ISSUE 351 · August 31, 2021TrendsThe Modern Data ExperienceA lot of "normal" people work with data these days and most don't care about the stack of tools that data practitioners use to move, store, wrangle, analyze and present data. They care about their own experience with data. "...the modern data stack isn't enough. We have to create a modern data experience." Great post! Sponsored LinkUsing third-party data to make smarter decisionsThe insights gained by using third-party data with your internal data can help you make smarter business and technical decisions. This technical eBook can help you learn how to use AWS Data Exchange to couple data sets with advanced analytics and machine learning with step-by-step instructions. Tutorials, Projects & OpinionsA lightweight data validation ecosystemNice approach for building an "advanced" but right-size data monitoring solution using common tools: GitHub (Actions, Pages, issues), R (pointblank + projmgr pkgs), and Slack notifications. Inferring Concept Drift Without Labeled DataA trained model might be static but because people and systems are constantly changing, models tend to drift over time. Detecting drift is key but that can be problematic with unlabeled data. This research report explores the problem and offers four ways to approach it using unsupervised methods. Pseudo-R²: A Metric for Quantifying InterestingnessGreat post that shows how the data science team at Heap uses a statistical metric called pseudo-R² to quantify how interesting an insight will be before deciding to recommend it for analysis. Science For People Who Give A ShitJoin tens of thousands of other smart humans and subscribe to Important, Not Important for the most vital science news of the week, deep analysis, and Action Steps you can take to 1) feel better and 2) measurably improve the world around you. Get it for free. Code & ToolsSQLModelSQLModel is a library that makes it easy to interact with SQL databases from Python code. It's designed to be intuitive, highly compatible, and robust. SQLModel is based on Python type annotations and is built as a layer on top of Pydantic and SQLAlchemy. ResourcesAnalyzing US Census Data: Methods, Maps & ModelsKyle Walker is a spatial data science expert who specializes in population geography and demographic trends. In this new book, Kyle shows how to acquire, wrangle, visualize, and model Census data using R. This looks like a great resource and is free to read online. Data Visualizationprettymapspretty maps is a small Python library to draw pretty maps from OpenStreetMap data. Based on osmnx, matplotlib and shapely libraries. Data Vis DispatchDatawrapper's new data visualization newsletter is a great source for curated visualizations from the media. Along with showing how data is being used to portray current events, many of the picks are good examples of visualization techniques. Sign up at the bottom of the page. |