Data Elixir logo

ISSUE 351 ·   August 31, 2021

 

Trends

The Modern Data Experience

A lot of "normal" people work with data these days and most don't care about the stack of tools that data practitioners use to move, store, wrangle, analyze and present data. They care about their own experience with data. "...the modern data stack isn't enough. We have to create a modern data experience." Great post!
Benn Stancil

 

Sponsored Link

Using third-party data to make smarter decisions

Using third-party data to make smarter decisions

The insights gained by using third-party data with your internal data can help you make smarter business and technical decisions. This technical eBook can help you learn how to use AWS Data Exchange to couple data sets with advanced analytics and machine learning with step-by-step instructions.

 

Reach Data Elixir readers by sponsoring an issue. Click here for details.

 
 

Tutorials, Projects & Opinions

A lightweight data validation ecosystem

Nice approach for building an "advanced" but right-size data monitoring solution using common tools: GitHub (Actions, Pages, issues), R (pointblank + projmgr pkgs), and Slack notifications.
Emily Riederer

 

Inferring Concept Drift Without Labeled Data

A trained model might be static but because people and systems are constantly changing, models tend to drift over time. Detecting drift is key but that can be problematic with unlabeled data. This research report explores the problem and offers four ways to approach it using unsupervised methods.
Cloudera Fast Forward Labs

 

Pseudo-R²: A Metric for Quantifying Interestingness

Great post that shows how the data science team at Heap uses a statistical metric called pseudo-R² to quantify how interesting an insight will be before deciding to recommend it for analysis.
Heap | David Robinson

 

Science For People Who Give A Shit

Join tens of thousands of other smart humans and subscribe to Important, Not Important for the most vital science news of the week, deep analysis, and Action Steps you can take to 1) feel better and 2) measurably improve the world around you. Get it for free.
// sponsored

 

Code & Tools

SQLModel

SQLModel is a library that makes it easy to interact with SQL databases from Python code. It's designed to be intuitive, highly compatible, and robust. SQLModel is based on Python type annotations and is built as a layer on top of Pydantic and SQLAlchemy.
SQLModel | Sebastián Ramírez

 

Resources

Analyzing US Census Data: Methods, Maps & Models 

Kyle Walker is a spatial data science expert who specializes in population geography and demographic trends. In this new book, Kyle shows how to acquire, wrangle, visualize, and model Census data using R. This looks like a great resource and is free to read online.
Kyle Walker

 

Data Visualization

prettymaps

prettymaps

pretty maps is a small Python library to draw pretty maps from OpenStreetMap data. Based on osmnx, matplotlib and shapely libraries.
GitHub | Marcelo Prates

 

Data Vis Dispatch

Datawrapper's new data visualization newsletter is a great source for curated visualizations from the media. Along with showing how data is being used to portray current events, many of the picks are good examples of visualization techniques. Sign up at the bottom of the page.
Datawrapper | Rose Mintzer-Sweeney and Lisa Charlotte Rost

 
 

Sign up to get Data Elixir's  data science newsletter in your Inbox >>

 
 
 
Data Elixir logo

Data Elixir, LLC
P.O. Box 21255
Boulder, CO 80308

Data Elixir is curated and maintained by Lon Riesberg. If you have questions or suggestions for the newsletter, just reply back to this email.

Unsubscribe