Data Elixir logo

ISSUE 380  ·   March 29, 2022

 

Trends

Emerging Architectures for Modern Data Infrastructure

This detailed look at the modern data stack was initially created in 2020 and has been recently updated to show how things are evolving. Covers best-in-class architectures for both analytic and operational systems and lays out a hypothesis for why specific changes are happening. The article came out of discussions with dozens of practitioners.
a16z future | Matt Bornstein, Jennifer Li, Martin Casado

 

Sponsored Link

Understanding and Overcoming Four Types of Biases in AI​​​​​​​

Understanding and Overcoming Four Types of Biases in AI

There are four types of biases found in machine learning models. These are algorithmic bias, sample bias, prejudicial bias, and measurement bias. How do each of these biases arise and how are each of them mitigated? Read this article to understand how you can produce better business outcomes by training AI models to do precisely what they are meant to do.

 

Reach Data Elixir readers by sponsoring an issue. Click here for details.

 

Tutorials, Projects & Opinions

Advanced exploratory data analysis (EDA) w/ Python

Great post that shows how to quickly get a handle on nearly any tabular dataset. This is a comprehensive tutorial that's well organized and includes lots of code snippets and screenshots along the way.
Michael Notter

 

Precision & Recall

Nice introduction to simple classification issues in machine learning. This post is built around interactive visuals that make it easy to understand why accuracy isn't always a great measure of classification and how precision, recall, and the F1-score work.
Jared Wilber

 

Artificial Counterfactual Estimation:
Machine Learning-Based Causal Inference at Airbnb

When they wanted to measure the impact of changes that couldn't be tested with A/B tests, Airbnb developed a new methodology that uses ML and causal inference to artificially reproduce the “counterfactual” scenario produced by random assignment. Here's how it works.
Airbnb Tech Blog | Zhiying Gu, Qianrong Wu

 

Frustration: One Year With R

Despite the title, this post isn't an attack on R or a pitch for anything else. Rather, it's a carefully thought-out review of R's strengths and weaknesses by someone who uses it everyday. It's a long post but a linked index makes it easy to jump around.
Reece Goding

 

Machine Learning with PyTorch and Scikit-Learn – Out Now!

Looking for a comprehensive book on machine learning and deep learning using PyTorch? Look no further! Machine Learning with PyTorch and Scikit-Learn by Sebastian Raschka, Yuxi Lui, and Vahid Marjalili is your essential guide to this powerful Python framework. Available on Amazon and the Packt website.
// sponsored

 

Discussions - Click through to read and/or participate

 
Twitter discussion: salaries
 
Twitter discussion: R / Python code comparisons
 

Outlier

Building games & apps entirely with natural language

Impressive demonstration of how OpenAI’s code-davinci model can be used to build apps and games — with ZERO coding. Andrew Mayne uses plain English to give instructions to the model and then the output is code for games like Wordle, Zelda, Tic Tac Toe, and more. Includes input instructions, ouput code, and running demos.
Andrew Mayne

 
 

Sign up to get Data Elixir's  data science newsletter in your Inbox >>

 
 
 
Data Elixir logo

Data Elixir, LLC
P.O. Box 21255
Boulder, CO 80308

Data Elixir is curated and maintained by Lon Riesberg. If you have questions or suggestions for the newsletter, just reply back to this email.

Unsubscribe