Data Elixir logo

ISSUE 378  ·   March 15, 2022

 

In this week's issue, Roni Kobrosly joins Data Elixir to curate the first of a new series called "Invited Topics." Roni is the head of data science at a Health Tech company in D.C. and he's also the creator of a collection of tools to perform causal inference analysis.

In this first Invited Topics section, Roni curates a collection of key links for getting started in causal inference. This week is Foundations. Coming up will be a selection of tools, methods, and more from around the web.

If you like this idea or you might be interested in curating your own section of Data Elixir, let me know!

-Lon

 

Trends

Snowflake goes shopping, and buys the store.

It may not seem like much now, but Snowflake's recent acquisition of Streamlit could become a very big deal. We're not there yet but as Benn Stancil argues here, the Streamlit acquisition points to the future of data apps, where they'll live on the data stack, and what's needed to get there. Great post.
Benn Stancil

 

Sponsored Link

Delivering Accurate Ground Truth Data for AI/ML Models

Delivering Accurate Ground Truth Data for AI/ML Models

30+ years experience working with leading data-centric AI/ML models. With 3,500+ global SMEs and experience with any data type, we accelerate operations and advance models in record time. Scale your AI with your model's new secret weapon, Innodata.

 

Reach Data Elixir readers by sponsoring an issue. Click here for details.

 

Tutorials, Projects & Opinions

"Just get some labelled data"

Nice introduction to the art of data labelling and how it's more complex than you might think. Fundamentally, data labelling encodes key decisions about the domain and the problems you're trying to solve. Starting with an "ideal" case, here's how it gets complicated.
Neal Lathia

 

Jupyter Everywhere

The latest version of JupyterLite lets you easily embed a console, a notebook, or a fully-fledged IDE on any web page. This post walks through how it works with lots of examples along the way.
Jupyter Blog | Jeremy Tuloup

 

CS 329S: Machine Learning Systems Design

Chip Huyen's course on Machine Learning Systems Design is a solid introduction to developing real-world machine learning systems. There aren't videos here but it's a great set of lecture notes and readings.
Stanford, Winter 2022

 

Good-Bye Digital Natives. Hello AI Natives.

TikTok understands what AI Natives want. Do you? AI is fundamentally changing the rules of business & creating a new consumer class. Learn about AI Natives & how you can win their loyalty. Download a free copy of Prolego’s ‘AI Natives Among Us’ research report.
// sponsored

 
 

Invited Topics: Causal Inference, Part 1 (Foundations)

Causal Inference: What If?

Miguel A. Hernán and James Robins are causal inference superstars within the public health world. Part 1 of this book ("Causal inference without models") is not short but if you want to learn about making causal inferences from data, this provides one of the best introductions to the topic you can find on the web. It assumes zero prior knowledge of modeling or public health, and is quite approachable. Free to download.
Miguel A. Hernán and James Robins
 

Causality for Machine Learning

This report by the fabulous Fast Forward Labs may sound intimidating but chapters one and two provide a math-less introduction to a number of critical topics in causal inference. You'll learn about causal graphs (a simple way to visualize causal relationships), the concept of the "causal hierarchy", and counterfactuals.
Fast Forward Labs
 

Thinking Clearly About Correlations and Causation

Julia Rohrer provides a slightly deeper but still math-less explanation of causal graphs.
Sage Journals | Julia Rohrer
 

Causal Inference Challenges in Industry:
A perspective from experiences at LinkedIn

While the above links have been more on the theoretical side, this video gives you a look at how causal inference is viewed from an industry perspective (specifically at LinkedIn)
YouTube | Ya Xu

 
 

Code & Tools

nbpreview

nbpreview is a terminal viewer for Jupyter notebooks. It's like a full-featured cat for ipynb files. Features include rendering for markdown, LaTeX, and DataFrames; syntax highlighting for code; previews for Vega charts and much more.
nbpreview | Paulo S. Costa

 

Career

Data salaries at FAANG companies in 2022

Mikkel Dengsøe explores 4000+ data points from salary sharing sites and puts some numbers on data salaries around the world, organized by geography, seniority, and company. The info here may help you find a higher paying job but the real point of the post is to spur discussion about transparency. Should salaries be transparent? 
Mikkel Dengsøe

 
 

Sign up to get Data Elixir's  data science newsletter in your Inbox >>

 
 
 
Data Elixir logo

Data Elixir, LLC
P.O. Box 21255
Boulder, CO 80308

This week's issue of Data Elixir was curated and edited by Roni Kobrosly and Lon Riesberg. If you have questions or suggestions for the newsletter, just reply back to this email.

Unsubscribe