Data Elixir logo

ISSUE 410  ·   October 25, 2022

 

Talks

Earth System Modeling with ML and Big Data

Earlier this year, the Aspen Global Change Institute hosted a weeklong workshop on using machine learning to model and help understand Earth's climate. Videos of the talks, along with PDFs of the slides, were recently released and are a great overview of the work being done across a variety of groups and applications.
Aspen Global Change Institute

 

The Normcore Tech Conference - December 15

Normcore is a free, online conference about all the mundane, behind-the-scenes, how-the-sausage-is-made, middlebrow, unsexy, normcore stuff in the data and ML parts of the tech world. This is a ~grassroots gathering that looks amazing! Check out the schedule 👉
NormConf

 

Reach Data Elixir readers by sponsoring an issue. Click here for details.

 

Tutorials, Projects & Opinions

Posterior Predictions Guide

Nice exploration of the differences between Bayesian posterior predictions, linear predictions, and the expectation of posterior predictions. A lot of visuals help make this intuitive and there's also a link to a cheatsheet at the bottom of the post.
Andrew Heiss

 

Rebels with a Cause: Monologues from Heckman, Pearl, Robins, and Rubin

The latest issue of Observational Studies is a must-read in the field of causal inference. Features in-depth interviews with James Heckman, Judea Pearl, Jamie Robins, and Don Rubin — four of the most prominent and influential leaders in the field.
Observational Studies

 

Modern Data Stack in a Box with DuckDB

Here's how you can deploy an open-source Modern Data Stack on your laptop or to a single machine using the combination of DuckDB, Meltano, dbt, and Apache Superset. The post explores how it's useful and how to set it up on your own machine.
DuckDB | Jacob Matson

 

Russian Roulette: An Unbiased Estimator of the Limit

The Russian Roulette (in statistics) offers a simple way to construct an unbiased estimator for the limit of a sequence. Potentially, there are a lot of applications for it since many hard problems can be cast as estimating the limit of a sequence. This is a nice write-up of how it works, how it could be useful, and why it never caught on.
Fabian Pedregosa

 
 

Turn insights into action with Hightouch

Your team is devoted to building precise dashboards and brilliant models, so why not do the most with those hard earned insights? Go beyond analytics with Hightouch - sync data from your warehouse into the tools your business teams live in (i.e Salesforce, Hubspot) so it can drive business impact. Book a demo to learn more.
// sponsored

 

Career

The Data Science Interview Book

This online interview guide covers a wide variety of topics in statistics, model building, algorithms, visualization, SQL, Python, algorithms, machine learning, and more. Updated monthly.
GitHub | Dip Ranjan Chatterjee

 

Data Visualization

Using ggplot2 to create Treatment Timelines with Multiple Variables

This tutorial introduces treatment timelines or “swimmer” plots and shows how to create them in R using ggplot2. These types of plots can help visualize treatment or measurement patterns, time-varying covariates, outcomes, and loss to follow-up in longitudinal data settings.
KHstats | Katherine Hoffman

 

aRt

This R package is a great playground for creating generative art. To see what you can do with its 30+ functions, just scroll through the examples.
GitHub | Nicola Rennie

 
 

Sign up to get Data Elixir's  data science newsletter in your Inbox >>

 
 
Data Elixir logo

Data Elixir, LLC
P.O. Box 21255
Boulder, CO 80308

Data Elixir is curated and maintained by Lon Riesberg. If you have questions or suggestions, just reply back to this email.

Unsubscribe