— In the News —
Last week, a widely criticized tweet from the White House Council of Economic Advisers suggested that COVID deaths would end in mid-May. In this post, Thomas Lumley breaks down the chart behind the controversy and why, ultimately, prediction is hard.
This article came out in March and it's more relevant now than ever. The data uses that are outlined in this article seem reasonable enough but many of the specific applications lack transparency and it's worth paying attention as the pandemic evolves. Also, see the COVID Tracing Tracker from The MIT Technology Review.
— Tools and Techniques —
How can you tell if a machine learning model is fair? It's harder than you might think. In this interactive article, Adam Pearce shows how there are multiple ways to measure the accuracy of a model and aligning all of them across different groups is impossible.
A streak is when several events happen in a row consecutively. In this step-by-step tutorial, Josh Devlin shows how to calculate streaks in Python using the pandas library and visualize them using Matplotlib.
To effectively use machine learning algorithms that have a large number of hyperparameters, you need to pick good hyperparameter values. This new Distill article shows how to do that using Bayesian Optimization, which is especially useful when the function evaluations are expensive.
There are some big updates in the latest version of Papers with Code. Along with 5500 new results and 800+ new leaderboards, results from papers are now systematically extracted, aggregated and linked to their original location in papers! See details in this announcement.
A feature store is a system for managing the curated data that's used to productionize machine learning applications. Without a centralized feature store, data scientists end up duplicating work. This is a great collection of talks about the variety of ways that organizations build and manage their feature stores.