— Insight —
The AI Index Report from the HAI group at Stanford is a "starting point for informed conversations about the state of AI." The report is organized into 9 chapters that cover a variety of topics including things like Technical Performance, Research, the Economy, and Public Perception. Start with the short Highlights section at the beginning.
When asked what they think, many practitioners will say the right thing. But what drives their decisions when they actually start building a system?
— Tools and Techniques —
Tyrolab's annual collection of top Python picks is consistently a must-read post. Like previous years, some of these picks will be familiar and some, probably not. Each pick includes useful descriptions and links.
Aaron Horowitz, Chief Data Scientist at the ACLU, describes how his team uses dbplyr as a meta-programming language to generate complex SQL code. It's a unique approach that enables them to create reusable data workflows that anyone on the team can use, regardless of whether they prefer R, Python, or pure SQL.
Users may have a right to have their data deleted but getting a machine learning model to unlearn data is notoriously difficult. This paper offers a nice overview of the issues and introduces a framework that simplifies the process of machine unlearning.
Data marketplaces use a variety of schemes to put a price on data but the schemes tend to be ad-hoc and difficult to scale. In this post, Ruoxi Jia shows how techniques using Shapley Functions and K-nearest neighbors can be used to provide more consistent and scalable valuations.
Nice introduction to R for the complete beginner. Starts with getting set-up and then walks through the basics of exploring and visualizing data with R. Includes code and exercises.