Top DS Python libs of 2019. Reusable data workflows for polyglot teams. Machine unlearning. Data valuation. AI Index Report. NeurIPS highlights.

No images? Click here

Data Elixir

ISSUE 264  ยท   December 17, 2019        

 

Data Elixir is taking a break next week which means the next issue will go out on December 31st. In the meantime, if you're looking for something to read, head over to the Archives or check out the Data Elixir Search page and  you're sure to find something interesting.  Have great holidays everyone!
- Lon

 

Insight

The 2019 AI Index report

The AI Index Report from the HAI group at Stanford is a "starting point for informed conversations about the state of AI." The report is organized into 9 chapters that cover a variety of topics including things like Technical Performance, Research, the Economy, and Public Perception. Start with the short Highlights section at the beginning.
HAI / Stanford

 
 
 

How practitioners and academics think (and then forget) about fairness when building AI systems.

When asked what they think, many practitioners will say the right thing. But what drives their decisions when they actually start building a system?
David Sumpter

 

Sponsored Link

See how top BI platforms compare

This month, Mode was named a leader in BI and analytics by G2 Crowd. G2 aggregated customer reviews for all the top BI platforms. See which companies do best in ease of setup, ease of admin, future direction, and more.  Download Report 

 

Reach Data Elixir readers by sponsoring an issue. Click here for details.

 
 

Tools and Techniques

Top 10 Python libraries of 2019

Tyrolab's annual collection of top Python picks is consistently a must-read post. Like previous years, some of these picks will be familiar and some, probably not. Each pick includes useful descriptions and links.
Tryolabs Blog

 
 
 

dbplyr : A Path to More Inclusive Data Transformations at the ACLU

Aaron Horowitz, Chief Data Scientist at the ACLU, describes how his team uses dbplyr as a meta-programming language to generate complex SQL code. It's a unique approach that enables them to create reusable data workflows that anyone on the team can use, regardless of whether they prefer R, Python, or pure SQL.
ACLU Tech & Analytics

 
 
 

Machine Unlearning

Users may have a right to have their data deleted but getting a machine learning model to unlearn data is notoriously difficult. This paper offers a nice overview of the issues and introduces a framework that simplifies the process of machine unlearning.
arXiv

 
 
 
 

What is My Data Worth?

Data marketplaces use a variety of schemes to put a price on data but the schemes tend to be ad-hoc and difficult to scale. In this post, Ruoxi Jia shows how techniques using Shapley Functions and K-nearest neighbors can be used to provide more consistent and scalable valuations.
BAIR at UC Berkeley

 
 
 
 

An Introduction to R With Hockey Data

Nice introduction to R for the complete beginner. Starts with getting set-up and then walks through the basics of exploring and visualizing data with R. Includes code and exercises.
Hockey-Graphs

 
 
 

Data scientists are in demand on Vettery

Vettery is an online hiring marketplace that's changing the way people hire and get hired. Ready for a bold career move? Make a free profile, name your salary, and connect with hiring managers from top employers today.
// sponsored

 

Conferences

NeurIPS 2019

NeurIPS 2019 generated a lot of activity around the web last week. Whether you attended or not, here are key links worth knowing about:

  • For detailed notes, this collection by David Abel is hard to beat.
  • If you prefer notes with lots of visuals, check out these notes by Robert Lange.
  • All of the presentations are online at https://slideslive.com/neurips
  • If you only have time to watch one presentation, check out Celeste Kidd's talk, "How to Know," about how people know what they know.
 

Job Board

  • Research Associate, Center for Data Insights at MDRC - New York, NY or Oakland, CA
  • Data Analytics Course Mentor at Springboard - Anywhere
  • Senior Data Scientist at Intercom - San Francisco, CA, USA
  • Data Scientist at University of Michigan - Ann Arbor, MI, USA
  • Data Engineer at OECD (Organisation for Economic Co-operation and Development) - Paris, France

More >>

 

Data Elixir is curated and maintained by @lonriesberg. For additional finds from around the web, follow Data Elixir on LinkedIn, Twitter or Facebook.

 
FacebookTwitterLinkedInWebsite
Data Elixir, LLC
P.O. Box 21255
Boulder, CO 80308
Unsubscribe