Data Elixir logo

ISSUE 316  ·   December 15, 2020        

 

Phew. 2020 is almost over! I've never been so excited for a year to end. To celebrate and reflect on new directions for 2021, Data Elixir will be taking the next 2 weeks off. I hope you and all your families have a great holiday season!

Cheers,
Lon

 

Insight

You Can’t Escape Hyperparameters and Latent Variables: ML as a Software Engineering Enterprise

Great "keynote" from NeurIPS 2020 about the past, present and future of ML research — especially as it pertains to algorithmic bias. It's 45 minutes but don't let that deter you. This is a well-paced presentation that's full of insightful conversations with researchers, engineers, and data scientists. Starts at 29:04.
NeurIPS 2020 | Charles Isbell

 
 

Sponsored Link

Learn R, Python, and SQL with Dataquest, the data science platform recommended by 97% of learners.

No video lectures, no fill-in-the-blank quizzes. Learn through high-quality, step-by-step courses developed by a team of data experts. Sign up for free to start analyzing data today -- all in your browser.

 

Reach Data Elixir readers by sponsoring an issue. Click here for details.

 
 

Tutorials, Projects & Opinions

Easier Code Reviews For Jupyter Notebooks

The raw content of Jupyter Notebooks is a mix of dissimilar source code, Markdown, and HTML, making Jupyter notoriously challenging for code reviews. This tutorial shows how to use an open-source utility called nbautoexport to simplify the process.
DrivenData

 
 
 

Why Some Models Leak Data

Machine learning models use large amounts of data, some of which can be sensitive. If the models aren't trained correctly, sometimes that data is inadvertently revealed. This interactive essay shows how.
PAIR Explorables | Adam Pearce and Ellen Jiang

 
 
 

Data Catalogs Are Dead; Long Live Data Discovery

Data catalogs aren't cutting it anymore when it comes to metadata management and data governance. Here's how data discovery can help.
Monte Carlo Blog | Barr Moses and Debashis Saha

 
 
 

Financial Times Data Platform: From zero to hero

Great walk-through of the evolution of the Financial Times data platform. Includes detailed timelines and decisions along the way.
FT Product & Technology Blog

 
 
 

End-to-end Production ML Monitoring

This post bills itself as a "deep dive" into production machine learning monitoring but it's organized in a way that also makes it easy to get a high level view of what ML monitoring is all about. Covers outlier detectors, drift detectors, metrics servers and explainers.
Alejandro Saucedo

 
 
 

AMAX: Revolutionary GPU solutions for data insights at deskside

The fastest time-to-results for deep learning training and visualization - AMAX GPU workstations with latest AMD EPYC™ Processors provide unprecedented data-transfer speed and performance for data and graphic intensive applications. Special pricing for academia/startups. Kick start your AI insights >
// sponsored

 

Resources

Advanced Data Science 2020

This GitBook accompanies a semester-long course that focuses on the "hard" part of data science. The assumption is that you already have a background in statistics, programming and the basics of reproducible research. The focus in this course is on synthesizing those tools into a data analysis and then communicating that analysis to an audience.
Johns Hopkins University | Jeff Leek and Roger D. Peng

 
 
 

Papers with Code is Expanding

Papers with Code recently launched new sites for statistics, math, computer science, and more. Use these sites to discover new projects and/or sync your code to show on arXiv. See the post for details.
Papers with Code Blog

 

Project Pick

Blob Opera

Blog Opera is a surprisingly addictive machine learning experiment by David Li in collaboration with Google Arts and Culture. Nothing to learn here. Just a fun machine learning toy to brighten up your holidays! ✨
Google Arts & Culture

 
Data Elixir logo

Data Elixir is curated and maintained by Lon Riesberg. For full-text search of prior issues, visit Data Elixir's Search Page. If you have suggestions or questions for the newsletter, just reply back to this email.

 

Sign up to get Data Elixir's  data science newsletter in your Inbox >>

 
FacebookTwitterLinkedInWebsite
Data Elixir, LLC
P.O. Box 21255
Boulder, CO 80308
Unsubscribe