Data Elixir

ISSUE 315  ·   December 8, 2020        

 

Insight

Most important statistical ideas of the past 50 years?

This new paper explores eight major statistical ideas of the past 50 years, including overviews and discussion of what they each have in common, how they differ and what to expect over the next few decades.
arXiv | Andrew Gelman, Aki Vehtari

 
 

Sponsored Link

Join the data science community at SIAM

Join the data science community at SIAM - new journal, book series, conference, and activity group!

Society for Industrial and Applied Mathematics (SIAM) has launched a collection of new data science resources! The most recent additions are the SIAM Activity Group on Data Science (a place to network and collaborate!) and the SIAM Journal on Mathematics of Data Science. Learn more about SIAM’s data science offerings and stay connected to the community.

 

Reach Data Elixir readers by sponsoring an issue. Click here for details.

 
 

Tutorials, Projects & Opinions

Airbnb-quality data for all

Great follow-on article to the recent data quality series from Airbnb. In this article, Jeremy Stanley, founder of Anomalo, shows how to build and maintain high quality data "without raising billions."
Anomalo | Jeremy Stanley

 
 
 

Machine Learning model governance at scale

The dynamic nature of machine learning makes model governance particularly challenging — especially at scale. This is a best-practices article from Microsoft that explores the issues and approaches.
Data Science at Microsoft

 

Code & Tools

(Re-)introducing Distill for R Markdown

Distill is a package for R Markdown that makes it easy to create technical articles, websites, and blogs in the style of the Distill Machine Learning Journal. Output is clean, interactive and engaging. Here are the highlights for the new 1.0 release, including links to key resources.
RStudio | Alison Hill and JJ Allaire

 
 
 

RipTable

High performance 64 bit python analytics engine for numpy arrays with multi-threaded support. Enhances or replaces numpy or pandas and claims it can crunch numbers 1.5 to 10 times faster.
GitHub | rtosholdings

 
 
 

Looking to Annotate Your Video Data?

Alegion just opened up its powerful video annotation capabilities for self-serve. Label your video with complete end-to-end control. Upload, configure, and label with speed, accuracy, rich annotation, ML efficiencies, real-time playback and download your annotated data, all on your time.
// sponsored

 

Resources

2020’s Top AI & Machine Learning Research Papers

These short summaries of recent AI and Machine Learning research papers cover a wide variety of authors, topics and venues. Includes key points, diagrams and links for each paper.
TOPBOTS

 

Data Visualization

A ggplot2 Tutorial for Beautiful Plotting in R

Awesome ggplot2 tutorial with lots of examples. Includes a linked Table of Contents and useful resources at the end. Worth bookmarking.
Cédric Scherer

 
 
 

Why use a radial data visualization?

Radial visualizations include circular layouts like pie charts, circular trees, sunbursts and weather wheels. They're an efficient way to present data but they can also be counter-productive. Here's a visual orientation of circular visualization options, why you might use them and the cases when clearly, there are better alternatives.
Observable | Kerry Rodden

 

And Finally...

Machine Learning concepts as animal GIFs
 

Data Elixir is curated and maintained by Lon Riesberg. For full-text search of prior issues, visit Data Elixir's Search Page. If you have suggestions or questions for the newsletter, just reply back to this email.

 

Sign up to get Data Elixir's  data science newsletter in your Inbox >>

 
FacebookTwitterLinkedInWebsite
Data Elixir, LLC
P.O. Box 21255
Boulder, CO 80308
Unsubscribe