Python ML trends & developments. The business of AI. Projects to know. What makes ML reproducible? Communicating model uncertainty. Inside look at OpenAI.

No images? Click here

Data Elixir

ISSUE 272  ·   February 18, 2020        

 

Insight

The New Business of AI (and How It’s Different From Traditional Software)

"Just as SaaS ushered in a novel economic model compared to on-premise software, we believe AI is creating an essentially new type of business." 
a16z

 
 
 

The Map of Mathematics

Nice rabbit hole! Explore modern mathematics and how its major elements fit together in this interactive project from Quanta Magazine.
Quanta

 

Profiles

The messy, secretive reality behind OpenAI’s bid to save the world

Karen Hao spent half a year digging into OpenAI, one of the leading AI research labs in the world. The lab is intended to "ensure that artificial general intelligence benefits all of humanity." Following dozens of interviews, here's Karen's inside look at how that's going.
Technology Review

 

Sponsored Link

The Expense of Poorly Labeled Data

The Expense of Poorly Labeled Data: An Experiment in Distortion

What happens when you train a machine learning model on biased data? In this article, we take a good data set conducive towards modeling and compare the effects of random and biased distortion. This analysis illustrates how biased distortion is demonstrably worse and will ruin a dataset and any model trained from this data.

 

Reach Data Elixir readers by sponsoring an issue. Click here for details.

 
 

Tools and Techniques

Quantifying Independently Reproducible ML

After attempting to reproduce results from 255 papers (!), Edward Raff, Chief Scientist at Booz Allen, distilled 26 key features of reproducibility. What makes a machine learning paper reproducible? Read this!
The Gradient

 
 
 

Five Interesting Data Engineering Projects

Nice introduction to a few projects that are worth being familiar with.
Dmitriy Ryaboy

 
 
 

Understanding Maximum Likelihood

This interactive post by Kristoffer Magnusson is a great explainer of maximum likelihood estimation and some common hypotheses tests, such as the likelihood ratio test, Wald test, and Score test.
R Psychologist

 
 
 

Comet: Machine Learning Experiment Management

Join tens of thousands of data scientists worldwide who use Comet.ml  Automatically track, compare, explain and reproduce your ML models and experiments. Sign-up for free.
// sponsored

 

Resources

Machine learning in Python: Main developments and technology trends in data science, ML, and AI

This new survey paper explores the Python machine learning landscape with a focus on recent trends and developments. This is a well-written long read, with lots of references along the way. By Sebastian Raschka, Joshua Patterson and Corey J Nolet.
arXiv

 
 
 

rstudio::conf 2020

All RStudio Conference 2020 videos are now available for streaming. There are over 100 talks here, covering a wide range of topics. Most of these talks are about 20 minutes and include links to related materials.
RStudio

 

Data Viz

How big is that, though?

Even for people who work with maps a lot, it can be hard to grasp the size of things that are in the news. How big is that "2400 km² locus swarm that's devastating Kenya?" This tool by Hans Hack has the answer. It's simple and brilliant.
Datawrapper

 
 
 

Communicating Model Uncertainty Over Space

Great post by Adam Pearce that walks-through his process for designing an interface that shows ML model uncertainty. The vehicle is a model that's very good at detecting prostate cancer but it's not perfect. How do you show a pathologist where a model can be trusted? Adam shows the strengths and weaknesses of 6 different approaches.
People + AI Research

 

Upcoming Events

  • February 25 - Columbia Business School Executive Education Webinar: How Data Science and AI Are Shaping the Business Landscape - Assaf Zeevi will introduce some of the key innovations in this space in a manner that provides a base for understanding these technologies and the promise they hold.  For info, see the event website >>
     
  • March 2 - Data Elixir, a Global Women in Data Science (WiDS) Conference media collaborator, is pleased to present the WiDS Stanford livestream. The Global WiDS Conference aims to inspire and educate data scientists worldwide, regardless of gender, and support women in the field. This annual one-day technical conference features outstanding women doing outstanding work. For info, see the event website >>
 

Job Board

New on the Job Board this week:

  • Data Scientist at Stack Overflow (remote)
  • VP of Data Science & Analytics at Mozilla
  • Data Science Instructor at Lambda School (remote)
  • Data Engineering Manager at Spotify
  • Data Reporter at The Center for Investigative Reporting
  • Data Product Manager at iRobot
  • 2 Data Science positions at Tesla

Check these out and more >>

 

Data Elixir is curated and maintained by @lonriesberg. For additional finds from around the web, follow Data Elixir on LinkedIn, Twitter or Facebook.

 
FacebookTwitterLinkedInWebsite
Data Elixir, LLC
P.O. Box 21255
Boulder, CO 80308
Unsubscribe