Data Elixir logo

ISSUE 373  ·   February 8, 2022

 

Insight

The Economics of Data Businesses

Great article about how data businesses work, how they're different than other types of tech businesses, why it's hard for data businesses to find investors, and what it takes to make these businesses worthwhile. If you're interested in getting involved in a business where data is the product, this is a must-read. Check out the resources at the end too.
Pivotal | Abraham Thomas

 

Sponsored Link

Get training data for ML in record time

Get training data for ML in record time

Designed by engineers for engineers, Toloka combines cutting-edge technologies with the power of the crowd to deliver high-performing data for Machine Learning projects in record time. Built-in quality control system provides superb data accuracy at scale.

 

Reach Data Elixir readers by sponsoring an issue. Click here for details.

 

Tutorials, Projects & Opinions

Modern ML Monitoring: Research Challenges

In this final part of a 4-part series, Shreya Shankar uses real-world post-deployment issues to explore research challenges and solution ideas for ML monitoring which, as she puts it, is a mess. This post covers a variety of issues including monitoring and diagnosing performance issues, visualizations, datasets and benchmarks. 
Shreya Shankar

 

Researchers Build AI That Builds AI

A new approach for training deep neural networks could help make AI more accessible to people without deep pockets and access to big data. Here's a high-level look at how it works.
Quanta Magazine | Anil Ananthaswamy

 

Everything gets a package: Python data science setup

It may sound like overkill but if you keep your Python projects organized as described here, your projects will always be consistent, portable and reproducible. Spinning up a quick jupyter notebook to check something out? Build a package first. This post starts with the why's and shows how to simplify the process.
Ethan Rosenthal

 

Understanding “statistical significance” and p-values 

Using simple examples, this post makes it easy to understand how statistical significance and p-values work.
For-loops and piep kicks

 

The 7 Best ELT Tools for Data Warehouses

Focus on actionable insights instead of stressing about managing data
// sponsored

 

Code & Tools

Mercury

Mercury is a simple web framework that converts Python Notebooks into interactive web applications. Just add a simple YAML header and deploy it to your Mercury server.
GitHub | mljar

 

Career

AI and Machine Learning Salaries Drop

Overall, tech salaries were up last year but not for machine learning professionals. This article is based on a report from Dice that explores U.S. tech salaries and where the money is flowing now.
IEEE Spectrum

 

Data Visualization

Expansion for Continuous Scales

ggplot2's expansion function can help you add space between your data and the axes but many people aren't familiar with it. In this new cheatsheet, Christian Burkhart offers an intuitive approach for understanding how to use the expansion function for continuous scales.
Christian Burkhart

 

Data Visualization - State of the Industry Survey

Roles, salaries, tech stacks, problems, trends and much more are covered in this report from a recent survey of data visualization practioners around the world. This post offers a nice summary of the results. For details, there's a link at the top to download the full report.
Data Visualization Society

 

Outlier

River Runner Global

Awesome interactive visualization. Click to drop a raindrop anywhere in the world and watch its path to wherever it ends up. Start with this shortlist of interesting paths >>
Sam Learner

 
 

Sign up to get Data Elixir's  data science newsletter in your Inbox >>

 
 
 
Data Elixir logo

Data Elixir, LLC
P.O. Box 21255
Boulder, CO 80308

Data Elixir is curated and maintained by Lon Riesberg. If you have questions or suggestions for the newsletter, just reply back to this email.

Unsubscribe