Data Elixir logo

ISSUE 318 ·   January 12, 2021        

 

In the News

Google Research: Looking Back at 2020, and Forward to 2021

Google has a massive impact on the tools, applications and research that help steer the data science community. As with prior years, this retrospective by Jeff Dean is amazing in its scope. Includes useful summaries, screenshots, videos and linked references throughout.
Google AI Blog | Jeff Dean

 
 
 

He Created the Web. Now He’s Out to Remake the Digital World.

When Tim Berners-Lee helped create the World Wide Web, there was no way to know how far it would go. Now, three decades later, the big tech companies control vast troves of data and have become surveillance platforms and gatekeepers of innovation. Here's how his new startup aims to fix that.
New York Times | Steve Lohr

 

Sponsored Link

Online Data Science Programs from Drexel University

Online Data Science Programs at Drexel University

Find your algorithm for success with an online data science degree from Drexel University. Gain essential skills in tool creation and development, data and text mining, trend identification, and data manipulation and summarization by using leading industry technology to apply to your career. Learn more.

 

Reach Data Elixir readers by sponsoring an issue. Click here for details.

 
 

Tutorials, Projects & Opinions

Real-time Machine Learning For Recommendations

Eugene Yan's latest article explores how real-time machine learning looks in practice, especially regarding recommendations. When does real-time recommendation make sense? When does it not? How should you approach an MVP? This is an awesome article that's easy to follow.
Eugene Yan

 
 
 

What do you wish you knew before you deployed your first ML model?

Especially if you're new, the suggestions here may surprise you.
Twitter | Chip Huyen

 

Code & Tools

Validating Data in Python with Cerberus

The Cerberus library provides data validation functions for Python and is designed to be simple to use and extensible. This introduction walks through some examples for getting started.
Trading Fish | Hector Castro

 
 
 

Simple Graph

Simple Graph is a graph database for SQLite, inspired by the article "SQLite as a document database".
GitHub | Denis Papathanasiou

 

Resources

Data is Plural archive

The Data is Plural newsletter is a great resource for learning about obscure datasets but the archives have been hard to browse. Not anymore. This project by Amelia Wattenberger makes it easy to search the Archives using keywords and topics.
dataset-finder | Amelia Wattenberger

 
 
 

Probabilistic Machine Learning: An Introduction

This popular text by Kevin P. Murphy has gotten a big upgrade this year. The updated book was expected to be over 1600 pages so to make it manageable, it's been split into 2 volumes. This is the electronic version of Volume 1. The second volume, Advanced Topics, is planned for 2022.
Kevin Patrick Murphy

 

Project Pick

Parler video GPS data

The social media app called Parler has taken a lot of heat recently for its role in the planning of domestic terrorist attacks in the U.S. Between the App Stores and AWS, it's been shut down but not before nearly all of its data was downloaded via some hacking heroics. This repo contains GPS metadata and code for working with it. Also, see this thread >>
GitHub | kylemcdonald

 
Data Elixir logo

Data Elixir is curated and maintained by Lon Riesberg. For full-text search of prior issues, visit Data Elixir's Search Page. If you have suggestions or questions for the newsletter, just reply back to this email.

 

Sign up to get Data Elixir's  data science newsletter in your Inbox >>

 
FacebookTwitterLinkedInWebsite
Data Elixir, LLC
P.O. Box 21255
Boulder, CO 80308
Unsubscribe