No images? Click here

Data Elixir

ISSUE 304 ·   September 22, 2020        

 

In the News

The High Privacy Cost of a “Free” Website

Most people know that trackers follow them around the web but most people don't realize just how deep that rabbit hole goes. This deep dive into the world of online tracking will help you understand who's tracking you, what they do with your data and how to protect yourself.
The Markup

 
 

Sponsored Link

Global Master of Management Analytics from Smith School of Business at Queen’s University

Future-Proof Your Career, While You Work

The Global Master of Management Analytics from Smith School of Business at Queen’s University is a 12-month program that can be taken from anywhere in the world.  Master the essential strategies for applying analytics to business needs in this ground-breaking program.

 

Reach Data Elixir readers by sponsoring an issue. Click here for details.

 
 

Tutorials, Projects & Opinions

How randomized response can help collect sensitive information responsibly

The availability of giant datasets and faster computers is making it harder to collect and study private data without inadvertently violating people’s privacy. In this interactive post, Adam Pearce and Ellen Jiang take a look at how random numbers can help.
PAIR Explorables

 
 
 

Analytics at Netflix: Who We Are and What We Do

Great insights into the data team at Netflix. Here's an inside look at what they do exactly and how they're organized.
The Netflix Tech Blog

 
 
 

An Intake Form for Data Requests

Which is more important? Getting the answer quickly or getting an accurate answer? It's never that easy but the question is a great starting point for discussion. Ultimately, this post by Caitlin Hudon leads to a well thought-out intake form to consider for your own data team.
Caitlin Hudon

 
 
 

Coding for Sports Analytics

For anyone interested in getting started with sports analytics, this is an awesome collection of tutorials to explore. Topics include coding, analysis techniques, data access and libraries for 🏈 ⚾ ⚽ 🏒 🏀 & more!
Brendan Kent

 

Code & Tools

• DuckDB – an embeddable SQL OLAP database management system, optimized for analytics. Also, see the discussion on Hacker News >>

• Deequ - library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

•  Norfair - customizable lightweight Python library for real-time 2D object tracking.

•  Eiten - open source Python toolkit that implements a variety of statistical and algorithmic investing strategies.

 

Resources

Array programming with NumPy

NumPy is a key component of nearly every Python library that does scientific or numerical computation. This is a very readable paper that explores how NumPy became so important, the ecosystem around it and a high-level view of how it works. This paper will be widely cited. 
Nature

 
 
 

Tidy Modeling with R

This new online book introduces a new collection of R software for model building. Covers modeling fundamentals, feature engineering, tools, tips and lots of examples along the way. The last chapters aren't finished yet but the book is far enough along to be useful.
Julia Silge and Max Kuhn

 

And Finally...

The story behind our record certification design

Sony Music asked Nadieh Bremer to design a data-driven alternative to the customary gold and platinum discs that popular artists receive and the result is AWESOME. Here's a detailed look at the considerations for the new design with lots of graphics along the way. Stunning.
Sony Music Data and Insights | Nadieh Bremer

 

Data Elixir is curated and maintained by Lon Riesberg. If you need help on a data project or have a suggestion for the newsletter, reply back to this email or grab a spot on my calendar >>

 
 
FacebookTwitterLinkedInWebsite
Data Elixir, LLC
P.O. Box 21255
Boulder, CO 80308
Unsubscribe