Issue 263
— Insight —
Why machine learning can't save the NFL
After years of scandals involving the health of its players, the NFL wants to use data and machine learning to reduce risk and prevent injuries. Here's why data won't change the game.
Biased Algorithms Are Easier to Fix Than Biased People
Software can just be rewritten so, intuitively, this makes perfect sense. But since people are behind the algorithms, the real problems are complex and nuanced.
This tool by Justin Bois makes it easy to explore commonly used probability distributions, including information about the stories behind them, their probability mass/probability density functions, their moments, etc. Each distribution includes interactive vignettes and syntax for NumPy, SciPy, and Stan. Nice tutorial by Jovan Veljanoski that shows how to use the Vaex library for working with datasets that fit on your hard drive but are too large for RAM. Vaex is an open-source DataFrame library which enables visualization, exploration, analysis and even machine learning with tabular datasets that are as large as your hard-drive. Metaflow is an end-to-end workflow tool from the Machine Learning Infrastructure team at Netflix. It helps you design your workflow, version experiments, deploy models to production, run them at scale and inspect results in notebooks - all without engineering expertise. This short rant on the TensorFlow developer experience struck a nerve for many this past week. Things are moving fast but partly the problems here are rooted in organizational politics and those, unfortunately, tend to be some of the hardest problems to work through. Here's a nice tool for arXiv users. Fermat's Librarian is an extension for Chrome that provides direct links to references, BibTeX extraction and comments on all arXiv papers. While powerful cloud-based analytics brings incredible benefits to data-driven organizations it comes with the risks of data breaches, noncompliance with data regulations, and unrestricted access to sensitive data. Join Databricks and Immuta for a webinar on 12/11 as we explore this common challenge facing data science teams.
// sponsored
— Tools and Techniques —
Probability Distribution Explorer
How to analyze 100 GB of data on your laptop with Python
Open-Sourcing Metaflow, a Human-Centric Framework for Data Science
Tensorflow User Experience
Fermat's Library - Librarian Extension for Chrome
Data Governance & Data Security for Cloud Analytics
Vega-Lite 4 is out! This is a major release that includes a variety of new interactive features, new transforms (density, regression, quantiles), responsive sizing and more. See these visual release notes for details.
— Data Viz —
Vega-Lite 4.0
Experiment Management: Rethink your Machine Learning Workflow - Join Comet.ml data scientist, Niko Laskaris, for a free webinar to learn how experiment management platforms help data scientists and teams track, compare, explain and reproduce their ML experiments leading to improved team collaboration, productivity and visibility. Tuesday, December 10, 2pm ET Register here >>
— Conferences & Events —
Some of the entries in this Reddit thread are definitely low and some are OMG high. Either way, there's a lot of useful info here about compensation packages in a variety of industries around the world.
— Career —
2019 End of Year Salary Sharing thread
No spam, ever.