— Insight —
After years of scandals involving the health of its players, the NFL wants to use data and machine learning to reduce risk and prevent injuries. Here's why data won't change the game.
Software can just be rewritten so, intuitively, this makes perfect sense. But since people are behind the algorithms, the real problems are complex and nuanced.
— Tools and Techniques —
This tool by Justin Bois makes it easy to explore commonly used probability distributions, including information about the stories behind them, their probability mass/probability density functions, their moments, etc. Each distribution includes interactive vignettes and syntax for NumPy, SciPy, and Stan.
Nice tutorial by Jovan Veljanoski that shows how to use the Vaex library for working with datasets that fit on your hard drive but are too large for RAM. Vaex is an open-source DataFrame library which enables visualization, exploration, analysis and even machine learning with tabular datasets that are as large as your hard-drive.
Metaflow is an end-to-end workflow tool from the Machine Learning Infrastructure team at Netflix. It helps you design your workflow, version experiments, deploy models to production, run them at scale and inspect results in notebooks - all without engineering expertise.
This short rant on the TensorFlow developer experience struck a nerve for many this past week. Things are moving fast but partly the problems here are rooted in organizational politics and those, unfortunately, tend to be some of the hardest problems to work through.
Here's a nice tool for arXiv users. Fermat's Librarian is an extension for Chrome that provides direct links to references, BibTeX extraction and comments on all arXiv papers.
While powerful cloud-based analytics brings incredible benefits to data-driven organizations it comes with the risks of data breaches, noncompliance with data regulations, and unrestricted access to sensitive data. Join Databricks and Immuta for a webinar on 12/11 as we explore this common challenge facing data science teams.
— Data Viz —
Vega-Lite 4 is out! This is a major release that includes a variety of new interactive features, new transforms (density, regression, quantiles), responsive sizing and more. See these visual release notes for details.
— Career —
Some of the entries in this Reddit thread are definitely low and some are OMG high. Either way, there's a lot of useful info here about compensation packages in a variety of industries around the world.