— In the News —
Data science offers the potential to fundamentally alter medicine but despite the promise, biased datasets and unaccountable algorithms threaten to further disempower patients.
In India, the black market for data resembles markets for wholesale vegetables and smuggled goods. Customers are encouraged to buy in bulk, and the variety of what’s available is mind-boggling.
— Tools and Techniques —
In her latest post, Chip Huyen explores the current state of real-time, online machine learning including use-cases, solutions, and challenges. After interviews with about a dozen companies, she concludes that "machine learning is going real-time, whether you’re ready or not."
What we lose if we abandon p values.
This visual guide is a great introduction to NumPy and has gotten a lot of attention around the web recently. It builds on a previous article by Jay Alammar and covers a broad range of operations including vectors, matrices, and high dimensional operations.
Tryolabs' annual list of top Python libraries is consistently a must-read post. This year's list is tailored for data science and ML and includes tools for high-dimensional plotting, config management, forecasting, command line interfaces, productivity, outlier detection, and more.
This Python toolkit provides standard metrics to quantify and compare uncertainty estimates from ML methods. It also offers intuition for these metrics, produces visualizations, and implements simple "re-calibration" procedures to help improve the uncertainties. This is well-documented and includes a collection of linked references.
Jupyter 3.0 is a major release that includes some key features including a visual debugger, a table of content for notebooks, multiple display languages, and a much improved extension system. See this announcement post for details.
— Resources —
This curated list of the latest breakthroughs in AI include short video explanations, links to in-depth articles, and code.
Awesome roundup of ⚽ analytics content from 2020. Covers research papers, blog posts, podcasts, talks, Python libraries, datasets, and more!