ISSUE 317 · January 5, 2021In the NewsMedicine's Machine Learning ProblemData science offers the potential to fundamentally alter medicine but despite the promise, biased datasets and unaccountable algorithms threaten to further disempower patients. Inside India’s booming dark data economyIn India, the black market for data resembles markets for wholesale vegetables and smuggled goods. Customers are encouraged to buy in bulk, and the variety of what’s available is mind-boggling. TrendsMachine learning is going real-timeIn her latest post, Chip Huyen explores the current state of real-time, online machine learning including use-cases, solutions, and challenges. After interviews with about a dozen companies, she concludes that "machine learning is going real-time, whether you’re ready or not." Sponsored Link2021: The year you make ML (and your job) easierMake it easy to get your ML projects from experiment to production. Comet automatically tracks datasets, code changes, experimentation history and production models - all so you can focus on data science. Get started today with the free community edition. Tutorials, Projects & OpinionsThe value of pWhat we lose if we abandon p values. NumPy IllustratedThis visual guide is a great introduction to NumPy and has gotten a lot of attention around the web recently. It builds on a previous article by Jay Alammar and covers a broad range of operations including vectors, matrices, and high dimensional operations. Code & ToolsTop 10 Python libraries of 2020Tryolabs' annual list of top Python libraries is consistently a must-read post. This year's list is tailored for data science and ML and includes tools for high-dimensional plotting, config management, forecasting, command line interfaces, productivity, outlier detection, and more. Uncertainty ToolboxThis Python toolkit provides standard metrics to quantify and compare uncertainty estimates from ML methods. It also offers intuition for these metrics, produces visualizations, and implements simple "re-calibration" procedures to help improve the uncertainties. This is well-documented and includes a collection of linked references. JupyterLab 3.0 is released!Jupyter 3.0 is a major release that includes some key features including a visual debugger, a table of content for notebooks, multiple display languages, and a much improved extension system. See this announcement post for details. Resources2020: A Year Full of Amazing AI papers- A ReviewThis curated list of the latest breakthroughs in AI include short video explanations, links to in-depth articles, and code. ⚽ Analytics 2020 ReviewAwesome roundup of ⚽ analytics content from 2020. Covers research papers, blog posts, podcasts, talks, Python libraries, datasets, and more! Data Elixir is curated and maintained by Lon Riesberg. For full-text search of prior issues, visit Data Elixir's Search Page. If you have suggestions or questions for the newsletter, just reply back to this email. Sign up to get Data Elixir's data science newsletter in your Inbox >> |