No images? Click here ISSUE 272 · February 18, 2020InsightThe New Business of AI (and How It’s Different From Traditional Software)"Just as SaaS ushered in a novel economic model compared to on-premise software, we believe AI is creating an essentially new type of business." The Map of MathematicsNice rabbit hole! Explore modern mathematics and how its major elements fit together in this interactive project from Quanta Magazine. ProfilesThe messy, secretive reality behind OpenAI’s bid to save the worldKaren Hao spent half a year digging into OpenAI, one of the leading AI research labs in the world. The lab is intended to "ensure that artificial general intelligence benefits all of humanity." Following dozens of interviews, here's Karen's inside look at how that's going. Sponsored LinkThe Expense of Poorly Labeled Data: An Experiment in DistortionWhat happens when you train a machine learning model on biased data? In this article, we take a good data set conducive towards modeling and compare the effects of random and biased distortion. This analysis illustrates how biased distortion is demonstrably worse and will ruin a dataset and any model trained from this data. Tools and TechniquesQuantifying Independently Reproducible MLAfter attempting to reproduce results from 255 papers (!), Edward Raff, Chief Scientist at Booz Allen, distilled 26 key features of reproducibility. What makes a machine learning paper reproducible? Read this! Five Interesting Data Engineering ProjectsNice introduction to a few projects that are worth being familiar with. Understanding Maximum LikelihoodThis interactive post by Kristoffer Magnusson is a great explainer of maximum likelihood estimation and some common hypotheses tests, such as the likelihood ratio test, Wald test, and Score test. Comet: Machine Learning Experiment ManagementJoin tens of thousands of data scientists worldwide who use Comet.ml Automatically track, compare, explain and reproduce your ML models and experiments. Sign-up for free. ResourcesMachine learning in Python: Main developments and technology trends in data science, ML, and AIThis new survey paper explores the Python machine learning landscape with a focus on recent trends and developments. This is a well-written long read, with lots of references along the way. By Sebastian Raschka, Joshua Patterson and Corey J Nolet. rstudio::conf 2020All RStudio Conference 2020 videos are now available for streaming. There are over 100 talks here, covering a wide range of topics. Most of these talks are about 20 minutes and include links to related materials. Data VizHow big is that, though?Even for people who work with maps a lot, it can be hard to grasp the size of things that are in the news. How big is that "2400 km² locus swarm that's devastating Kenya?" This tool by Hans Hack has the answer. It's simple and brilliant. Communicating Model Uncertainty Over SpaceGreat post by Adam Pearce that walks-through his process for designing an interface that shows ML model uncertainty. The vehicle is a model that's very good at detecting prostate cancer but it's not perfect. How do you show a pathologist where a model can be trusted? Adam shows the strengths and weaknesses of 6 different approaches. Upcoming Events
Job BoardNew on the Job Board this week:
![]() Data Elixir is curated and maintained by @lonriesberg. For additional finds from around the web, follow Data Elixir on LinkedIn, Twitter or Facebook. |