ISSUE 357 · October 12, 2021In the NewsState of AI Report 2021This slide deck is one of the most comprehensive resources you'll find about the current state of AI. Explores a variety of topics including AI research, talent supply, commercial applications, and predictions for the coming year. It's dense but it's a very visual deck that's easy to follow. Covid response hampered by population data glitchesGreat Twitter thread about bad Covid data and how it's costing lives around the world. It turns out that many countries don't even know how many residents they have. This thread breaks down exactly how that's a problem and how to think about the data you see in the news, reports, papers, etc. Sponsored LinkDistilled Insights from Data LeadersThe past year has been marked by a great acceleration. As a result, becoming data-driven is a priority for every organization. In this white paper, you will find distilled insights in data transformation from enterprise CDOs to data startup CEOs shared on the most useful episodes of DataCamp’s DataFramed podcast. Download now. Tutorials, Projects & OpinionsInterpreting A/B test results: false positives and statistical significanceNice introduction to false positives and statistical significance with simple examples along the way to build up intuition. This is the 3rd post of a multi-part series on how Netflix uses A/B tests to inform decisions. Choosing the right KPIs to evaluate your modelsThis first of a two-part series introduces a framework for choosing the "right" KPIs to evaluate your ML models. In Part 2, Amir Dolev explores a related question: how do you know if your model is actually correct? Pull Request Flow with usethisIf you use git and GitHub to collaborate on R packages or projects, then the "pr_" helper functions in the usethis package are super helpful. This flowchart/cheatsheet is a great reference that shows how. Cracking the Mystery of Deep LearningTo help them explain the shocking success of deep neural networks, researchers are turning to older but better-understood models of ML. Activity Schema: a faster, simpler data modelAn activity schema enables faster querying, easier maintenance, and simpler governance. Create models with no dependencies or foreign keys, and query any combination of them to build all necessary tables for BI and analysis. Narrator is the only platform for building, maintaining, and querying an activity schema. ResourcesBayesian Optimization BookThis text on Bayesian optimization aims to provide a self-contained and comprehensive introduction to Bayesian optimization, starting from scratch and carefully developing all the key ideas along the way. The intended audience is graduate students and researchers in machine learning, statistics, and related fields. Data VisualizationMap Projection PlaygroundThere are other places on the web that let you play with map projections but this new tool offers a wide selection of customization options, the code is easily edited, and the output can be exported as svg files for further editing in Illustrator or Inkscape. RobservableThis package allows Observable notebooks to be used as htmlwidgets in R. It doesn't embed an entire notebook but rather, it lets you choose which cells to display, update cell values from R, and add observers to get cell values back into a Shiny application. |