ISSUE 348 · August 10, 2021InsightData Monetization With Strategic Data AssetsMost data is underutilized. To use it more effectively, companies should transform it so it can be reused and recombined to create new value. Sponsored LinkUnderstanding Bias in AI: What is it, how to identify it, and how to mitigate it.It’s no secret...bias in AI is a problem. The more powerful AI capacity becomes, the more important it becomes to ensure that we are producing tools that make life better for more people, not simply scaling and perpetuating forms of error. That’s why we’ve created a guide to understanding and addressing possible sources of bias for your AI/ML project. Tutorials, Projects & OpinionsHow Airbnb Built “Wall” to prevent data bugsThis post continues a series that explores Airbnb's massive effort to ensure data quality across the organization. In this post, Subrata Biswas walks through the motivations, features, and architecture behind Airbnb's in-house data quality management framework called "Wall." For more in
this series, see the Archives. Applications of survival analysisSurvival analysis has been a standard tool for decades in clinical research, but data scientists in other domains have mostly ignored it. This introduction to survival analysis highlights practical applications to help jump-start your creative problem solving. ML Won't Solve Natural Language UnderstandingIn last week's Data Elixir, the posts "What Have Language Models Learned?" and the "Jessica
Simulation" were super popular. They're great demonstrations but ultimately, they treat language like data and that approach needs to be reconsidered if we're going to make progress towards natural language understanding. Here's why. Share Databases, Not FilesHelp eliminate the CSV. bit.io is the fastest way to create, share, and collaborate on a real database. Get a Postgres-compatible database in one-click with easy, secure permissions and zero management. Join us in creating a single place for public and private data that makes everyone immediately
productive with data. Try it now! ResourcesAn Introduction to Statistical Learning (ISLR)The long-awaited second edition of this popular statistics text has just been released and it's been significantly expanded. Topics cover some of the most important modeling and prediction techniques and is intended to be accessible to a broad audience. Free to download. ISLR tidymodels LabsIf you're planning to read the 2nd edition of An Introduction to Statistical Learning (above), learn tidymodels too with this collection of labs that have been rewritten from the book to use tidymodels packages. As much as possible, these labs stay true to the original material. 📺 ML YouTube CoursesGreat collection of some of the best and most recent machine learning courses available on YouTube. CareerThe local minima of suckinessHow to become good by getting slowly less bad. Data VisualizationVisualizing a codebaseHow could visualization be used to quickly show the structure of a codebase? Would that be useful? Check it out! The Typical Weather Anywhere on EarthAwesome visualizations for the typical weather at nearly 150K locations around the world. Get graphical reports for temperature, precipitation, cloud cover, wind, sunrise/sunset, and more. |