— Insight —
Predictive performance is just the beginning of how you should be evaluating your models. In this post, Christoph Molnar takes a step back and explores what you should really be thinking about.
A data subscription business may seem like a great idea but if the value of the information erodes the more people know it, you have a problem. In this post, Mark Johnson, CEO of Descartes Labs, describes the thinking behind their pivot from selling data to insights.
Stop trying to produce the right answers. "You're probably asking the wrong question anyway."
— Sponsored Link —
In a recent post, Derwen’s Paco Nathan reveals themes for the upcoming Rev Summit and previews what he is most excited for at the conference.
Come to New York City on May 23–24 to learn from data science teams and leaders at Netflix, Slack, Stitch Fix, Domino Data Lab, Microsoft, Dell, Red Hat, Google, Turner Broadcasting System, Humana, Workday, Lloyds Banking Group, BNP Paribas Cardif, and many others!
Use exclusive promo code Data-Elixir_Rev to get $100 off your order!
— How-to —
This four-part course explores a variety of techniques used for natural language processing, including basic statistical models, extracting information from large volumes of text, working with pipelines, and training neural net models.
When the Mueller Report came out last week, the data community was quick to start analyzing it. This curated collection of tweets highlights some of the most interesting posts from around the R community, covering data collection, ways to analyze the data, visualization and, of course, ████ and ████.
Great post in Lyft's Engineering blog about the value of metadata: what it is, how its useful, and the thinking behind their data discovery platform that has enabled its data scientists to be more effective.
Great introduction to the Hamiltonian Monte Carlo method, which is used by software like PyMC3 and Stan. This is the second of a series of posts by Colin Carroll that shows how to implement gradient based samplers.
— Data Viz —
The information design studio called Fathom is well-known for building tools that deconstruct long documents into understandable components. Their latest project, ConText, enables you to quickly gather impressions of how people and groups are described in the Mueller Report.
The latest edition of the R Graphics Cookbook is now available online for free. This cookbook offers more than 150 "recipes" for generating high-quality graphs quickly. If you prefer print, it's also available for purchase online.