Issue 189
— Insight —
Why businesses fail at machine learning
Cassie Kozyrkov from Google offers practical insights into why businesses have a hard time implementing machine learning solutions. This is a short post with ideas you can act on. "If you're opening a bakery... are you in the business of making bread? Or making ovens?..."
If You Say Something Is “Likely,” How Likely Do People Think It Is?
People often use vague words to describe possible events. Words such as "likely," "probably," "maybe," and "almost certainly" aren't specific but they actually map to distinct probability intervals in people's heads. This study in the Harvard Business Review takes a look at the use of vague terms, including motivations for using them, how, specifically, common terms are interpreted, inherent problems, and practical suggestions based on their analysis.
Great Power, Great Responsibility: The 2018 Big Data & AI Landscape
Awesome overview of the current state of the Big Data & AI industries by Matt Turck of FirstMark Capital. Includes a discussion of what's changed over the past year, where things are going, and a comprehensive map of important players. Matt is super knowledgeable in this space and is one of the key writers I watch for industry insights.
The Master of Information and Data Science (MIDS) is an online degree program for professionals looking to advance in the field of data science. The program’s multidisciplinary approach encompasses planning and gathering data to analyzing and presenting findings. Course work in this program examines issues of security, explores machine learning, and considers techniques for data storage and management. Students in the program benefit from UC Berkeley’s strong ties to the Bay Area and Silicon Valley.
— Sponsored Link —
Master's in Data Science Online from UC Berkeley
For a lot of reasons, Agile development has become standard practice for software engineering teams. In this three-part series, Michael Kaminsky explores how Agile techniques can also be applied to analytics and why you'll want to. Nice introduction to Tensorflow, especially for people who are already familiar with Machine Learning and Python. This tutorial is by Jacob Buckman, who's currently a Google AI Resident.
— Tools and Techniques —
Agile Analytics
Tensorflow: The Confusing Parts
For R users, ggplot2 is key for data viz and one of R's most widely used packages. It's based on the well-known Grammar of Graphics, which is an awesome foundation but may not always provide what you need. For those times when you need something different, ggplot2 extensions offer additional functionality. This curated collection of extensions is a useful starting point that includes use cases, screenshots and linked references for each extension. Great article from the Uber Engineering team about their geospatial indexing system called "H3." H3 uses a hexagonal grid that can be subdivided into finer and finer hexagonal grids to fit efficiently into S2 cells. What they're doing here is clever, it's open-source, and this article offers a nice overview of Uber's motivations for creating it and how to get started.
— Data Viz —
12 extensions to ggplot2 for more powerful R visualizations
H3: Uber’s Hexagonal Hierarchical Spatial Index
Interviewing is a two-way street. If you're looking for work, you should be interviewing them as much as they're interviewing you. This post from Emily Robinson and Jonathan Nolis highlights key things to watch out for from the perspective of a data scientist. And if you're hiring for data science roles, there are useful insights here for you too.
— Career —
Red Flags In Data Science Interviews
Data Elixir is curated and maintained by @lonriesberg. For additional finds from around the web, follow Data Elixir on Twitter, Facebook, or Google Plus.
— About —
Sign up for Free
and join the thousands of data lovers who already start their week with us.
No spam, ever. We'll never share your email address and you can opt out at any time.
No spam, ever.