— Insight —
Cassie Kozyrkov from Google offers practical insights into why businesses have a hard time implementing machine learning solutions. This is a short post with ideas you can act on. "If you're opening a bakery... are you in the business of making bread? Or making ovens?..."
People often use vague words to describe possible events. Words such as "likely," "probably," "maybe," and "almost certainly" aren't specific but they actually map to distinct probability intervals in people's heads. This study in the Harvard Business Review takes a look at the use of vague terms, including motivations for using them, how, specifically, common terms are interpreted, inherent problems, and practical suggestions based on their analysis.
Awesome overview of the current state of the Big Data & AI industries by Matt Turck of FirstMark Capital. Includes a discussion of what's changed over the past year, where things are going, and a comprehensive map of important players. Matt is super knowledgeable in this space and is one of the key writers I watch for industry insights.
— Sponsored Link —
The Master of Information and Data Science (MIDS) is an online degree program for professionals looking to advance in the field of data science. The program’s multidisciplinary approach encompasses planning and gathering data to analyzing and presenting findings. Course work in this program examines issues of security, explores machine learning, and considers techniques for data storage and management. Students in the program benefit from UC Berkeley’s strong ties to the Bay Area and Silicon Valley.
— Tools and Techniques —
For a lot of reasons, Agile development has become standard practice for software engineering teams. In this three-part series, Michael Kaminsky explores how Agile techniques can also be applied to analytics and why you'll want to.
Nice introduction to Tensorflow, especially for people who are already familiar with Machine Learning and Python. This tutorial is by Jacob Buckman, who's currently a Google AI Resident.
— Data Viz —
For R users, ggplot2 is key for data viz and one of R's most widely used packages. It's based on the well-known Grammar of Graphics, which is an awesome foundation but may not always provide what you need. For those times when you need something different, ggplot2 extensions offer additional functionality. This curated collection of extensions is a useful starting point that includes use cases, screenshots and linked references for each extension.
Great article from the Uber Engineering team about their geospatial indexing system called "H3." H3 uses a hexagonal grid that can be subdivided into finer and finer hexagonal grids to fit efficiently into S2 cells. What they're doing here is clever, it's open-source, and this article offers a nice overview of Uber's motivations for creating it and how to get started.
— Career —
Interviewing is a two-way street. If you're looking for work, you should be interviewing them as much as they're interviewing you. This post from Emily Robinson and Jonathan Nolis highlights key things to watch out for from the perspective of a data scientist. And if you're hiring for data science roles, there are useful insights here for you too.