— In the News —
Microsoft is submerging a data center in the ocean. That might sound crazy but the deeper you dive into this story, the more reasonable this "moonshot" idea seems to be. It's pretty awesome, actually.
Researchers simulated the entire soccer tournament 100,000 times and have predicted the outcome! Here's an overview of their approach and if you're interested in details - along with a few other ideas - check out these papers on arXiv.org.
— Sponsored Link —
Future-proof your career in an increasingly data-driven world with a master’s degree from James Cook University. Demand for advanced data analysis skills is growing rapidly, with an expected 38,000 new jobs by 2021 (Deloitte Access Economics, 2018). Leverage off this growth with a stand-out qualification you can study entirely online, without compromising your work or family commitments.
— Tools and Techniques —
Sure, the applications may be tiny. But the market here is "massive" and "untapped."
Feature engineering is a vital part of machine learning and is typically a manual process that relies on domain knowledge, intuition, and data wrangling skills. This tutorial shows how to use the Python featuretools library to automate at least some of the process and create numerous features with minimal effort.
There's a big difference between creating machine learning models and creating software that runs in production environments. This post explores the ways that model development does not equal software development and it offers practical considerations for closing the gap.
This 32-part course consists of tutorials, quizzes, hands-on assignments and real-world projects to learn data science, as well as advanced python tools for data science. You can think of this list as a "Free Online Nano Book".
RoboSat is an end-to-end pipeline written in Python 3 for feature extraction from aerial and satellite imagery. Features can be anything that are visually distinguishable in the imagery, such as buildings, parking lots, roads, or cars. This is a new opensource project from MapBox, which includes support for using their Maps API as well as your own imagery.
— Podcasts —
The latest season of Stanford's Raw Data podcast traces the origins of power in Silicon Valley. Data, of course, is an important part of the story, which evolves like a novel. Start with The Triple Fence (Episode 5) if it's the data part that interests you most. That said, the entire season is fantastic if you're interested in how a tech scene like Silicon Valley gets started.
— Data Viz —
In this article, Elijah Meeks helps to demystify D3.js by exploring its structure and separating it into more manageable pieces. It's useful for both D3 novices who are trying to figure out which parts to learn and also for experts who are interested in making better use of the library. This is a must-read for anyone doing data visualization work on the web.