— In the News —
The data collections of big companies like Google, Apple, Facebook, Amazon, and Microsoft offer enormous advantages over potential new competitors. Most of us benefit from that data and for a variety of reasons, typical antitrust arguments are hard to make. This article by Steve Lohr explores the issues and how they affect business and innovation.
It's being called "a landmark achievement for artificial intelligence." A poker-bot has defeated several professional players in Texas hold’em poker using an approach that's essentially, an artificial gut feeling.
From deep learning to decoupling, here are the data trends to watch in the year ahead.
— Sponsored Link —
Blendo is an ETL-as-a-service platform on steroids. Load data from any source, into any data warehouse - like Redshift, BigQuery and MS SQL Server - in minutes. Focus on answering questions about your business, not building data pipelines. Get started for free now!
— Tools and Techniques —
word2vec is a group of algorithms that transforms words into vectors, so that words with similar meaning end up being close to each other. Moreover, it allows us to use vector arithmetics to work with analogies, for example the famous king - man + woman = queen. Here's a clear description of how it works, including a helpful interactive.
This is the end of a fantastic series by David Robinson on empirical Bayesian methods. In this post, David explores the question, do these methods actually work? If you're new to the series, the previous 9 posts are linked at the top.
This is a pretty awesome project that keeps getting resurrected in my Inbox. While you may not need an automated Boss Sensor of your own, this is a fun and worthy read.
— Resources —
Two University of Washington professors have created this free online seminar to teach people how to think critically about data. Aside from the cheekiness, this is not a joke. Check out the Syllabus and Case Studies. This is a well-crafted curriculum for a smart audience.
Free book about using numpy to write efficient Python code, especially for scientific applications.
— Challenges —
This year's Data Science Bowl is a competition to develop cancer detection algorithms. $1,000,000 in prizes will be awarded to those who observe the right patterns, ask the right questions, and in turn, create unprecedented impact around cancer screening, care and prevention.