— In the News —
This week, Facebook, Google, Microsoft, and Twitter officially unveiled the Data Transfer Project, which allows users to move their data between providers. The stated intention is to "improve data portability," which may or may not be true but either way, this project opens the door for new entrants to create business models around user data that's previously been locked up. The announcement includes a white paper and technical overview.
Insightful post by Ben Lorica that explores the value - and costs - of data. Includes discussion around the costs of acquiring data, liabilities for having it, business valuations that result from data, data liquidity, data marketplaces, etc.
— Sponsored Link —
Strata Data Conference is happening in New York September 11-13.
Get an insider’s look at data’s latest developments, review extraordinary technical expertise, and tap the brightest minds in data discussing the most important issues. Also, get full access to networking events, Findata Day, and Strata Business Summit.
Register now before Early Price discounts end on Friday, July 27.
Special for Data Elixir members: Save an additional 20% on Gold, Silver, Bronze, and Findata Day passes with code DE20
— Tools and Techniques —
This is a fantastic overview of how decision trees work by Brandon Rohrer. Includes lots of diagrams, easy to follow descriptions and a short video if you'd rather watch.
In this continuation of his super popular series, Ben Weber introduces the importance of creating services that other teams (and/or products) can use. This can be especially challenging in a resource-constrained environment, such as a startup, so he also offers a step-by-step guide for setting up such a service using AWS. This is a great tutorial.
Evolutionary algorithms are inspired by natural selection and are super interesting for solving search and optimization problems. Together, two articles this week make for a great primer. The first article, from the MIT Tech Review, explores how evolutionary algorithms work and the second is a step-by-step tutorial that shows how to build one from scratch using Python:
Many leaders only use data to feel better about decisions they’ve already made. It's called "confirmation bias" and it can completely undermine your team's efforts to be data-driven. This short post offers a practical approach for making sure your data science efforts are more than just an expensive hobby.
— Resources —
I found the Linear Digressions Podcast last week and in a recent episode, Katie introduced this gem of a site. ShortScience.org is a platform for discussing research papers, which may not sound interesting but it also serves as an easy way to keep up with the papers that are being discussed. The site currently has 800+ summaries, mostly in machine learning.
— Data Viz —
Get an intuitive perspective of how map projections affect the display of geographical data with this interactive online tool. There are 63 projections here to choose from!
This new book club from the folks at Data Wrapper might be just the thing you need to inspire and advance your interest in data visualization. The first book is the classic, Visual Display of Quantitative Information by Edward Tufte. Check out the post for details.