— In the News —
In 2007, researchers started warning about ways that social interaction data might be used to predict and manipulate behaviour. They called it "computational social science" and while current events are somewhat prescient, there are much bigger threats looming in datasets that people don't often think about. This article in Wired takes a look at how the research has evolved with an eye towards what we should really be worried about.
Talia Borodin, CEO of Amaro Science, explores four key questions you should answer before jumping on the data science bandwagon. Her approach here is well thought-out and practical.
— Profiles —
The Tembé tribe from the central Amazon are hiding old cell phones in trees and using machine learning to listen for sounds of illegal logging. This post on Google's Blog profiles this ambitious project and the people behind it.
— Sponsored Link —
If you’re ready to learn what it means to be a data scientist, the skills necessary to become a data scientist and the steps to follow to get a job in data science, download Springboard’s comprehensive 70-page guide on How to get your first job in data science.
— Tools and Techniques —
This post on the Netflix Tech Blog explores the technical challenges of video streaming and how statistical models and machine learning techniques are used to overcome those challenges. This is pretty high-level but it's an interesting perspective into the kinds of problems that data scientists at Netflix get to work on.
Pandas is super flexible and it's often possible to perform a given task in multiple ways. To help select which approach is best for a particular problem, this project provides benchmarks for different operations, given various DataFrame sizes.
LabNotebook is a tool that allows you to monitor, record, save, and query your machine learning experiments. This is an Alpha version so expect some issues but it looks promising.
Monte Carlo Tree Search is commonly used in games for selecting the next best move. Here’s a solid introduction to get started with.
monday.com is a visual and intuitive project management and software development tool. Manage your design layouts and coding tasks visually, allows real time collaboration with your team and see what everyone is working on in a single glance.
— Data Viz —
Altair is a Python visualization library that's based on the Vega-Lite visualization grammar. Fundamentally, it's rooted in the "Grammar of Graphics," like ggplot2. The results are effective and beautiful and can be created with a minimal amount of code. This set of projects is very well done and this article by Jim Vallandingham is a great starting point.
Lisa Charlotte Rost from Datawrapper explores choropleth maps, including use-cases, practical how-tos, lots of examples and links to key references. It's a great tutorial and if you like these kinds of posts, Datawrapper's blog is also worth paying attention to.