— In the News —
In 2013, President Obama signed an executive order that made open and machine-readable data the new default for U.S. government information. It was one of Obama's hallmark achievements and led to several initiatives to scale up the accessibility of data across government sectors. As has been widely reported this week, Open Data appears to be Closed.
Facebook builds complex profiles of each of its users so it can offer data points to advertisers for targeting ads. Some data points are obvious, like your age and interests, but many may surprise you. This free tool reveals the unsettling amount of information Facebook tries to deduce about you.
— Sponsored Link —
Build a full-stack web application without being a full-stack web developer. Work in-depth with the technology you know. Rely on the Exaptive Studio for the rest, including the glue code. Come check out the Studio and get a free account.
— Tools and Techniques —
"Stacks" are often used by developers to describe the software infrastructure that their applications require. That concept isn't common to describe data systems but it sure could be. In this post, Thomas Ebermann describes a Data Stack with a variety of solutions that are organized into 5 layers: Sources, Processing, Storage, Analysis, and Visualization. This is very well done and worthwhile.
DataKit is a tool to orchestrate applications using a Git-like dataflow. It revisits the UNIX pipeline concept, with a modern twist: streams of tree-structured data instead of raw text. DataKit allows you to define and build complex pipelines over version-controlled data.
This tutorial provides a framework for working through time series forecasting problems using Python. It's very well-structured and includes lots of code snippets and suggestions for further exploration.
Here's how the the team at Silicon Valley Data Science got TensorFlow to work on a Raspberry Pi. It's not a tutorial but rather, a practical look at their approach with thoughts about the decisions that were made along the way.
— Resources —
Well organized and comprehensive collection of curated Jupyter Notebooks. Highly recommended.
Many people forget that the hardest part of building a new AI solution is not the AI or algorithms - it’s the data collection and labeling. This collection of standard datasets can be used as validation or as a starting point for building a more tailored solution.
— Career —
Interested in introductory courses in Data Science? Definitely don't miss this article.