Tools and Techniques
"Stacks" are often used by developers to describe the software infrastructure that their applications require. That concept isn't common to describe data systems but it sure could be. In this post, Thomas Ebermann describes a Data Stack with a variety of solutions that are organized into 5 layers: Sources, Processing, Storage, Analysis, and Visualization. This is very well done and worthwhile.
DataKit is a tool to orchestrate applications using a Git-like dataflow. It revisits the UNIX pipeline concept, with a modern twist: streams of tree-structured data instead of raw text. DataKit allows you to define and build complex pipelines over version-controlled data.
This tutorial provides a framework for working through time series forecasting problems using Python. It's very well-structured and includes lots of code snippets and suggestions for further exploration.
Here's how the the team at Silicon Valley Data Science got TensorFlow to work on a Raspberry Pi. It's not a tutorial but rather, a practical look at their approach with thoughts about the decisions that were made along the way.