Tools and Techniques
You'd probably recognize a repetitive song when you heard one, but is it possible to measure repetitiveness? For instance, maybe you have a large dataset of songs and you're interested in discovering trends over time. You might try counting unique words but you'd quickly discover that that doesn't really work. This post is very well done and is a fun read.
This Stanford course is a great introduction to TensorFlow for deep learning. The course ended in March but the notes, slides, and examples are available via the online syllabus.
If you think your privacy is ensured in aggregated and anonymized data, think again. This summary of a recent paper shows how to identify individual users in aggregated mobile data. Although the techniques used here are specific to mobility data, their cleverness offers insights into how vulnerable "privacy" really is.
This open source tool lets you use SQL-like queries to search through your file system.