— In the News —
Enter the first-ever National Data Science Bowl, co-sponsored by Booz Allen and Kaggle in partnership with Oregon State University's Hatfield Marine Science Center. At stake? The very health of our oceans. This looks very well organized and funded. Awards total $175K. This looks fantastic!
One of the last bastions of human mastery over computers is about to fall. This article describes why Go is difficult for computers to master and how they're mastering it anyway.
Great article about how easy it is to exploit data that seems completely harmless. This article should creep you out. Really.
— Tools and Techniques —
Pattern is a web mining module for Python. It has tools for data mining (Google, Twitter and Wikipedia API, a web crawler, a HTML DOM parser), natural language processing (part-of-speech taggers, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, clustering, SVM), network analysis and visualization. It's free, well-documented, and comes with lots of examples and unit tests.
This is a hands-on tutorial for learning about deep learning. It uses Kaggle's Facial Keypoints Detection challenge as a vehicle and introduces the Lasagne library for building neural networks with Python and Theano. With sample code, visualizations, and detailed instructions, this looks very worthwhile.
— Resources —
Nice collection of free data sources from around the world.
Interested in a job as a "data scientist" but not sure what that means exactly? Sebastian Gutierrez is the author of the recently released, "Data Scientists at Work" and hosted this "Reddit - Ask Me Anything" post about the book. Super insightful.
40 data viz interviews of 2014
Great collection of Data Viz interviews, curated by Visualoop:
— Data Viz —
It's pretty hard to come up with the best collection of Data Visualization Projects but this is certainly a very good collection. Some of these are right at the top of my own list of favorites for 2014. Interestingly, these are all either interactive and/or animated. There's definitely a lot to explore here.
This is a collection of "physical visualizations," which aren't exactly data visualizations but are data representations in physical forms. Most of these are modern but some go back hundreds or even thousands of years. This might be my favorite find yet.