— In the News —
The amount of data being collected from athletes on the field is unprecedented and is radically changing the game. Lots of interesting issues here.
Thick Data is difficult to quantify but it's vital for making Big Data work. This is a worthwhile exploration of why you should care and how to approach it.
Marine biologists crowdsourced a facial-recognition algorithm to help them identify individual whales. This article in the Atlantic is a great story about how it all came together. The winning solutions are detailed here:
— Sponsored Link —
Syncano. Database. Backend. Middleware. Real-time. Support. Start for free!
— Tools and Techniques —
Great set of projects that were developed for an introductory artificial intelligence course at UC Berkeley. The projects use Pac-Man as the vehicle to teach foundational AI concepts.
Fun, interactive guide to thinking about complex systems.
Free and open source face recognition with deep neural networks. Uses Python and Torch and is gaining fans fast!
— Resources —
Kaggle is not just hosting datasets here. Kaggle is offering a platform for working with, sharing results, and discussing the data. This will be an interesting project to watch.
Yahoo recently released a 13TB dataset with the news-reading habits of some 20 million users. The goal is to help researchers create software that’s better at predicting what we want. This post on the Yahoo Labs Blog is a good overview of the data and how to get it.
Here are 17 R packages that connect to a variety of curated data sources. These are well documented and include useful examples.
Need image data? Here you go!
— Data Viz —
384 data visualization tools and resources with lots of filters to help you find things. Contributions are welcome and easy to provide via the Google Sheet that powers this site.
If you've been interested in learning D3 but not sure where to start, START HERE! The curriculum that Lynn Cherny developed for this introductory, semester-long D3 course is fantastic.