— In the News —
Wearable devices have enabled people to track their activities and health stats in ever-increasing levels of detail. This is a great overview of how useful that is, some very interesting upcoming technologies, and things we should all be concerned about. Highly recommended.
Researchers are claiming that an automated rap-generating algorithm pushes the boundaries of machine creativity. Here's a good overview of how it works along with sample lyrics. How good are they? Check it out.
This NPR story pits a seasoned White House correspondent against a machine. You can probably guess who wins the race but can you tell who wrote what?
— Sponsored Link —
This two-day event in San Francisco will begin with a day of presentations by Data Science experts such as UC Berkeley Professors Mike Franklin and Mike Jordan, Stanford Professor Rob Tibshirani, UW Professor Carlos Guestrin, CMU Professor Alex Smola, and more. See the complete list of speakers here.
The second day, 7/21/15, will be the 4th year of the Dato conference (Previously referred to as The GraphLab Conference), an event focused around machine learning implementation with GraphLab Create, including data science tutorials and case studies.
Data Elixir readers can save 35% by using "DataElixir" as the discount code during registration! Offer expires on June 1st. Register Now and Save!
— Tools and Techniques —
Fascinating project. A group of researchers at Google and the University of Washington mined 86 million photos from the Internet and automatically pieced them together to create time-lapse sequences of popular places. The video is just 5 minutes and very worthwhile.
Great tutorial by Zev Ross for extracting data from website tables and lists using the R package, rvest. This is well-written with code snippets to make it easy to follow. Along with demonstrating rvest, the tutorial includes steps to geocode and map the extracted demo data.
This is a great introduction to popular data mining algorithms. Each section includes a description of the algorithm, related terms, common use cases, and linked references. If you're not already a data mining pro, this article is Highly Recommended.
— Resources —
If you're just getting into Python, this is a nice place to start. Rick Muller of Sandia National Laboratories created this tutorial to help colleagues come up to speed quickly. It starts with the basics and quickly gets into Numpy, Scipy, Matplotlib, and code optimization. It's presented as an IPython notebook and includes lots of code snippets and references.
MIT Press' Deep Learning textbook is currently in development and is worth paying attention to. Part I, which includes applied math and machine learning basics, is now complete. Large sections of parts II and III are also available and cover practical deep networks and deep learning research.
— Inspiration —
This is an inspiration on many levels. In 2012, Fernanda Viégas and Martin Wattenberg created their well-known Wind Map using data from the National Digital Forecast Database. Wind Map inspired Cameron Beccario to create his Earth project, which includes global data and a variety of additional options. Cameron open-sourced his project and Ivo Lukačovič used that as the foundation for Windyty. Prepare to be amazed.