Issue 36
— In the News —
Health and Data: Can Digital Fitness Monitors Revolutionize Our Lives?
Wearable devices have enabled people to track their activities and health stats in ever-increasing levels of detail. This is a great overview of how useful that is, some very interesting upcoming technologies, and things we should all be concerned about. Highly recommended.
Machine-Learning Algorithm Mines Rap Lyrics, Then Writes Its Own
Researchers are claiming that an automated rap-generating algorithm pushes the boundaries of machine creativity. Here's a good overview of how it works along with sample lyrics. How good are they? Check it out.
An NPR Reporter Raced A Machine To Write A News Story. Who Won?
This NPR story pits a seasoned White House correspondent against a machine. You can probably guess who wins the race but can you tell who wrote what?
This two-day event in San Francisco will begin with a day of presentations by Data Science experts such as UC Berkeley Professors Mike Franklin and Mike Jordan, Stanford Professor Rob Tibshirani, UW Professor Carlos Guestrin, CMU Professor Alex Smola, and more. See the complete list of speakers here. The second day, 7/21/15, will be the 4th year of the Dato conference (Previously referred to as The GraphLab Conference), an event focused around machine learning implementation with GraphLab Create, including data science tutorials and case studies. Data Elixir readers can save 35% by using "DataElixir" as the discount code during registration! Offer expires on June 1st. Register Now and Save!
— Sponsored Link —
Check out this lineup!
Fascinating project. A group of researchers at Google and the University of Washington mined 86 million photos from the Internet and automatically pieced them together to create time-lapse sequences of popular places. The video is just 5 minutes and very worthwhile. Great tutorial by Zev Ross for extracting data from website tables and lists using the R package, rvest. This is well-written with code snippets to make it easy to follow. Along with demonstrating rvest, the tutorial includes steps to geocode and map the extracted demo data. This is a great introduction to popular data mining algorithms. Each section includes a description of the algorithm, related terms, common use cases, and linked references. If you're not already a data mining pro, this article is Highly Recommended. Here's the latest gem by Andrej Karpathy. This is a clearly presented deep dive into recurrent neural networks with a GitHub repo of code to go along with the article. There's also a worthwhile discussion about the article on Hacker News. Highly Recommended.
— Tools and Techniques —
Scientists Tell Us How Your Old Pics Will Change Time-lapse Photography
Scrape Website Data With The New R Package rvest
Top 10 Data Mining Algorithms in Plain English
The Unreasonable Effectiveness of Recurrent Neural Networks
If you're just getting into Python, this is a nice place to start. Rick Muller of Sandia National Laboratories created this tutorial to help colleagues come up to speed quickly. It starts with the basics and quickly gets into Numpy, Scipy, Matplotlib, and code optimization. It's presented as an IPython notebook and includes lots of code snippets and references. MIT Press' Deep Learning textbook is currently in development and is worth paying attention to. Part I, which includes applied math and machine learning basics, is now complete. Large sections of parts II and III are also available and cover practical deep networks and deep learning research.
— Resources —
A Crash Course in Python for Scientists
Deep Learning - An MIT Press Book in Preparation
This is an inspiration on many levels. In 2012, Fernanda Viégas and Martin Wattenberg created their well-known Wind Map using data from the National Digital Forecast Database. Wind Map inspired Cameron Beccario to create his Earth project, which includes global data and a variety of additional options. Cameron open-sourced his project and Ivo Lukačovič used that as the foundation for Windyty. Prepare to be amazed.
— Inspiration —
Windyty
Data Elixir is curated and maintained by @lonriesberg. If you find this newsletter worthwhile, please help spread the word! Forward to your colleagues or use the links below to share to your favorite network: Thanks!
— About —
Sign up for Free
and join the thousands of data lovers who already start their week with us.
No spam, ever. We'll never share your email address and you can opt out at any time.
No spam, ever.