— In the News —
This video by Frank Chen at Andreessen Horowitz explores what companies are doing with AI today and what’s bubbling up from the research community that’s just a few years out. This is a super interesting survey of where these technologies are going.
95% of Earth's peoples communicate with ~100 languages. That sounds like a lot but the remaining 5% use nearly 7000 languages! Automated translation services could help preserve these languages but the datasets that would be required don't exist. A new machine translation technique might change that.
— Sponsored Link —
ActivePython comes pre-bundled with essential tools for data preparation, analysis, visualization and machine learning. Connect to your data sources and start being productive with all the hard-to-build libraries ready to go. Integrate your analysis into web apps, test for production, and train complex neural networks, all with one Python solution.
— Tools and Techniques —
You'd probably recognize a repetitive song when you heard one, but is it possible to measure repetitiveness? For instance, maybe you have a large dataset of songs and you're interested in discovering trends over time. You might try counting unique words but you'd quickly discover that that doesn't really work. This post is very well done and is a fun read.
This Stanford course is a great introduction to TensorFlow for deep learning. The course ended in March but the notes, slides, and examples are available via the online syllabus.
If you think your privacy is ensured in aggregated and anonymized data, think again. This summary of a recent paper shows how to identify individual users in aggregated mobile data. Although the techniques used here are specific to mobility data, their cleverness offers insights into how vulnerable "privacy" really is.
This open source tool lets you use SQL-like queries to search through your file system.
— Data Viz —
OpenVis Conf is one of the best data visualization conferences around. For anyone interested in data visualization, information design and data analysis, especially for the web, this is a Must-Watch collection of presentations.
This post is a great overview of the thought process and experimentation that led to a new chart type called a "Weighted Pivot Scatter Plot." It looks useful and you won't be able to resist playing with the experiments along the way.
— Conferences —
If you're interested in AI and can be in London on June 20-21, you'll definitely want to check out the lineup of speakers, panels and workshops at CogX London. Check it out and if you want to go, I have 5 free VIP passes to give to Data Elixir readers. Just send me an email with "I want to go to CogX" in the subject. I'll select winners on Tuesday, the 23rd. Also, if you register by the 21st, you can get a free Trade Expo Pass with this code: 3l!xr4873xp0