— Notes —
This new directory from Data Elixir includes information and links to conferences from around the world. Keyword search makes it easy to find things and featured events include discount codes. This directory will evolve and be regularly updated. To add an event, get in touch.
— In the News —
It takes a lot of manual labor to clean, categorize, and label data before it's useful for machine learning. An extremely competitive industry has developed around these kinds of data prep activities and, not surprisingly, the work is often off-shored to low paid workers. This article in the MIT Tech Review explores the opportunities, the challenges, and how some organizations are striving to do good.
— Tools and Techniques —
These "simple rules" for Jupyter Notebooks amount to a set of best practices for ensuring that your work is maintainable, reproducible and easy to follow. Unlike other Jupyter how-to guides, this isn't about code tricks and widgets. This article highlights Jupyter's capabilities as a computational notebook. Includes links to examples and useful tools.
Computing a conversion rate can be fairly straightforward but when there is a substantial delay until the conversion event, the analysis gets far more complex. This is a great post that walks through a variety of solutions and issues with these types of analyses. Includes links to key resources and introduces a Python package called "convoys" that helps to fit these types of models.
Kevin Markham, founder of Data School has expanded his popular pandas tricks series. There are now more than 45 tricks and new ones are added daily.
This beginner's guide to common sampling techniques is easy to follow and includes code snippets to play with.
How to choose the right ML approach for your business goals and how to determine the best data labeling technique for your use cases.
— Resources —
This classic text by Christopher M. Bishop is easy to follow and is available for free as a PDF.
— Data Viz —
This recent survey offers insights into the latest trends and issues of the data visualization community. More than 1300 practitioners answered questions about their work experience, the tools they use, how data visualization fits into their organizations, desires, frustrations, etc. In this post, Elijah Meeks summarizes the results and the key themes that emerged.