— In the News —
Around the world, countries are releasing official strategies that promote the use and development of AI. They're all different. Some focus on research, others focus on public and private sector adoption, some focus on infrastructure, some on standards, etc. In this article, Tim Dutton summarizes the strategies and planned investment for each country, including links for official announcements and key resources.
— Sponsored Link —
On July 24-25 Metis will host its 2nd annual Demystifying Data Science event - a free, online conference - featuring 28 interactive data science talks from industry-leading speakers. Keynotes include Lillian Pierson, CEO of Data-Mania LLC and author of Data Science for Dummies, and Beth Comstock, author and former Vice Chair of GE. Register now!
— Tools and Techniques —
If you've been following the Pandas on Ray project, this post is a good overview of how it's evolving. If you're not familiar with the project, Pandas on Ray is a drop-in replacement for Pandas that enables you to transparently distribute your data and computation. It's fast, scalable and can be implemented simply by changing a single import statement. This post also includes an introduction to the project and links for getting started.
Big changes are underway in the world of Natural Language Processing. The long reign of word vectors as NLP's core representation technique has seen an exciting new line of challengers emerge: ELMo, ULMFiT, and the OpenAI transformer. This post by Sebastian Ruder explores this evolving landscape.
Shirin Glander recently taught a Machine Learning with R workshop at the University of Heidelberg. The slides and detailed notes that she released for the workshop are fantastic. Includes code snippets to get you started.
Non-technical overview of database options, including the different types of databases that are currently available and how to choose the right one for your requirements.
— Resources —
Datasets by Microsoft Research are now available in the cloud. These datasets are intended to advance state-of-the-art research in areas such as natural language processing, computer vision, and domain specific sciences. There's a lot here and be sure to check the license terms for your particular use.
Kaggle Kernels can be super useful for understanding the practical implementation of algorithms but with nearly 200,000 kernels, it's not always easy to find what you're looking for. This "glossary" of Kernels is a well-organized resource that will help you quickly find useful models, techniques, and tools.
— Data Viz —
The latest update of ggplot2 is a major release with lots of new features. Here's what you need to know.
Curated collection of data visualization research papers, books, blog posts, and other readings. Covers topics such as automated visualization design, color, perception, data management, human computer interaction, scientific visualization, uncertainty, etc. This is a different collection than awesome-dataviz which is a more practical collection of tools and libraries.
— Career —
If you think you might be looking for work in the next few years, study this! Here's how to cultivate an online presence that shows what you're capable of. Covers specific ideas for a portfolio and tips for resumes, LinkedIn profiles, Kaggle, social media, etc, etc. Includes lots of examples.