What I learned from 200 machine learning tools
To better understand the landscape of available tools for ML production, Chip Huyen researched every AI/ML tool she could find. In this post, she explores the landscape and identifies under-served problems and opportunities. This is well-researched and insightful.
Machine learning: go full stack or go home
Traditional machine learning startups building single tools have the wrong idea. Today, companies need to be full-stack to thrive.
Introducing Kedro: The open source library for production-ready Machine Learning code
Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code. It borrows concepts from software engineering and applies them to machine-learning code; applied concepts include modularity, separation of concerns and versioning.
The What-If Tool: Code-Free Probing of Machine Learning Models
Google's new What-If Tool enables users to analyze a machine learning model without writing code. Given pointers to a TensorFlow model and a dataset, the What-If Tool offers an interactive visual interface for exploring model results. The post on the Google AI blog offers a good overview. For more info and online demos, check out the What-If Tool project site >>
Turn Python Scripts into Beautiful ML Tools
Streamlit is a new open source app framework that's being billed as the "fastest way to build custom ML tools." The founders are industry veterans with first-hand insights into the pain points of machine learning engineers. As Streamlit co-founder Adrien Treuille describes it, we’re giving engineers these sort of Lego blocks to build whatever they want.
How to Deploy Machine Learning Models
Nice guide to getting machine learning models into production. It's fairly high-level but there are links throughout to go deeper. Includes discussion of the complexities involved, design considerations, tooling, testing, developments to watch, etc.
Darts: Time Series Made Easy in Python
Doing machine learning with time series data can get complicated fast and Darts is an open-source library that aims to simplify the process. It's inspired by scikit-learn and uses a consistent API with a powerful set of tools. This announcement explores its capabilities and motivations.
doccano is an open source text annotation tool for humans. It provides annotation features for text classification, sequence labeling and sequence to sequence tasks. So, you can create labeled data for sentiment analysis, named entity recognition, text summarization and so on. Just create a project, upload data and start annotating. You can build a dataset in hours.
Using GitHub Actions for MLOps and Data Science
GitHub just released a collection of new tools to help with automation, collaboration and reproducibility in your data science and machine learning workflows. Here are the details.
Open-Sourcing Metaflow, a Human-Centric Framework for Data Science
Metaflow is an end-to-end workflow tool from the Machine Learning Infrastructure team at Netflix. It helps you design your workflow, version experiments, deploy models to production, run them at scale and inspect results in notebooks - all without engineering expertise.
Manifold: A Model-Agnostic Visual Debugging Tool for Machine Learning at Uber
This post from the Uber Engineering Blog introduces a new internal tool for debugging machine learning models. Called "Manifold," the tool leverages visual analytics to help machine learning practitioners optimize their models and identify trouble spots. This post describes the thinking behind Manifold's visual design and how it works. Here's the code.