Your search for
machine learning tools returned 54 results
Issue 294 - Tools and Techniques
Darts: Time Series Made Easy in Python
Doing machine learning with time series data can get complicated fast and Darts is an open-source library that aims to simplify the process. It's inspired by scikit-learn and uses a consistent API with a powerful set of tools. This announcement explores its capabilities and motivations.
Issue 291 - Tools and Techniques
What I learned from 200 machine learning tools
To better understand the landscape of available tools for ML production, Chip Huyen researched every AI/ML tool she could find. In this post, she explores the landscape and identifies under-served problems and opportunities. This is well-researched and insightful.
Issue 291 - Tools and Techniques
Using GitHub Actions for MLOps and Data Science
GitHub just released a collection of new tools to help with automation, collaboration and reproducibility in your data science and machine learning workflows. Here are the details.
Issue 279 - Resources
CS472: Data science and AI for COVID-19
This new Stanford course will introduce the epidemiology and biology of the COVID-19 virus and will investigate it using data science and machine learning tools. A lot of the course material, including lecture videos, are planned to be posted on the course website for non-Stanford students. Starts April 10.
Issue 263 - Tools and Techniques
Open-Sourcing Metaflow, a Human-Centric Framework for Data Science
Metaflow is an end-to-end workflow tool from the Machine Learning Infrastructure team at Netflix. It helps you design your workflow, version experiments, deploy models to production, run them at scale and inspect results in notebooks - all without engineering expertise.
Issue 254 - Tools
MLOps Tooling
In this post from the MLOps NYC Conference, Todd Morrill summarizes use-cases and pros/cons of several tools that are commonly used for building and managing machine learning models. Covers Kubeflow, MLFlow, SageMaker, Dask and Rapids.
Issue 254 - Tools
Turn Python Scripts into Beautiful ML Tools
Streamlit is a new open source app framework that's being billed as the "fastest way to build custom ML tools." The founders are industry veterans with first-hand insights into the pain points of machine learning engineers. As Streamlit co-founder Adrien Treuille describes it, "we’re giving engineers these sort of Lego blocks to build whatever they want."
Issue 251 - Tools and Techniques
Machine Learning Workspace
The ML workspace is an all-in-one web-based IDE for machine learning and data science. Combines Jupyter, VS Code, TensorFlow, and many other tools and libraries into one convenient Docker image.
Issue 251 - Tools and Techniques
Machine learning: go full stack or go home
Traditional machine learning startups building single tools have the wrong idea. Today, companies need to be full-stack to thrive.
Issue 237 - Tools and Techniques
Introducing Kedro: The open source library for production-ready Machine Learning code
Kedro is an open-source workflow development tool that helps you build data pipelines that are robust, scalable, deployable, reproducible and versioned. This post introduces the project and for details, here's the code repo >>
Issue 236 - Tools and Techniques
GAMS in R
GAMs offer offer a middle ground between simple linear models and complex machine-learning techniques, allowing you to model and understand complex systems. This short, interactive course will teach you how to use these flexible, powerful tools to model data and solve data science problems.
Issue 235 - How-to
How to Deploy Machine Learning Models
Nice guide to getting machine learning models into production. It's fairly high-level but there are links throughout to go deeper. Includes discussion of the complexities involved, design considerations, tooling, testing, developments to watch, etc.
Issue 233 - Tools
Data version control with DVC. What do the authors have to say?
DVC is an open-source version control system for datasets and machine learning models. It's a key tool that's worth being familiar with. In this interview, Dmitry Petrov, the creator of DVC, discusses the project's origins, the problems it solves, how it's used, upcoming features, etc. This is a great discussion.
Issue 228 - Tools
doccano
doccano is an open-source text annotation tool for machine learning. It provides annotation features for text classification, sequence labeling and sequence to sequence.
Issue 216 - Data Viz
Manifold: A Model-Agnostic Visual Debugging Tool for Machine Learning at Uber
This post from the Uber Engineering Blog introduces a new internal tool for debugging machine learning models. Called "Manifold," the tool leverages visual analytics to help machine learning practitioners optimize their models and identify trouble spots. This post describes the thinking behind Manifold's visual design and how it works. For anyone interested in data visualization, Uber's team is one of the most innovative around.
Issue 202 - Data Viz
Machine Learning for Visualization
Ian Johnson, a DataVis UXE engineer at Google, explores how a relatively new set of tools can change the way we explore large datasets.
Issue 200 - Tools and Techniques
The What-If Tool: Code-Free Probing of Machine Learning Models
Google's new What-If Tool enables users to analyze a machine learning model without writing code. Given pointers to a TensorFlow model and a dataset, the What-If Tool offers an interactive visual interface for exploring model results. The post on the Google AI blog offers a good overview. For more info and online demos, check out the What-If Tool project site >>
Issue 191 - Tools and Techniques
What do machine learning practitioners actually do?
Machine learning talent is reportedly in short supply but what is it that these experts really do? Answering that will help guide which skills to teach, which tools to build, and which processes to automate. This post by Rachel Thomas offers a practical perspective of the field, including an overview of common organizational issues that hinder machine learning efforts.
Issue 175 - Tools and Techniques
LabNotebook - A simple experiment manager
LabNotebook is a tool that allows you to monitor, record, save, and query your machine learning experiments. This is an Alpha version so expect some issues but it looks promising.
Issue 171 - Tools and Techniques
Gartner’s 2018 Take on Data Science Tools
Here's a nice overview of Gartner’s 2018 report, "Magic Quadrant for Data Science and Machine Learning Platforms." From ~100 companies that sell data science software, Gartner selected 16 of the most important to rate on their vision and ability to execute. This overview by Robert A. Muenchen includes key developments and a link to an in-depth analysis.
Issue 162 - In Case You Missed It
Be sure to catch the most popular links from last week's issue...
Issue 160 - Tools and Techniques
Amazon SageMaker – Accelerating Machine Learning
At its re:Invent Conference last week, AWS unveiled multiple tools that will be useful for data scientists and engineers. In particular, check out this announcement about SageMaker.
Issue 155 - Resources
Awesome Machine Learning for Cyber Security
From the Awesome Series, here's a curated list of tools and resources that are related to the use of machine learning for cyber security.
Issue 141 - Tools and Techniques
Using Machine Learning to Predict Value of Homes On Airbnb
Robert Chang from Airbnb's Engineering and Data Science team describes their machine learning infrastructure and how it enables their team to work effectively. Includes discussion about the tools and process they use for feature engineering, prototyping & training, model selection & validation, and deploying models to production.
Issue 140 - Data Viz
Facets - Know Your Data
As part of its new People and AI Research Initiative (PAIR), Google open-sourced two new visualization tools to help engineers understand and analyze machine learning datasets. This page offers a project description, live demos and the ability to upload your own datasets.
Issue 134 - Tools and Techniques
How A Data Scientist Can be More Productive
This is a good entry point for a data science iteration tool called "DVC." It stands for "data version control" and is based on concepts that are used in software engineering to facilitate ongoing development. DVC makes it easy to create versions of machine learning algorithms and to share the corresponding code, dependencies, and data in a single, reproducible environment.
Issue 125 - Deep Learning
Try Deep Learning in Python now with a fully pre-configured VM
Here's an easy way to get all those open source libraries installed and working on your own computer. This virtual machine image is complete with Ubuntu, Python 3.5, all the required libraries, and tools like TensorFlow, Theano, Keras, OpenCV, and dlib.
Issue 118 - Resources
100+ Free Data Science Books for 2017
This collection has some great picks with books in a variety of categories including analytics, interviews, distributed tools, Python, R, SQL, NoSQL, machine learning, AI, data visualization, and math.
Issue 111 - In the News
A Guide to Solving Social Problems with Machine Learning
There are enormous gains that can be made from using the latest machine learning tools. But there are also many challenges and some of the most important are easy to miss. This article is aimed at anyone who wants to use data science to create social good, but is unsure how to proceed.
Issue 108 - In the News
Economists are prone to fads, and the latest is machine learning
Is it really a useful tool or is this latest craze distorting economics?
Issue 106 - In the News
How a researcher used big data to beat her own ovarian cancer
When Shirley Pepke was diagnosed with ovarian cancer, she started working on a tool that could tailor cancer treatment to individual patients using a machine learning algorithm and genomics data. "Some people get cancer and do fundraisers — I'm good at doing computational research on complex systems."
Issue 106 - Tools and Techniques
Data Science Deployments with Docker
Containers, such as Docker, are widely used in the software industry but have been tricky to use for machine learning. A new tool from NVIDIA seeks to change that. Here's why using Docker makes sense for a lot of data science projects, what the challenges have been, and how to use NVIDIA's new tool.
Issue 82 - Resources
Setting up a Deep Learning Machine from Scratch (Software)
Detailed guide to setting up your machine for deep learning research. Includes instructions to install drivers, tools and various deep learning frameworks.
Issue 81 - Tools and Techniques
Terrapattern: "similar-image search" for satellite photos
Terrapattern is a visual search tool for satellite imagery. It uses machine learning to find places that look similar.
Issue 62 - Tools and Techniques
Introducing TPOT, the Data Science Assistant
TPOT is a Python tool that automatically creates and optimizes Machine Learning pipelines using genetic programming. It intelligently explores thousands of possible pipelines to find the best one for your data. This is an open-source project by Randy Olson and looks super interesting.
Issue 59 - Tools and Techniques
7 Tools in Every Data Scientist’s Toolbox
Nice collection of statistical and machine learning concepts that are widely used and consistently useful in a large variety of domains and problem settings.
Issue 50 - Resources
Free Data Science Books
Nice collection of eBooks covering a variety of topics in business analytics, data mining, big data, machine learning, algorithms, tools, and programming languages. Along with a link to the online version, many of the titles include a link to purchase a hardcopy at Amazon. Sometimes lists like these link to torrents and pirated copies but these look legitimate.
Issue 31 - Tools and Techniques
The data science ecosystem, part 3: Data applications
The final installment of a three part series that explores the massive landscape of tools that are available to data scientists. This week's focus: Data Applications - where the "sexy stuff" like predictive analysis, data mining and machine learning happen. This is the part where you take all your data and do something really amazing with it...
Issue 23 - Resources
mlxtend - Machine Learning Library Extensions
Sebastian Raschka’s mlxtend library has a number of tools to extend Python's data analysis and machine learning libraries. This is definitely worth checking out.
Issue 14 - Tools and Techniques
Pattern - A Python Module for Mining the Web
Pattern is a web mining module for Python. It has tools for data mining (Google, Twitter and Wikipedia API, a web crawler, a HTML DOM parser), natural language processing (part-of-speech taggers, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, clustering, SVM), network analysis and visualization. It's free, well-documented, and comes with lots of examples and unit tests.
Issue 9 - Tools and Techniques
Introduction to Scikit-Learn: Machine Learning with Python
This tutorial covers the basics of Scikit-Learn, a popular package containing a collection of tools for machine learning in Python. This was one of several tutorials presented at the ESAC Data Analysis and Statistics Workshop recently. They're all well done but after reading the introduction, definitely check out the Support Vector Machines tutorial.
Issue 6 - Tools and Techniques
Six of the Best Open Source Data Mining Tools
Here are six powerful open source data mining tools to accomplish tasks using artificial intelligence, machine learning and other techniques to extract value from data.
Issue 4 - Data Viz
MLDemos - A visualization tool for machine learning
Here's one more great resource for learning about machine learning. MLDemos is an open-source visualization tool for studying how these algorithms work. Fun to play with!
Your search for
machine learning tools returned 54 results
Issue 294 - Tools and Techniques
Darts: Time Series Made Easy in Python
Doing machine learning with time series data can get complicated fast and Darts is an open-source library that aims to simplify the process. It's inspired by scikit-learn and uses a consistent API with a powerful set of tools. This announcement explores its capabilities and motivations.
Issue 291 - Tools and Techniques
What I learned from 200 machine learning tools
To better understand the landscape of available tools for ML production, Chip Huyen researched every AI/ML tool she could find. In this post, she explores the landscape and identifies under-served problems and opportunities. This is well-researched and insightful.
Issue 291 - Tools and Techniques
Using GitHub Actions for MLOps and Data Science
GitHub just released a collection of new tools to help with automation, collaboration and reproducibility in your data science and machine learning workflows. Here are the details.
Issue 279 - Resources
CS472: Data science and AI for COVID-19
This new Stanford course will introduce the epidemiology and biology of the COVID-19 virus and will investigate it using data science and machine learning tools. A lot of the course material, including lecture videos, are planned to be posted on the course website for non-Stanford students. Starts April 10.
Issue 263 - Tools and Techniques
Open-Sourcing Metaflow, a Human-Centric Framework for Data Science
Metaflow is an end-to-end workflow tool from the Machine Learning Infrastructure team at Netflix. It helps you design your workflow, version experiments, deploy models to production, run them at scale and inspect results in notebooks - all without engineering expertise.
Issue 254 - Tools
MLOps Tooling
In this post from the MLOps NYC Conference, Todd Morrill summarizes use-cases and pros/cons of several tools that are commonly used for building and managing machine learning models. Covers Kubeflow, MLFlow, SageMaker, Dask and Rapids.
Issue 254 - Tools
Turn Python Scripts into Beautiful ML Tools
Streamlit is a new open source app framework that's being billed as the "fastest way to build custom ML tools." The founders are industry veterans with first-hand insights into the pain points of machine learning engineers. As Streamlit co-founder Adrien Treuille describes it, "we’re giving engineers these sort of Lego blocks to build whatever they want."
Issue 251 - Tools and Techniques
Machine Learning Workspace
The ML workspace is an all-in-one web-based IDE for machine learning and data science. Combines Jupyter, VS Code, TensorFlow, and many other tools and libraries into one convenient Docker image.
Issue 251 - Tools and Techniques
Machine learning: go full stack or go home
Traditional machine learning startups building single tools have the wrong idea. Today, companies need to be full-stack to thrive.
Issue 237 - Tools and Techniques
Introducing Kedro: The open source library for production-ready Machine Learning code
Kedro is an open-source workflow development tool that helps you build data pipelines that are robust, scalable, deployable, reproducible and versioned. This post introduces the project and for details, here's the code repo >>
Issue 236 - Tools and Techniques
GAMS in R
GAMs offer offer a middle ground between simple linear models and complex machine-learning techniques, allowing you to model and understand complex systems. This short, interactive course will teach you how to use these flexible, powerful tools to model data and solve data science problems.
Issue 235 - How-to
How to Deploy Machine Learning Models
Nice guide to getting machine learning models into production. It's fairly high-level but there are links throughout to go deeper. Includes discussion of the complexities involved, design considerations, tooling, testing, developments to watch, etc.
Issue 233 - Tools
Data version control with DVC. What do the authors have to say?
DVC is an open-source version control system for datasets and machine learning models. It's a key tool that's worth being familiar with. In this interview, Dmitry Petrov, the creator of DVC, discusses the project's origins, the problems it solves, how it's used, upcoming features, etc. This is a great discussion.
Issue 228 - Tools
doccano
doccano is an open-source text annotation tool for machine learning. It provides annotation features for text classification, sequence labeling and sequence to sequence.
Issue 216 - Data Viz
Manifold: A Model-Agnostic Visual Debugging Tool for Machine Learning at Uber
This post from the Uber Engineering Blog introduces a new internal tool for debugging machine learning models. Called "Manifold," the tool leverages visual analytics to help machine learning practitioners optimize their models and identify trouble spots. This post describes the thinking behind Manifold's visual design and how it works. For anyone interested in data visualization, Uber's team is one of the most innovative around.
Issue 202 - Data Viz
Machine Learning for Visualization
Ian Johnson, a DataVis UXE engineer at Google, explores how a relatively new set of tools can change the way we explore large datasets.
Issue 200 - Tools and Techniques
The What-If Tool: Code-Free Probing of Machine Learning Models
Google's new What-If Tool enables users to analyze a machine learning model without writing code. Given pointers to a TensorFlow model and a dataset, the What-If Tool offers an interactive visual interface for exploring model results. The post on the Google AI blog offers a good overview. For more info and online demos, check out the What-If Tool project site >>
Issue 191 - Tools and Techniques
What do machine learning practitioners actually do?
Machine learning talent is reportedly in short supply but what is it that these experts really do? Answering that will help guide which skills to teach, which tools to build, and which processes to automate. This post by Rachel Thomas offers a practical perspective of the field, including an overview of common organizational issues that hinder machine learning efforts.
Issue 175 - Tools and Techniques
LabNotebook - A simple experiment manager
LabNotebook is a tool that allows you to monitor, record, save, and query your machine learning experiments. This is an Alpha version so expect some issues but it looks promising.
Issue 171 - Tools and Techniques
Gartner’s 2018 Take on Data Science Tools
Here's a nice overview of Gartner’s 2018 report, "Magic Quadrant for Data Science and Machine Learning Platforms." From ~100 companies that sell data science software, Gartner selected 16 of the most important to rate on their vision and ability to execute. This overview by Robert A. Muenchen includes key developments and a link to an in-depth analysis.
Issue 162 - In Case You Missed It
Be sure to catch the most popular links from last week's issue...
Issue 160 - Tools and Techniques
Amazon SageMaker – Accelerating Machine Learning
At its re:Invent Conference last week, AWS unveiled multiple tools that will be useful for data scientists and engineers. In particular, check out this announcement about SageMaker.
Issue 155 - Resources
Awesome Machine Learning for Cyber Security
From the Awesome Series, here's a curated list of tools and resources that are related to the use of machine learning for cyber security.
Issue 141 - Tools and Techniques
Using Machine Learning to Predict Value of Homes On Airbnb
Robert Chang from Airbnb's Engineering and Data Science team describes their machine learning infrastructure and how it enables their team to work effectively. Includes discussion about the tools and process they use for feature engineering, prototyping & training, model selection & validation, and deploying models to production.
Issue 140 - Data Viz
Facets - Know Your Data
As part of its new People and AI Research Initiative (PAIR), Google open-sourced two new visualization tools to help engineers understand and analyze machine learning datasets. This page offers a project description, live demos and the ability to upload your own datasets.
Issue 134 - Tools and Techniques
How A Data Scientist Can be More Productive
This is a good entry point for a data science iteration tool called "DVC." It stands for "data version control" and is based on concepts that are used in software engineering to facilitate ongoing development. DVC makes it easy to create versions of machine learning algorithms and to share the corresponding code, dependencies, and data in a single, reproducible environment.
Issue 125 - Deep Learning
Try Deep Learning in Python now with a fully pre-configured VM
Here's an easy way to get all those open source libraries installed and working on your own computer. This virtual machine image is complete with Ubuntu, Python 3.5, all the required libraries, and tools like TensorFlow, Theano, Keras, OpenCV, and dlib.
Issue 118 - Resources
100+ Free Data Science Books for 2017
This collection has some great picks with books in a variety of categories including analytics, interviews, distributed tools, Python, R, SQL, NoSQL, machine learning, AI, data visualization, and math.
Issue 111 - In the News
A Guide to Solving Social Problems with Machine Learning
There are enormous gains that can be made from using the latest machine learning tools. But there are also many challenges and some of the most important are easy to miss. This article is aimed at anyone who wants to use data science to create social good, but is unsure how to proceed.
Issue 108 - In the News
Economists are prone to fads, and the latest is machine learning
Is it really a useful tool or is this latest craze distorting economics?
Issue 106 - In the News
How a researcher used big data to beat her own ovarian cancer
When Shirley Pepke was diagnosed with ovarian cancer, she started working on a tool that could tailor cancer treatment to individual patients using a machine learning algorithm and genomics data. "Some people get cancer and do fundraisers — I'm good at doing computational research on complex systems."
Issue 106 - Tools and Techniques
Data Science Deployments with Docker
Containers, such as Docker, are widely used in the software industry but have been tricky to use for machine learning. A new tool from NVIDIA seeks to change that. Here's why using Docker makes sense for a lot of data science projects, what the challenges have been, and how to use NVIDIA's new tool.
Issue 82 - Resources
Setting up a Deep Learning Machine from Scratch (Software)
Detailed guide to setting up your machine for deep learning research. Includes instructions to install drivers, tools and various deep learning frameworks.
Issue 81 - Tools and Techniques
Terrapattern: "similar-image search" for satellite photos
Terrapattern is a visual search tool for satellite imagery. It uses machine learning to find places that look similar.
Issue 62 - Tools and Techniques
Introducing TPOT, the Data Science Assistant
TPOT is a Python tool that automatically creates and optimizes Machine Learning pipelines using genetic programming. It intelligently explores thousands of possible pipelines to find the best one for your data. This is an open-source project by Randy Olson and looks super interesting.
Issue 59 - Tools and Techniques
7 Tools in Every Data Scientist’s Toolbox
Nice collection of statistical and machine learning concepts that are widely used and consistently useful in a large variety of domains and problem settings.
Issue 50 - Resources
Free Data Science Books
Nice collection of eBooks covering a variety of topics in business analytics, data mining, big data, machine learning, algorithms, tools, and programming languages. Along with a link to the online version, many of the titles include a link to purchase a hardcopy at Amazon. Sometimes lists like these link to torrents and pirated copies but these look legitimate.
Issue 31 - Tools and Techniques
The data science ecosystem, part 3: Data applications
The final installment of a three part series that explores the massive landscape of tools that are available to data scientists. This week's focus: Data Applications - where the "sexy stuff" like predictive analysis, data mining and machine learning happen. This is the part where you take all your data and do something really amazing with it...
Issue 23 - Resources
mlxtend - Machine Learning Library Extensions
Sebastian Raschka’s mlxtend library has a number of tools to extend Python's data analysis and machine learning libraries. This is definitely worth checking out.
Issue 14 - Tools and Techniques
Pattern - A Python Module for Mining the Web
Pattern is a web mining module for Python. It has tools for data mining (Google, Twitter and Wikipedia API, a web crawler, a HTML DOM parser), natural language processing (part-of-speech taggers, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, clustering, SVM), network analysis and visualization. It's free, well-documented, and comes with lots of examples and unit tests.
Issue 9 - Tools and Techniques
Introduction to Scikit-Learn: Machine Learning with Python
This tutorial covers the basics of Scikit-Learn, a popular package containing a collection of tools for machine learning in Python. This was one of several tutorials presented at the ESAC Data Analysis and Statistics Workshop recently. They're all well done but after reading the introduction, definitely check out the Support Vector Machines tutorial.
Issue 6 - Tools and Techniques
Six of the Best Open Source Data Mining Tools
Here are six powerful open source data mining tools to accomplish tasks using artificial intelligence, machine learning and other techniques to extract value from data.
Issue 4 - Data Viz
MLDemos - A visualization tool for machine learning
Here's one more great resource for learning about machine learning. MLDemos is an open-source visualization tool for studying how these algorithms work. Fun to play with!