Data Elixir logo

ISSUE 338   ·   June 1, 2021

 

Insight

Nasty, brutish and short: The life of the modern CDO

What does a Chief Data Officer do exactly? This short post offers a glimpse into the world of a CDO and how they're often not set up for success within their organizations.
LinkedIn | Ian Thomas

 
 
 

To regulate AI, try playing in a sandbox

In the EU's recently proposed AI regulations, the word "sandbox" is mentioned 38 times. The idea is to develop and test AI applications in low-stakes, monitored environments before rolling them out to the general public. Experts agree that it's a step in the right direction but it's no silver bullet.
Emerging Tech Brew | Dan McCarthy

 

Sponsored Link

Mitigate Bias, Duplication, and Costly Errors with High-Quality Data

Mitigate Bias, Duplication, and Costly Errors with High-Quality Data

At Alegion, we understand that when you don't have high-quality training data, it can cost companies millions of dollars. That is why we have created a whitepaper on how our team of experts achieves quality training data, and how you can too. Read more about our four prioritized phases, the significance of quality metrics, and how to implement it all.

 

Reach Data Elixir readers by sponsoring an issue. Click here for details.

 
 

Tutorials, Projects & Opinions

Understanding the data (error) generating processes for data validation

In stats, we talk about the data generating process (DGP), yet data validation is often conducted without a theory of error generation. This post explores some failure models in ELT and implications for data consumers on effective validation.
Emily Riederer

 
 
 

JavaScript for Data Analysis

JavaScript may not be widely considered as important for working with data but the tools are evolving and JavaScript already excels at enabling collaboration and communication on the web. In this post, Mike Bostock, creator of D3.js and the Observable platform, makes a case for JavaScript as a key language for data and what to expect in the coming years.
Mike Bostock

 
 
 

Interactive Gaussian Process Visualization

This interactive visualization of Gaussian processes helps make it easy to understand prior and posterior kernel functions. The description regarding Gaussian processes is high-level but follow the link in the "Background" section for an in-depth tutorial.
Infinite Curiosity

 
 
 

Can you build an ML model to monitor another model?

Can you train a machine learning model to predict your model's mistakes? It sounds reasonable on the surface. Machine learning models make mistakes. Let's take these mistakes and train another model to predict the missteps of the first one! Sort of a "trust detector," based on learnings from how our model did in the past.
Evidently AI | Emeli Dral and Elena Samuylova

 
 
 

Session-based Recommender Systems

Session-based recommendation algorithms provide recommendations based solely on a user's interactions in a single session. That's useful because it doesn't require a history of that user's preferences. This introduction shows how these algorithms work, how to evaluate them, and things to consider if you're building your own.
Cloudera Fast Forward

 
 
 

Apply today: Master of Data Science for Public Policy

Passionate about turning data into valuable insights for the common good? Join the next cohort of the Master of Data Science for Public Policy (MDS) at the Hertie School in Berlin, Germany. This two-year programme taught entirely in English welcomes students from any academic discipline who are interested in using data to solve real-world challenges. Are you curious about the MDS? Get in touch.
// sponsored

 

Code & Tools

AgentPy - Agent-based modeling in Python

AgentPy is an open-source library for the development and analysis of agent-based models in Python. The framework integrates the tasks of model design, interactive simulations, numerical experiments, and data analysis within a single environment, and is optimized for interactive computing with IPython and Jupyter.
agentpy | Joël Foramitti

 
 
 

NocoDB - The Open Source Airtable Alternative 

NocoDB is an open-source Airtable alternative that turns your relational databases into "smart-spreadsheets." It connects to cloud services like S3 for storage and things like Slack, Twilio, Discord, etc via an App Store for workflow automations. This project was open-sourced just last week and already has over 10K stars! Also, see the Hacker News discussion >>
GitHub | nocodb

 

Sign up to get Data Elixir's  data science newsletter in your Inbox >>

 
Data Elixir logo

Data Elixir is curated and maintained by Lon Riesberg. If you have questions or suggestions for the newsletter, just reply back to this email.

 

To find specific content from prior issues or to research topics, check out the catalogued Archives on Data Elixir's Search Page >> 

 
FacebookTwitterLinkedInWebsite
 
 
Data Elixir, LLC
P.O. Box 21255
Boulder, CO 80308
Unsubscribe