Data Elixir logo

ISSUE 364  ยท   November 30, 2021

 

Insight

Storm in the stratosphere: how the cloud will be reshuffled

Do you think cloud stack consolidation is inevitable? Here's a reasonable take on how the next few years could play out.
Erik Bernhardsson

 

The Sobering Truth About Your Business Ideas

Through rigorous measurement and analysis, data science has shown that most business ideas will fail. It's not because of the data team. 
O'Reilly | Eric Colson, Daragh Sibley and Dave Spiegel

 

Sponsored Link

eBook: 101 Ways to Use Third-Party Data to Make Smarter Decisions

eBook: 101 Ways to Use Third-Party Data to Make Smarter Decisions

AWS Data Exchange has created a new eBook designed as a broad compilation of use cases submitted by AWS Marketplace Data Providers. Gain access to the strategies, insights, and behaviors necessary to take companies to the next level with cost-effective third-party data solutions in the cloud.  

 

Reach Data Elixir readers by sponsoring an issue. Click here for details.

 
 

Tutorials, Projects & Opinions

Doing Data Science for Social Good, Responsibly

There are a lot of "data for good" projects out there and they can be incredibly helpful for both participants and the organizations being served. But it's not all sunshine and roses. Here's what to watch for.
fast.ai | Rachel Thomas

 

Predicting viewership for Doctor Who episodes

Using a tidymodels workflow can make many modeling tasks more convenient, but sometimes you want more flexibility and control of how to handle your modeling objects. Here's how.
Julia Silge

 

Data Quality Validation for Python Dataframes

In this post, Miguel Cabrera explores libraries that check the data quality of Pandas and Spark dataframes. Covers basic usage and the pros and cons of Great Expectations, Pandera, and Deequ/PyDeequ.
Miguel Cabrera

 

Financial market data analysis with pandas

Nice introductory tutorial that walks through a simple time-series analysis of the stock market using pandas.
Matt Wright

 

Motion planning, scaling data pipelines, ML edge cases? Hear the latest.  

Top AI and ML speakers from Facebook AI, Cruise, Zoox, GE Healthcare (and more) unveil how to successfully deploy machine learning data operations. Register now for the iMerit ML DataOps Summit, a one-day free virtual event on December 2nd in partnership with TechCrunch.  
// sponsored

 

Code & Tools

BookNLP

BookNLP is a natural language processing pipeline that scales to books and other long documents in English. Handles a variety of tasks such as part-of-speech tagging, entity recognition, event tagging, and more. 
GitHub | booknlp

 

Career

What you can and cannot control in your job hunt

When you're looking for a job, most of the hiring process is out of your control. This post, from a data science interviewer perspective, focuses on what you can control and things you can do to stand out.
Eric J. Ma

 

Data Visualization

pybaobabdt

If you're building Decision Trees or Random Forests in Python, this new package uses colors and link widths to make it easy to interpret the visualizations, even when they're large. These are much nicer to work with than typical node-link visualizations.
pypi | Stef van den Elzen

 
 

Sign up to get Data Elixir's  data science newsletter in your Inbox >>

 
 
 
Data Elixir logo

Data Elixir, LLC
P.O. Box 21255
Boulder, CO 80308

Data Elixir is curated and maintained by Lon Riesberg. If you have questions or suggestions for the newsletter, just reply back to this email.

Unsubscribe