ISSUE 413 · November 15, 2022InsightsData’s day of reckoningData is sometimes said to be the "new oil" but for most businesses, that analogy doesn't work. In the most successful of data organizations, data might burn bright but, as Benn Stancil puts it here, in most businesses, data burns more like peat moss. Here are some reasons why, with some ideas for all the median businesses. Sponsored LinkWeb scraping datasets made easy - ScrapFly.ioThe web is full of quality data though scraping it can be difficult. ScrapFly API can retrieve any web page or simplify the web scraping process through cloud web browsers - click buttons, input forms and retrieve the data. ScrapFly comes with a Python SDK making scraping in notebooks a breeze - Try ScrapFly for free! Tutorials, Projects & OpinionsHow Federated Learning Protects PrivacyMost machine learning models are trained by collecting vast amounts of data on a central server. This is a great visual explainer that shows how federated learning makes it possible to train models without any user's raw data leaving their device. Forecasting with Structural AR TimeseriesThe strength of a Bayesian model is largely the flexibility it offers for different modeling tasks. In this tutorial, Nathaniel Forde shows how to fit and predict a range of auto-regressive structural timeseries models and how to predict future observations of the models. Method Chaining in Pandas: Bad Form Or a Recipe For Success?Matt Harrison has written books on pandas and Python and regularly trains data science teams at top companies. And yet, his code is sometimes met with derision online. In this interview, he explores his approach to code, how to think about method chaining, and what separates naive code from good code. Using Functional Analysis to Model Air Pollution DataFunctional analysis is one approach to understand how your data changes within a given timeframe, such as a day, or between timeframes such as many days. This is an easy-to-follow tutorial that shows how to apply functional analysis to some messy air pollution data using R. How I learn machine learningIn a rapidly evolving field like machine learning, you need to figure out what works for you to navigate the never-ending task of staying up to date. In her latest post, Vicki Boykis shares her own process, including lots of links and resources along the way. Tools & CodeDebirdifyThis is a great tool if you're looking for Mastodon accounts to follow and want something more nuanced than a haphazard list of user handles. Debirdify searches a specific Twitter user's Lists and/or Followed Accounts for associated Mastodon handles and returns a Mastodon-friendly csv file. ResourcesAdvanced NLP - Carnegie Mellon 2022Graham Neubig's "Advanced NLP" is one of the best resources you'll find for current state-of-the-art techniques and algorithms in modern NLP. Follow the links for the slides and an awesome collection of readings and resources. Go here for the lecture videos 👉 CareerLooking for Ambitious Machine Learning EngineersRatio is a revenue-generating startup that's looking for ambitious machine learning engineers to help automate a big part of the advertising space. The product is "like a self-driving car of marketing" and largely uses existing models from OpenAI. Remote OK. New OpportunitiesIn addition to office-based positions around the world, Data Elixir's Job Board currently has 35+ listings for remote positions, including roles for data scientists, data analysts, researchers, data architects, machine learning engineers, and more. The roles cover a variety of job levels, from Junior to Senior. If you're HIRING, join the Data Elixir Talent Collective and get regular drops of outstanding data practitioners and leaders who are open to new opportunities 👉Data VisualizationImages by Daniel Coe / CC BY-NC-ND 2.0 / Links: Image 1 Image 2 Image 3 Visualizing Rivers and Floodplains with USGS DataAwesome tutorial that shows how to create visualizations of the flow of water through rivers and floodplains using publicly available USGS data and open source tools. Includes links to tools, data, key resources, and a gallery of stunning visualiztions. Galileo’s Telescopic Discoveries: |