ISSUE 336 · May 18, 2021InsightBe Decision-Driven Not Data-DrivenSince launching its annual executive survey in 2012, NewVantage Partners has watched leading companies steadily invest in efforts to become more data-driven. Nearly 99% of reporting companies report active investments in data initiatives and yet, only 24% claim to be in a data-driven organization. Maybe being data-driven is the wrong goal... Sponsored LinkRay Summit: Scalable ML & AI for everyoneWant to learn the best way to scale? Ray Summit brings together data scientists and engineers to build scalable ML & AI using Ray, the dominant platform for distributed computing. Learn about top trends in machine learning & AI, ML in production, reinforcement learning, cloud computing & more. Register to join live or on-demand. Tutorials, Projects & OpinionsWhy Dagster is the next-generation data orchestratorDagster is a data orchestrator for machine learning, analytics, and ETL. It's similar to Airflow but it handles each stage of the data life cycle differently. In this post, Nick Schrock compares the two systems and explores the advantages of using Dagster. Using PostgreSQL as a Data WarehouseWith some tweaking, Postgres can be a great data warehouse. Here's why that's worth considering and how to configure it. Good Data Scientist, Bad Data ScientistRegardless of what part of the stack you work on, there are common traits that separate "good" data scientists from "bad" data scientists. This short post applies to most people in tech and is good food for thought. What is a Vector Database?Vector embeddings are fundamental parts of many recommendation and search algorithms and they've become increasingly important to machine learning applications. This post introduces vector embeddings, their unique needs and how, ultimately, a new type of database is needed. Scale Transform: The Present and Future of AIScale Transform broke records and brought together more than 10,000+ leading researchers, practitioners, and executives. The conference featured an all-star line-up of 27 of the leading AI researchers and practitioners and 19 sessions from the latest research breakthroughs to the real-world impact across industries. Code & ToolsGreykite for flexible, intuitive, and fast forecastingGreykite is an open-source Python library that was developed to support LinkedIn’s forecasting needs. Its main forecasting algorithm, called Silverkite, is fast, accurate, and intuitive, making it suitable for interactive and automated forecasting at scale. Data Profiler | What's in your data?The DataProfiler is a Python library that makes it easy to extract schema, statistics and entities from your datasets. Data Profiles can then be used in downstream applications or reports. EventsData WeekGeneral Assembly's Data Week is happening THIS WEEK! Sessions are scheduled throughout the week and cover things like bias detection, communicating with data, career development, playlist recommenders, and lots more. All sessions are online and free. Data VisualizationIntroducing Dataflow, a self-hosted Observable Notebook EditorDataflow is a standalone notebook editor for Observable. Create, run, and compile Observable notebooks on your own machine! MIT license. Falx: Visualization by ExampleFalx is a visualization-by-example tool that uses small examples of a dataset to show how the full dataset should be visualized. This post is a nice introduction to the project. Follow the links for an online demo. Sign up to get Data Elixir's data science newsletter in your Inbox >> Data Elixir is curated and maintained by Lon Riesberg. If you have questions or suggestions for the newsletter, just reply back to this email. To find specific content from prior issues or to research topics, check out the catalogued Archives on Data Elixir's Search Page >> |