No images? Click here ISSUE 269 · January 28, 2020InsightGPT-2 and the Nature of IntelligenceIn many ways, GPT-2 works remarkably well so does it matter that it doesn't know what it's talking about? In this essay, Gary Marcus explores what GPT-2 reveals about the nature of intelligence. Sponsored LinkData scientists are in demand on VetteryVettery is an online hiring marketplace that's changing the way people hire and get hired. Ready for a bold career move? Make a free profile, name your salary, and connect with hiring managers from top employers today. Tools and TechniquesML CO2 ImpactMore and more, researchers are acknowledging that training big models requires a non-trivial amount of energy. How much energy? This interactive site estimates the carbon impact of your research based on your specific hardware, run-time and cloud provider. Also, see the related paper, Quantifying the Carbon Emissions of Machine Learning. Bayesian Product Ranking at WayfairWayfair's product catalog is massive. They have something for everyone but that selling point makes it hard to know which items to present to new customers. In this post, Dave Harris and Tom Croonenborghs walk-through Wayfair's new Bayesian system that considers a variety of factors to identify products with broad appeal. What's wrong with computational notebooks?Notebooks are wildly popular but they're far from perfect. In this post, Austin Z. Henley summarizes the findings of a recent paper that explores key pain points, needs, and design opportunities. Survey Shows AI & ML Are Still Nascent277 data scientists across 20 industries confirmed the broad challenges observed in our customer organizations. Labeling training data for ML projects poses a significant obstacle for data science teams as they strive to prove ROI and get their projects into production. ResourcesDiscovering millions of datasets on the webGoogle's Dataset Search is now officially out of beta. There are currently ~25 million datasets in the index and it's still growing. Here's an overview of what's in the index now, how to access it, and how to make sure your own datasets get included. Elements of Data ScienceThis introduction to data science is intended for people with no programming experience. The goal is to present a small subset of Python that allows you to do real work in data science as quickly as possible. This is by Allen Downey, author of Think Python, and will be used to teach introductory data science in courses at Olin and Harvard. Data VizUnderstanding The Altair StackAltair is a Python visualization library that's rooted in the "Grammar of Graphics," like ggplot2. It's related to a few other visualization specifications and libraries, such as Vega-Light, Vega, and D3.js. This is a super useful post that walks-through what they each do, how they're connected, and ultimately, how to best utilize this visualization stack. ‘Twelve Million Phones, One Dataset, Zero Privacy’In this interview, Stuart A. Thompson gives a behind-the-scenes look at how the “One Nation, Tracked” story came to life at the New York Times. Conferences & EventsStrata Data & AI - London - Data feeds AI; AI makes sense of data. So it also made sense to combine the O’Reilly Strata Data and AI Conferences—covering two of the most pressing technological trends of the decade—and giving you access to the full breadth of both programs. April 20-23. Details and Registration Info >> CareerAnalysis of compensation, level, and experience details of 19k tech workersChip Huyen analyzed compensation and levels data for 19k tech workers to answer questions about compensation, career paths, and bias. The data primarily covers software engineers in large U.S. companies but this is a great analysis with broad insights across tech. Getting into Sports Analytics 2.0Great post by an industry insider about getting work in sports analytics. Includes insights about university opportunities, where to find datasets for personal projects, competitions, conferences and more. Job BoardAlong with the jobs below, announcements on the Job Board this week include openings for a Head of Data at the NBA, a remote Data Scientist at Life Epigenetics, a Lead for NASA's Scientific Visualization Studio, and a Data Scientist at Apple. The Job Board is being actively curated. If you know of an interesting opening, get in touch!
![]() Data Elixir is curated and maintained by @lonriesberg. For additional finds from around the web, follow Data Elixir on LinkedIn, Twitter or Facebook. |