The New York Times’ Dr. Chris Wiggins Talks Data Science
Dr. Chris Wiggins, chief data science at the New York Times and associate professor of applied mathematics at Columbia University, was the inaugural lecturer for the College of Science’s Lecture on Data Science on Monday, January 14, in Patrick F. Taylor Hall.
At Columbia, he is a founding member of the executive committee of the Data Science Institute and of the Department of Systems Biology, and is affiliated faculty in Statistics. He is also a Fellow of the of the American Physical Society and is a recipient of Columbia’s Avanessians Diversity Award. In addition, Wiggins co-founded and co-organized hackNY, a nonprofit which since 2010 has organized once a semester student hackathons and the hackNY Fellows Program, a structured summer internship at NYC startups.
Dr. Wiggins used the phrase “new tools require new mindsets” when talking about the changes scientists have had to make in their research as technology has advanced the data available to them. When big data sets came into existence in biology with genome sequencing, there was a need for people who could make sense of the large amounts of data. After years of technological advances, data sets were getting bigger in different areas of research outside of the natural sciences.
Data scientists are now needed in almost every field to sift through the numbers to make it into something usable to industry leaders. When it comes to working in industry and collaborating with people in other business areas like marketing and finance, Wiggins said data scientists need to be able to develop and deploy machine learning solutions to newsrooms and business problems as easy-to-use software.
“As a data scientist, you are supposed to work with people from different domains,” said Wiggins. “You are the broker between the data and the collaborator you are working with.”
Wiggins broke down data science into three categories: descriptive data sets, predictive models, and prescriptive solutions. He described a descriptive data set as one that describes the type of reader different New York Times articles can target and can describe readers’ reading behaviors to help recommend articles to them.
The New York Times uses predictive models to predict which potential readers will subscribe and which subscribers will cancel their subscriptions. By looking at the canceling subscribers, they can also pinpoint risky behaviors that might lead to more subscribers canceling their subscriptions. He has also helped create software that can predict how a reader will feel after reading an article, which advertisers can use when deciding where to place their ads.
Prescriptive solutions are used to recommend the best treatments to reach certain goals. These treatments for the New York Times include using data to recommend articles to readers based on past articles they’ve read and to recommend how to use social media more scientifically to reach readers.
When an attendee asked Wiggins where he thought data science was headed in the next five years, he responded, “I am most excited about seeing advances on how to learn causality from observation, understanding deep learning and hard to interpret data, and health data, specifically with self-reporting mobile apps.”