Back in February, I got the chance to speak at O’Reilly’s Strata+Hadoop World 2015 about some of the unique data science work that we do here at Civis. We recently got a video of the talk and thought this would be a great time to share it.
Over the last year, we’ve spent a good chunk of time thinking about the ways in which data scientists and social scientists can work together. While we don’t always think the same way, use the same tools, or even speak the same mathematical languages, here at Civis we’re determined to leverage the strengths of both disciplines to understand, predict, and change human behavior for the better. In my talk, I go through some of the challenges and benefits of data science/social science collaborations and share some success stories from our projects. Here’s the video, with a quick summary from the abstract below. Enjoy!
The Two Cultures of People Science
When it comes to understanding how people behave, data scientists are relative newcomers to a game that social scientists have been playing for a long time. With the rise of vast new amounts of data, and of the tools and techniques that data scientists bring to bear in analyzing that data, a wealth of new collaboration opportunities have opened up. But along with those opportunities come challenges. While the two disciplines don’t face anything like C.P. Snow’s “mutual incomprehension”, they also aren’t natural bedfellows. Data science is steeped in the culture of technology, computer science, and other “harder” sciences. The tools and techniques that come naturally to social scientists””surveys, randomized controlled trials, natural experiments””aren’t usually part of a data scientist’s training. And machine learning, optimization, and massively parallel computing aren’t traditional parts of a social scientist’s toolkit either.
This talk will present tips for truly productive collaboration based on several successfully executed case studies from Civis Analytics, where we employ a team of social scientists, political scientists, and data scientists to understand the many facets of human behavior. These include using simulated annealing optimization methods to generate representative poll weightings, novel transfer learning techniques from machine learning for modeling small surveys, Bayesian MCMC methods for poll combination, and the design and modeling of randomized controlled experiments to understand the true causal impact of various interventions.