Civis Data Science R&D Bookshelf

September 1, 2017 Michael H.

This post is part of a new series from the Data Science R&D department at Civis Analytics. In this series, a Civis data scientist will share some links to interesting software tools, blog posts, scientific articles, and other things that he or she has read about recently, along with a little commentary about why these things are worth checking out. Are you reading anything interesting? We’d love to hear from you on Twitter.

pyheatmagic

This is a nice little IPython magic that can be used in a Jupyter notebook to see which parts of your code are taking a long time to run. This also demonstrates the awesomeness of open source software since it builds off of the existing packages pprofiler and matplotlib. At Civis, we’re particularly interested in cool Jupyter notebook add-ons like this because we recently released Jupyter notebooks as a feature in our data science platform.

ipywidgets

I spent some time the past week exploring Jupyter/IPython widgets (ipywidgets). Jupyter notebooks are a great way to integrate code with documentation and outputs such as visualizations. In some cases, however, a data scientist may want to expose their methods to a less-technical user as an interactive web app. The R ecosystem has a great tool for this use case in R Shiny, but it’s a little less clear what the “market leader” is for Python data science apps. After trying ipywidgets out for a few different applications, I’m really optimistic about them as a way to make notebooks more interactive and/or to make data science web app development easier, depending on how you look at it.

CS 294: Fairness in Machine Learning

I came across this course about fairness in machine learning at UC-Berkeley. It’s really great to see courses like this popping up, so that data scientists and others use machine learning algorithms carefully and appropriately rather than just as a magical black boxes that might perpetuate human biases. The syllabus here looks like an excellent collection of further reading on this topic.

The Innovators

This book by Walter Isaacson is a couple years old, but I just finished it and was blown away. It tells the story of the pioneers of computing, from early visionaries such as Ada Lovelace and Alan Turing (both of whom Civis has conference rooms named after) to the advent of the Internet, the World Wide Web, search engines, etc. It also highlights two themes that I find appealing: the importance of teams in modern technology, and the power of human-computer interaction. Protip: reading this while watching Halt and Catch Fire will turn your sense of tech nostalgia up to 11.

The post Civis Data Science R&D Bookshelf appeared first on Civis Analytics.

Previous Article
Fairness in Data Science
Fairness in Data Science

Here at Civis, we build a lot of models. Most of the time we’re modeling people and their behavior because ...

Next Article
More Data More Problems: Variable Selection with Multiple Response Variables
More Data More Problems: Variable Selection with Multiple Response Variables

More data isn’t always better! This post will go over why and how we removed uninformative variables from a...