Civis Bookshelf: Modeling to predict sarcasm with emoji

September 15, 2017 Keith I.

This post is part of our Bookshelf series organized by the Data Science R&D department at Civis Analytics. In this series, Civis data scientists share links to interesting software tools, blog posts, scientific articles, and other things that they have read about recently, along with a little commentary about why these things are worth checking out. Are you reading anything interesting? We’d love to hear from you on Twitter.

Learning To Optimize With Reinforcement Learning

Neural networks are a surprisingly useful tool, helping us classify images, translate text, and even play Atari. They provide a very general framework for creating algorithms. Interestingly, the process of training a neural network is itself an algorithm. The process of training a neural network can actually be performed by another neural network. This blog post summarizes two papers which demonstrate the idea. Their evidence suggests a neural network training another neural network might speed up training time over traditional methods based on stochastic gradient descent.

Detecting Sarcasm With Emoji

Sarcasm is a notoriously difficult concept to identify with the methods of natural language processing. For example, in this paper the authors found that even people have a hard idea agreeing whether a tweet was sarcastic or not! With DeepMoji, the authors go about classifying sarcasm (and other NLP tasks like sentiment) using a fun and indirect approach. First, the authors build a model to predict which emoji a tweet contains. There is a lot of labeled data for this task, so the model is well-tuned. Next, the authors apply this emoji model to a new task like sarcasm detection. They slightly adjust the emoji model using the small amount of labeled data they have for the specific sarcasm detection task. Surprisingly, this technique works 💯😃👍. It seems that building a model on emoji results in a model that can (approximately) predict the emotion of a tweet, which is useful for predicting sarcasm.

Tips for Project Managers Building Internal Tools

On the R&D team at Civis, we write a lot of tools which are used around the company (some of which we open source!). This article has some great tips on developing and releasing software to colleagues. Briefly, the article suggests to pay attention to how your colleagues use your software, make it easier for them based on what you see, and be up front about what you won’t or can’t do. I couldn’t agree more with the author on how rewarding it is to create internal tools. Talking to and hearing from users is a great part of writing software.

A Machine Learning Research Paper Aggregator

Machine learning is developing so rapidly it can be hard to keep up. From SELU to ByteNet to Sparsely-Gated Mixture-of-Experts, researchers are presenting new techniques at a blistering rate and it’s difficult to understand what’s most important. Karpathy, Tesla’s Director of AI, created a website which follows the papers mentioned by machine learning experts on Twitter. While I still find talking to my colleagues as the best source for finding out about novel findings, this aggregator often picks up on the biggest findings very quickly.

The post Civis Bookshelf: Modeling to predict sarcasm with emoji appeared first on Civis Analytics.

Previous Article
The People Behind the People Science: Intern Edition
The People Behind the People Science: Intern Edition

Summer in Chicago means warm weather, food trucks, patio season, and a festival for every occasion. At Civi...

Next Article
Q&A with the JDRF Illinois Chapter: Using Data to Diversify and Grow the Nonprofit’s Supporter Base
Q&A with the JDRF Illinois Chapter: Using Data to Diversify and Grow the Nonprofit’s Supporter Base

As part of the nonprofit team at Civis, I have the opportunity to help mission-driven organizations work mo...