You’ve built it, now automate it with Workflows in Civis Platform

October 18, 2017 Lori E.

Today we’re launching Workflows in Civis Platform! This new feature is a data science automation tool that enables data scientists to cross small, manual tasks off their daily to-do lists so they can focus on what they do best: help decision makers see the big picture and make informed decisions. Workflows allow piecewise development, testing, and deployment of data science code, making it easier for data scientists to build robust, repeatable solutions.

Workflows in Civis Platform are a great solution for all types of data science work. With Workflows, teams can:

      • Run a Workflow every morning to update model scores, cut a list of sales targets from those scores, and upload the list to a CRM so sales teams have the best, most current information.
      • Periodically clean and unify multiple datasets to make them useful for the rest of a team’s modeling efforts.
      • Import the most up-to-date customer engagement data from multiple sources, then automatically deliver reports on each engagement metric to marketing teams.

Solving a data science problem starts with data and ends in actionable results — but what happens in the middle? There are many small, iterative steps a data scientist has to take to create data-driven solutions. From cleaning, appending, and unifying data, to choosing an algorithm, building a training set, and training a model, to scoring a dataset and shipping a report to the decision makers who need it…the path to action can get pretty complicated.

Getting all of this work automated and into production by yourself is a challenge. You could kick off each step manually whenever a new report is needed. Or you could write your own scripts to chain the work together. Or you could put every small step into one long script and hope it doesn’t break. These options are time-consuming and difficult to debug — distractions from the heart of a data scientist’s actual work.

Data scientists of the world: we see your struggles, and we’re here to help.

With Workflows, data scientists can use a graphical UI or YAML code to chain together imports, SQL scripts, Python scripts, models, exports, and more into flexible end-to-end data pipelines. From your simplest daily reporting to your most complicated data consolidation pipeline, a Workflow is the right tool for the job. Data scientists can set up a Workflow to automatically run on a schedule, which puts that Workflow into production and frees teams up to move on to solve the next problem.

Grace Shrader is an applied data scientist at Civis who investigates the effectiveness of one of our large client’s digital advertising campaigns. She used a Workflow to automate her pipeline:

Digital advertising is a hairy beast. I discovered this because I worked with a client to measure the effectiveness of their digital advertising campaigns. Building a reporting pipeline to capture a range of metrics (clicks, impressions, etc.) across numerous campaigns is a challenge, but is very beneficial to the client because they can change their strategy to get a much better return on their investment. In order to accurately assess the effectiveness of a digital campaign (and, more importantly, the “why” behind that performance), we needed data not only from the company doing the advertising, but also from several other partners. It takes a full understanding of all of these data resources to understand which aspects of an ad campaign are the most important to driving success and meeting our clients’ KPIs.

Executing this process means:

      • Importing data from a multitude of sources, including Google Cloud Storage, Amazon S3, Box, and the client’s own databases.
      • Linking the data together in a way that enables us to extract the key relationships among features in all of the data sets.
      • Using our understanding of those relationships to make predictions, calculate KPIs, and ultimately obtain the answers we need to help our client optimize their advertising.

And we don’t just need to execute this process once—it has to run it on a regular schedule in order to keep our analyses up-to-date.

As a visual thinker and communicator, I initially planned out my pipeline with Post-it notes on a whiteboard. Creating a pipeline is a highly visual task and I’d previously spent a fair amount of time sketching it out to keep my coworkers in the know.

Grace’s visualization method before Workflows.

Then came Platform Workflows.

Workflows automatically produces a visual representation that consistently reflects the current state of the job-flow sequence. Each step in the data pipeline is represented by a Workflow task, and the logical path is very comprehensible. Workflows enabled me to easily translate my sticky note vision into a functional pipeline and share that vision with my team.

All the data we accumulate for this work is measured in terabytes, so as you can imagine, processing efficiency is key. Workflows in Civis Platform gave me specificity and control: When two tasks in my pipeline didn’t depend on one another, I needed them to run in parallel to save time. I also needed multiple tasks to complete before subsequent jobs got started. Finally, I needed to incorporate the logic of a plan of action as to how my pipeline would proceed should one of its scripts fail.

Platform Workflows gave me the control to specify all this.

And how she used a Workflow to build and share her pipeline.

With Workflows, I now have a series of data sets in Platform that are joined and aggregated in the way I need them, and they are always up to date without requiring me to touch my keyboard. My process is transparent, so both the client and my team here at Civis can view my Workflow in Platform directly. When we discuss solutions with the client, my Workflows provide a visual representation that makes communication so much easier. The client team can quickly understand how we at Civis combine their data to get the answers they want.

For Grace, Workflows gave her a path to production. She had a seamless transition from sticky notes to a Workflow graph, and from there she could fill in the details for each node on the graph. Workflows helped her organize her project plan, build out a solution, and automate it for the entire team.

Workflows allow Civis users to build a solution once, put it on a schedule, and have data-driven results automatically delivered to the people who need them the most. We’re excited to offer this feature to all Civis Platform users so they can recoup their time and solve more problems.

Check out this video to learn more and, for Civis Platform users, visit the Platform Help Center to get started!

The post You’ve built it, now automate it with Workflows in Civis Platform appeared first on Civis Analytics.

Previous Article
Civis Bookshelf: The Joys of Reading Python Source Code
Civis Bookshelf: The Joys of Reading Python Source Code

This post is part of our Bookshelf series organized by the Data Science R&D department at Civis Analytics. ...

Next Article
Hack Hunger Hackathon
Hack Hunger Hackathon

One of the best things about working at Civis Analytics is the number of opportunities we have to give back...