Imagine you founded a successful business…
Your product is getting traction and even revenue. You’re also collecting tons of interesting data. In order to stay competitive you need to use that data – perhaps to predict user behavior, recommend content, customize the experience, or price more effectively. But that means you must hire a data scientist. It is one of the most in-demand, and expensive skill sets on the market today. Even if you’re able to find a good one, you will end up paying him or her more than $150,000 a year in salary alone  And it is going to take a few months for that person to ramp up… In comes DataRobot.
DataRobot is a software company that attempts to automate the work done by data scientists. It promises to be better, faster, and cheaper.
This sounds too good to be true…
DataRobot uses brute force to ingest your datasets and algorithmically output the best predictive model for it. The platform evaluates 1000’s of models in open source libraries. It then “searches through millions of possible combinations of algorithms, pre-processing steps, features, transformations and tuning parameters to deliver the best models for your dataset and prediction target” . The process is computationally expensive, but far more comprehensive, cheaper, and quicker than hiring a data scientist.
DataRobot creates value by running state of the art analytics on your datasets. It captures it through charging a fee for use of an API that encapsulates the best possible predictive model for that dataset. So now, with only one additional line of code, your software makes an API call and adds a predictive analytics layer to your software. Its like magic!
For a visual explanation of how DataRobot works, see this video.
Who else is out there?
Today, DataRobot’s primary competition are data scientists. And good ones are hard to find, easy to lose, and expensive to retain. They do not pose a great threat to DataRobot. But other companies are attempting to build platforms that are just like Data Robot, such as Watson Analytics, and Loom Systems. These is even, machineJS, a push in the open source community to build this platform. These pose a more serious threat than humans. DataRobot’s “brute force” method – testing every open source predictive model on a dataset and picking the best combination – is not particularly opaque. So it is likely that these competitors can easily catch up. What is more, this brute force method requires a lot of servers and computational power, which is expensive. And the bigger competitors, such as Amazon and Google, which have their own cloud infrastructures, can do this at less of a cost than DataRobot, which relies on AWS for the heavy lifting. However, DataRobot has one interesting advantage: data about its data. It can develop a broad understanding of what kinds of algorithmic methods work best for any given datasets. Given a certain data type, distribution, size, or shape, it can look to previous analysis it has conducted and reduce its compute time and costs. In order to stay competitive, DataRobot must optimize, and learn from, its own previous analysis attempts.
So while today we see data scientists and software engineers building tools to successfully replace many high skilled jobs – we about to also see them automate their own work!