Best practice in data analysis is to begin with a question, form a hypothesis, design an experiment, and perform the appropriate analysis to find correlation or causation between dependent and independent variables. In business, this analysis is usually used to make predictions about future behavior of customers, sales teams, etc.
In reality, most businesses look to solve a problem at hand, for example fraudulent credit card transactions, by analyzing data they have and assuming certain relationships or linkages between either inputs or outcomes. This approach is problematic for several reasons – it is not generally randomized, and it is often overly localized analysis that can lead to overfitting or overly generalized analysis that can lead to analysis that tries to group all similar outcomes in one bucket of cause and effect.
Ayasdi is an enterprise software company that uses topological analysis to help data scientists understand the shape of their data. The science of topology is primarily concerned with the study of shape (i.e. linear, clustered, flared, etc.). With data, the shape encodes structure and meaning. Because business data is often large and high-dimensional, most companies bucket outcomes (eg. fraud) in a single category and apply one understanding of shape. By using shape as the first step in the analytical process, Ayasdi allows businesses to build principled local models.
In the example of credit card fraud, banks may experience fraud for many reasons but will create one rules-engine to provide predictive criteria for identifying fraudulent behavior. However, this approach will attempt to explain both the fraud perpetrated by a guy who steals wallets and uses the cards as well as the fraud where bulk user data is stolen from a retailer and used elsewhere. Clearly, these cases are different and you would expect different behavior between these groups – a clear example of clustering (see image of this analysis below).
But if you’ve applied only a clustering model you might not see similar trends across nodes or predict new suspicious behavior as seen in the clusters of predicted fraud shown in the image above to the right. Their results from an engagement with one credit card issuer is shown below:
The value creation in this case study is immediate and quantifiable in the amount of savings the credit card issuer achieved in better prediction of fraud. Value capture is done in two ways: first, in providing access to the software tools that can enable companies to do this analysis on their own; or two, through consulting engagements, working with partners to help them process and interpret the analysis.
Ayasdi started out of a Stanford research lab and has been funded by DARPA and several other research grants. They have now commercialized the product and raised several rounds of VC funding. Their challenge will be in clearly articulating their value to enterprises that may not have sophisticated data teams nor systems to analyze that data. Also, given what a hot industry analytics current is, the search for talent will be expensive and time consuming. However, they appear poised for growth and have the potential to greatly impact the way companies think about filtering, analyzing and using their data to drive intelligent business decisions.