Unsecured consumer debt in the United States is roughly ~$3.8T dollars, with student loans comprising just over $1.4T of that figure[i]. Undergraduate and graduate students in the United States typically have limited credit history and spotty FICO scores, limiting their options for affordable financing. For the last six months, I’ve served as the co-founder of an organization that negotiates student loan rates on behalf of large groups of graduate students[ii]. I’m heavily invested in the use of alternative data sources to assess the credit worthiness of students, which SoFi, a leading fintech lender, has been innovating on for the past seven years.
SoFi, founded in 2011, provides student loan refinancing and other financial products aimed at higher income individuals who have yet to build material wealth. Many of these potential borrowers have limited credit histories, so figures like FICO scores provide an incomplete view on credit risk. In fact, nearly 20 percent of Americans have no FICO score at all[iii]. SoFi began using machine learning to assess the credit worthiness of borrowers by taking into account a wide array of factors that traditional lenders do not frequently consider – educational attainment, utility payments, insurance claims, and mobile phone usage, among others[iv]. The Philadelphia Federal Reserve argues that “adding alternative data into the mix may make it possible to open up more affordable credit for millions of additional consumers”[v]. This additional data is compared to actual credit payments over time to identify a large set of traits that make one more or less likely to fulfill financial obligations.
SoFi uses machine learning to identify new customers and to learn which ancillary products it should recommend to existing customers. Today, SoFi’s data science team mines large datasets and runs various regression techniques to tease out the relationship, if any, between attributes in those datasets and credit worthiness.
Credit history datasets are unwieldy, comprising millions of rows and thousands of columns merged into a single file[vi]. Storing, cleaning, and analyzing that data can take considerable time, so much of SoFi’s current action plan is aimed at increasing computational capacity and reducing throughput time of analyzing the data. Yan Wu, SoFi’s Head of Analytics recently said, “the most important thing that cloud computing has done is make incredibly high-powered machines available for testing. What happens is cycle times become shorter and iterations become quicker”[vii]. With faster iterations, SoFi can more readily learn if a new data source improves its ability to predict credit risk.
Once SoFi has customers in the funnel, it seeks to cross-sell new financial products and to increase its share of wallet. It is currently releasing a financial planning app, which has tens of thousands of customers on a waiting list[viii]. Through this app, SoFi provides insights to customers on how others in a similar age and income bracket are using their money. It plans to use machine learning to identify financial products that individual customers are likely to benefit from and is developing a recommendation algorithm to market those services to customers.
SoFi’s current strategy is based on collecting data later in a customer’s lifecycle, specifically post graduation. It could benefit by: 1) expanding its customer funnel and lending to students while in school, and 2) publishing data on expected outcomes for students based on major, school, and a variety of other factors.
If SoFi enters the direct student lending market, it could leverage its rich refinancing data to price private loans better than existing players in a highly fragmented market. It could look at its existing dataset to identify the least risky graduate programs, based on repayment rates on refinanced loans from those programs.
SoFi should also consider providing a public service that could build its brand awareness among students before they even choose which school to attend or which major to study. Most schools publish high level data on median starting salary, employment percentage, etc. SoFi could begin publishing much more granular data that allows prospective students to input some basic assumptions like school, intended major, graduate program (if applicable), and desired city/state to see expected financial outcomes based on historical data.
The use of machine learning presents the opportunity to democratize risk and increase access to financing for millions of people who may otherwise be denied it. A broad array of lenders, not just SoFi, are using alternative data sources to assess risk. As these techniques become more prevalent, we must be mindful that historical data is subject to historical biases. Marginalized groups who may have been systemically denied access to capital in the past may have incomplete data to draw from. As we add new data sources, we should always ask whether these new metrics bias against certain portions of the population.
[i] New York Federal Reserve, Student Loan Data and Demographics [Excel Download], 2017, URL: https://www.newyorkfed.org/microeconomics/topics/student-debt, Accessed Nov 2018
[iii] Consumer Financial Protection Bureau, “Data Point: Credit Invisibles”, May 2015, URL: https://files.consumerfinance.gov/f/201505_cfpb_data-point-credit-invisibles.pdf, Accessed Nov 2018
[iv] Peter Rudegeair, “Silicon Valley: We Don’t Trust FICO Scores”, The Wall Street Journal, Jan 11, 2016, URL: https://www.wsj.com/articles/silicon-valley-gives-fico-low-score-1452556468 ,Accessed Nov 2018
[v] Julapa Jagtiani, “The Roles of Alternative Data and Machine Learning in Fintech Lending: Evidence From the LendingClub Platform”, Philadelphia Federal Reserve, 2018, Pg. 3, URL: https://www.philadelphiafed.org/-/media/research-and-data/publications/working-papers/2018/wp18-15.pdf, Accessed Nov 2018
[vi] Thomson Reuters, “SoFi’s data science head: Opening the funnel to non-traditional borrowers with machine learning”, October 17, 2018, Pg. 2, URL: https://blogs.thomsonreuters.com/answerson/sofis-data-science-head-opening-the-funnel-to-non-traditional-borrowers-with-machine-learning/,Accessed Nov 2018
[vii] Thomson Reuters, “SoFi’s data science head: Opening the funnel to non-traditional borrowers with machine learning”, October 17, 2018, Pg. 3, URL: https://blogs.thomsonreuters.com/answerson/sofis-data-science-head-opening-the-funnel-to-non-traditional-borrowers-with-machine-learning/,Accessed Nov 2018
[viii] Ainsley Harris, “Are you ready to ditch your bank? SoFi is betting its future on it”, Fast Company, June 19, 2018, URL: https://www.fastcompany.com/40585328/are-you-ready-to-ditch-your-bank-sofi-is-betting-its-future-on-it, Accessed Nov 2018