Don’t be fooled by the friendly emoji in the company’s actual name — HuggingFace means business. What started out in 2016 as a humble chatbot company with investors like Kevin Durant has become a a central provider of open-source natural language processing (NLP) infrastructure for the AI community. HuggingFace boasts an impressive list of users, including the big four of the AI world (Facebook, Google, Microsoft, and Amazon). What’s most surprising is that, despite their completely open source business model, HuggingFace has been cash-flow positive and maintains a staff of under 100 people. This blogpost will describe the basics of their business model and attempt to explain how they’ve accomplished so much with so little.
HuggingFace’s core product is an easy-to-use NLP modeling library. The library, Transformers, is both free and ridicuously easy to use. With as few as three lines of code, you could be using cutting-edge NLP models like BERT or GPT2 to generate text, answer questions, summarize larger bodies of text, or any other number of standard NLP tasks. The library is fully compatible with popular deep learning frameworks like PyTorch and Tensorflow. Furthermore, the library also provides simple hooks for custom training or fine-tuning of existing models.
Amazingly, HuggingFace does not charge for their core product; rather, they open-source their core library, providing it at zero charge. The models and core library are all available on Github under the Apache License 2.0, an extremely permissive license that allows others to build on and capture value from their work without condition. The company is active in responding to technical issues encountered by its users, and generally seems to have a goal of promoting as much adoption of their models as possible.
The core value of HuggingFace is comes from distilling the work of the broader research community and making it accessible via thoughtful tool design. HuggingFace does not (for the most part) research most of its own models, but rather builds on the research of others. Importantly, the research community has a norm of sharing the product of research as open-source code as well, which enables HuggingFace to do this at extremely low cost. HuggingFace spends a lot of effort on the sofware design that makes their models accessible to others; the heavy focus on UX is a big reason for their popularity in the research community.
Beyond their core products, HuggingFace is extremely embedded within the NLP research community, and uses that position to create additional value. HF organizes a large community of users who share the company’s norms around openness. They collaborate with universities and larger companies on research papers. They’ve coordinated with large MLOps Infrastructure providers to ensure their service is available on the main cloud computing services (e.g. AWS SageMaker). One PhD researcher who I’ve spoken with went as far as to say “I don’t really know how I’d do [big-model] NLP research without HuggingFace”.
Amazingly, the company has been cash-flow positive for over a year. The company does this by providing consulting and infrastructural services to aid in the use and application of their product. In particular, the company’s specialty in of operating large language models enables them to collaborate with companies to help them to run efficiently at scale. Demand for this type of service has exploded this past year with a sharp rise in the demand for document classification by many organization.
HuggingFace is effectively pioneering a new business model, pushing the business models of AI away from capturing value from models directly, and towards capturing value from the complementary products and processes necessary for deploying them. The company is betting on machine learning as being as important in the future as software engineering is today (source).
Is this a sustainable business model? It’s hard to argue with the results. The core reason they are profitable is that they have extremely low costs relative to the value that they are creating. The company successfully raised a Series B round in early last year to grow the size of their team, resisting acquisition interest from the big tech companies. It seems fairly clear, though, that they’re leaving tremendous value to be captured by others, especially those providing the technical infrastructured necessary for AI services. However, their openness does seem to generate a lot of benefit for our society. For that reason, HuggingFace deserves a big hug.