Hugging Face: Embracing Natural Language Processing

Learn how the leading provider of large language models does it with a completely open source business model

Don’t be fooled by the friendly emoji in the company’s actual name — HuggingFace means business. What started out in 2016 as a humble chatbot company with investors like Kevin Durant has become a a central provider of open-source natural language processing (NLP) infrastructure for the AI community. HuggingFace boasts an impressive list of users, including the big four of the AI world (Facebook, Google, Microsoft, and Amazon). What’s most surprising is that, despite their completely open source business model, HuggingFace has been cash-flow positive and maintains a staff of under 100 people. This blogpost will describe the basics of their business model and attempt to explain how they’ve accomplished so much with so little.

Value Creation

HuggingFace’s core product is an easy-to-use NLP modeling library. The library, Transformers, is both free and ridicuously easy to use. With as few as three lines of code, you could be using cutting-edge NLP models like BERT or GPT2 to generate text, answer questions, summarize larger bodies of text, or any other number of standard NLP tasks. The library is fully compatible with popular deep learning frameworks like PyTorch and Tensorflow. Furthermore, the library also provides simple hooks for custom training or fine-tuning of existing models.

transformer_resideual_layer_norm_3.png (1415×804)
The conceptual architecture behind Transformer models. To learn more about Transformer Architectures, see this amazing blogpost.

Amazingly, HuggingFace does not charge for their core product; rather, they open-source their core library, providing it at zero charge. The models and core library are all available on Github under the Apache License 2.0, an extremely permissive license that allows others to build on and capture value from their work without condition. The company is active in responding to technical issues encountered by its users, and generally seems to have a goal of promoting as much adoption of their models as possible.

The core value of HuggingFace is comes from distilling the work of the broader research community and making it accessible via thoughtful tool design. HuggingFace does not (for the most part) research most of its own models, but rather builds on the research of others. Importantly, the research community has a norm of sharing the product of research as open-source code as well, which enables HuggingFace to do this at extremely low cost. HuggingFace spends a lot of effort on the sofware design that makes their models accessible to others; the heavy focus on UX is a big reason for their popularity in the research community.

Beyond their core products, HuggingFace is extremely embedded within the NLP research community, and uses that position to create additional value. HF organizes a large community of users who share the company’s norms around openness. They collaborate with universities and larger companies on research papers. They’ve coordinated with large MLOps Infrastructure providers to ensure their service is available on the main cloud computing services (e.g. AWS SageMaker). One PhD researcher who I’ve spoken with went as far as to say “I don’t really know how I’d do [big-model] NLP research without HuggingFace”.

Value Capture

Amazingly, the company has been cash-flow positive for over a year. The company does this by providing consulting and infrastructural services to aid in the use and application of their product. In particular, the company’s specialty in of operating large language models enables them to collaborate with companies to help them to run efficiently at scale. Demand for this type of service has exploded this past year with a sharp rise in the demand for document classification by many organization.

HuggingFace is effectively pioneering a new business model, pushing the business models of AI away from capturing value from models directly, and towards capturing value from the complementary products and processes necessary for deploying them. The company is betting on machine learning as being as important in the future as software engineering is today (source).

Is this a sustainable business model? It’s hard to argue with the results. The core reason they are profitable is that they have extremely low costs relative to the value that they are creating. The company successfully raised a Series B round in early last year to grow the size of their team, resisting acquisition interest from the big tech companies. It seems fairly clear, though, that they’re leaving tremendous value to be captured by others, especially those providing the technical infrastructured necessary for AI services. However, their openness does seem to generate a lot of benefit for our society. For that reason, HuggingFace deserves a big hug.

Previous:

Etsy: Keeping Commerce Human…with AI

Next:

Mya Systems

Student comments on Hugging Face: Embracing Natural Language Processing

  1. Great post, Daniel! I was definitely tricked by the emoji on the front page. They are a totally different company than what I first expected 🙂 I love the business model and focus on collaboration and access to new research.

  2. Very interesting, thanks for sharing Daniel. I’m surprised they have built such an impressive technical tool, yet are offering it as an open-source platform for others and instead making money through consulting services. Do you have any insight into why they took this approach (perhaps it relates to the competitive dynamics of this kind of tool)? I’m also curious to learn more about other use cases for the technology – could this be used by companies for AI chatbots (e.g., customer service chats)?

  3. Thank you for this post, Daniel. I’m very impressed by the atypical business model they currently use and by the benefit that openness provides. I’m very curious to see if they’ll be able to maintain it in the long run and if they’ll resist possible acquisitions

  4. Thank you so much for this interesting sharing, Daniel! There are a few cases of companies in the machine learning industry that are open-sourcing, and it is so nice to see that they are actually making money. Since for many companies, the algorithms and code typically are not proprietary but they still protect them as a “secret sauce”. But Hugging Face took another way and succeeded and they even acquired another chatbot service, Sam.ai? I would very much like to see what their next approach will be, will they end up profit from selling ads, like other big AI labs are actually financed.

  5. What an intriguing post and business model! Thanks for posting about this Daniel! I wonder if through their consulting services, they will be able to determine unmet market demand and begin developing tools in that area. Over time, they can begin developing more services and vertically integrating the workflow to become a dominant player in the field!

Leave a comment