honnibal.dev

Back to our roots

2024-07-17 · 17 minute read

For most of Explosion’s life we’ve been a very small company, running off revenues. In 2021 that changed, and we became a slightly less small company running off venture capital. We’ve been unable to make that configuration work, so we’re back to running Explosion as an independent-minded self-sufficient company. We’re going to stay small and not look for any more venture capital. spaCy and Prodigy will continue.

We have an announcement post up on the company blog that lays out what this means for our various projects, but I know people will also want to know more about the how and why. These things are easier to talk about from a personal perspective, so I decided to make a separate post.

I started work on spaCy in 2014. Before I started I had been researching syntactic parsers during my PhD and as a post-doc, and I saw that the Python ecosystem really didn’t have an NLP toolchain suitable for most companies. The largest companies like Google or Microsoft had internal tools, and the Java ecosystem had CoreNLP, but there was a gap in the Python ecosystem that I was uniquely qualified to fill. The system I’d built for my research was the fastest in the world, and I’d written it in Cython. I was disenchanted with academia, so with some support from my family I left academia and got to work. I met Ines in late 2014, and she started working on the project with me shortly after. In 2015 I hit “publish” and moved to Berlin, and in 2016 we founded Explosion together.

My initial commercial plan for spaCy was admittedly vague. I figured that if enough value was flowing through the library commercially, there would be a variety of ways to capture enough of it to make the effort worthwhile. An early idea was dual licensing, modelled after Stanford CoreNLP, but we decided to switch to the MIT license soon after the first release, to make it easier to adopt.

When Ines and I founded Explosion, we took consulting projects to understand what users were doing with the library and what they would need. We settled on annotation tooling as the commercial complement to the free offering. The most common advice for monetising the library would have been a hosted solution. However, very few users were asking for that. The two best things about spaCy are its Python API and its performance (blend of speed and accuracy). All the major cloud providers, and lots of smaller vendors, already had JSON-over-REST NLP APIs, so a spaCy API wouldn’t really be anything special. The second most common advice for monetising spaCy was something like “enterprise support”, but that would really just be consulting by another name. What users wanted was help with NLP, not help with technical details about the library or some sort of SLA.

The job that spaCy signed up to do for people is hard but somewhat humble. It’s positioned as a utility library; a tool for building tools. Core functionality includes segmenting strings of text into sentences and words, providing categories for words and phrases (e.g. noun, verb, digit, proper name, date etc.), recognising and normalising morphology, and training and applying your own custom labels. The processing pipeline is divided into a number of steps, and there’s extensive support for configuring, saving and loading a pipeline to suit your requirements. At the heart of it all are efficient data structures for the Doc, Span and Token objects, implemented in Cython. These data structures let you actually make use of all these properties of the text that the pipeline produced.

Many applications can get everything they need from the text by applying rules over the general linguistic properties spaCy provides. However, many others benefit from at least some custom, problem-specific machine learning as well. The trick to building this sort of thing is to factor out as much business logic as you can.

For instance, if you’re building a system to route customer service questions, and Alex handles complaints and short usage questions, while Bo handles longer usage questions unless they’re from users in Austria, you don’t want to train a system with the labels “Alex” and “Bo”. Just train a model to identify the complaints and usage questions, and do the rest with rules. When doing this, it’s important to recognise that the categories “complaint” and “usage question” are not natural kinds in the language. There’s no general definition of them — it will be up to you to define what they mean in the context of your specific data and use-case. Thanks to deep learning, if you can come up with consistent definitions, you really don’t need many examples to produce an accurate classifier. But you do need some, and it will be up to you to sort that out — either doing the work in-house, or engaging some external provider.

Ines and I saw this need for custom data from users when we started consulting, so we built Prodigy as our first commercial product. Prodigy is a flexible annotation tool that integrates very well with spaCy, although you can also use it separately. A key idea behind Prodigy is to help annotation projects scale down: what’s the least annotation you can do to define the custom behaviours you need, and how can you do it as efficiently as possible?

We built Prodigy as a developer tool, and we sell it as a standalone purchase. You buy a license to it ($390 for a personal license, $2450 for 5 company seats), and can then download and use it. This matches the nature of the product, so it made things simpler both for us and for users. Writing a license server and lockout system to charge subscription fees would have been more work, and it would have imposed more operational complexity for users. We also figured that it would be easier for enterprise users to get approval for a relatively small once-off purchase, instead of a new subscription. This product scope (downloadable tool, flat fee) played to our strengths. It was small enough for Ines and I to build it by ourselves, but few other small teams would have the right mix of skills to copy it — and it seemed unlikely anyone would get funding just to build a tool sold for a flat fee. We knew Prodigy left out a lot of important things, but we figured we could release a follow-up project later to address that.

We released beta versions of Prodigy in mid 2017, and had the product on sale by December. By early 2018 Explosion was still just me and Ines, but we had a big community of spaCy users and Prodigy was making around $40,000 a month. We met Sofie that summer, and she started working with us, first as a freelancer and then full-time. Adriane started contributing to spaCy in early 2019, and joined full-time later that year as well.

By 2019 we also started work on a follow-up product to Prodigy, Prodigy Teams. There was a lot we left out of Prodigy, in the name of simplicity. Because it runs locally, users have to provide their own infrastructure solution to host it, and develop their own workflows for project management tasks such as assigning work to different annotators. We wanted a product that would fill that gap, without giving up the programmability and data privacy that our users valued in Prodigy. Justin and Sebastián joined to work on it, with Explosion sponsoring 20% of Sebastián’s time being spent on his open-source projects (FastAPI and later Typer).

Ines and I were spread thin across the three projects and the internal systems we needed. spaCy is quite an operationally complex project: we train dozens of statistical models that we combine in various ways for the pipelines we release. We have to test and benchmark on a matrix of CPU and GPU hardware; Windows, Linux and macOS operating systems; Python versions; and packaging solutions (pip and conda). We also had to develop some internal systems for analytics, systems for hosted demos, and our various websites. We knew that getting Prodigy Teams built and released on top of the work we were already doing would be a stretch, but we felt we could get there so long as nothing major went wrong.

In 2020 and 2021, major things did indeed go wrong. The pandemic took a toll on all of us, and we ended up in a pattern of running just to stand still. By the end of summer in Berlin Ines and I decided to go to Australia for at least the winter. Australia had been able to handle the pandemic much better than Europe, and I wanted to be close to my family. We ended up setting up shop in Melbourne.

As 2020 came and went, we had to face the fact that progress on Prodigy Teams had stalled. If we wanted to build this thing and do what we wanted to do with spaCy, we would need investment. We’d built up a backlog of investor interest over the years, and the pandemic had made US investors much more open to international deals, as everyone got used to doing business over Zoom. We chose to work with SignalFire, a Silicon Valley firm who really understood our stack and open-source roots. They offered us a variety of deal configurations, and we sold a small stake for $6m. They put in an additional $3m in 2022.

Going into the round, we decided we wanted to raise an amount of money that was a good fit for our plans, rather than just trying to raise as much as investors would give us. More money sounds strictly better in theory, but you generally can’t raise money without making a good argument for how the capital will make the business more valuable. If you want to raise a lot of money, you need plans that justify it. We wanted to make sure we were getting investment to make our plans happen, not making plans to get investment to happen.

The terms we ended up with were pretty unusual, because the company was in an unusual position. It was clear that NLP was an explosive trend, and Explosion had built itself an amazing position within the space with almost nothing. We’d out-engineered whole teams with just a couple of people, and we were selling software to hundreds of companies, including a good chunk of the Fortune 100. What would we be able to do with more resources available?

The answer, unfortunately, was disappointing. There’s this saying “hindsight is 20/20”, but like a lot of sayings, if you think about it it’s really not true at all. For most difficult decisions you’ll never know what would have happened if you had done things differently. I don’t have any simple answer to the question “what went wrong”, but there were a few different things that felt like they were making things harder.

The thing that feels most upstream of the other problems was that we had these different projects that were all underway to some extent, and we suddenly had to hand them off to other people and get a team working on them. The multiple projects divided my and Ines’ attention, and made us a larger company than we’d have wanted to be for either project.

The technical hand-off issue was especially difficult for spaCy. spaCy represents one person’s take on how a suite of NLP tools should be designed to come together in a practical whole. There’s many design decisions that mean you basically have to do novel research to develop new components — you can’t just review the literature, pick a paper, and adapt its implementation. If you do that, you find that the method you’re trying to implement doesn’t work on whole documents, or it requires a whole GPU to function, or it requires some other process to run over a text collection before you can start processing your text. This process of synthesising research to extend spaCy in-line with the existing direction was very difficult to hand over to a team.

Engineering for spaCy and our other projects was also very challenging to hand over. spaCy is implemented in Cython, and big chunks of the project are essentially C code with funny syntax. We pass around pointers to arrays of structs, and if you access them out of bounds, well, hopefully it crashes. You have to just not do that. And then in addition to this memory-managed code, there’s all the GPU-specific considerations, all the numpy minutiae, and maintaining compatibility with a big matrix of Python versions, operating systems and hardware. It’s a lot.

By the time we took the investment, Sofie and Adriane had been working on spaCy for years. They knew everything and could do almost all of it. But there was never any organised hand-off where we sat down and decided how things should work with me less involved. The whole company was only 4-6 people up to that point, and the roots of all the work came back to decisions I’d made alone years ago — so I was in the middle of everything. We really didn’t have a model of a distributed decision-making process. Instead, we just sort of added more people, and hoped that they’d get up to speed and be productive contributors. As I became less involved in the hands-on work, I struggled to be effective as a decision-maker. A lot of the bigger questions got deferred, and we had an increasing bias towards whichever approach was least committal.

spaCy and its associated libraries have been exceptionally well maintained over the last few years, and there’s been lots of extensions and additions. But if I compare the output 2016-2019 to 2021-2024, and I look at the difference in what was spent, I really can’t consider it a good result. Ultimately I take responsibility for not setting up a situation where we were able to get the most out of what everyone was able to give, but the truth is I still don’t know what that situation should have looked like in detail.

Why couldn’t I just stay focussed on spaCy? Well, the project that would make or break the company’s future was Prodigy Teams. Justin and Sebastián left the company with the transition to investment, so Ines and I had to hire a team and get them to work. We felt that we had a lot of progress already, and I wanted to work with the team to get them set up.

We had announced that we were starting work on Prodigy Teams in 2019, so we felt behind on the project before the new team was even in place. This dynamic caused problems.

People join a startup to work on an unreleased product because they want to be part of creating something. But Prodigy Teams had been in flight for years already. We had a lot of customers of Prodigy waiting for it. Many of them have built their own internal systems to operationalise Prodigy, and are really looking forward to having a stable product to replace it. We’d been talking to these users for a long time and we felt like we understood their requirements well. We also felt like we had a good record of hitting the mark with our releases, so we were pretty confident about what we wanted to build.

What ended up happening was pretty chaotic. Ines and I were still in Australia when we got the investment, but of course we wanted to get started right away. An old contact from Australia was also in Melbourne freelancing, so we linked up. Things seemed to be going well, so we made him an offer to come and work with us full-time — and through his network we soon had two other Australian hires. However, we didn’t want to commit to an Australia-only team. Experience building B2B SaaS systems was obviously essential, and this experience is heavily concentrated in the US. When Ines and I went back to Berlin, we had a team split across three timezones.

Of the mistakes we made, I feel like this is one that was especially stupid. We had always been a distributed company, but we always had workdays that matched up pretty well. This changed when Ines and I went to Australia, and we told ourselves that we were making it work just fine — so it shouldn’t be a problem for the team, right?

What should have been obvious to us is that the situations aren’t at all similar. It’s not surprising that we weren’t able to get a new team to gel and work closely together while split across three timezones. I think this is where my lack of experience working in a large, normal software development team set us back. There were a bunch of smaller, contingent fires going on in the team, and it took me too long to accept that the timezone split was one of the biggest problems.

By the time we reorganised as a team just in Europe, we felt even more behind than before. These years of history weighed on the project. Instead of a bunch of decisions the team had come to together, there was all this context stretching back a long way. I wasn’t a good manager and didn’t do a good job of coordinating the team, but this hole was awkward for someone else to fill. Ines and I knew what we were trying to build and we had good reason to be confident in it, so the job description wasn’t exactly “product manager” in the standard sense — at least not at first. We knew that the product would adapt and evolve after release, but we didn’t want to delegate the release roadmap.

It wasn’t until late 2022 that the team finally started to feel really productive, and by then it was too late. We got to the point where we had external testers, but we ran out of money before we could get the product finished. The VC market tanked in 2023, and without the product complete we couldn’t make the case for additional investment. I hold no bitterness about this. I do think we’d have gotten there with some additional investment, but I can easily understand investors’ perspective. It was our job to make the company undeniably successful with the money we had available. That’s the deal we signed up for. If we’d succeeded the investment climate and investor sentiment wouldn’t matter.

We considered selling the company, but we weren’t able to find a good fit. Instead, we’re back at the same sort of size we had before the investment. We’re very grateful to Hugging Face for a $250,000 grant to support our open-source work as our funding ran out, and we’ve applied successfully for a German R&D reimbursement grant that will give us up to €1.5m in unconditional funding.

I’ve been finding the transition back to the way things were quite difficult. I still know our codebases well, but the associated infrastructure isn’t easy to wrangle. Overall I haven’t been very productive over the last few months, but it’s getting better now. I hope the community can be patient with me while I pick everything back up. I don’t want to say too much about our plans for the various projects here though — it’s better to communicate about that from the Explosion site. You can find that post here.