Building a data-driven sourcing strategy

table of contents
Down arrow

This article originally appeared in Forbes.

Venture capital (VC) deal-sourcing today is tough, and a new approach is needed. Crunchbase reported a broad slowdown in both funding volume and number of deals announced in Q3 2024, continuing a trend that has been playing out since VC mania reached its peak in 2021.

Since then, the macroeconomic environment has turned. Even as fears of a recession fade, investors have become far more conservative (shifting their focus from hype and growth-at-all-costs to companies with strong fundamentals and durable long-term growth), and the average time for due diligence has risen as investors reset their portfolios with investments that meet much stricter criteria.

This trend has not yet bottomed out from a deal count perspective. Looking at dollars invested, the picture seems more optimistic. This is in large part thanks to breakthroughs and funding in AI startups (as Crunchbase notes, the one subsector of tech bucking the trend).

Many AI companies—especially those working on designing and training their own models—are highly capital-intensive. We’re seeing the “power law” at play, where the biggest AI players drive a huge percentage of investment in the space. Two examples demonstrate this: xAI’s $6 billion round in Q2 and OpenAI’s record-breaking $6.6 billion round at the start of Q4.

These were outlier events. We’re not going to see $6 billion-plus rounds every quarter, and in fact, I wouldn’t be surprised if total funding returned to a downward trend starting Q4-onward. Results from an analysis of our own platform’s data correlate with this downward trend.

Deal sourcing: A data challenge

Many firms have funds to deploy and are challenged to find the right deals. They must first “see” deals, then pick those most relevant to their thesis, and ultimately win (“get into”) them.

The problem right now is that there is too much capital chasing, not enough good deals. There was a 76% increase in the number of funds from 2015 to 2023. And while there are more companies being founded than ever before—the number of new startup filings hit a record high this year—these are not necessarily all quality companies.

With more competition and more potential investments to sift through and track, it’s harder than ever to see the right companies early enough to get into them before other investors do.

At its heart, this sourcing challenge is a data challenge. There are only so many things (companies, signals, to-dos) a human can hold in their head at any given point. This leads to regret and missed opportunities, especially today.

I’ve heard many investors say, “X company fits our thesis perfectly and just got funded by Sequoia—how did we not even know about it?” or “We’ve seen and met Y company before but didn’t stay on top of them, and now they’ve been funded by A16z—how did we miss that?”

The answer is to get and stay ahead with a data-driven sourcing strategy.

Data-driven sourcing

Top firms are using data to find deals before their competitors. They will look at “alternative” signals, specifically signals that correlate with growth like head count, web traffic and app downloads. By monitoring changes like these to high-potential startups, firms can flag when major announcements or inflections happen and it’s time for a dealmaker to reach out.

Firms may also monitor the portfolio companies of other investors they respect (especially those that invest one stage earlier than them) so they’re ready to reach out through their network and access great deals as early as possible.

This type of monitoring is already becoming the new norm. My warning to investors today is that if you don’t start tapping in to these kinds of signals, you will be at risk of falling behind.

Building a tech stack for data-driven sourcing

The reality is that the biggest firms in VC are staying ahead by investing a lot—some upward of $1 million annually—into in-house sourcing platforms. By aggregating trillions of data signals, they are more likely to see every opportunity first. The cost? It involves hiring in-house data engineering teams, buying and/or scraping unique datasets and building custom models and UI on top of that data after reconciling it into one place.

This isn’t possible for most other firms, or even recommended. I would strictly reserve building for things that you genuinely think can generate a unique edge, and buy the rest. Buying is far more pragmatic, and you get a dedicated product that will improve at a rate far quicker than internal buildout, especially when you don’t have the resources or fund size for a huge investment.

When you do decide to build, it’s important to be very specific about what truly generates a unique edge. Many firms believe sourcing fits within this camp. I’d argue that there are lots of parts of sourcing and some generate alpha, others do not.

In the first category are things like scraping or building a proprietary dataset that truly no other firms have access to. In the latter camp, rebuilding an entity resolution engine or building a custom UI on top of your sourcing data are already-solved problems. They don’t give you an edge in rebuilding from scratch.

Stick to this framework to build a data-driven deal-sourcing strategy that maximizes the efficiency of your resources.

A rebound In 2025?

Our annual survey of hundreds of dealmakers reveals that a large majority expect to do more deals in 2025. In a sense, it’s self-fulfilling—if respondents say they expect to do more deals, they will. This is also supported by the direction of U.S. monetary policy, with the Fed cutting interest rates in September and November and more expected to come.

While there may not be a huge rebound, especially given the continuing constraints of finding high-quality deals, there is certainly reason for optimism as we look to the year ahead.

author
Ray Zhou
Co-Founder
posted in
share this