How to overcome the barriers to data-driven investing

table of contents
Down arrow

This article originally appeared in Forbes.

The AI hype cycle continues to unfold before us. New tools and models like GPT-4, Claude and Midjourney are having real, profound impacts on the way people work and create, and the world of private capital is no exception.

The possibilities for the industry are profound, but many VCs I speak to today are still in the early stages of their data-driven journey and face significant hurdles. The biggest of these is the need for high-quality data and often in-house engineering—the plumbing before data-driven VC initiatives become truly useful.

What is the profile of a data-driven firm, and how can you become one?

Dr. Andre Retterath’s "Data-driven VC Landscape 2023" found that the median data-driven firm has three engineers in a workforce of 43 employees and looks after $540 million in assets under management (AUM) (pg. 20).

Interestingly, the report found an exponential correlation between AUM and engineering investment. Firms often have at least one engineering hire to kickstart their data-driven initiatives, but bigger engineering efforts scale and correlate significantly with AUM. However, there are early steps that firms can take to test the waters of data-driven investing.

Determine data needs based on investment thesis and value chain focus

Data-driven VCs should start by answering one fundamental question: "What data available in the world correlates with the signals most relevant to our firm?"

Every firm has an investment thesis and criteria. Whether it’s explicitly written down or implicitly in each partner’s head, it helps to begin by translating that thesis into the specific attributes and signals of great startups or founders you would look for.

From there, it’s useful to articulate which stage of the firm’s value chain will benefit most from a data-driven initiative. For example, good deal sourcing datasets require exhaustive coverage. You need as much data as possible on your entire universe of potential startups and founders to ensure nothing gets missed.

In contrast, research and due diligence datasets need to be more targeted and accurate. You already know the company you’re trying to learn more about; now, you want to go deeper into questions like market mapping their competitive landscape. And then there’s portfolio value creation, where the most valuable data might be relationship-based to help with introductions or talent-based to help with hiring.

Getting specific on the types of data your firm requires and why will give you an actionable blueprint for the next step—experimenting with existing tools.

Source datasets and experiment with existing tools

With your data needs defined, you can be proactive about sourcing datasets that cover those signals and attributes and experimenting with that data using tools that are widely available.

There are a variety of off-the-shelf data vendors that unlock more signals than many VCs realize—from headcount growth and website traffic to app downloads.

For one of our customers, the focus of data is on developer platforms and infrastructure software. They mentioned that they extract data from sources on developer platforms, which they then operationalize using their internal tooling to pinpoint projects that stand out as potentially intriguing or deserving of further attention. That approach helps them recognize emerging trends and the factors that contribute to the success of specific products.

The flexibility of building data-driven initiatives means that you can start small and then iterate on what works well.

Consider hiring your first engineer

The biggest step toward data-driven investing requires a leap of faith. It’s committing the upfront investment to start an engineering or data science team that can further build tools personalized to your firm’s investment strategy.

Thankfully, this barrier to entry is rapidly coming down. The proliferation of large language models has made it possible for a much larger number of entry-level engineers and analysts, not just machine learning PhDs, to build useful data-driven VC tools. This means that for venture capital, a new competitive standard is coming.

In the words of Francesco Corea, Director of Data Science at Greycroft Partners: "I think we are approaching the tipping point where the majority [of firms] would need some degree of data-driven culture to become and remain competitive" (pg. 15 of the DDVC report linked above). Firms that don’t adopt data-driven strategies will find they are increasingly falling behind.

author
Ray Zhou
Co-Founder
posted in
share this