Venture capital plays an instrumental role in the modern American economy. Corporations such as Google, Apple, Facebook, Instagram, PayPal, Tesla, SpaceX, Airbnb, FedEx and Intel all received venture funding to enduringly grow into the iconic companies they are today. As of 2015, venture backed companies made up 17% of U.S. public companies, accounted for 44% of R&D spending and employed 11% of U.S. citizens [3]. Despite the unequivocal impact venture capital has on the United States, the process behind venture investment decisions remains manual, subjective and unsystematic.
This project sets out to explore whether venture capital can benefit from machine learning, particularly deep learning, to make investment decisions in a more scientific way. Specifically, we aim to predict the valuation step-up multiple in the subsequent financing round of venture backed U.S. companies. The primary dataset for our model comes from Pitchbook, a popular commercial dataset of company and funding information. We produced a regression-based model to utilize a fully-connected ten-layer neural network to encode features and predict the valuation step-up multiple.
Keywords: fully-connected neural network, venture capital, regression, company valuation prediction
Some previous work has attempted to predict the success of a startup based on information regarding the company and its founders. A particular project named the Holy Grail of Venture Capital [4] was conducted by a technical and VC-experienced team from the University of California at Berkeley who utilized traits of a founder including fear of failure, persistence, perusation, reliability, competitiveness, network strength, and trust to determine likelihood of success. These traits were scaled as a quantitative measure from 1 to 5. Using data from Crunchbase and a founder survey, the team ended up using eight features and one target output determined from their cycles of pre-processing. When building their model, they compared algorithms built on logistic regression, SVM, perceptron, Naive Bayes, XGBoost, as well as Random Forest. While many algorithms were compared, they did not explore using deep learning techniques.
Another prior work was conducted by Will Gornell and Ilya Strebulaev to build a valuation model of venture capital-backed companies with multiple rounds of financing [3]. They built a model using data from Pitchbook and Genesis and limited their scope to U.S. companies from 2004 onwards. They analyzed about 19,000 companies over 37,000 financing rounds. The paper explores regression models to determine how current value, value change, and prior contractual terms impact the terms of a new round. Their results indicated an overestimation of post-valuation of companies but found results in line with the prices reported from the VC industry’s finance intermediaries.