Graph Machine Learning for Asset Pricing: Traversing the Supply Chain
Published:

Abstract
We develop a nonparametric method to aggregate firm characteristics across a large supply chain network to explain cross-sectional expected returns. Each firm receives a pricing signal, nonlinearly constructed from the characteristics of neighboring firms within $d$-hops on the network. We find that $d = 3$ — encompassing network effects up to the third order — balances bias reduction from higher-order relations against variance from added complexity.
Our model leads to a portfolio sorted by ML-driven firm-level estimated returns that condition on both historical supply chain data and firm characteristics. We achieve over a 16% out-of-sample Sharpe gain versus direct-link models, and outperform the Fama–French five-factor and PCA benchmarks. We find that the ML-managed portfolio improves mean-variance efficiency, measured by Sharpe ratio. Lastly, we show that the conditional mean return estimation of more central firms is 55% more sensitive to missingness of supply chain links compared to that of peripheral firms in the supply chain graph.
Key Contributions
- A graph attention network (GNN) that aggregates firm characteristics up to $d$ hops along supplier–customer edges, with data-driven depth selection
- Formal characterization of the bias–variance tradeoff in higher-order neighborhood aggregation for return prediction
- An empirical asset pricing study showing that supply chain topology contains return-predictive information invisible to linear factor models
- A centrality-based guide for supply chain data quality investment
Presentations
UT Austin, UNC, Baruch; SoFiE, NFA, EFA, Inaugural Finance Research Revolution Conference (Vitznau, Switzerland), INFORMS, Luohan Academy
