Graph Machine Learning for Asset Pricing: Traversing the Supply Chain

Published:

Paper (SSRN) Code
GNN supply chain comic thumbnail

Abstract

We develop a nonparametric method to aggregate firm characteristics across a large supply chain network to explain cross-sectional expected returns. Each firm receives a pricing signal, nonlinearly constructed from the characteristics of neighboring firms within $d$-hops on the network. We find that $d = 3$ — encompassing network effects up to the third order — balances bias reduction from higher-order relations against variance from added complexity.

Our model leads to a portfolio sorted by ML-driven firm-level estimated returns that condition on both historical supply chain data and firm characteristics. We achieve over a 16% out-of-sample Sharpe gain versus direct-link models, and outperform the Fama–French five-factor and PCA benchmarks. We find that the ML-managed portfolio improves mean-variance efficiency, measured by Sharpe ratio. Lastly, we show that the conditional mean return estimation of more central firms is 55% more sensitive to missingness of supply chain links compared to that of peripheral firms in the supply chain graph.

Key Contributions

  • A graph attention network (GNN) that aggregates firm characteristics up to $d$ hops along supplier–customer edges, with data-driven depth selection
  • Formal characterization of the bias–variance tradeoff in higher-order neighborhood aggregation for return prediction
  • An empirical asset pricing study showing that supply chain topology contains return-predictive information invisible to linear factor models
  • A centrality-based guide for supply chain data quality investment

Presentations

UT Austin, UNC, Baruch; SoFiE, NFA, EFA, Inaugural Finance Research Revolution Conference (Vitznau, Switzerland), INFORMS, Luohan Academy