Selective Multiple Testing: Inference for Large Panels with Many Covariates(Paper)(Code)(Slides) Co-author: Markus Pelger
We propose Panel Multiple Testing that allows us to select covariates that explain a large cross-section with false discovery control. In our empirical asset pricing study, we select sparse risk factors from a factor zoo of 114, to explain 243 doubly-sorted portfolio excess returns.
NASMES 2023, AMES 2023, INFORMS 2023, 11th Western Conference on Mathematical Finance, NBER-NSF SBIES 2022, California Econometrics Conference 2022, Stanford HAI Financial Services Industry Review
Large Dimensional Change Point Detection with FWER Control as Automatic Stopping(Paper)(Poster)(Code) Co-authors: Yang Fan, Markus Pelger
With hundreds of time series and unknown number of change points to detect, our inference-based method is better suited than the classical DP-based algorithm due to its conscientious trade-off of Type I vs Type II error. We provide FWER control theory. In simulations, we showed 20% lift in F1 scores against leading benchmarks.
Inference for Large Panel Data with Machine Learning(Paper)
This is my PhD thesis, accessible from Stanford’s archival system.
Asset pricing with Supply Chain Relationships(Paper)(Code) Co-authors: Agostino Capponi, Jose Sidaoui
We propose a nonparametric method to aggregate rich firm characteristics over a large supply chain network to explain the cross-section of expected returns. Each target firm receives a nonlinearly constructed pricing signal passed from neighboring firms that are within $d$-hops on the supply chain network. We find supply chain is useful for asset pricing: our model achieves over 50% higher out-of-sample Sharpe ratios compared to models using only direct suppliers and consumers, outperforming Fama-French five-factor and principal component models. Through a graph-Monte Carlo experiment, we demonstrate the interplay between $d$ and degree centrality, showing that the most central firms are twice as sensitive as peripheral firms. Our recommended $d = 6$ balances bias-variance and ensures robustness.