AI-Powered (Finance) Scholarship

Paper · Source
Co-Writing and CollaborationDomain Specialization in LLMs

This paper describes a process for automatically generating academic finance papers using large language models (LLMs). It demonstrates the process’ efficacy by producing hundreds of complete papers on stock return predictability, a topic particularly well-suited for our illustration. We first mine over 30,000 potential stock return predictor signals from accounting data, and apply the Novy-Marx and Velikov (2024) “Assaying Anomalies” protocol to generate standardized “template reports” for 96 signals that pass the protocol’s rigorous criteria. Each report details a signal’s performance predicting stock returns using a wide array of tests and benchmarks it to more than 200 other known anomalies. Finally, we use state-of-the-art LLMs to generate three distinct complete versions of academic papers for each signal. The different versions include creative names for the signals, contain custom introductions providing different theoretical justifications for the observed predictability patterns, and incorporate citations to existing (and, on occasion, imagined) literature supporting their respective claims.

Introduction. Consider this scenario: a junior professor submits a paper documenting a novel re- turn predictor, which includes precisely formulated hypotheses and robust empirical evidence. The paper is well written, the analysis appears correct, and the hypothe- ses accurately predict the patterns observed in the data. Should it matter if an AI system generated these hypotheses after seeing the results? This question cuts to the heart of how we understand scientific discovery and hypothesis formation, and how our views are being tested by the introduction of Large Language Models (LLMs). In modern academia we face an inherent tension in our treatment of hypothesis formation. We often view post-hoc theorizing with suspicion, labeling it as “HARK- ing” (Hypothesizing After Results are Known) (Kerr, 1998). The prevailing academic standard insists that researchers should first develop their theories and predictions and then test them against data. Few significant scientific discoveries in history have,

Discussion / Conclusion. Our findings suggest that the introduction of AI into academic research production represents more than just a technological advancement—it has the potential to be a fundamental shift in how we generate and validate knowledge in finance. The ability to automate hypothesis generation challenges us to reconsider what constitutes a meaningful research contribution. The questions posed here have no easy answers, but demand careful consideration as we enter an era where AI becomes an increasingly integral part of the research process. The future of financial research may depend less on our ability to gener- ate hypotheses and more on our capacity to distinguish meaningful insights from statistically significant but theoretically hollow findings.