Mihail Velikov (@mvelikov) Bsky

Leveraging 30,000 stock predictors and AI to generate 288 finance papers with custom hypotheses, highlighting risks of automated research and HARKing (Hypothesizing After Results are Known) in academia, from Robert Novy-Marx and @VelikovMihail https://www.nber.org/papers/w33363

1 year ago 10 3 1 0

Agree completely! One point was to show how close you get with simple prompts. We are working on quantifying and fixing the hallucinations though that will take more work. But even with the current state of agentic AI a lot of the remaining issues I think are fixable, let alone with what comes next.

1 year ago 0 0 0 0

Thank you, @amanela.bsky.social! We did consider that, but decided against it due to the ethical considerations and the strain it would have brought on editors and referees. I'm pretty sure they could be published somewhere. I was super curious though how high up the ladder they would have made it.

1 year ago 0 0 1 0

Very excited to release a major revision to our paper on algorithmic collusion by large language models.
#EconSky

1 year ago 28 4 1 0

Researchers used AI to generate 288 complete academic finance papers predicting stock returns, complete with plausible theoretical frameworks & citations. Each paper looks and reads as legit.

They did this to show how easy it now is to mass produce "credible" research. Academia isn't ready.

1 year ago 147 37 7 13

Thanks for featuring our work, Ethan!

1 year ago 3 0 0 0

Really cool work that raises questions about how we think about science and progress. The push towards pre registration has benefits but foregoing HARKing has many costs too. The optimal balance is not obvious!

1 year ago 2 1 0 0

Cool results that raise interesting questions about a swath of asset pricing papers. Also really like the assaying framework they use.

1 year ago 5 1 0 0

In the paper we raise further questions about research integrity and evaluation that reflect the realities of AI-enabled research production and give some initial thoughts on ways to address those.

1 year ago 3 0 1 0

Key implication: When AI can rapidly produce plausible hypotheses for any empirical finding at unprecedented scale, how do we ensure quality control in academic research?

1 year ago 2 0 1 0

Another version for the OLM signal invokes production-based asset pricing arguments and cites Cochrane (1992) and Zhang (2005). While the stories are not always flawless, they are remarkably coherent, especially considering the scale at which we can produce them.

1 year ago 1 0 1 0

For example, one of the signals is the ratio of current assets to EBITDA. The LLM creatively names the signal "Operating Liquidity Margin". One version hypothesizes that OLM predicts returns due to slow diffusion of information and cites Hirshleifer and Teoh's (2003) limited attention model.

1 year ago 1 0 1 0

The papers are remarkably coherent - they include creative names for the signals, contain custom introductions providing different hypotheses for the observed predictability patterns, and incorporate citations to existing (and, on occasion, imagined) literature.

1 year ago 2 0 1 0

GitHub - velikov-mihail/AI-Powered-Scholarship: Code used in Novy-Marx and Velikov (2024), AI-Powered (Finance) Scholarship Code used in Novy-Marx and Velikov (2024), AI-Powered (Finance) Scholarship - velikov-mihail/AI-Powered-Scholarship

To assess this question we:
1⃣Mined 30K+ potential stock return predictors
2⃣Validated 96 robust signals using our "Assaying Anomalies" protocol
3⃣Used LLMs to generate 3 versions of complete papers for different hypotheses for each signal

Papers & code are available at:
github.com/velikov-miha...

1 year ago 2 0 1 0

An academic paper has excellent empirical evidence & hypotheses that perfectly match the patterns in the data.

One catch: AI wrote the hypotheses after seeing the results.

Should this matter?

New paper w/ Robert Novy-Marx on AI-Powered (Finance) Scholarship🧵

papers.ssrn.com/sol3/papers....

1 year ago 24 6 5 6

Posts by Mihail Velikov