Paper

Paper

About Docs Changelog Terms Privacy Support

Back to Basics: Revisiting REINFORCE Style Optimization f... | ResearchHub

Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs

By

Arash Ahmadian

Arash Ahmadian·Chris Cremer·Sara Hooker

Preprint

February 22, 2024

2

Paper

Paper

About Docs Changelog Terms Privacy Support

Loading PDF viewer…

Journal

arXiv (Cornell University)

Topics

#computer-science #machine-learning #artificial-intelligence #psychology #history

DOI

10.48550/arxiv.2402.14740

Other Formats