Paper
Paper
Search...
Search ResearchHub...
Ctrl+K
New
Home
Browse
Earn
Fund
RH Journal
Notebook
Lists
Leaderboard
RSC
USD
Changelog
Terms
Privacy
Issues
Docs
Support
Foundation
About
Back to Basics: Revisiting REINFORCE Style Optimization f... | ResearchHub
Paper
Paper
Search...
Search ResearchHub...
Ctrl+K
New
Home
Browse
Earn
Fund
RH Journal
Notebook
Lists
Leaderboard
RSC
USD
Changelog
Terms
Privacy
Issues
Docs
Support
Foundation
About
Preprint
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs
2
Authors
Arash Ahmadian
•
Chris Cremer
4 more
•
Sara Hooker
Published
February 22, 2024
Paper
Conversation
1
Reviews
0
Bounties
0
Loading PDF viewer…
Supporters
Support the authors with ResearchCoin
Tip RSC
Journal
arXiv (Cornell University)
Topics
Computer Science
Machine Learning
Artificial Intelligence
Psychology
History
Show all topics
DOI
10.48550/arxiv.2402.14740
Other Formats
PDF