Paper
Paper
Search...
Search ResearchHub...
Ctrl+K
New
Home
Browse
Earn
Fund
RH Journal
Notebook
Lists
Leaderboard
RSC
USD
Changelog
Terms
Privacy
Issues
Docs
Support
Foundation
About
Back to Basics: Revisiting REINFORCE Style Optimization f... | ResearchHub
Paper
Paper
Search...
Search ResearchHub...
Ctrl+K
New
Home
Browse
Earn
Fund
RH Journal
Notebook
Lists
Leaderboard
RSC
USD
Changelog
Terms
Privacy
Issues
Docs
Support
Foundation
About
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs
2
Authors
Arash Ahmadian
6 more
Arash Ahmadian
•
Chris Cremer
4 more
•
Sara Hooker
Published
February 22, 2024
Paper
Conversation
1
Reviews
0
Bounties
0
Loading PDF viewer…
Supporters
Support the authors with ResearchCoin
Tip RSC
Journal
arXiv (Cornell University)
Topics
Computer Science
Machine Learning
Artificial Intelligence
Reinforcement Learning
Political Science And International Relations
Show all topics
DOI
10.48550/arxiv.2402.14740
Other Formats
PDF
Supporters
Support the authors with ResearchCoin
Tip RSC
Journal
arXiv (Cornell University)
Topics
Computer Science
Machine Learning
Artificial Intelligence
Reinforcement Learning
Political Science And International Relations
Show all topics
DOI
10.48550/arxiv.2402.14740
Other Formats
PDF