Paper
Paper
Search
Search
Ctrl+K
Publish
Home
Fund
Earn
Journal
Notebook
Lists
RSC
USD
About
Docs
Changelog
Terms
Privacy
Support
Back to Basics: Revisiting REINFORCE Style Optimization f... | ResearchHub
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs
By
Arash Ahmadian
6 more
Arash Ahmadian
·
Chris Cremer
4 more
·
Sara Hooker
Preprint
February 22, 2024
2
Paper
Conversation
1
Reviews
0
Bounties
0
Paper
Paper
Search
Search
Ctrl+K
Publish
Home
Fund
Earn
Journal
Notebook
Lists
RSC
USD
About
Docs
Changelog
Terms
Privacy
Support
Loading PDF viewer…
Journal
arXiv (Cornell University)
Topics
#computer-science
#machine-learning
#artificial-intelligence
#psychology
#history
Show all topics
DOI
10.48550/arxiv.2402.14740
Other Formats
PDF