Paper
Paper
Search
Search
Ctrl+K
Publish
Home
Fund
Earn
Journal
Notebook
Lists
RSC
USD
About
Docs
Changelog
Terms
Privacy
Support
Mixture-of-Depths: Dynamically allocating compute in tran... | ResearchHub
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
By
David Raposo
5 more
David Raposo
·
Sam Ritter
3 more
·
Alberto Santoro
Preprint
April 2, 2024
7
Paper
Conversation
1
Reviews
5.0
Bounties
0
Paper
Paper
Search
Search
Ctrl+K
Publish
Home
Fund
Earn
Journal
Notebook
Lists
RSC
USD
About
Docs
Changelog
Terms
Privacy
Support
Sign in to review
Share your thoughts on this paper...
Best
Journal
arXiv (Cornell University)
Peer Reviews
AR
Ahan M R
almost 2 years ago
Topics
#computer-science
#machine-learning
#engineering
#artificial-intelligence
#voltage-1
Show all topics
DOI
10.48550/arxiv.2404.02258
Other Formats
PDF