Xiaocan (Bruce) Li's Blog
Home
Archives
Search
Archives
2025
1
February
1
A-3PO: Accelerating Asynchronous LLM Training with Staleness-aware Proximal Policy Approximation
February 5, 2025
·
8 min
·
1575 words
·
Xiaocan (Bruce) Li