Studying RLHF (2025)
このセミナーについて (About this seminar)
RLHF の最新の教科書(2025年4月公開)を読み、事後学習手法に関する背景知識(強化学習や選好学習等)を身につけるとともに、言語モデルに対して強化学習手法を適用する際の課題を明らかにする。 In this seminar, we will read the latest RLHF textbook (published in April 2025) to gain background knowledge on post-training methods—such as reinforcement learning and preference learning—and to identify challenges in applying reinforcement learning techniques to language models.
- 日時 Date and Time: 2Q 隔週月曜 17:15 - 18:55 全7回 2Q, Bi-weekly on Mondays, 17:15 - 18:55 (total of 7 seminars)
- 参加者 Attendees: 言語モデルの事後学習・強化学習に興味がある方 Anyone interested in post-training and reinforcement learning for language models
- 教科書 Textbook
- タイトル Title: “Reinforcement Learning from Human Feedback - A short introduction to RLHF and post-training focused on language models.”
- URL: https://rlhfbook.com/ (PDF / HTML)
- 著者 Author: Nathan Lambert
- 発表時間 Presentation Time
- 30分発表 30-minute presentation
- 20分議論 20-minute discussion
- 毎回2名発表 Two presenters each seminar
- total 100 minutes for two presenters
- Zoomあり毎回録画予定 Zoom available, seminars will be recorded
スケジュール Schedule
Date | Content | Presenters | Note |
---|---|---|---|
2025/06/02 (Mon) 13:30 - 15:10 | Seminar 1: Ch. 1-3, Ch. 4-6 | Ota, Ichinose | Different time for the first session only |
2025/06/16 (Mon) 17:15 - 18:55 | Seminar 2: Ch. 7, Ch. 8-10 | Matsushita, Takahashi | |
2025/06/30 (Mon) 17:15 - 18:55 | Seminar 3: Ch. 11, Ch. 12 | Ma, Shimada | |
2025/07/07 (Mon) 17:15 - 18:55 | Seminar 4: Presentations on RLHF-related papers (1) | Ohi, Saito | Not part of bi-weekly schedule |
2025/07/14 (Mon) 17:15 - 18:55 | Seminar 5: Ch. 13-16, Ch. 17-19 | Koike, Onami | |
2025/07/28 (Mon) 17:15 - 18:55 | Seminar 6: Presentations on RLHF-related papers (2) | Mizuki, Katsumata | Not part of bi-weekly schedule |
2025/08/04 (Mon) 17:15 - 18:55 | Seminar 7: Presentations on RLHF-related papers (3) | Oba, (Name) | One additional presenter possible |
今後の予定 Planned Seminars
2025/06/16 (Mon) 17:15 - 18:55 | Ch. 7, Ch. 8-10
- Optimization Tools 1 (P.37-43, total 7 pages) (Presenter: Matsushita)
- Optimization Tools 2 (P.44-55, total 12 pages) (Presenter: Takahashi)
2025/06/30 (Mon) 17:15 - 18:55 | Ch. 11, Ch. 12
- Optimization Tools 3 (P.56-74, total 18 pages) (Presenter: Ma)
- Optimization Tools 4 (P.75-83, total 8 pages) (Presenter: Shimada)
2025/07/07 (Mon) 17:15 - 18:55 | Presentations on RLHF-related papers (1)
- (Presenter: Ohi) 論文は後日決めます.
- (Presenter: Saito) Shao et al. “DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models”. arXiv. 2024. Paper(GRPOを扱う予定です.発表日・順は適宜調整可能です.)
2025/07/14 (Mon) 17:15 - 18:55 | Ch. 13-16, Ch. 17-19
- Advanced (P.84-99, total 16 pages) (Presenter: Koike)
- Open Questions (P.100-111, total 12 pages) (Presenter: Onami)
2025/07/28 (Mon) 17:15 - 18:55 | Presentations on RLHF-related papers (2)
- (Presenter: Mizuki) CHEN, Angelica, et al. Preference Learning Algorithms Do Not Learn Preference Rankings. In: NeurIPS 2024. 2024. Paper
- (Presenter: Katsumata) Rule Based Rewards for Language Model Safety. NeurIPS 2024. Paper (状況に応じて紹介論文変えると思います)
2025/08/04 (Mon) 17:15 - 18:55 | Presentations on RLHF-related papers (3)
- (Presenter: Oba) 論文は後ほど決定します。
- (発表希望 Preferred by: Name)
Past Seminars
2025/06/02 (Mon) 13:30 - 15:10 | Ch. 1-3, Ch. 4-6
- Introductions (P.5-18, total 14 pages) (Presenter: Ota)
- [slides], [supplementary slides about RL], [supplementary slides about distributed RL]
- Problem Setup & Context (P.19-36, total 18 pages) (Presenter: Ichinose)
- [slides]