LLM Post-training

Customization

In this session, our readings cover:

Required PPO readings

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

Tulu 3: Pushing Frontiers in Open Language Model Post-Training

More Readings: