Agent - Planning / Test time scaling
- Notes: Agents planning
In this session, our readings cover:
Required Readings: PLANNING & ORCHESTRATION
Core Component: Agent Planning Module - Goal Decomposition and Strategy Formation
How agents break down complex tasks, form plans, and orchestrate multi-step workflows, leveraging world models when available. Key Concepts: Task decomposition, planning algorithms (with/without world models), agent workflows, domain-specific planning strategies, plan-then-act vs. continuous replanning
| Topic | Slide Deck | Previous Semester |
|---|---|---|
| Agent - Planning / World Model | W10.1-Team 3-Planning | 25course |
| Test time scaling | Week14.1-T5-Test-Time-Scaling | 25course |
| Platform - Prompting Engineering Tools / Compression | W5.1.Team5-Prompt | 25course |
| Prompt Engineering | W11-team-2-prompt-engineering-2 | 24course |
| LLM Alignment - PPO | W11.2-team6-PPO | 25course |
| LLM Post-training | W14.3.DPO | 25course |
| Scaling Law and Efficiency | W11-ScalinglawEfficientLLM | 24course |
| LLM Fine Tuning | W14-LLM-FineTuning | 24course |
2025 HIGH-IMPACT PAPERS on this topic
a. EnCompass: Separating Search from Agent Workflows (December 2025)
- arXiv: https://arxiv.org/abs/2512.03571
- Press: https://techxplore.com/news/2025-12-ai-agents-results-large-language.html Key Innovation: Separates search strategy from workflow code
- Performance: 15-40% accuracy boost on code repository translation
- Search strategies: Backtracking, parallel exploration, beam search (best: two-level beam search)
Use Cases: Code translation, digital grid transformation rules
b. Model-First Reasoning LLM Agents: Reducing Hallucinations through Explicit Problem Modeling (December 2025)
- Link: https://arxiv.org/abs/2512.14474
Two-Phase Paradigm:
- Modeling Phase: LLM constructs explicit model (entities, state variables, actions, constraints)
- Solution Phase: Generate plan based on explicit model
- Reduces constraint violations across medical scheduling, route planning, resource allocation, logic puzzles
- Outperforms Chain-of-Thought and ReAct
- Critical finding: Many planning failures stem from representational deficiencies, not reasoning limitations
Domains Tested: Medical scheduling, route planning, resource allocation, logic puzzles, procedural synthesis
