More - LLM Alignments
- SlideDeck: 2026-SP-W2.1-advanced-llm-alignment.pdf
- Version: current
- Notes: advanced preference alignment
In this session, our readings cover:
Required Readings: ACTION & TOOL USE
Understanding how agents tooling frameworks and how they execute actions through external tools, APIs, and interfaces.
Core Component: Agent-Computer Interface (ACI) - How Agents Interact with Tools and Systems Key Concepts: Prompt engineering, tool calling, function APIs, agent tooling frameworks, efficient tool use
| Topic | Slide Deck | Previous Semester |
|---|---|---|
| Platform - Prompting Engineering Tools / Compression | W5.1.Team5-Prompt | 25course |
| Platform - Agent Tooling | W6.1-team2-master-ai-agent-book-review | 25course |
| Platform - More Agent Related | W6.2-team2-agent24-full | 25course |
| Prompt Engineering | W11-team-2-prompt-engineering-2 | 24course |
| Bonus Session: KV Cache, Tooling and WMDP | W15-KVcahe-WMDP-Tools | 24course |
2025 HIGH-IMPACT PAPERS on this topic
- c. SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks? (September 2025)
- arXiv: https://arxiv.org/abs/2509.16941
- Leaderboard: https://scale.com/leaderboard/swe_bench_pro_public
- 1,865 problems from 41 actively maintained repositories
- Enterprise-level complexity: Tasks requiring hours to days for professional engineers
- Multi-file modifications: Substantial code changes across repositories
- Three datasets: Public (11 repos), held-out (12 repos), commercial (18 proprietary repos)
- Contamination-resistant: GPL-licensed and commercial codebases
More Readings:
- d. From LLMs to LLM-based Agents for Software Engineering: A Survey (August 2024, Updated 2025)
- Link: https://arxiv.org/html/2408.02479v2
- Six key topics: Requirement engineering, code generation, autonomous decision-making, software design, test generation, software maintenance
