Extra readings - Agent Guardrailing

Safety

In this session, our readings cover:

Required Readings:

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents

SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model

AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems

More Readings:

Agents for Software Development

Identifying the Risks of LM Agents with an LM-Emulated Sandbox

Building Safe GenAI Applications: An End-to-End Overview of Red Teaming for Large Language Models

Safeguarding Large Language Models: A Survey