Agent Safety / More Autonomous Agents

Safety Agent

In this session, our readings cover:

Required Readings:

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents

[Submitted on 28 Jul 2024] The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies Feng He, Tianqing Zhu, Dayong Ye, Bo Liu, Wanlei Zhou, Philip S. Yu Inspired by the rapid development of Large Language Models (LLMs), LLM agents have evolved to perform complex tasks. LLM agents are now extensively applied across various domains, handling vast amounts of data to interact with humans and execute tasks. The widespread applications of LLM agents demonstrate their significant commercial value; however, they also expose security and privacy vulnerabilities. At the current stage, comprehensive research on the security and privacy of LLM agents is highly needed. This survey aims to provide a comprehensive overview of the newly emerged privacy and security issues faced by LLM agents. We begin by introducing the fundamental knowledge of LLM agents, followed by a categorization and analysis of the threats. We then discuss the impacts of these threats on humans, environment, and other agents. Subsequently, we review existing defensive strategies, and finally explore future trends. Additionally, the survey incorporates diverse case studies to facilitate a more accessible understanding. By highlighting these critical security and privacy issues, the survey seeks to stimulate future research towards enhancing the security and privacy of LLM agents, thereby increasing their reliability and trustworthiness in future applications.

More Readings:

Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models

Unique Security and Privacy Threats of Large Language Model: A Comprehensive Survey

Large Language Model Safety: A Holistic Survey

MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control

Privacy-Preserving Large Language Models: Mechanisms, Applications, and Future Directions