Platform - VLM Jailbreaking / Probing

Jailbreaking Multimodal

In this session, our readings cover:

Required Readings:

garak: A Framework for Security Probing Large Language Models

MMJ-Bench: A Comprehensive Study on Jailbreak Attacks and Defenses for Multimodal Large Language Models

More Readings:

Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs

Jailbreak Attacks and Defenses Against Large Language Models: A Survey

Safeguarding Large Language Models: A Survey