Posts by Tag

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning Nathaniel Li, Alexander Pan, Anjali Gopal, Summer Yue, Daniel Berrios, Alice Gatt...

More FM risk

38 minute read

In this session, our readings cover:

LLM multimodal harm responses

14 minute read

In this session, our readings cover:

FM toxicity / harmful outputs

10 minute read

In this session, our readings cover:

FM fairness / bias issues

33 minute read

In this session, our readings cover:

FM privacy leakage issues

14 minute read

In this session, our readings cover:

FM copyright infrigement

26 minute read

In this session, our readings cover:

Survey AI Risk framework

14 minute read

In this session, our readings cover:

GenAI Guardrails

19 minute read

In this session, our readings cover:

Back to Top ↑

Safety

Model Interpretibility for FM

6 minute read

In this session, our readings cover:

Agent Safety

9 minute read

In this session, our readings cover:

Extra readings - Agent Guardrailing

6 minute read

In this session, our readings cover:

Platform - VLM Jailbreaking / Probing

5 minute read

In this session, our readings cover:

Platform - Model Jailbreaking / Safeguarding

5 minute read

In this session, our readings cover:

Safety Benchmark WMDP

1 minute read

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning Nathaniel Li, Alexander Pan, Anjali Gopal, Summer Yue, Daniel Berrios, Alice Gatt...

More FM risk

38 minute read

In this session, our readings cover:

LLM multimodal harm responses

14 minute read

In this session, our readings cover:

FM toxicity / harmful outputs

10 minute read

In this session, our readings cover:

Back to Top ↑

LLMEvaluate

Privacy

1 minute read

Beyond Memorization: Violating Privacy Via Inference with Large Language Models

Gradient for LLM

less than 1 minute read

Gradient for LLM

1 minute read

TextGrad: Automatic “Differentiation” via Text

Safety Benchmark WMDP

1 minute read

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning Nathaniel Li, Alexander Pan, Anjali Gopal, Summer Yue, Daniel Berrios, Alice Gatt...

FM privacy leakage issues

14 minute read

In this session, our readings cover:

FM copyright infrigement

26 minute read

In this session, our readings cover:

Survey AI Risk framework

14 minute read

In this session, our readings cover:

LLM evaluating framework

16 minute read

In this session, our readings cover:

Back to Top ↑

BasicLLM

more LLM basics - a survey

1 minute read

In this session, our readings cover:

LLM basics - emergent ability and GenAI platform

less than 1 minute read

Readings:

Introduction

less than 1 minute read

Readings:

Recent LLM basics

21 minute read

In this session, our readings cover:

Open Source LLM - Mistral Data preparation

27 minute read

In this session, our readings cover:

Survey LLMs and Multimodal FMs

2 minute read

In this session, our readings cover:

LLM basics

less than 1 minute read

Required Readings:

Back to Top ↑

Agent

Agent Safety

9 minute read

In this session, our readings cover:

Platform - More agent related

1 minute read

In this session, our readings cover:

Platform - Agent Tooling

less than 1 minute read

In this session, our readings cover:

Platform - Context construction via RAG and Agent

2 minute read

In this session, our readings cover:

MultiAgent LLMs

16 minute read

In this session, our readings cover:

LLM Agents

23 minute read

Required Readings:

Back to Top ↑

Mitigate

Safety Benchmark WMDP

1 minute read

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning Nathaniel Li, Alexander Pan, Anjali Gopal, Summer Yue, Daniel Berrios, Alice Gatt...

FM privacy leakage issues

14 minute read

In this session, our readings cover:

FM copyright infrigement

26 minute read

In this session, our readings cover:

Survey AI Risk framework

14 minute read

In this session, our readings cover:

GenAI Guardrails

19 minute read

In this session, our readings cover:

Back to Top ↑

Applications

Agent - In Healthcare

8 minute read

In this session, our readings cover:

Extra - BioLLM

11 minute read

In this session, our readings cover:

Survey - FMs in Robotics

3 minute read

In this session, our readings cover:

Survey - FMs in healthcare

4 minute read

In this session, our readings cover:

Survey - BioScience LLMs

2 minute read

In this session, our readings cover:

Back to Top ↑

RL

RL course review

2 minute read

Overview

RLHF + InstructGPT

less than 1 minute read

Papers Paper URL Abstract Training language models to follow instructions with human feedback URL ...

Decision Transformers

1 minute read

Decision Transformer: Reinforcement Learning via Sequence Modeling Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Piet...

A Generalist Agent + offline RL + UniMask

less than 1 minute read

Papers Paper URL Abstract

Back to Top ↑

AGI

RLHF + InstructGPT

less than 1 minute read

Papers Paper URL Abstract Training language models to follow instructions with human feedback URL ...

Decision Transformers

1 minute read

Decision Transformer: Reinforcement Learning via Sequence Modeling Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Piet...

A Generalist Agent + offline RL + UniMask

less than 1 minute read

Papers Paper URL Abstract

Back to Top ↑

language model

RLHF + InstructGPT

less than 1 minute read

Papers Paper URL Abstract Training language models to follow instructions with human feedback URL ...

Emergent Abilities of LLM

1 minute read

Emergent Abilities of Large Language Models URL “an ability to be emergent if it is not present in smaller models but is present in larger models. Thus...

DiffDock + ESMfold

less than 1 minute read

Papers Paper URL Abstract Evolutionary-scale prediction of atomic level protein structure with a language mo...

Back to Top ↑

RAG

Platform - long context vs RAG + Hallucination

5 minute read

In this session, our readings cover:

Platform - Context construction via RAG and Agent

2 minute read

In this session, our readings cover:

Knowledge Augmented FMs

17 minute read

In this session, our readings cover:

Back to Top ↑

Reasoning

advanced LLM - Math reasoning

6 minute read

In this session, our readings cover:

advanced LLM - for code reasoning

3 minute read

In this session, our readings cover:

Self-exam LLM and reasoning

19 minute read

In this session, our readings cover:

Back to Top ↑

Train

Privacy

1 minute read

Beyond Memorization: Violating Privacy Via Inference with Large Language Models

Gradient for LLM

less than 1 minute read

Gradient for LLM

1 minute read

TextGrad: Automatic “Differentiation” via Text

Back to Top ↑

Customization

LLM Post-training

2 minute read

In this session, our readings cover:

LLM Alignment - PPO

4 minute read

In this session, our readings cover:

Platform - Model Customization - instruction tuning / LoRA

2 minute read

In this session, our readings cover:

Back to Top ↑

Alignment

LLM fine tuning

29 minute read

In this session, our readings cover:

Survey human alignment

18 minute read

In this session, our readings cover:

Back to Top ↑

Prompting

Platform - Prompting Engineering tools / Prompt Compression

2 minute read

In this session, our readings cover:

Prompt Engineering

26 minute read

In this session, our readings cover:

Back to Top ↑

Jailbreaking

Platform - VLM Jailbreaking / Probing

5 minute read

In this session, our readings cover:

Platform - Model Jailbreaking / Safeguarding

5 minute read

In this session, our readings cover:

Back to Top ↑

Scaling

Inference test time scaling law

4 minute read

In this session, our readings cover:

More on LLM based agents

9 minute read

In this session, our readings cover:

Back to Top ↑

Protein

DiffDock + ESMfold

less than 1 minute read

Papers Paper URL Abstract Evolutionary-scale prediction of atomic level protein structure with a language mo...

Back to Top ↑

Diffusion

Stable diffusion + DreamBooth + LoRA

1 minute read

Stable diffusion URL “High-Resolution Image Synthesis with Latent Diffusion Models”

Back to Top ↑

Image synthesis

Stable diffusion + DreamBooth + LoRA

1 minute read

Stable diffusion URL “High-Resolution Image Synthesis with Latent Diffusion Models”

Back to Top ↑

Human Alignment

RLHF + InstructGPT

less than 1 minute read

Papers Paper URL Abstract Training language models to follow instructions with human feedback URL ...

Back to Top ↑

Bias

FM fairness / bias issues

33 minute read

In this session, our readings cover:

Back to Top ↑

Hallucination

LLM Hallucination

15 minute read

In this session, our readings cover:

Back to Top ↑

DomainAdapt

Domain Centered FMs

23 minute read

In this session, our readings cover:

Back to Top ↑

ModelEdit

Model editing and Disgorgement

19 minute read

In this session, our readings cover:

Back to Top ↑

Interpretibility

LLM interpretibility, trust and knowledge conflicts

15 minute read

Required Readings:

Back to Top ↑

Serving

Platform - Model Serving

4 minute read

In this session, our readings cover:

Back to Top ↑

LongContext

Platform - long context vs RAG + Hallucination

5 minute read

In this session, our readings cover:

Back to Top ↑

Planning

Agent - Planning / World Model

7 minute read

In this session, our readings cover:

Back to Top ↑

Multiagent

Agent - multiagent collaboration

6 minute read

In this session, our readings cover:

Back to Top ↑

Multimodal

multimodal FMs - video / audio

5 minute read

In this session, our readings cover:

Back to Top ↑