RL course review

2 minute read

RL Course Material from Prof. Guni at TAMU
- Excellent slide decks! HERE
- TAMU CSCE-642, a graduate-level course on Reinforcement Learning in Fall 2025
CSCE-642 Reinforcement Learning Course Briefing: Fall 2025
- CSCE-642 is a comprehensive Reinforcement Learning (RL) course scheduled for the Fall 2025 term. The curriculum is designed to transition students from fundamental RL concepts to state-of-the-art deep learning applications and research. Success in the course requires a specific prerequisite background, active participation in online assessments, and the completion of an original research project. The pedagogical approach combines theoretical foundations—covering Markov Decision Processes (MDPs) and Monte-Carlo methods—with modern techniques such as Deep Q-Learning, Soft Actor-Critic, and Curriculum Learning.
- The curriculum is supported by four primary textbooks covering the breadth of reinforcement learning, artificial intelligence, and deep learning:

Title	Scope
Reinforcement Learning: An Introduction	Foundational RL concepts
Reinforcement Learning: State-of-the-Art	Advanced and current RL methodologies
Artificial Intelligence: A Modern Approach	General AI framework
Deep Learning	Neural network architectures and optimization

The course moves logically from basic bandits to complex, derivative-free, and curriculum-based learning.

The initial sessions establish the mathematical framework for RL, focusing on tabular methods and fundamental algorithms.

Introduction and Multi-Armed Bandits: Initial focus on the exploration-exploitation trade-off.
Markov Decision Processes (MDPs): Extensive coverage over two sessions to define the environment and agent interactions.
Monte-Carlo and Temporal Difference (TD): Introduction to learning from experience and bootstrapping techniques.
Model-based RL: Exploring environments where the transition dynamics are known or learned.

As the course progresses, it moves into high-dimensional state spaces requiring function approximation.

Function Approximation: Coverage of Prediction and Control with approximation, specifically utilizing Deep Neural Networks (DNNs).
Eligibility Traces: Bridging the gap between TD and Monte-Carlo methods.
Deep RL Architectures: Detailed sessions on Deep Q-Learning, Policy Gradient methods, and Actor-Critic frameworks.
Advanced Optimization: In-depth analysis of Trust Regions and Soft Actor-Critic (SAC) methods, both of which are allocated multiple sessions for thorough examination.

The final segment of the course addresses specialized sub-fields and modern research challenges.

Transfer Learning: Methods for applying knowledge from one task to another.
Imitation Learning: Learning from expert demonstrations.
Derivative-Free Methods: Optimization techniques that do not rely on gradients.
Curriculum Learning: Structured learning paths to improve agent training efficiency.
Summary of schedule

Lecture #	Slides
1	Introduction.pptx
2	Multi-Armed Bandit.pptx
3	MDPs.pptx
4	MDPs.pptx (continue)
5	Monte-Carlo Methods.pptx
6	Monte-Carlo Methods.pptx (continue)
7	Temporal Difference.pptx
8	Bootstrapping.pptx
9	Model-based RL.pptx
10	Prediction with Approximation.pptx
11	DNN Approximation.pptx
12	Control with Approximation.pptx
13	Eligibility Traces.pptx
14	Deep Q-Learning.pptx
15	No Class
16	Policy Gradient.pptx
17	Actor Critic.pptx
18	Trust Regions.pptx
19	Trust Regions.pptx (continue)
20	Soft Actor-Critic.pptx
21	Soft Actor-Critic.pptx (continue)
22	Transfer Learning.pptx
23	Transfer Learning.pptx (continue)
24	Immitation Learning.pptx
25	Derivative Free.pptx
26	Derivative Free.pptx (continue)
27	Curriculum Learning.pptx
28	Curriculum Learning.pptx (continue)

You May Also Enjoy