Unsupervised and Reinforcement Learning

Weekly hours
Competences
Objectives
Contents
Activities
Teaching methodology
Evaluation methodology
Bibliography
Previous capacities

Credits

Types

Compulsory

Requirements

This subject has not requirements, but it has got previous capacities

Department

Web

https://sites.google.com/upc.edu/aprns

This course covers two important fields of machine learning: non-supervised learning and reinforcement learning. Non-supervised learning is a type of machine learning where the algorithm learns patterns and structures from unlabeled data, whereas reinforcement learning is a type of machine learning where the algorithm learns from feedback given in the form of rewards or punishments.

The course will start with an introduction to the fundamental concepts and algorithms of non-supervised deep learning, such as autoencoders, adversarial networks or denosiing diffuson. The course will then move on to reinforcement learning, covering concepts such as Markov Decision Processes, Q-Learning, and Policy Gradient methods. The course will also explore the latest research in these fields, including deep reinforcement learning and unsupervised deep learning.

By the end of the course, students will have a strong foundation in non-supervised and reinforcement learning, and will be able to apply these techniques to real-world problems.

Teachers

Person in charge

Javier Béjar Alonso ( )
Mario Martín Muñoz ( )

Weekly hours

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Competences

Transversal Competences

Transversals

CT6 [Avaluable] - Autonomous Learning. Detect deficiencies in one's own knowledge and overcome them through critical reflection and the choice of the best action to extend this knowledge.

Basic

CB5 - That the students have developed those learning skills necessary to undertake later studies with a high degree of autonomy

Technical Competences

Especifics

CE18 - To acquire and develop computational learning techniques and to design and implement applications and systems that use them, including those dedicated to the automatic extraction of information and knowledge from large volumes of data.

Generic Technical Competences

Generic

CG2 - To use the fundamental knowledge and solid work methodologies acquired during the studies to adapt to the new technological scenarios of the future.
CG4 - Reasoning, analyzing reality and designing algorithms and formulations that model it. To identify problems and construct valid algorithmic or mathematical solutions, eventually new, integrating the necessary multidisciplinary knowledge, evaluating different alternatives with a critical spirit, justifying the decisions taken, interpreting and synthesizing the results in the context of the application domain and establishing methodological generalizations based on specific applications.

Objectives

To distinguish the kind of problems can be modeled as a reinforcement learning problem and identify the techniques that can be applied to solve them.
Related competences: CG2, CT6, CE18,
To understand the need, fundamentals, and particularities of behavior learning and the differences it has from supervised and unsupervised machine learning.
Related competences: CG2, CE18,
To understand the most important algorithms and state of the art in the area of learning by reinforcement
Related competences: CG4, CE18,
To know how to computationally formalize a real world problem as learning by reinforcement and know how to implement in the most current environments the learning algorithms that solve them
Related competences: CG2, CG4, CT6, CE18,
Know the problems that can be modeled with deep unsupervised algorithms
Related competences: CG2, CT6, CE18,
Understand the particularities of deep unsupervised algorithms
Related competences: CG4, CT6, CE18,
Know the most important algorithms and the state of the art of deep unsupervised learning
Related competences: CG2, CT6, CB5, CE18,
Knowing how to implement and apply deep learning algorithms to a problem using the most current environment
Related competences: CG2, CT6, CB5, CE18,

Introduction: Behavior Learning in Agents and description of main elements in Reinforcement Learning
Intuition, motivation and definition of the reinforcement learning (RL) framework. Key elements in RL.
Finding optimal policies using Dynamic Programming
How to learn the optimal policy with full knowledge of the world model: algebraic solution, policy iteration and value iteration.
Introduction to Model-Free approaches.
Basic algorithms for reinforcement learning: Monte-Carlo, Q-learning, Sarsa, TD(lambda). The need for Exploration. Differences between On-policy and Off-policy methods.
Function approximation in Reinforcement Learning
Need for function approximation and Incremental methods in RL. The Gradient Descent approach. RL with Linear function approximation. The deadly triad for function approximation in RL. Batch methods and Neural Networks for function Approximation.
Deep Reinforcement Learning (DRL)
Introducing Deep Learning in RL. Dealing with the deadly triad with the DQN algorithm. Application to the Atari games case. Evolutions of the DQN algorithm: Double DQN, Prioritized Experience Replay, multi-step learning and Distributional value functions. Rainbow: the state-of-the-art algorithm in discrete action space.
Policy gradient methods
What to do in continuous action spaces. How probabilistic policies allow to apply the gradient method directly in the policy network. The REINFORCE algorithm. The Actor-Critic algorithms. State-of-the-art algorithms in continuous action spaces: DDPG, TD3 and SAC.
Advanced Topics: How to deal with sparse rewards
The problem of the sparse reward. Introduction to advanced exploration techniques: curiosity and empowerment in RL. Introduction to curriculum learning to easy the learning of the goal. Hierarchical RL to learn complex tasks. The learning of Universal Value Functions and Hindsight Experience Replay (HER).
Reinforcement Learning in the multi-agent framework
Learning of behaviors in environment where several agents act. Learning of cooperative behaviors, Learning of competitive behaviors, and mixed cases. State-of-the art algorithms. The special case of games: The AlfaGo case and the extension to Alfa-Zero.
Introduction: Deep unsupervised learning
Introduction to the need for deep unsupervised learning and its applications
Autoregressive models
Introduction to learning probability distributions defined as autoregressive distributions and main models
Normalizing flows
Introduction to normalized flows for learning probability distributions
Latent variables models
Introduction to models based on latent variables and variational autoencoders
Generative Adversarial Networks
Introduction to generative adversarial networks, conditional and unconditional generation, attribute disentanglement
Denoising Diffusion netwoks
Introduction to models based on noise diffusion, denoising networks, conditioning, multimodal generation
Self supervised learning
Introduction to self-supervised learning for feature-generating networks and embeddings, contrastive and non contrastive methods, masking

Activities

Activity Evaluation act

Introduction: Behavior Learning in Agents and description of main elements in Reinforcement Learning

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Finding optimal policies using Dynamic Programming

How to learn the optimal policy with full knowledge of the world model: algebraic solution, policy iteration and value iteration.

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Introduction to Model-Free approaches. Monte-Carlo, Q-learning, Sarsa, TD(lambda)

Development of the corresponding topic

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Function approximation in RL

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Deep Reinforcement Learning (DRL)

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Policy gradient methods

What to do in continuous action spaces. How probabilistic policies allow to apply the gradient method directly in the policy network. The REINFORCE algorithm. The Actor-Critic algorithms. State-of-the-art algorithms in continuous action spaces: DDPG, TD3 and SAC.

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Advanced Topics: How to deal with sparse rewards

The problem of the sparse reward. Introduction to advanced exploration techniques: curiosity and empowerment in RL. Introduction to curriculum learning to easy the learning of the goal. Hierarchical RL to learn complex tasks. The learning of Universal Value Functions and Hindsight Experience Replay (HER).

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Reinforcement Learning in the multi-agent framework

Learning of behaviors in environment where several agents act. Learning of cooperative behaviors, Learning of competitive behaviors, and mixed cases. State-of-the art algorithms. The special case of games: The AlfaGo case and the extension to Alfa-Zero.

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Control of the reinforcement learning part

Objectives: 3 4 2 1
Week: 8 (Outside class hours)

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Introduction: Deep unsupervised learning

Introduction to the need for deep unsupervised learning and its applications

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Autoregressive models

Introduction to learning probability distributions defined as autoregressive distributions and main models

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Normalizing flows

Introduction to normalized flows for learning probability distributions

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Latent variables models

Introduction to models based on latent variables and variational autoencoders

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Generative Adversarial Networks

Introduction to generative adversarial networks, conditional and unconditional generation, attribute disentanglement

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Denoising Diffusion netwoks and Self supervised learning

Introduction to models based on noise diffusion, denoising networks, conditioning, multimodal generation

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Unsupervised learning syllabus control

Objectives: 5 6 7 8
Week: 15 (Outside class hours)

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Teaching methodology

The classes are divided into theory, problem and laboratory sessions.

In the theory sessions, knowledge of the subject will be developed, interspersed with the presentation of new theoretical material with examples and interaction with the students in order to discuss the concepts.

In the laboratory classes, small practices will be developed using tools and using specific libraries that will allow you to practice and reinforce the knowledge of the theory classes.

Evaluation methodology

The subject will include the following assessment acts:

- Reports of the laboratory activities, which must be delivered within the deadline indicated for each session (roughly, 2 weeks). Based on a weighted average of the grades of these reports, a laboratory grade will be calculated, L.

- A first partial exam, taken towards the middle of the course, of the material seen until then. Let P1 be the grade obtained in this exam.

- On the designated day within the exam period, a second partial exam of the subject not covered by the first partial. Let P2 be the grade obtained in this exam.

The three grades L, P1, P2 are between 0 and 10.

The final grade of the subject will be: 0.4*L +0.3*P1 + 0.3*P2

Only can do the re-evaluation those people who, have failed the final exam. The maximum mark that can be obtained in the re-evaluation is a 7.

Bibliography

Basic:

Reinforcement learning : an introduction - Sutton, Richard S; Barto, Andrew G, The MIT Press, [2020]. ISBN: 9780262039246
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004166329706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Grokking deep reinforcement learning - Morales, Miguel, Manning Publications, 2020. ISBN: 9781617295454
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004208939706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Generative deep learning: teaching machines to paint, write, compose, and play - Foster, D, O'Reilly Media, Incorporated, 2023. ISBN: 9781098134143
Hands-on image generation with TensorFlow: a practical guide to generating images and videos using deep learning - Cheong, S.Y, Packt Publishing, 2020. ISBN: 9781838821104

Complementary:

Deep reinforcement learning in action - Zai, Alexander; Brown, Brandon, Manning Publications Co , 2020. ISBN: 9781617295430
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004203829706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Generative AI with Python and TensorFlow 2: harness the power of generative models to create images, text, and music - Babcock, J.; Bali, R, Packt Publishing , 2021. ISBN: 9781800208506

Previous capacities

Basic knowledge of Deep Learning and Machine Learning.

Unsupervised and Reinforcement Learning

Teachers

Person in charge

Weekly hours

Competences

Transversal Competences

Transversals

Basic

Technical Competences

Especifics

Generic Technical Competences

Generic

Objectives

Contents

Activities

Introduction: Behavior Learning in Agents and description of main elements in Reinforcement Learning

Finding optimal policies using Dynamic Programming

Introduction to Model-Free approaches. Monte-Carlo, Q-learning, Sarsa, TD(lambda)

Function approximation in RL

Deep Reinforcement Learning (DRL)

Policy gradient methods

Advanced Topics: How to deal with sparse rewards

Reinforcement Learning in the multi-agent framework

Control of the reinforcement learning part

Introduction: Deep unsupervised learning

Autoregressive models

Normalizing flows

Latent variables models

Generative Adversarial Networks

Denoising Diffusion netwoks and Self supervised learning

Unsupervised learning syllabus control

Teaching methodology

Evaluation methodology

Bibliography

Basic:

Complementary:

Previous capacities

Where we are

Contact with us

Unsupervised and Reinforcement Learning

You are here

Teachers

Person in charge

Weekly hours

Competences

Transversal Competences

Transversals

Basic

Technical Competences

Especifics

Generic Technical Competences

Generic

Objectives

Contents

Activities

Introduction: Behavior Learning in Agents and description of main elements in Reinforcement Learning

Finding optimal policies using Dynamic Programming

Introduction to Model-Free approaches. Monte-Carlo, Q-learning, Sarsa, TD(lambda)

Function approximation in RL

Deep Reinforcement Learning (DRL)

Policy gradient methods

Advanced Topics: How to deal with sparse rewards

Reinforcement Learning in the multi-agent framework

Control of the reinforcement learning part

Introduction: Deep unsupervised learning

Autoregressive models

Normalizing flows

Latent variables models

Generative Adversarial Networks

Denoising Diffusion netwoks and Self supervised learning

Unsupervised learning syllabus control

Teaching methodology

Evaluation methodology

Bibliography

Basic:

Complementary:

Previous capacities

Where we are

Contact with us