The goals of this course are two-fold: first, to provide students with a sufficient mathematical and computational background to analyze distributed intelligent systems through appropriate models, and second, to illustrate several coordination strategies and show how to concretely implement and optimize them. The course is a well-balanced mixture of theory and laboratory exercises using simulation and real hardware platforms. It involves the following topics: 1) introduction to key concepts such as self-organization and software and hardware tools used in the course, 2) examples of natural, artificial, and hybrid distributed intelligent systems, 3) machine-learning methods: single- and multi-agent techniques, and 4) coordination strategies and distributed control.
Teachers
Person in charge
Sergio Álvarez Napagao (
)
Others
Javier Vazquez Salceda (
)
Ramon Sangüesa Sole (
)
Ulises Cortés García (
)
Weekly hours
Theory
2
Problems
0
Laboratory
2
Guided learning
0
Autonomous learning
6
Competences
Transversal Competences
Teamwork
G5 [Avaluable] - To be capable to work as a team member, being just one more member or performing management tasks, with the finality of contributing to develop projects in a pragmatic way and with responsibility sense; to assume compromises taking into account the available resources.
G5.3
- To identify the roles, skills and weaknesses of the different members of the group. To propose improvements in the group structure. To interact with efficacy and professionalism. To negotiate and manage conflicts in the group. To recognize and give support or assume the leader role in the working group. To evaluate and present the results of the tasks of the group. To represent the group in negotiation involving other people. Capacity to collaborate in a multidisciplinary environment. To know and apply the techniques for promoting the creativity.
Technical Competences of each Specialization
Computer science specialization
CCO2 - To develop effectively and efficiently the adequate algorithms and software to solve complex computation problems.
CCO2.1
- To demonstrate knowledge about the fundamentals, paradigms and the own techniques of intelligent systems, and analyse, design and build computer systems, services and applications which use these techniques in any applicable field.
CCO2.2
- Capacity to acquire, obtain, formalize and represent human knowledge in a computable way to solve problems through a computer system in any applicable field, in particular in the fields related to computation, perception and operation in intelligent environments.
Objectives
To master the basic concepts of Distributed Artificial Intelligence
Related competences:
G9.1,
CCO2.1,
CCO2.2,
To become familiar with the intelligent agent paradigm as a key piece in the construction of multi-agent systems
Related competences:
G7.1,
G9.1,
G5.3,
CCO2.1,
CCO2.2,
To know the logical and computational models that allow the construction of goal-oriented agents
Related competences:
G7.1,
G9.1,
G5.3,
CCO2.1,
CCO2.2,
Know the logical and computational models that allow the construction of utility-driven agents
Related competences:
G7.1,
G9.1,
G5.3,
CCO2.1,
CCO2.2,
To know the different methodologies, algorithms and technologies to train agents through reinforcement learning
Related competences:
G7.1,
G9.1,
G5.3,
CCO2.1,
CCO2.2,
To learn the basic concepts of game theory and its relationship with multi-agent systems
Related competences:
G7.1,
G9.1,
G5.3,
CCO2.1,
CCO2.2,
To learn several cooperation methodologies and algorithms for agents in a multi-agent system
Related competences:
G7.1,
G9.1,
G5.3,
CCO2.1,
CCO2.2,
To know various methodologies and algorithms for the competition between agents in a multi-agent system
Related competences:
G7.1,
G9.1,
G5.3,
CCO2.1,
CCO2.2,
To understand the most relevant aspects of the field of Mechanism Design
Related competences:
G9.1,
CCO2.1,
To know and to understand the social and ethical implications of Artificial Intelligence applied to systems capable of making decisions autonomously
Related competences:
G9.1,
CCO2.1,
Contents
Introduction: intelligent distributed systems
Perspectives on Artificial Intelligence.
Introduction to distributed computing systems.
Cognitive architecture paradigm and historical vision.
Introduction to multi-agent systems.
Intelligent agents
Definition of intelligent agent.
Rationality.
Bounded rationality.
Definition of environment.
Properties of an environment.
Intelligent agent architectures: reactive, goal-driven deliberative, utility-driven deliberative, adaptive.
Goal-driven agents
What is a logic-symbolic agent.
Modal logic.
Possible worlds logic.
Alethic, doxastic, epistemic modal logics.
Goal-guided practical reasoning: the agent as an intentional system.
Implementation of a goal-driven agent: the agent control loop.
Commitment management with respect to a goal.
BDI logic (Belief-Desire-Intention).
Ontologies
Representing the world: ontology and epistemology.
The semiotic triangle.
Elements of an ontology.
Representation languages: OWL and RDF.
Knowledge graphs.
Ontological reasoning.
Descriptive logic: ABox, TBox.
Utility-driven agents
Goals vs utility.
Definition of utility.
Reward hypothesis and reward signal.
Definition of sequential decision problem.
Markov Decision Processes (MDPs).
Trajectories and policies: discount factor.
Algorithms for solving MDPs: policy evaluation and value iteration.
Brief Introduction to Partially Observable Markov Decision Processes (POMDPs).
Reinforcement learning
Multi-armed bandits: exploration vs exploitation.
How to learn to decide: reinforcement learning, categorization and taxonomy.
Model-based Monte Carlo.
Time difference learning algorithms: SARSA and Q-Learning.
Policy gradient algorithms: REINFORCE.
Multi-agent systems: Game Theory
Why to formalize multi-agent systems: Braess's paradox.
Definition of multi-agent environment and multi-agent system.
Brief introduction to computational models for multi-agent systems: MDPs, DCOPs, planning, distributed systems, socio-technical systems, game theory.
Introduction to Normal Form Game Theory: the prisoner's dilemma.
Solution concepts: dominant strategy, minimax and maximin strategies, Nash equilibrium.
How to compute expected reward.
Equilibrium efficiency: price of anarchy, Pareto optimality.
Introduction to multi-agent coordination: competition vs cooperation.
Cooperation
What is cooperation?
Challenges, structures and modes of cooperation.
Brief introduction to theories and models of cooperation.
Theory of Coalitions.
Definition of superadditive, simple and convex games.
Fair coalitional game: Shapley value.
Stable coalitional game: the Core.
Social choice theory: Condorcet's paradox and desirable properties.
Functions of social choice: majority, plurality, Condorcet, Borda, hare, fixed agenda, dictatorial.
Introduction to consensus algorithms: Paxos.
Competition
What is competition?
Competition theories and models.
Definition of game in extensive form.
Reduction of extensive form to normal form.
How to compute Nash Equilibrium: the backward induction algorithm.
Negotiation as a mechanism of competition.
Bargaining problem definition and how to solve it using backward induction (subgame perfect equilibria).
Nash bargaining solution.
Competition resolution as an adversary game: Minimax, Expectiminimax, Monte Carlo search tree.
Mechanism design
Definition of mechanism.
Theory of implementation.
Incentive compatibility.
Principle of revelation.
Design of mechanisms seen as an optimization problem.
Example of type of mechanism: auctions.
Market mechanisms.
Naive, first-price and second-price auction (Vickrey-Clarke-Groves).
Example of auction and consensus combination.
Multi-agent reinforcement learning
From game theory to reinforcement learning: stochastic games and partially observable stochastic games.
How to add communication to a stochastic game.
Definition of multi-agent reinforcement learning problem.
Computing expected utility: individual policy vs joint policy.
Solution concepts: equilibria, Pareto optimality, social welfare, minimum entanglement.
Training process and guarantees and type of convergence to a solution: what happens when a policy is not stationary.
Agent reduction training methodologies: centralized learning, independent learning, self-play (AlphaZero).
Multi-agent training algorithms: Joint Action Learning, Agent Modeling.
Symbolic models for social AI
Introduction to socio-technical systems: impact on society of intelligent distributed systems.
Social coordination and organizational models: social abstractions, norms, roles.
Electronic organizations: OperA.
Normative models: electronic institutions, HarmonIA.
Holistic models: OMNI.
Agents and ethics
Review of the concepts of intelligent agent and rational agent.
Relationship between agency and intelligence.
Social and ethical issues of Artificial Intelligence: privacy, responsible AI.
Activities
ActivityEvaluation act
Introduction: intelligent distributed systems
Perspectives on Artificial Intelligence.
Introduction to distributed computing systems.
Cognitive architecture paradigm and historical vision.
Introduction to multi-agent systems.
Theory: Perspectives on Artificial Intelligence.
Introduction to distributed computing systems.
Cognitive architecture paradigm and historical vision.
Introduction to multi-agent systems.
What is a logic-symbolic agent.
Modal logic.
Possible worlds logic.
Alethic, doxastic, epistemic modal logics.
Goal-guided practical reasoning: the agent as an intentional system.
Implementation of a goal-driven agent: the agent control loop.
Commitment management with respect to a goal.
BDI logic (Belief-Desire-Intention).
Theory: What is a logic-symbolic agent.
Modal logic.
Possible worlds logic.
Alethic, doxastic, epistemic modal logics.
Goal-guided practical reasoning: the agent as an intentional system.
Implementation of a goal-driven agent: the agent control loop.
Commitment management with respect to a goal.
BDI logic (Belief-Desire-Intention).
Laboratory: Introduction to Python.
Setting up the Python environment.
Installation of the multi-agent environment.
Practical work with a logical-symbolic language for agents guided by objectives.
Development of goal-driven agents.
In this activity, the students, organized in groups, will have to analyze a recent academic article in which a novel agent architecture is presented. Objectives:12 Week:
3 (Outside class hours)
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
3h
Ontologies
Representing the world: ontology and epistemology.
The semiotic triangle.
Elements of an ontology.
Representation languages: OWL and RDF.
Knowledge graphs.
Ontological reasoning.
Descriptive logic: ABox, TBox.
Theory: Representing the world: ontology and epistemology.
The semiotic triangle.
Elements of an ontology.
Representation languages: OWL and RDF.
Knowledge graphs.
Ontological reasoning.
Descriptive logic: ABox, TBox.
Laboratory: Learn how to use Protégé to define concepts using descriptive logic: definition by inclusion and by equivalence.
Implementation of other axioms of descriptive logic.
How to do ontological reasoning: theory and practice.
Goals vs utility.
Definition of utility.
Reward hypothesis and reward signal.
Definition of sequential decision problem.
Markov Decision Processes (MDPs).
Trajectories and policies: discount factor.
Algorithms for solving MDPs: policy evaluation and value iteration.
Brief Introduction to Partially Observable Markov Decision Processes (POMDPs).
Theory: Goals vs utility.
Definition of utility.
Reward hypothesis and reward signal.
Definition of sequential decision problem.
Markov Decision Processes (MDPs).
Trajectories and policies: discount factor.
Algorithms for solving MDPs: policy evaluation and value iteration.
Brief Introduction to Partially Observable Markov Decision Processes (POMDPs).
Laboratory: Practical exercises in solving Markov decision processes (MDPs).
How to formalize a problem as a MDP.
Solving a MDP with policy evaluation and value iteration.
In this activity, groups of students will have to modify an already existing ontology to apply a set of descriptive logic axioms, both on paper and in an ontology design tool (e.g. Protégé). Objectives:3 Week:
5 (Outside class hours)
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
3h
Reinforcement learning
Multi-armed bandits: exploration vs exploitation.
How to learn to decide: reinforcement learning, categorization and taxonomy.
Model-based Monte Carlo.
Time difference learning algorithms: SARSA and Q-Learning.
Policy gradient algorithms: REINFORCE.
Theory: Multi-armed bandits: exploration vs exploitation.
How to learn to decide: reinforcement learning, categorization and taxonomy.
Model-based Monte Carlo.
Time difference learning algorithms: SARSA and Q-Learning.
Policy gradient algorithms: REINFORCE.
Laboratory: Introduction to the Gymnasium library for agent simulation and training.
Reinforcement learning practices with a functional environment: value iteration, direct estimation, Q-Learning, REINFORCE.
In this laboratory assignment, the teams of students will design and develop intelligent agents in a complex environment, using techniques and logic seen in the theory and laboratory sessions. Objectives:123 Week:
6 (Outside class hours)
Theory
0h
Problems
0h
Laboratory
0.5h
Guided learning
0h
Autonomous learning
20h
Partial Exam
The partial exam will be done during standard class hours. People who do not pass the partial will be evaluated again on the final exam. Objectives:12345 Week:
7
Theory
2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
10h
Multi-agent systems: Game Theory
Why to formalize multi-agent systems: Braess's paradox.
Definition of multi-agent environment and multi-agent system.
Brief introduction to computational models for multi-agent systems: MDPs, DCOPs, planning, distributed systems, socio-technical systems, game theory.
Introduction to Normal Form Game Theory: the prisoner's dilemma.
Solution concepts: dominant strategy, minimax and maximin strategies, Nash equilibrium.
How to compute expected reward.
Equilibrium efficiency: price of anarchy, Pareto optimality.
Introduction to multi-agent coordination: competition vs cooperation.
Theory: Why to formalize multi-agent systems: Braess's paradox.
Definition of multi-agent environment and multi-agent system.
Brief introduction to computational models for multi-agent systems: MDPs, DCOPs, planning, distributed systems, socio-technical systems, game theory.
Introduction to Normal Form Game Theory: the prisoner's dilemma.
Solution concepts: dominant strategy, minimax and maximin strategies, Nash equilibrium.
How to compute expected reward.
Equilibrium efficiency: price of anarchy, Pareto optimality.
Introduction to multi-agent coordination: competition vs cooperation.
Laboratory: Solving exercises of games in normal form: problem modeling, calculation of strategies and equilibria, price of anarchy and Pareto-optimality.
Best-answer algorithm for finding dominant strategies and equilibria: theory and practice.
Mixed equilibrium calculation algorithm: theory and practice.
What is cooperation?
Challenges, structures and modes of cooperation.
Brief introduction to theories and models of cooperation.
Theory of Coalitions.
Definition of superadditive, simple and convex games.
Fair coalitional game: Shapley value.
Stable coalitional game: the Core.
Social choice theory: Condorcet's paradox and desirable properties.
Functions of social choice: majority, plurality, Condorcet, Borda, hare, fixed agenda, dictatorial.
Introduction to consensus algorithms: Paxos.
Theory: What is cooperation?
Challenges, structures and modes of cooperation.
Brief introduction to theories and models of cooperation.
Theory of Coalitions.
Definition of superadditive, simple and convex games.
Fair coalitional game: Shapley value.
Stable coalitional game: the Core.
Social choice theory: Condorcet's paradox and desirable properties.
Functions of social choice: majority, plurality, Condorcet, Borda, hare, fixed agenda, dictatorial.
Introduction to consensus algorithms: Paxos.
Laboratory: Resolution of coalitional games.
Practical calculation of the Shapley value and the Core.
Resolution of social choice exercises.
What is competition?
Competition theories and models.
Definition of game in extensive form.
Reduction of extensive form to normal form.
How to compute Nash Equilibrium: the backward induction algorithm.
Negotiation as a mechanism of competition.
Bargaining problem definition and how to solve it using backward induction (subgame perfect equilibria).
Nash bargaining solution.
Competition resolution as an adversary game: Minimax, Expectiminimax, Monte Carlo search tree.
Theory: What is competition?
Competition theories and models.
Definition of game in extensive form.
Reduction of extensive form to normal form.
How to compute Nash Equilibrium: the backward induction algorithm.
Negotiation as a mechanism of competition.
Bargaining problem definition and how to solve it using backward induction (subgame perfect equilibria).
Nash bargaining solution.
Competition resolution as an adversary game: Minimax, Expectiminimax, Monte Carlo search tree.
Laboratory: Solving competition problems.
Formalization of problems as games in extensive form.
Reduction of extensive form to normal form.
Formalization and resolution of bargaining problems.
Application of backward induction to find Nash equilibria and SPE (subgame perfect equilibria).
Student teams will be required to write a report with a comparative study of the performance of various reinforcement learning techniques in a proposed environment. Objectives:45 Week:
10 (Outside class hours)
Theory
0h
Problems
0h
Laboratory
0.5h
Guided learning
0h
Autonomous learning
20h
Mechanism design
Definition of mechanism.
Theory of implementation.
Incentive compatibility.
Principle of revelation.
Design of mechanisms seen as an optimization problem.
Example of type of mechanism: auctions.
Market mechanisms.
Naive, first-price and second-price auction (Vickrey-Clarke-Groves).
Example of auction and consensus combination.
Theory: Definition of mechanism.
Theory of implementation.
Incentive compatibility.
Principle of revelation.
Design of mechanisms seen as an optimization problem.
Example of type of mechanism: auctions.
Market mechanisms.
Naive, first-price and second-price auction (Vickrey-Clarke-Groves).
Example of auction and consensus combination.
Students will have to deliver the solution to game theory exercises proposed in Racó, potentially including: games in normal form, coalition games, games in extensive form and/or bargaining problems. Objectives:678 Week:
11 (Outside class hours)
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
4h
Multi-agent reinforcement learning
From game theory to reinforcement learning: stochastic games and partially observable stochastic games.
How to add communication to a stochastic game.
Definition of multi-agent reinforcement learning problem.
Computing expected utility: individual policy vs joint policy.
Solution concepts: equilibria, Pareto optimality, social welfare, minimum entanglement.
Training process and guarantees and type of convergence to a solution: what happens when a policy is not stationary.
Agent reduction training methodologies: centralized learning, independent learning, self-play (AlphaZero).
Multi-agent training algorithms: Joint Action Learning, Agent Modeling.
Theory: From game theory to reinforcement learning: stochastic games and partially observable stochastic games.
How to add communication to a stochastic game.
Definition of multi-agent reinforcement learning problem.
Computing expected utility: individual policy vs joint policy.
Solution concepts: equilibria, Pareto optimality, social welfare, minimum entanglement.
Training process and guarantees and type of convergence to a solution: what happens when a policy is not stationary.
Agent reduction training methodologies: centralized learning, independent learning, self-play (AlphaZero).
Multi-agent training algorithms: Joint Action Learning, Agent Modeling.
Laboratory: Introduction to multi-agent reinforcement learning environments.
Reinforcement learning in adversarial games: self-play MCTS and AlphaZero.
Practical work with various methodologies to train agents in environments of mixed interests: joint-action learning, agent modeling, policy gradient.
Introduction to socio-technical systems: impact on society of intelligent distributed systems.
Social coordination and organizational models: social abstractions, norms, roles.
Electronic organizations: OperA.
Normative models: electronic institutions, HarmonIA.
Holistic models: OMNI.
Theory: Introduction to socio-technical systems: impact on society of intelligent distributed systems.
Social coordination and organizational models: social abstractions, norms, roles.
Electronic organizations: OperA.
Normative models: electronic institutions, HarmonIA.
Holistic models: OMNI.
Review of the concepts of intelligent agent and rational agent.
Relationship between agency and intelligence.
Social and ethical issues of Artificial Intelligence: privacy, responsible AI.
Theory: Review of the concepts of intelligent agent and rational agent.
Relationship between agency and intelligence.
Social and ethical issues of Artificial Intelligence: privacy, responsible AI.
Student teams will have to write a report with a comparative study of the performance of various multi-agent reinforcement learning techniques in a proposed, cooperative, competitive environment, or a mixture of the two. Objectives:5678 Week:
15 (Outside class hours)
Theory
0h
Problems
0h
Laboratory
1h
Guided learning
0h
Autonomous learning
20h
Final Exam
Final exam for all the course contents. Objectives:12345678910 Week:
15 (Outside class hours)
Theory
3h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
10h
Teaching methodology
The teaching methodology consists exposure theory classes in theory and application of concepts in classes and laboratory problems.
The examination will the same for all groups.
Evaluation methodology
Evaluation is based on a final exam and a part exam, grading of course assignments, and a grade for lab work. The final and part exams will test the theoretical knowledge and the methodology acquired by students during the course. The grade for course assignments will be based on submissions of small problems set during the course. Lab grades will be based on students" reports and lab practical work carried out throughout the course.
At about half of the 4-moth term there will be an exemptive exam, testing the first half of the course (exemptive only if the grade is 5 or more). The final exam will test both the first and the second part of the course. The first half is compulsory for those students who did not pass the part exam, and optional for the rest. The maximum of both grades (or only the one for the midterm exam) will stand as the grade for the first part.
The final grade will be calculated as follows:
GPar = part exam grade
GEx1 = 1st half of the final exam grade
GEx2 = 2nd half of the final exam grade
Total Exams grade = [max(Gpar, GEx1) + GEx2]/2
Final grade= Total Exams grade * 0.5 + Exercises grade * 0.2 + lab grade * 0.3 (code + inform)
Competences' Assessment
The assessment of the competence on teamwork is based on work done during the laboratory assignments.