Machine Learning I

Credits
6
Types
Compulsory
Requirements
This subject has not requirements, but it has got previous capacities
Department
CS;TSC
The goal of machine learning is the development of theories, techniques and algorithms that allow to explore automatic methods to infer models based on data (eg, to find structure or regularities, or make predictions). This inference is based on the observation of data that represent incomplete information about a process. Machine learning is a meeting point for different disciplines: multivariate and computational statistics, computer science, and mathematical optimization, among others.

Teachers

Person in charge

  • Luis Antonio Belanche Muñoz ( )

Others

  • Jaume Baixeries Juvillà ( )
  • Jordi Cortadella Fortuny ( )
  • Marta Arias Vicente ( )

Weekly hours

Theory
2
Problems
0
Laboratory
2
Guided learning
0.333
Autonomous learning
5

Competences

Technical Competences

Technical competencies

  • CE1 - Skillfully use mathematical concepts and methods that underlie the problems of science and data engineering.
  • CE3 - Analyze complex phenomena through probability and statistics, and propose models of these types in specific situations. Formulate and solve mathematical optimization problems.
  • CE8 - Ability to choose and employ techniques of statistical modeling and data analysis, evaluating the quality of the models, validating and interpreting them.
  • CE9 - Ability to choose and employ a variety of automatic learning techniques and build systems that use them for decision making, even autonomously.

Transversal Competences

Transversals

  • CT3 - Efficient oral and written communication. Communicate in an oral and written way with other people about the results of learning, thinking and decision making; Participate in debates on topics of the specialty itself.
  • CT7 - Third language. Know a third language, preferably English, with an adequate oral and written level and in line with the needs of graduates.

Generic Technical Competences

Generic

  • CG1 - To design computer systems that integrate data of provenances and very diverse forms, create with them mathematical models, reason on these models and act accordingly, learning from experience.
  • CG2 - Choose and apply the most appropriate methods and techniques to a problem defined by data that represents a challenge for its volume, speed, variety or heterogeneity, including computer, mathematical, statistical and signal processing methods.

Objectives

  1. Formulate the problem of automatic learning from data, and get to know the types of tasks that can be given.
    Related competences: CE1, CE9, CG1, CG2,
  2. Organize the resolution flow of a machine learning problem, analyzing the possible options and choosing the most suitable for the problem.
    Related competences: CE1, CE9, CT7, CG1, CG2,
  3. Decide, defend and criticize a solution to a machine learning problem, arguing the strong and weak points of the approach.
    Related competences: CE9, CT3, CG2,
  4. Know and know how to apply linear techniques to solve supervised learning problems.
    Related competences: CE3, CE8, CG2,
  5. Know and know how to apply mono and multilayer neural network techniques to solve supervised learning problems.
    Related competences: CE8, CE9, CG2,
  6. Know and know how to apply support vector machines to the resolution of supervised learning problems.
    Related competences: CE8, CE9, CG2,
  7. Know and know how to apply the basic techniques for the resolution of unsupervised learning problems, with emphasis on data clustering tools.
    Related competences: CE8, CE9, CG2,
  8. Know and know how to apply the basic techniques for solving reinforcement learning problems.
    Related competences: CE8, CE9, CG2,
  9. Know and know how to apply ensemble techniques to solve supervised learning problems.
    Related competences: CE8, CE9, CG2,

Contents

  1. Introduction to Machine Learning
    General information and basic concepts. Description and approach of problems attacked by automatic learning. Supervised learning (regression and classification), non-supervised (clustering) and semi-supervised (reinforcement and transductive). Modern examples of application.
  2. Unsupervised machine learning: clustering
    Definition and approach of unsupervised machine learning. Introduction to clustering. Probabilistic algorithms: k-means and Expectation-Maximization (E-M).
  3. Supervised machine learning (I): linear regression methods
    Maximum likelihood for regression. Errors for regression. Least squares: analytical (pseudo-inverse and SVD) and iterative ( gradient descent) methods. Notion of regularization. L1 and L2 regularized regression: algorithms ridge regression, LASSO and Elastic Net.
  4. Supervised machine learning (II): linear methods for classification
    Maximum likelihood for classification. Error functions for classification. Bayesian Generative Classifiers: LDA/QDA/RDA, Naïve Bayes and k-nearest neighbours.
  5. Hierarchical methods: decision trees
    General construction of decision trees. Split criteria: gain in entropy and Gini. Regularization in decision trees. CART trees for regression and classification.
  6. Feed-forward shallow neural networks
    Feed-forward shallow neural networks (one hidden layer). Activation functions. Multilayer perceptron with one hidden layer and RBF (radial basis function network) and their training algorithms.
  7. Recurrent shallow neural networks
    Recurrent shallow neural networks: Hopfield networks and their training algorithms. Applications in associative memories and combinatorial optimization problems.
  8. Kernel based learning methods
    Introduction to learning with kernel functions. Regularized kernelized linear regression. Basic kernel functions. Complexity and generalization: Vapnik-Chervonenkis dimension. Support Vector Machine.
  9. Ensemble methods
    Introduction to ensemble methods. Bagging and Random Forests. Boosting. Adaboost and variants.
  10. Reinforcement learning
    Description of reinforcement learning. Markov processes. Bellman Equations. Values and methods of temporal differences. Q learning and the Sarsa algorithm. Applications.

Activities

Activity Evaluation act


Development of topic 1


Objectives: 1
Contents:
Theory
2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
3.3h

Development of topic 2


Objectives: 1 3 7
Contents:
Theory
4h
Problems
0h
Laboratory
2h
Guided learning
0h
Autonomous learning
6.6h

Development of topic 3


Objectives: 1 4
Contents:
Theory
6h
Problems
0h
Laboratory
2h
Guided learning
0h
Autonomous learning
10h

Development of topic 4


Objectives: 1 2 4
Contents:
Theory
5h
Problems
0h
Laboratory
2h
Guided learning
0h
Autonomous learning
8.3h

Development of topic 6


Objectives: 1 2 5
Contents:
Theory
7h
Problems
0h
Laboratory
2h
Guided learning
0h
Autonomous learning
11.6h

Development of topic 7


Objectives: 5 1 2
Contents:
Theory
3h
Problems
0h
Laboratory
1h
Guided learning
0h
Autonomous learning
5h

Development of topic 8


Objectives: 1 6
Contents:
Theory
7h
Problems
0h
Laboratory
2h
Guided learning
0h
Autonomous learning
11.6h

Development of topics 5 and 9


Objectives: 1 9
Contents:
Theory
8h
Problems
0h
Laboratory
2h
Guided learning
0h
Autonomous learning
13.3h

Development of topic 10


Objectives: 1 8
Contents:
Theory
3h
Problems
0h
Laboratory
1h
Guided learning
0h
Autonomous learning
5h

Control session for the practical work


Objectives: 1 2 3 4 5 6 7 8 9
Week: 8 (Outside class hours)
Type: assigment
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
2h
Autonomous learning
0h

Delivery of the practical work


Objectives: 1 2 3 4 5 6 7 8 9
Week: 15 (Outside class hours)
Type: assigment
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
3h
Autonomous learning
0h

Teaching methodology

Las clases de teoría introducen todo los conocimientos, las técnicas, conceptos y resultados necesarios para alcanzar un nivel bien fundamentado y comprensible. Estos conceptos se ponen en práctica en las clases de laboratorio. En estas se proporciona código R que permite resolver ciertos aspectos de un problema de análisis de datos con las técnicas correspondientes al tema en curso. Este laboratorio también sirve de guía para la parte correspondiente de la práctica, que desarrollan los alumnos a lo largo del curso. Algunas de las horas de laboratorio se podrán usar para resolver problemas (sin ordenador) en el aula de teoría.

Hay un trabajo práctico evaluable, que trabaja un problema real a elegir por el propio estudiante y que recoge e integra los conocimientos y las competencias de todo el curso. También se evalúa mediante el trabajo práctico la competencia genérica de comunicación eficaz escrita.

Evaluation methodology

The subject is evaluated through a partial exam, a final exam and a practical work in which a real problem is attacked, writing the corresponding report.

The final grade is calculated as:

Grade = 0.4 * Work + 0.6 * max (Final, 1/3 * Partial + 2/3 * Final)

For those students who can and want to attend re-evaluation, the re-evaluation exam grade will replace max (Final, 1/3 * Partial + 2/3 * Final).

Bibliography

Basic:

Previous capacities

Nocions mitjanes de probabilitat i estadística.
Nocions mitjanes d'algebra lineal, càlcul matricial i anàlisi real
Bon nivell de programació en llenguatges d'alt nivell