Machine Learning I

Credits
6
Types
Compulsory
Requirements
This subject has not requirements, but it has got previous capacities
Department
CS;TSC
The goal of machine learning is the development of theories, techniques and algorithms that allow to explore automatic methods to infer models based on data (eg, to find structure or regularities, or make predictions). This inference is based on the observation of data that represent incomplete information about a process. Machine learning is a meeting point for different disciplines: multivariate and computational statistics, computer science, and mathematical optimization, among others.

Teachers

Person in charge

  • Marta Arias Vicente ( )

Others

  • Alexis Molina Martinez de los Reyes ( )
  • Luis Antonio Belanche Muñoz ( )

Weekly hours

Theory
2
Problems
0
Laboratory
2
Guided learning
0
Autonomous learning
6

Competences

Technical Competences

Technical competencies

  • CE1 - Skillfully use mathematical concepts and methods that underlie the problems of science and data engineering.
  • CE3 - Analyze complex phenomena through probability and statistics, and propose models of these types in specific situations. Formulate and solve mathematical optimization problems.
  • CE8 - Ability to choose and employ techniques of statistical modeling and data analysis, evaluating the quality of the models, validating and interpreting them.
  • CE9 - Ability to choose and employ a variety of automatic learning techniques and build systems that use them for decision making, even autonomously.

Transversal Competences

Transversals

  • CT3 - Efficient oral and written communication. Communicate in an oral and written way with other people about the results of learning, thinking and decision making; Participate in debates on topics of the specialty itself.
  • CT4 [Avaluable] - Teamwork. Be able to work as a member of an interdisciplinary team, either as a member or conducting management tasks, with the aim of contributing to develop projects with pragmatism and a sense of responsibility, taking commitments taking into account available resources.
  • CT7 - Third language. Know a third language, preferably English, with an adequate oral and written level and in line with the needs of graduates.

Generic Technical Competences

Generic

  • CG1 - To design computer systems that integrate data of provenances and very diverse forms, create with them mathematical models, reason on these models and act accordingly, learning from experience.
  • CG2 - Choose and apply the most appropriate methods and techniques to a problem defined by data that represents a challenge for its volume, speed, variety or heterogeneity, including computer, mathematical, statistical and signal processing methods.

Objectives

  1. Formulate the problem of automatic learning from data, and get to know the types of tasks that can be given.
    Related competences: CE1, CE9, CG1, CG2,
  2. Organize the resolution flow of a machine learning problem, analyzing the possible options and choosing the most suitable for the problem.
    Related competences: CE1, CE9, CT4, CT7, CG1, CG2,
  3. Decide, defend and criticize a solution to a machine learning problem, arguing the strong and weak points of the approach.
    Related competences: CE9, CT3, CT4, CG2,
  4. Know and know how to apply linear techniques to solve supervised learning problems.
    Related competences: CE3, CE8, CG2,
  5. Know and know how to apply mono and multilayer neural network techniques to solve supervised learning problems.
    Related competences: CE8, CE9, CG2,
  6. Know and know how to apply support vector machines to the resolution of supervised learning problems.
    Related competences: CE8, CE9, CG2,
  7. Know and know how to apply the basic techniques for the resolution of unsupervised learning problems, with emphasis on data clustering tools.
    Related competences: CE8, CE9, CG2,
  8. Know and know how to apply the basic techniques for solving reinforcement learning problems.
    Related competences: CE8, CE9, CG2,
  9. Know and know how to apply ensemble techniques to solve supervised learning problems.
    Related competences: CE8, CE9, CG2,

Contents

  1. Introduction to Machine Learning
    General information and basic concepts. Description and approach of problems attacked by automatic learning. Supervised learning (regression and classification), non-supervised (clustering) and semi-supervised (reinforcement and transductive). Modern examples of application.
  2. Unsupervised machine learning: clustering
    Definition and approach of unsupervised machine learning. Introduction to clustering. Probabilistic algorithms: k-means and Expectation-Maximization (E-M).
  3. Supervised machine learning (I): linear regression methods
    Maximum likelihood for regression. Errors for regression. Least squares: analytical (pseudo-inverse and SVD) and iterative ( gradient descent) methods. Notion of regularization. L1 and L2 regularized regression: algorithms ridge regression, LASSO and Elastic Net.
  4. Supervised machine learning (II): linear methods for classification
    Maximum likelihood for classification. Error functions for classification. Bayesian Generative Classifiers: LDA/QDA/RDA, Naïve Bayes and k-nearest neighbours.
  5. Hierarchical methods: decision trees
    General construction of decision trees. Split criteria: gain in entropy and Gini. Regularization in decision trees. CART trees for regression and classification.
  6. Ensemble methods
    Introduction to ensemble methods. Bagging and Random Forests. Boosting. Adaboost and variants.
  7. Kernel based learning methods
    Introduction to learning with kernel functions. Regularized kernelized linear regression. Basic kernel functions. Complexity and generalization: Vapnik-Chervonenkis dimension. Support Vector Machine.

Activities

Activity Evaluation act


Development of topic 1


Objectives: 1
Contents:
Theory
2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
3.3h

Development of topic 2


Objectives: 1 3 7
Contents:
Theory
3h
Problems
0h
Laboratory
2h
Guided learning
0h
Autonomous learning
6.6h

Development of topic 3


Objectives: 1 4
Contents:
Theory
8h
Problems
0h
Laboratory
2h
Guided learning
0h
Autonomous learning
10h

Development of topic 4


Objectives: 1 2 4
Contents:
Theory
6h
Problems
0h
Laboratory
2h
Guided learning
0h
Autonomous learning
8.3h

Development of topic 5+6


Objectives: 1 2 5
Theory
5h
Problems
0h
Laboratory
2h
Guided learning
0h
Autonomous learning
11.6h

Development of topic 7


Objectives: 1 2 5
Theory
6h
Problems
0h
Laboratory
2h
Guided learning
0h
Autonomous learning
5h

Control session for the practical work


Objectives: 1 2 3 4 5 6 7 8 9
Week: 8 (Outside class hours)
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
2h
Autonomous learning
0h

Delivery of the practical work


Objectives: 1 2 3 4 5 6 7 8 9
Week: 15 (Outside class hours)
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
3h
Autonomous learning
0h


Teaching methodology

The theory classes introduce all the knowledge, techniques, concepts and results necessary to reach a well-founded and insightful level of maturity. These concepts are put into practice in the laboratory classes. In these labs, Python code is provided that allows solving certain aspects of a data analysis problem with the techniques corresponding to the current topic of study. This laboratory also serves as a guide for the corresponding part of the term project, which must be developed by the students throughout the course. Some laboratory hours may be used to solve problems (without a computer) in the theory classroom.

There is a graded practical project which works out a real problem to be chosen by the student and which collects and integrates the knowledge and skills of the entire course. The generic competence of effective written communication is also evaluated by means of this practical work.

Evaluation methodology

The subject is evaluated through a partial exam, a final exam and a practical work in which a real problem is attacked, writing the corresponding report.

The final grade is calculated as:

Grade = 0.4 * Work + 0.6 * max (Final, 1/3 * Partial + 2/3 * Final)

For those students who can and want to attend re-evaluation, the re-evaluation exam grade will replace max (Final, 1/3 * Partial + 2/3 * Final).

Bibliography

Basic:

Previous capacities

Nocions mitjanes de probabilitat i estadística.
Nocions mitjanes d'algebra lineal, càlcul matricial i anàlisi real
Bon nivell de programació en llenguatges d'alt nivell