Optimization Techniques for Data Mining

Teachers
Weekly hours
Competences
Objectives
Contents
Activities
Teaching methodology
Evaluation methodology
Bibliography
Web links
Previous capacities

Credits

6

Types

MIRI: Specialization complementary (Data Science)
MDS: Elective

Requirements

This subject has not requirements, but it has got previous capacities

Department

EIO

The course introduces the basic concepts of optimization and the different types of optimization problems, the iterative algorithms to solve these problems, and their properties. The practice of optimization using modeling languages to describe a problem and commercial and publicly available solvers is also emphasized.

Teachers

Person in charge

Jordi Castro Pérez ( )

Others

F. Javier Heredia Cervera ( )
Jessica Rodríguez Pereira ( )

Weekly hours

Theory

4

Problems

0

Laboratory

0

Guided learning

0

Autonomous learning

7.53

Competences

Transversal Competences

Information literacy

CT4 - Capacity for managing the acquisition, the structuring, analysis and visualization of data and information in the field of specialisation, and for critically assessing the results of this management.

Third language

CT5 - Achieving a level of spoken and written proficiency in a foreign language, preferably English, that meets the needs of the profession and the labour market.

Basic

CB6 - Ability to apply the acquired knowledge and capacity for solving problems in new or unknown environments within broader (or multidisciplinary) contexts related to their area of study.
CB7 - Ability to integrate knowledge and handle the complexity of making judgments based on information which, being incomplete or limited, includes considerations on social and ethical responsibilities linked to the application of their knowledge and judgments.
CB10 - Possess and understand knowledge that provides a basis or opportunity to be original in the development and/or application of ideas, often in a research context.

Generic Technical Competences

Generic

CG2 - Identify and apply methods of data analysis, knowledge extraction and visualization for data collected in disparate formats

Technical Competences

Especifics

CE1 - Develop efficient algorithms based on the knowledge and understanding of the computational complexity theory and considering the main data structures within the scope of data science

Objectives

Discerning what is an optimization problem and its type and having a basic knowledge of optimization algorithms
Related competences: CT5, CE1, CB10,
Optimization of neural networks and support vector machines.
Related competences: CT5, CE1, CB10,
Modelling languages. Formulating optimization problems and representing them through a modeling language. Choosing and adequate solver for a given problem.
Related competences: CT4, CT5, CG2, CB6, CB7,

Unconstrained Optimization
Optimality conditions. Convexity. Descent directions.
Line search. Acceptability of step sizes.
General minimization algorithm.
Gradient method. Rate of convergence.
Newton's method. Factorizations to ensure convergence.
Quasi-Newton methods.
Optimization of neural networks.
Constrained Optimization and Support Vector Machines.
- Introduction to the modelling langauge AMPL.
- Introduction to Support Vector Macines (SVM)
- KKT Optimality conditions of constrained optimization. Optimality conditions of SVM.
- Duality in Optimization. The dual of the SVM.
Integer Programming
- Modelling problems with binary variables.
- The branch and bound algorithm for integer programming
- Gomory's cutting planes algorithm.
- Minimal spanning tree problem and algorithms of Kruskal and Prim. Application to clustering.

Activities

Activity Evaluation act

Unconstrained Optimization

Optimality conditions. Convexity. Descent directions. Line search. Acceptability of step sizes. General minimization algorithm. Gradient method. Rate of convergence. Newton's method. Factorizations to ensure convergence. Weighted least squares. Introduction to AMPL. The Neos solver site.
Objectives: 3 1 2

Theory

17.3h

Problems

0h

Laboratory

0h

Guided learning

0h

Autonomous learning

33h

Constrained Optimization and Support Vector Machines

- Introduction to Support Vector Macines (SVM) - KKT Optimality conditions of constrained optimization. Optimality conditions of SVM. - Duality in Optimization. The dual of the SVM.
Objectives: 3 1 2

Theory

17.4h

Problems

0h

Laboratory

0h

Guided learning

0h

Autonomous learning

33h

Integer Programming

- Modelling problems with binary variables. - The branch and bound algorithm for integer programming - Gomory's cutting planes algorithm. - Minimal spanning tree problem and algorithms of Kruskal and Prim
Objectives: 3 1

Theory

17.3h

Problems

0h

Laboratory

0h

Guided learning

0h

Autonomous learning

33h

Teaching methodology

The students will have available all the course material.

About two thirds of lecture time will be devoted to optimization algorithms and their properties, and the rest will be for presenting and solving exercises and problems

Lab sessions will be devoted to the solution of some data science applications (neural networks, support vector machines, clustering) using optimization methods.

Evaluation methodology

Each of the three parts of the topic (unconstrained optimization, constrained optimization, integer optimization) has a weight 1/3 on the final mark, and is evaluated as: 40% theory and 60% practical assignment. The evaluation activities are:

- Theory (40%). There will be two midterm exams (there is no final exam). The first midterm exam (2/3 of 40%) will consist on some practical exercises about the first two parts of the course, and it will be done before January. The second midterm exam (1/3 of 40%) will be about the third part of the course, and it will be done in January.

- Practical assignments (60%). There will be 3 lab assignments, one for each part of the course, all of them with the same weight on the final mark.

Additional exercises provided during the lectures may be taken into consideration to decide or to boost the final mark.

Bibliography

Basic:

Numerical optimization - Nocedal, J.; Wright, S.J, Springer, 2006. ISBN: 9780387303031
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991003178739706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Linear and nonlinear programming - Luenberger, D.G.; Ye, Y, Springer, 2021. ISBN: 9783030854492
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991005136979306711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Integer programming - Wolsey, L.A, Wiley, 2021. ISBN: 9781119606536
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991005125279506711&context=L&vid=34CSUC_UPC:VU1&lang=ca
AMPL: a modeling language for mathematical programming - Fourer, R.; Gay, D.M.; Kernighan, B.W, Thomson Brooks/Cole, 2003. ISBN: 0534388094
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991002629329706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
An introduction to support vector machines: and other kermel-based learning methods - Cristianini, N.; Shawe-Taylor, J, Cambridge University Press, 2000. ISBN: 0521780195
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991001992979706711&context=L&vid=34CSUC_UPC:VU1&lang=ca

Web links

Tool for self-learning LP an ILP algorithms. http://www-eio.upc.es/teaching/ple/pfc_ing.html
NEOS server https://neos-server.org/neos/

Previous capacities

Basic background in linear algebra, calculus, and programming languages is needed for the course.

Optimization Techniques for Data Mining

Teachers

Person in charge

Others

Weekly hours

Competences

Transversal Competences

Information literacy

Third language

Basic

Generic Technical Competences

Generic

Technical Competences

Especifics

Objectives

Contents

Activities

Unconstrained Optimization

Constrained Optimization and Support Vector Machines

Integer Programming

Teaching methodology

Evaluation methodology

Bibliography

Basic:

Web links

Previous capacities

Where we are

Contact with us

Optimization Techniques for Data Mining

You are here

Teachers

Person in charge

Others

Weekly hours

Competences

Transversal Competences

Information literacy

Third language

Basic

Generic Technical Competences

Generic

Technical Competences

Especifics

Objectives

Contents

Activities

Unconstrained Optimization

Constrained Optimization and Support Vector Machines

Integer Programming

Teaching methodology

Evaluation methodology

Bibliography

Basic:

Web links

Previous capacities

Where we are

Contact with us