Optimization Techniques for Data Mining

You are here

Credits
6
Department
EIO
Types
Specialization complementary (Data Science)
Requirements
This subject has not requirements
The course introduces the basic concepts of optimization and the different types of optimization problems, the iterative algorithms to solve these problems, and their properties. The practice of optimization using modeling languages to describe a problem and commercial and publicly available solvers is also emphasized.

Teachers

Person in charge

  • Jordi Castro Pérez ( )

Others

  • Elena Fernández Areizaga ( )
  • F. Javier Heredia Cervera ( )

Weekly hours

Theory
3.2
Problems
0
Laboratory
0
Guided learning
0.77
Autonomous learning
0

Competences

Technical Competences of each Specialization

Advanced computing

  • CEE3.2 - Capability to use a wide and varied spectrum of algorithmic resources to solve high difficulty algorithmic problems.
  • CEE3.3 - Capability to understand the computational requirements of problems from non-informatics disciplines and to make significant contributions in multidisciplinary teams that use computing.

Service engineering

  • CEE5.3 - Capability to work in interdisciplinary engineering services teams and, provided the necessary domain experience, capability to work autonomously in specific service systems.

Specific

  • CEC1 - Ability to apply scientific methodologies in the study and analysis of phenomena and systems in any field of Information Technology as well as in the conception, design and implementation of innovative and original computing solutions.
  • CEC2 - Capacity for mathematical modelling, calculation and experimental design in engineering technology centres and business, particularly in research and innovation in all areas of Computer Science.

Generic Technical Competences

Generic

  • CG1 - Capability to apply the scientific method to study and analyse of phenomena and systems in any area of Computer Science, and in the conception, design and implementation of innovative and original solutions.
  • CG3 - Capacity for mathematical modeling, calculation and experimental designing in technology and companies engineering centers, particularly in research and innovation in all areas of Computer Science.

Transversal Competences

Sustainability and social compromise

  • CTR2 - Capability to know and understand the complexity of the typical economic and social phenomena of the welfare society. Capacity for being able to analyze and assess the social and environmental impact.

Reasoning

  • CTR6 - Capacity for critical, logical and mathematical reasoning. Capability to solve problems in their area of study. Capacity for abstraction: the capability to create and use models that reflect real situations. Capability to design and implement simple experiments, and analyze and interpret their results. Capacity for analysis, synthesis and evaluation.

Basic

  • CB9 - Possession of the learning skills that enable the students to continue studying in a way that will be mainly self-directed or autonomous.

Objectives

  1. Discerning what is an optimization problem and its type and having a basic knowledge of optimization algorithms
    Related competences: CEE5.3, CEE3.2, CTR2, CTR6,
  2. Formulating optimization problems and representing them through a modeling language
    Related competences: CEE5.3, CG1, CG3, CEE3.2, CEE3.3, CB9, CEC1, CEC2,
  3. Choosing an adequate solver type for a given problem
    Related competences: CEE5.3, CEE3.2, CEE3.3,
  4. Using publicly available and commercial solvers. Interpreting results from solvers and communicating in writing results from optimization
    Related competences: CEE5.3, CTR6,

Contents

  1. Unconstrained Optimization
    Optimality conditions. Convexity. Descent directions.
    Line search. Acceptability of step sizes.
    General minimization algorithm.
    Gradient method. Rate of convergence.
    Newton's method. Factorizations to ensure convergence.
    Weighted least squares.
    Introduction to AMPL. The Neos solver site.
  2. Constrained Optimization and Support Vector Machines.
    - Introduction to Support Vector Macines (SVM)
    - KKT Optimality conditions of constrained optimization. Optimality conditions of SVM.
    - Duality in Optimization. The dual of the SVM.
  3. Integer Programming
    - Modelling problems with binary variables.
    - The branch and bound algorithm for integer programming
    - Gomory's cutting planes algorithm.
    - Minimal spanning tree problem and algorithms of Kruskal and Prim.

Activities

Unconstrained Optimization

Optimality conditions. Convexity. Descent directions. Line search. Acceptability of step sizes. General minimization algorithm. Gradient method. Rate of convergence. Newton's method. Factorizations to ensure convergence. Weighted least squares. Introduction to AMPL. The Neos solver site.
Theory
17.3
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
35
Objectives: 1 2 3 4

Constrained Optimization and Support Vector Machines

- Introduction to Support Vector Macines (SVM) - KKT Optimality conditions of constrained optimization. Optimality conditions of SVM. - Duality in Optimization. The dual of the SVM.
Theory
17.4
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
35
Objectives: 1 2 3 4

Integer Programming

- Modelling problems with binary variables. - The branch and bound algorithm for integer programming - Gomory's cutting planes algorithm. - Minimal spanning tree problem and algorithms of Kruskal and Prim
Theory
17.3
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
35
Objectives: 1 2 3 4

Teaching methodology

The students will have available all the course material.

About two thirds of lecture time will be devoted to optimization algorithms and their properties, and the rest will be for presenting and solving exercises and problems

Lab sessions (using AMPL) will be devoted to the solution of some data mining applications.

Evaluation methodology

- Theory (40%). There will be a short midterm exam based on practical questions for the first two parts of the course, and some individual numerical exercises for the third (and last) part.

- Practical assigments (60%). There will be 3 lab assignments, one for each part of the course, all of them with the same weight on the final mark.

Additional exercises provided during the lectures may be taken into consideration to decide or to boost the final mark.

Bibliografy

Basic:

  • Numerical Optimization - J. Nocedal and S.J. Wright., Springer , 1999. ISBN:
  • Linear and Nonlinear Programming - D.G. Luenberger and Y. Ye, Springer , 2008. ISBN:
  • Integer Programming - L.A. Wolsey, Wiley , 1998. ISBN:
  • AMPL A modeling language for mathematical programming - . Fourer, D.M. Gay, B.W. Kernighan, Thomson Brooks/Cole , 2003. ISBN:
  • An introduction to support vector machines : and other kermel-based learning methods - Cristianini, Nello; Shawe-Taylor, John, Cambridge University Press , 2000. ISBN: 0521780195
    http://cataleg.upc.edu/record=b1165003~S1*cat

Web links

Previous capacities

Basic background in linear algebra, calculus, and programming languages is needed for the course.