Algorithmics for Data Mining

You are here

Credits
6
Types
Specialization complementary (Data Science)
Requirements
This subject has not requirements

Department
CS
In the discipline of Data Mining, many technologies allow organizations to improve their processes on the basis of the analysis of existing data and the search for patterns in them. However, available success stories notwithstanding, is is mandatory to acquire a consciousness of the limitations of these technologies: we will develop a study of the most usual algorithms and of their main parameters, so that the students become capable of identifying suitable tools for a given application. We will explain the theory and the practical usage of clusterers, associators, and classifiers so that the students acquire criteria to choose values for the many free parameters of each of these algorithms.

Teachers

Person in charge

  • Jose Luis Balcázar Navarro ( )

Others

  • Jorge Castro Rabal ( )
  • Jose Carmona Vargas ( )
  • Marta Arias Vicente ( )
  • Ricard Gavaldà Mestre ( )

Weekly hours

Theory
1
Problems
0
Laboratory
2
Guided learning
0
Autonomous learning
0

Competences

Technical Competences of each Specialization

Advanced computing

  • CEE3.1 - Capability to identify computational barriers and to analyze the complexity of computational problems in different areas of science and technology as well as to represent high complexity problems in mathematical structures which can be treated effectively with algorithmic schemes.
  • CEE3.2 - Capability to use a wide and varied spectrum of algorithmic resources to solve high difficulty algorithmic problems.
  • CEE3.3 - Capability to understand the computational requirements of problems from non-informatics disciplines and to make significant contributions in multidisciplinary teams that use computing.

Generic Technical Competences

Generic

  • CG1 - Capability to apply the scientific method to study and analyse of phenomena and systems in any area of Computer Science, and in the conception, design and implementation of innovative and original solutions.
  • CG3 - Capacity for mathematical modeling, calculation and experimental designing in technology and companies engineering centers, particularly in research and innovation in all areas of Computer Science.
  • CG5 - Capability to apply innovative solutions and make progress in the knowledge to exploit the new paradigms of computing, particularly in distributed environments.

Transversal Competences

Teamwork

  • CTR3 - Capacity of being able to work as a team member, either as a regular member or performing directive activities, in order to help the development of projects in a pragmatic manner and with sense of responsibility; capability to take into account the available resources.

Solvent use of the information resources

  • CTR4 - Capability to manage the acquisition, structuring, analysis and visualization of data and information in the area of informatics engineering, and critically assess the results of this effort.

Appropiate attitude towards work

  • CTR5 - Capability to be motivated by professional achievement and to face new challenges, to have a broad vision of the possibilities of a career in the field of informatics engineering. Capability to be motivated by quality and continuous improvement, and to act strictly on professional development. Capability to adapt to technological or organizational changes. Capacity for working in absence of information and/or with time and/or resources constraints.

Reasoning

  • CTR6 - Capacity for critical, logical and mathematical reasoning. Capability to solve problems in their area of study. Capacity for abstraction: the capability to create and use models that reflect real situations. Capability to design and implement simple experiments, and analyze and interpret their results. Capacity for analysis, synthesis and evaluation.

Basic

  • CB6 - Ability to apply the acquired knowledge and capacity for solving problems in new or unknown environments within broader (or multidisciplinary) contexts related to their area of study.
  • CB8 - Capability to communicate their conclusions, and the knowledge and rationale underpinning these, to both skilled and unskilled public in a clear and unambiguous way.
  • CB9 - Possession of the learning skills that enable the students to continue studying in a way that will be mainly self-directed or autonomous.

Objectives

  1. Te be aware of the theoretical and practical set of problems that constitute Data Mining, and to understand the main models and algorithms to tackle it: both at the conceptual level and at the level of their application through commercial tools, preferably open-source.
    Related competences: CB6, CTR4, CTR5, CTR6, CEE3.1, CEE3.2, CEE3.3, CG1, CG3, CG5,
  2. To acquire and demonstrate an ability to put to work the knowledge obtained in the autonomous, team-wise deployment of a practical data mining case, including a public presentation of the work developed.
    Related competences: CB6, CB8, CB9, CTR3, CTR4, CTR5, CTR6, CEE3.2, CG3,

Contents

  1. Main models and algorithms for Data Mining

Activities

Theoretical and conceptual study of the main data mining algorithms.

Theoretical and conceptual study of the main data mining algorithms.
Theory
18
Problems
6
Laboratory
0
Guided learning
0
Autonomous learning
6
Objectives: 1
Contents:

Deploy of a practical case study

Deploy of a practical case study
Theory
0
Problems
0
Laboratory
36
Guided learning
0
Autonomous learning
18
Objectives: 1 2
Contents:

Teaching methodology

Theory sessions, problem solving sessions with or without a programming component, practical sessions with commercial data mining software, development of a case study.

Evaluation methodology

Optional mid-term exam, final exam, presentation of the development of the case study.

The final grade will be the sum of two grades:

Basic understanding mark, between 0 and 7.

Commendable understanding mark, between 0 and 3.

The commendable understanding mark is obtained through specific questions in the final exam, clearly marked as such, and worth 3 points.

The basic understanding mark is the sum of three marks: the mid-term exam, worth 4 points; the presentation of the case study, worth 4 points; and the rest of the final exam, worth 4 points. If this sum is higher than 7, it gets truncated at 7.

Bibliografy

Basic:

  • The Top Ten Algorithms in Data Mining - Xindong Wu (Redactor), Vipin Kumar (Redactor) , Chapman & Hall/CRC Data Mining and Knowledge Discovery Series , . ISBN: 978-1420089646

Previous capacities

Thorough understanding of computing in general; good command of several programming languages; basic ability to formalize mathematically issues in informatics engineering.