This exciting course broaches the hot topic of Intelligent Data Analysis (IDA) from the viewpoint of Data Mining.
Most areas in science, engineering and business are becoming increasingly data dependent. Clear examples of this are, to name a few, bioinformatics, medicine, or electronic commerce.
Data analysis techniques are needed to deal with these data and generate usable knowledge out of them. Amongst them, IDA techniques are one of the most promising approaches. This theme is at the core of the contents of this course.
Professorat
Responsable
Alfredo Vellido Alcacena (
)
Altres
Carlos Cano Domingo (
)
Caroline König (
)
Hores setmanals
Teoria
2.9
Problemes
0
Laboratori
0
Aprenentatge dirigit
0.1
Aprenentatge autònom
4.6
Competències
Competències Tècniques Generals
Genèriques
CG3 - Capacitat per a la modelització, càlcul, simulació, desenvolupament i implantació en centres tecnològics i d'enginyeria d'empresa, particularment en tasques de recerca, desenvolupament i innovació en tots els àmbits relacionats amb la Intel·ligència Artificial.
Competències Tècniques de cada especialitat
Acadèmiques
CEA4 - Capacitat de comprendre els principis bàsics de funcionament de les tècniques principals d'Intel·ligència Computacional, i saber utilitzar-les en l'entorn d'un sistema o servei intel·ligent.
CEA7 - Capacitat de comprendre la problemàtica, i les solucions als problemes en la pràctica professional de l'aplicació de la Intel·ligència Artificial en l'entorn empresarial i industrial.
CEA11 - Capacitat de comprendre les tècniques avançades d'Intel·ligència Computacional, i saber dissenyar, implementar i aplicar aquestes tècniques en el desenvolupament d'aplicacions, serveis o sistemes intel·ligents.
Professionals
CEP1 - Capacitat de resoldre les necessitats d'anàlisi de la informació de les diferents organitzacions, tot identificant les fonts d'incertesa i variabilitat.
CEP5 - Capacitat de dissenyar noves eines informàtiques i noves tècniques d'Intel·ligència Artificial en l'exercici professional.
Competències Transversals
ús solvent dels recursos d'informació
CT4 - Gestionar l'adquisició, l'estructuració, l'anàlisi i la visualització de dades i informació de l'àmbit d'especialitat, i valorar de forma crítica els resultats d'aquesta gestió.
Raonament
CT6 - Capacitat d'avaluar i analitzar de manera raonada i crítica sobre situacions, projectes, propostes, informes i estudis de caracter cientific-tecnic. Capacitat d'argumentar les raons que expliquen o justifiquen aquestes situacions, propostes, etc.
Analisis i sintesis
CT7 - Capacitat d'anàlisi i resolució de problemes tècnics complexos.
Objectius
Presenting DM as a process that should involve a methodology id applied at its best.
Competències relacionades:
CEA7,
CEP5,
CT4,
CT6,
Introducing the students to the new concept of DM for processes, called Process Mining.
Competències relacionades:
CEA7,
CG3,
CEP1,
CEP5,
CT4,
CT6,
Delving into some detail in one of the stages of DM: data exploration.
Competències relacionades:
CEA4,
CG3,
CEP1,
CT4,
Dealing in detail with the problem of data visualization for exploration as a key issue in DM.
Competències relacionades:
CEA11,
CG3,
CEP1,
CEP5,
CT4,
CT6,
Introducing the students to the basics of probability theory as applied in Intelligent Data Analysis (IDA)
Competències relacionades:
CEP1,
CT4,
CT6,
CT7,
Introducing the students to the probabilistic variant of IDA in the form of Statistical Machine Learning, both for supervised and unsupervised learning models.
Competències relacionades:
CEA11,
CG3,
CT4,
CT6,
CT7,
Dealing in detail with different unsupervised models for data visualization, including case studies.
Competències relacionades:
CEA11,
CG3,
CEP1,
CEP5,
CT4,
CT6,
CT7,
Approaching the multi-faceted concept of data mining (DM) from different perspectives.
Competències relacionades:
CEA7,
CG3,
CEP5,
CT4,
CT6,
CT7,
Continguts
Introduction to the concept of data mining (DM).
DM is a multi-faceted concept that requires discussion and clarification. We will do this at the beginning of the course.
DM as a methodology.
We argue that DM should not be focused on the concept of data analysis/modeling, but, instead, should be treated as a methodology with diverse inter-related stages.
DM for processes: Process Mining.
A new development in DM methodologies is that which deals with one specifically suited for processes. It is called Process Mining and will be described and discussed in this course.
Data exploration in DM.
One of the main stages of well-structures DM methodologies is Data exploration. It will be discussed as a preamble to data visualization.
Basics of probability theory in Intelligent Data Analysis (IDA)
For a long time in the last half-century, multivariate statistics and artificial intelligence (mostly in the field of machine learning) have developed in parallel without fully meeting. Statistical machine learning has bridged that field over the last two decades. We introduce it by first providing some basic principles of probability theory (Bayesian inference).
Data visualization for exploration.
One of the aspects of the problem of data exploration is data visualization. It has a research 'life' of its own as it involves not only computer-based mathematical models, but also natural perception and processing.
Statistical Machine Learning for IDA: supervised models.
Once the basics of Bayesian inference are set, we will delve into the field of Statistical Machine Learning for IDA, starting with supervised learning models, with an emphasis on feed-forward artificial neural networks.
Statistical Machine Learning for IDA: unsupervised models.
Once the basics of Bayesian inference and of Statistical Machine Learning for IDA in supervised models are set, we will continue with unsupervised models, focusing on self-organizing maps and related models.
Unsupervised models for data visualization, with case studies.
In the final item of the contents of the course, we will bring statistical machine learning and data visualization together by discussing some probabilistic unsupervised learning models for data visualization, including some case studies as an example.
Activitats
ActivitatActe avaluatiu
Essay on IDA for DM
Students will have to write a research essay on the topic of IDA for DM, with different options:
1. State of the art on an specific IDA-DM topic
2. Evaluation of an IDA-DM software tool with original experiments
3. Pure research essay, with original experimental content Objectius:81234567 Setmana:
15
Teoria
0h
Problemes
0h
Laboratori
0h
Aprenentatge dirigit
3h
Aprenentatge autònom
0h
Introduction to Data Mining and its Methodologies
Introduction to Data Mining as a general concept and to its methodologies for practical implementation
Teoria: presential seminars dealing with the theory of this topic
Aprenentatge dirigit: Students' directed learning, related to the topic.
Aprenentatge autònom: Students' autonomous learning, related to the topic.
The meeting of statistics and machine learning: Statistical Machine Learning methods, from the point of view of both supervised and supervised learning
Teoria: presential seminars dealing with the theory of this topic
Aprenentatge dirigit: Students' directed learning, related to the topic.
Aprenentatge autònom: Students' autonomous learning, related to the topic.
Teoria
12h
Problemes
0h
Laboratori
0h
Aprenentatge dirigit
0h
Aprenentatge autònom
16h
SML in data visualization, with case studies
We merge the topics of SML and data visualization, illustrating its use with some real case studies
Teoria: presential seminars dealing with the theory of this topic
Aprenentatge dirigit: Students' directed learning, related to the topic.
Aprenentatge autònom: Students' autonomous learning, related to the topic.
This course will build on different teaching methodology (TM) aspects, including:
TM1: Expositive seminars
TM2: Expositive-participative seminars
TM3: Orientation for individual assignments (essays)
TM4: Individual tutorization
Mètode d'avaluació
The course will be evaluated through a final essay that will take one of these three modalities:
1. State of the art on an specific IDA-DM topic
2. Evaluation of an IDA-DM software tool with original experiments
3. Pure research essay, with original experimental content
Students are expected to have at least some basic background in the area of artificial intelligence and, more specifically, with the areas of Machine Leaning and Computational Intelligence.
Some basic knowledge of probability theory and statistics would be beneficial.
Other than this, the course is open to students and researchers of all types of background