Processes of Intelligent Data Analysis

Credits

Types

Compulsory

Requirements

This subject has not requirements , but it has got previous capacities

Department

EIO

Intelligent Data Analysis Processes is the fourth course in a sequence of courses where in which the rudiments of probability, statistical inference and statistical modelling have been acquired. This course culminates the training to bring data to more complex decision-making, with an in-depth study of the design of comprehensive processes that incorporate data and use various forms of artificial intelligence and advanced data models to extract strategic value from them. , while connecting the results of the data models with other components of the decision systems and processes. In this subject, the techniques seen in large part of the subjects of the preceding subjects such as "Probability and Statistics", "Intelligent Data Analysis", "Machine Learning", "Knowledge, Automatic Reasoning" and "Knowledge-Based Systems" and "Treatment of Human Language" will be seen as parts of more complex analysis processes, ranging from data collection to the integration of data and knowledge-based models in comprehensive decision support systems or different schemes for integrating AI and data in decisions.

Teachers

Person in charge

Karina Gibert Oliveras (karina.gibert@upc.edu)
Xavier Angerri Torredeflot (xavier.angerri@upc.edu)

Others

Sergi Ramirez Mitjans (sergi.ramirez@upc.edu)

Weekly hours

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Competences

Transversal Competences

Transversals

CT4 [Avaluable] - Teamwork. Be able to work as a member of an interdisciplinary team, either as a member or conducting management tasks, with the aim of contributing to develop projects with pragmatism and a sense of responsibility, taking commitments taking into account available resources.

CT6 [Avaluable] - Autonomous Learning. Detect deficiencies in one's own knowledge and overcome them through critical reflection and the choice of the best action to extend this knowledge.

CT8 [Avaluable] - Gender perspective. An awareness and understanding of sexual and gender inequalities in society in relation to the field of the degree, and the incorporation of different needs and preferences due to sex and gender when designing solutions and solving problems.

Basic

CB2 - That the students know how to apply their knowledge to their work or vocation in a professional way and possess the skills that are usually demonstrated through the elaboration and defense of arguments and problem solving within their area of ??study.

CB3 - That students have the ability to gather and interpret relevant data (usually within their area of ??study) to make judgments that include a reflection on relevant social, scientific or ethical issues.

CB4 - That the students can transmit information, ideas, problems and solutions to a specialized and non-specialized public.

CB5 - That the students have developed those learning skills necessary to undertake later studies with a high degree of autonomy

Technical Competences

Especifics

CE09 - To ideate, design and integrate intelligent data analysis systems with their application in production and service environments.

CE17 - To develop and evaluate interactive systems and presentation of complex information and its application to solving human-computer and human-robot interaction design problems.

CE18 - To acquire and develop computational learning techniques and to design and implement applications and systems that use them, including those dedicated to the automatic extraction of information and knowledge from large volumes of data.

CE20 - To select and put to use techniques of statistical modeling and data analysis, assessing the quality of the models, validating and interpreting.

Generic Technical Competences

Generic

CG1 - To ideate, draft, organize, plan and develop projects in the field of artificial intelligence.

CG2 - To use the fundamental knowledge and solid work methodologies acquired during the studies to adapt to the new technological scenarios of the future.

CG3 - To define, evaluate and select hardware and software platforms for the development and execution of computer systems, services and applications in the field of artificial intelligence.

CG4 - Reasoning, analyzing reality and designing algorithms and formulations that model it. To identify problems and construct valid algorithmic or mathematical solutions, eventually new, integrating the necessary multidisciplinary knowledge, evaluating different alternatives with a critical spirit, justifying the decisions taken, interpreting and synthesizing the results in the context of the application domain and establishing methodological generalizations based on specific applications.

CG5 - Work in multidisciplinary teams and projects related to artificial intelligence and robotics, interacting fluently with engineers and professionals from other disciplines.

CG7 - To interpret and apply current legislation, as well as specifications, regulations and standards in the field of artificial intelligence.

CG8 - Perform an ethical exercise of the profession in all its facets, applying ethical criteria in the design of systems, algorithms, experiments, use of data, in accordance with the ethical systems recommended by national and international organizations, with special emphasis on security, robustness , privacy, transparency, traceability, prevention of bias (race, gender, religion, territory, etc.) and respect for human rights.

CG9 - To face new challenges with a broad vision of the possibilities of a professional career in the field of Artificial Intelligence. Develop the activity applying quality criteria and continuous improvement, and act rigorously in professional development. Adapt to organizational or technological changes. Work in situations of lack of information and / or with time and / or resource restrictions.

Objectives

Solving available open data sources in combination with private data
Related competences: CG8, CT6, CT8, CB3,
Identify what kind of preprocessing real data needs
Related competences: CG4, CG8,
Know methods of integrated analysis of data and knowledge and be able to apply them correctly to a real problem
Related competences: CG2, CG4, CE18,
Given a problem, data and perspectives for using the model, knowing how to choose the best model to apply among all those seen in the subject and in the previous ones
Related competences: CG1, CG4, CG8, CT4, CT8, CB5, CE09, CE18, CE20,
Combine the results of data-driven models with useful knowledge production methods for subsequent decision-making
Related competences: CT4, CB2, CE09, CE17, CE18,
Identify the reporting or visualization tools most suitable for a specific problem.
Related competences: CB4,
Integrate the tools and models that are known in the design of an intelligent data analysis process suitable for a specific problem.
Related competences: CG2, CG3, CG4, CG9,
Master the technologies of putting into production an intelligent data analysis process.
Related competences: CG3, CG7, CG9, CE18,
Be aware of AI's digital footprint and be able to apply strategies that reduce it in a process of intelligent data analysis.
Related competences: CG2, CG3, CG8, CE09, CE18,
Integrate intelligent data analysis processes into intelligent decision support system architectures.
Related competences: CG1, CG3, CG4, CG5, CG8, CG9, CT4, CT6, CT8, CB2, CE09, CE20,
Being able to document new methods or technologies autonomously
Related competences: CT6,
Understand the ethical principles of the current AI model and assess whether we can implement it in the debate.
Related competences: CG4, CG8, CG9, CT8, CB4,
Be able to document yourself about new methods or technologies independently and be able to self-train in the future.
Related competences: CG5, CT6,

Introduction. The insertion of the data in the real decision processes
Introducción a la teoría de la decisión y a los procesos reales de soporte a la toma de decisiones.
Intelligent decision-making support systems
Intelligent decision-making support systems. General purpose architecture for IDSS
Intelligent decision-making support systems
Intelligent decision-making support systems
Design of relevant data sources for a decision-making process
The relevant sources of information (data, images, videos, knowledge); static/dynamic; on-line/off-line; open, sample, experimental data. Primary/Secondary data.
Linking the data with the objectives of the study. Data representativeness, biases and compensation policies
Best practices from design.
Integrated preprocessing design
Construction of data preprocessing organizational charts for complex projects
Role of study objectives and data models to be trained in data preprocessing processes
Automated choice of data modeling methods for the decision support process
-Integration of the DMMCM map in the method selection process
-The DMMT model of representation of the data-based methods
-Relation between the available methods and the objectives of the study
-Relation between the available methods and the available data
-Relation between different data-driven models
-Relation between the available method and the intended use of the model
Criteria to determine knowledge models
Criteria for determining the knowledge representation models to integrate in the decision process (ontologies, knowledge bases, linguistic labels, etc.)
Relationship between knowledge components/reasoning engines and data-based models in the decision support process
Mixed data/knowledge-driven AI models in IDSS
Mixed data/knowledge-driven models. Hybrid Artificial Intelligence systems.
Impact of interface design in IDSS
IDSS inputs, perception, knowledge representation in system inputs, access (roles, authentication, permissions), digital gap, modes of user interaction (voice, forms, chatbots, etc.). Good practices in menu design, accessibility, multilingual systems. Data connection (lakes, APIS, SQL, scrapping...).
Outputs: applications of data visualization to an IDSS, explainability and argumentation, recommenders, automatic reporting, communication of metrics and KPIS, accountability (registers), system role (agency, assistance/automation)
Validation of an IDSS
Insertion of intelligent data analysis in real processes
Health and wellness systems
Business (retailing, negotiations, etc.)
Administrative processes: public administration, hospital administration, large corporations, etc.
Industry 4.0
Strategic decision-making (business strategies or public policy-making)
Sustainability (biodiversity, carbon footprint, etc.)

Activities

Activity Evaluation act

Introduction to practices and training of work teams

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Introduction The insertion of the data in the real decision processes

Objectives: 2
Contents:

1 . Introduction. The insertion of the data in the real decision processes

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Generation of the AI Canvas from a real case

Once the topic of the practical assignment has been chosen, define the Artificial Intelligence project that can be outlined using the CANVAS-AI methodology
Objectives: 4
Contents:

2 . Intelligent decision-making support systems

Theory

Problems

Laboratory

1.5h

Guided learning

Autonomous learning

Intelligent decision-making support systems

Design the system architecture
Objectives: 2 3 4
Contents:

3 . Intelligent decision-making support systems

Theory

Problems

Laboratory

1.5h

Guided learning

Autonomous learning

Design of relevant data sources for a decision-making process

Objectives: 1 5
Contents:

4 . Design of relevant data sources for a decision-making process

Theory

Problems

Laboratory

1.5h

Guided learning

Autonomous learning

Integrated preprocessing design

Objectives: 6 7 8
Contents:

5 . Integrated preprocessing design

Theory

Problems

Laboratory

1.5h

Guided learning

Autonomous learning

Choice of data modeling methods for the decision support process

Objectives: 2 3 4
Contents:

6 . Automated choice of data modeling methods for the decision support process

Theory

Problems

Laboratory

1.5h

Guided learning

Autonomous learning

10h

Determination of knowledge models

Objectives: 2 4 5 6 8
Contents:

7 . Criteria to determine knowledge models

Theory

Problems

Laboratory

1.5h

Guided learning

Autonomous learning

Other components of the decision-making and integration process

Objectives: 10 11 13

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Intermediate presentation of the practical works

Objectives: 10
Week: 8 (Outside class hours)

Theory

Problems

Laboratory

Guided learning

Autonomous learning

User model and input interfaces in IDSS

Objectives: 1 7 8
Contents:

9 . Impact of interface design in IDSS

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Design of the IDSS outputs and model explainability

Objectives: 5 6 7
Contents:

9 . Impact of interface design in IDSS

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Global validation and deployment plan of the IDSS

Objectives: 3 5 7 8
Contents:

10 . Insertion of intelligent data analysis in real processes

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Ethical considerations and the carbon footprint of AI

Objectives: 9 12

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Final presentations of the practices

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Presentation of real-world IDSS applications

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Teaching methodology

The 12 suggested topics will be developed in 12 theoretical class sessions (2 hours per week) with their respective practices or associated laboratory session (also 2 hours per week).

The 3 sessions that are missing from the 15 sessions per semester established in the FIB, will be used for theoretical evaluations (quiz or similar) and practical evaluations (defense of practical work in the middle of the semester and at the end of the semester), remembering also that there are a couple of non-teaching weeks to be mid-term and/or final exam week, during which advice, support and guidance can be offered to students as reinforcement or preparation for their assessments.

In the theory classes, the inverted class scheme will be practiced whenever possible.
There is a web page for the subject.
The temporary distribution of the subject's contents and the materials to be brought prepared before each class will be published on this platform(s).
The master class outline will be used on occasion when the teacher needs to clarify complex concepts that have not been clear with the materials previously distributed in class.
The theory class will be mainly devoted to the presentation of cases and the development of interactive activities with the students such as the discussion of the cases, or the completion of specific short questionnaires.
One of the activities of the theory classes of the course will be the approach of real cases with proposals for the design of the intelligent data system to support certain decisions and the open discussion in the classroom about the strengths and weaknesses of the proposed design. This activity is fundamental to train the student in designing solvent, safe, viable processes with little risk of bankruptcy when we talk about real environments. Methodological questions to be clarified by the teacher will derive from the result of the debate.

Additionally, the students will perform in groups a good number of short practical works on the design of intelligent data analysis processes in more or less mature scenarios from a technological point of view where the entire process will have to be done from the eventual collection or identification of data sources or knowledge up to the communication of results and recommendations with the user.
The analysis case can be proposed by the students themselves based on certain characteristics set by the teaching staff. Each team will carry out practice sessions, each week applying the techniques seen in the course to tackle the challenge. The teacher will monitor all the work teams weekly in the laboratory sessions. The design proposal will include a proof of concept as far as the means of the subject allow for the proposed proposal.

Twice a year the teams will present their proposals in a sharing session where all the projects will be discussed together.

Supporting material resources include:
* Slides/Transparencies for each subject in pdf format or similar.
* Links to articles, forums, discussions or practical cases in congruent and reliable repositories for the subject.
* Videos or similar to show case studies or complementary topics to master classes.
* Use of GNU software for the practical part. The use of R, RStudio and similar platforms is suggested.
* You can use specialized software developed by research groups within the UPC such as GESCONDA and Klass, Freeling, etc.

Evaluation methodology

The following evaluation system is proposed:
- 4 Team works carried out throughout the course 80%.

Each team work is evaluated
- Technical quality of the proposed design and integration of knowledge involved (30%)
- Proof of concept (20%)
- Oral knowledge control test 10% (discussion with the teaching staff during the oral presentation of team work).
- Quality and performance of the work team. 10%
- Oral and written communication 10%.
- Ethics of the work team and the work itself 10%
-Gender perspective of the team and the work 10%.

- Attendance and participation in classes and laboratories. 10%

Reevaluación: Solo se pueden presentar a la reevaluación las personas que, habiéndose presentado al examen final, hayan suspendido. La máxima calificación que se puede alcanzar en la reevaluación es un 7.
- 2 Quiz throughout the course 10% (5% each).

Bibliography

Basic

Intelligent Decision Support Systems - Sanchez-Marre, M, Springer, 2022. ISBN: 9783030877903
https://link-springer-com.recursos.biblioteca.upc.edu/book/10.1007/978-3-030-87790-3
IOS Press - Gibert, Karina; Sánchez-Marré, Miquel; Izquierdo, J, IOS Press, 2016.
https://journals.sagepub.com/doi/abs/10.3233/AIC-160710
The Elements of statistical learning : data mining, inference, and prediction - Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome, Springer, cop. 2009. ISBN: 9780387848570
https://link-springer-com.recursos.biblioteca.upc.edu/book/10.1007/978-0-387-84858-7
Exploratory multivariate analysis by example using R - Husson, François; Lê, Sébastien; Pagès, Jérôme, CRC Press, Taylor & Francis Group, 2017. ISBN: 9781315301860
https://ebookcentral-proquest-com.recursos.biblioteca.upc.edu/lib/upcatalunya-ebooks/detail.action?pq-origsite=primo&docID=4856173
Applied multivariate statistical analysis - Johnson, Richard A; Wichern, Dean W, Pearson Education Limited, [2014]. ISBN: 9781292024943
https://ebookcentral-proquest-com.recursos.biblioteca.upc.edu/lib/upcatalunya-ebooks/detail.action?pq-origsite=primo&docID=5174865
Practical statistics for data scientists: 50+ essential concepts using R and Python - Bruce, Peter; Bruce, Andrew; Gedeck, Peter, O'Reilly, [2020]. ISBN: 9781492072942
https://ebookcentral-proquest-com.recursos.biblioteca.upc.edu/lib/upcatalunya-ebooks/detail.action?pq-origsite=primo&docID=6173908
Visualization analysis and design - Munzner, Tamara, CRC Press, Taylor & Francis Group, 2015. ISBN: 9781466508934
https://ebookcentral-proquest-com.recursos.biblioteca.upc.edu/lib/upcatalunya-ebooks/detail.action?pq-origsite=primo&docID=1664615
Show me the numbers : designing tables and graphs to enlighten - Few, Stephen, Analytics Press, 2012. ISBN: 0970601972
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004067739706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures - Wilke, Claus O, O'Reilly, 2019. ISBN: 9781492031079
https://ebookcentral-proquest-com.recursos.biblioteca.upc.edu/lib/upcatalunya-ebooks/detail.action?pq-origsite=primo&docID=5734202
The Visual Display of Quantitative Information - Tufte, Edward R, Graphics Press, 2001. ISBN: 096139210X
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991001453439706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Process mining: data science in action - Aalst, Wil van der, Springer, 2016. ISBN: 9783662498514
https://ebookcentral-proquest-com.recursos.biblioteca.upc.edu/lib/upcatalunya-ebooks/detail.action?pq-origsite=primo&docID=4505537
Process mining in action : principles, use cases and outlook - Reinkemeyer, Lars, Springer, 2020. ISBN: 9783030401726
https://ebookcentral-proquest-com.recursos.biblioteca.upc.edu/lib/upcatalunya-ebooks/detail.action?pq-origsite=primo&docID=6134217
Guide to Intelligent Data Science: How to Intelligently Make Use of Real Data - Berthold, Michael R; Borgelt, Christian; Höppner, FranK ... [et al.], Springer, 2020. ISBN: 9783030455736
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991005348570306711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Analytics, Data Science, & Artificial Intelligence: Systems for Decision Support - Sharda, Ramesh; Delen, Dursun; Turban, Eraïm, 11th Edition, 2020. ISBN: 978-1292341552
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991005323855806711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems - Kleppmann, Martin, O'Reilly, 2017. ISBN: 978-1449373320
https://ebookcentral-proquest-com.recursos.biblioteca.upc.edu/lib/upcatalunya-ebooks/detail.action?pq-origsite=primo&docID=4825244
Fundamentals of Data Engineering: Designing and Building Scalable Data Systems for Modern Applications [audiollibre] - Murray, Brian, 2024. ISBN: 979-8391793649
https://www.storytel.com/co/books/fundamentals-of-data-engineering-designing-and-building-scalable-data-systems-for-modern-applications-9848877

Complementary

Exploratory multivariate analysis by example using R - Husson, F.; Lê, S.; Pagès, J., CRC Press, 2017. ISBN: 9781315301860
https://ebookcentral-proquest-com.recursos.biblioteca.upc.edu/lib/upcatalunya-ebooks/detail.action?pq-origsite=primo&docID=4856173

Previous capacities

In this subject the techniques seen in a large part of the subjects of the preceding subjects such as "Probability and Statistics", "Intelligent Data Analysis", "Machine Learning", "Logic, Automatic Reasoning and "Knowledge-Based Systems" and " Human language processing and perception"