Intelligent Data Analysis Processes is the fourth course in a sequence in which the rudiments of probability, statistical inference and statistical modelling have been acquired. This course culminates the training to bring data to more complex decision-making, with an in-depth study of the design of comprehensive processes that incorporate data and use various forms of artificial intelligence and advanced data models to extract strategic value from them. , while connecting the results of the data models with other components of the decision systems and processes. In this subject, the techniques seen in large part of the subjects of the preceding subjects such as "Probability and Statistics", "Intelligent Data Analysis", "Machine Learning", "Knowledge, Automatic Reasoning" and "Knowledge-Based Systems" and "Treatment of Human Language" will be seen as parts of more complex analysis processes, ranging from data collection to the integration of data and knowledge-based models in comprehensive decision support systems or different schemes for integrating AI and data in decisions.
Teachers
Person in charge
Karina Gibert Oliveras (
)
Sergi Ramirez Mitjans (
)
Others
Xavier Angerri Torredeflot (
)
Weekly hours
Theory
2
Problems
0
Laboratory
2
Guided learning
0
Autonomous learning
6
Competences
Transversal Competences
Transversals
CT4 [Avaluable] - Teamwork. Be able to work as a member of an interdisciplinary team, either as a member or conducting management tasks, with the aim of contributing to develop projects with pragmatism and a sense of responsibility, taking commitments taking into account available resources.
CT6 [Avaluable] - Autonomous Learning. Detect deficiencies in one's own knowledge and overcome them through critical reflection and the choice of the best action to extend this knowledge.
CT8 [Avaluable] - Gender perspective. An awareness and understanding of sexual and gender inequalities in society in relation to the field of the degree, and the incorporation of different needs and preferences due to sex and gender when designing solutions and solving problems.
Basic
CB2 - That the students know how to apply their knowledge to their work or vocation in a professional way and possess the skills that are usually demonstrated through the elaboration and defense of arguments and problem solving within their area of ??study.
CB3 - That students have the ability to gather and interpret relevant data (usually within their area of ??study) to make judgments that include a reflection on relevant social, scientific or ethical issues.
CB4 - That the students can transmit information, ideas, problems and solutions to a specialized and non-specialized public.
CB5 - That the students have developed those learning skills necessary to undertake later studies with a high degree of autonomy
Technical Competences
Especifics
CE09 - To ideate, design and integrate intelligent data analysis systems with their application in production and service environments.
CE17 - To develop and evaluate interactive systems and presentation of complex information and its application to solving human-computer and human-robot interaction design problems.
CE18 - To acquire and develop computational learning techniques and to design and implement applications and systems that use them, including those dedicated to the automatic extraction of information and knowledge from large volumes of data.
CE20 - To select and put to use techniques of statistical modeling and data analysis, assessing the quality of the models, validating and interpreting.
Generic Technical Competences
Generic
CG1 - To ideate, draft, organize, plan and develop projects in the field of artificial intelligence.
CG2 - To use the fundamental knowledge and solid work methodologies acquired during the studies to adapt to the new technological scenarios of the future.
CG3 - To define, evaluate and select hardware and software platforms for the development and execution of computer systems, services and applications in the field of artificial intelligence.
CG4 - Reasoning, analyzing reality and designing algorithms and formulations that model it. To identify problems and construct valid algorithmic or mathematical solutions, eventually new, integrating the necessary multidisciplinary knowledge, evaluating different alternatives with a critical spirit, justifying the decisions taken, interpreting and synthesizing the results in the context of the application domain and establishing methodological generalizations based on specific applications.
CG5 - Work in multidisciplinary teams and projects related to artificial intelligence and robotics, interacting fluently with engineers and professionals from other disciplines.
CG7 - To interpret and apply current legislation, as well as specifications, regulations and standards in the field of artificial intelligence.
CG8 - Perform an ethical exercise of the profession in all its facets, applying ethical criteria in the design of systems, algorithms, experiments, use of data, in accordance with the ethical systems recommended by national and international organizations, with special emphasis on security, robustness , privacy, transparency, traceability, prevention of bias (race, gender, religion, territory, etc.) and respect for human rights.
CG9 - To face new challenges with a broad vision of the possibilities of a professional career in the field of Artificial Intelligence. Develop the activity applying quality criteria and continuous improvement, and act rigorously in professional development. Adapt to organizational or technological changes. Work in situations of lack of information and / or with time and / or resource restrictions.
Objectives
Solving available open data sources in combination with private data
Related competences:
CG8,
CT6,
CT8,
CB3,
Identify what kind of preprocessing real data needs
Related competences:
CG4,
CG8,
Know methods of integrated analysis of data and knowledge and be able to apply them correctly to a real problem
Related competences:
CG2,
CG4,
CE18,
Given a problem, data and perspectives for using the model, knowing how to choose the best model to apply among all those seen in the subject and in the previous ones
Related competences:
CG1,
CG4,
CG8,
CT4,
CT8,
CB5,
CE09,
CE18,
CE20,
Combine the results of data-driven models with useful knowledge production methods for subsequent decision-making
Related competences:
CT4,
CB2,
CE09,
CE17,
CE18,
Identify the reporting or visualization tools most suitable for a specific problem.
Related competences:
CB4,
Integrate the tools and models that are known in the design of an intelligent data analysis process suitable for a specific problem.
Related competences:
CG2,
CG3,
CG4,
CG9,
Master the technologies of putting into production an intelligent data analysis process.
Related competences:
CG3,
CG7,
CG9,
CE18,
Be aware of AI's digital footprint and be able to apply strategies that reduce it in a process of intelligent data analysis.
Related competences:
CG2,
CG3,
CG8,
CE09,
CE18,
Integrate intelligent data analysis processes into intelligent decision support system architectures.
Related competences:
CG1,
CG3,
CG4,
CG5,
CG8,
CG9,
CT4,
CT6,
CT8,
CB2,
CE09,
CE20,
Being able to document new methods or technologies autonomously
Related competences:
CT6,
Understand the ethical principles of the current AI model and assess whether we can implement it in the debate.
Related competences:
CG4,
CG8,
CG9,
CT8,
CB4,
Be able to document yourself about new methods or technologies independently and be able to self-train in the future.
Related competences:
CG5,
CT6,
Contents
Introduction. The insertion of the data in the real decision processes
General outline of a data process (pre-processing, processing, post-processing, interpretation, insertion in the decision process)
Intelligent decision-making support systems
Intelligent decision-making support systems
Design of relevant data sources for a decision-making process
The relevant sources of information (data, images, videos, knowledge); static/dynamic; open, sample, experimental data
Linking the data with the objectives of the study. Data representativeness, biases and compensation policies
Best practices from design.
Integrated preprocessing design
Construction of data preprocessing organizational charts for complex projects
Role of study objectives and data models to be trained in data preprocessing processes
Choice of data modeling methods for the decision support process
-Integration of the DMMCM map in the method selection process
-The DMMT model of representation of the data-based methods
-Relation between the available methods and the objectives of the study
-Relation between the available methods and the available data
-Relation between the available method and the intended use of the model
Determination of knowledge models
Criteria for determining the knowledge representation models to integrate in the decision process (ontologies, knowledge bases, linguistic labels, etc.)
Relationship between knowledge components and data-based models in the decision support process
Knowledge representation and explainability of data modelsMixed components of data and knowledge
Other components of the decision process
Display
User interface
Modes of interaction with the user (voice, text, etc etc)
Insertion of intelligent data analysis in administrative processes
Real cases related to public administration, hospital administration, large corporations, etc. will be worked on
Insertion of intelligent data analysis in industrial processes
Real industry 4.0 cases will be worked on
Insertion of intelligent data analysis in business processes
Real cases will be worked on to improve business processes through the insertion of data and intelligent analysis (retailing, negotiations, etc.)
Insertion of intelligent data analysis in strategic decision processes
Real cases in the field of defining business strategies and drafting public policies
Ethical considerations and the carbon footprint of AI
European ethical models, assessment tools Carbon footprint of AI, strategies to reduce it
Activities
ActivityEvaluation act
Introduction The insertion of the data in the real decision processes
Design of an intelligent process to improve an administrative process
Team work on a real case applying the techniques seen in the course to the design of an intelligent data analysis process to improve an administrative process Objectives:10
Theory
2h
Problems
0h
Laboratory
7h
Guided learning
0h
Autonomous learning
10h
Design of a decision support system for a strategic decision process
Teamwork on real data following IDSS architectures seen in class Objectives:10
Theory
2h
Problems
0h
Laboratory
7h
Guided learning
0h
Autonomous learning
10h
Design of an intelligent data process to improve a business process
In this case, it will be a matter of teamwork to improve a business process, but the focus will be on small companies that are not yet technical and that do not have massive data or continuous monitoring Objectives:10
Theory
2h
Problems
0h
Laboratory
7h
Guided learning
0h
Autonomous learning
10h
Design of an intelligent system to support an industrial process
Teamwork for a highly technical industrial production process Objectives:10
Theory
2h
Problems
0h
Laboratory
7h
Guided learning
0h
Autonomous learning
10h
Ethical considerations and the carbon footprint of AI
The 12 suggested topics will be developed in 12 theoretical class sessions (2 hours per week) with their respective practices or associated laboratory session (also 2 hours per week).
The 3 sessions that are missing from the 15 sessions per semester established in the FIB, will be used for theoretical evaluations (quiz or similar) and practical evaluations (defense of practical work in the middle of the semester and at the end of the semester), remembering also that there are a couple of non-teaching weeks to be mid-term and/or final exam week, during which advice, support and guidance can be offered to students as reinforcement or preparation for their assessments.
In the theory classes, the inverted class scheme will be practiced whenever possible.
There is a web page for the subject.
The temporary distribution of the subject's contents and the materials to be brought prepared before each class will be published on this platform(s).
The master class outline will be used on occasion when the teacher needs to clarify complex concepts that have not been clear with the materials previously distributed in class.
The theory class will be mainly devoted to the presentation of cases and the development of interactive activities with the students such as the discussion of the cases, or the completion of specific short questionnaires.
One of the activities of the theory classes of the course will be the approach of real cases with proposals for the design of the intelligent data system to support certain decisions and the open discussion in the classroom about the strengths and weaknesses of the proposed design. This activity is fundamental to train the student in designing solvent, safe, viable processes with little risk of bankruptcy when we talk about real environments. Methodological questions to be clarified by the teacher will derive from the result of the debate.
Additionally, the students will perform in groups a good number of short practical works on the design of intelligent data analysis processes in more or less mature scenarios from a technological point of view where the entire process will have to be done from the eventual collection or identification of data sources or knowledge up to the communication of results and recommendations with the user.
The analysis case can be proposed by the students themselves based on certain characteristics set by the teaching staff. Each team will carry out practice sessions, each week applying the techniques seen in the course to tackle the challenge. The teacher will monitor all the work teams weekly in the laboratory sessions. The design proposal will include a proof of concept as far as the means of the subject allow for the proposed proposal.
Twice a year the teams will present their proposals in a sharing session where all the projects will be discussed together.
Supporting material resources include:
* Slides/Transparencies for each subject in pdf format or similar.
* Links to articles, forums, discussions or practical cases in congruent and reliable repositories for the subject.
* Videos or similar to show case studies or complementary topics to master classes.
* Use of GNU software for the practical part. The use of R, RStudio and similar platforms is suggested.
* You can use specialized software developed by research groups within the UPC such as GESCONDA and Klass, Freeling, etc.
Evaluation methodology
The following evaluation system is proposed:
- 4 Team works carried out throughout the course 80%.
Each team work is evaluated
- Technical quality of the proposed design and integration of knowledge involved (30%)
- Proof of concept (20%)
- Oral knowledge control test 10% (discussion with the teaching staff during the oral presentation of team work).
- Quality and performance of the work team. 10%
- Oral and written communication 10%.
- Ethics of the work team and the work itself 10%
-Gender perspective of the team and the work 10%.
- Attendance and participation in classes and laboratories. 10%
- 2 Quiz throughout the course 10% (5% each).
A survey on pre-processing techniques: Relevant issues in the context of environmental data -
GIBERT, Karina; SÁNCHEZ-MARRÉ, Miquel; IZQUIERDO, J,
IOS Press, 2016.
Analytics, Data Science, & Artificial Intelligence: Systems for Decision Support -
Ramesh Sharda, Dursun Delen, Efraim Turban,
11th Edition, 2020. ISBN: 978-1292341552
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems -
KLEPPMANN, Martin,
O'Reilly, 2016. ISBN: 978-1449373320
Fundamentals of Data Engineering: Designing and Building Scalable Data Systems for Modern Applications -
MURRAY, Brian,
ISBN: 979-8391793649
Complementary:
Exploratory multivariate analysis by example using R -
Husson, F.; Lê, S.; Pagès, J., CRC Press ,
2011.
ISBN: 9780367658021
Previous capacities
In this subject the techniques seen in a large part of the subjects of the preceding subjects such as "Probability and Statistics", "Intelligent Data Analysis", "Machine Learning", "Logic, Automatic Reasoning and "Knowledge-Based Systems" and " Human language processing and perception"