Process-Oriented Data Science

You are here

Credits
6
Types
Compulsory
Requirements
This subject has not requirements, but it has got previous capacities
Department
CS
Every organization is structured on a set of processes that define its operation. In order to be able to manage their processes, organizations use models that allow them to be analyzed and continuously improved. In this course we will look at how data science can significantly improve the way organizations manage and improve their business processes.

Teachers

Person in charge

  • Carlos Escolano Peinado ( )

Others

  • Aysel Palacios Ardanuy ( )

Weekly hours

Theory
1.9
Problems
0
Laboratory
1.9
Guided learning
0
Autonomous learning
6.85

Competences

Transversal Competences

Information literacy

  • CT4 - Capacity for managing the acquisition, the structuring, analysis and visualization of data and information in the field of specialisation, and for critically assessing the results of this management.

Third language

  • CT5 - Achieving a level of spoken and written proficiency in a foreign language, preferably English, that meets the needs of the profession and the labour market.

Entrepreneurship and innovation

  • CT1 - Know and understand the organization of a company and the sciences that govern its activity; have the ability to understand labor standards and the relationships between planning, industrial and commercial strategies, quality and profit. Being aware of and understanding the mechanisms on which scientific research is based, as well as the mechanisms and instruments for transferring results among socio-economic agents involved in research, development and innovation processes.

Basic

  • CB6 - Ability to apply the acquired knowledge and capacity for solving problems in new or unknown environments within broader (or multidisciplinary) contexts related to their area of study.
  • CB7 - Ability to integrate knowledge and handle the complexity of making judgments based on information which, being incomplete or limited, includes considerations on social and ethical responsibilities linked to the application of their knowledge and judgments.
  • CB8 - Capability to communicate their conclusions, and the knowledge and rationale underpinning these, to both skilled and unskilled public in a clear and unambiguous way.
  • CB9 - Possession of the learning skills that enable the students to continue studying in a way that will be mainly self-directed or autonomous.
  • CB10 - Possess and understand knowledge that provides a basis or opportunity to be original in the development and/or application of ideas, often in a research context.

Generic Technical Competences

Generic

  • CG2 - Identify and apply methods of data analysis, knowledge extraction and visualization for data collected in disparate formats
  • CG3 - Define, design and implement complex systems that cover all phases in data science projects

Technical Competences

Especifics

  • CE5 - Model, design, and implement complex data systems, including data visualization
  • CE6 - Design the Data Science process and apply scientific methodologies to obtain conclusions about populations and make decisions accordingly, from both structured and unstructured data and potentially stored in heterogeneous formats.
  • CE7 - Identify the limitations imposed by data quality in a data science problem and apply techniques to smooth their impact
  • CE9 - Apply appropriate methods for the analysis of non-traditional data formats, such as processes and graphs, within the scope of data science
  • CE13 - Identify the main threats related to ethics and data privacy in a data science project (both in terms of data management and analysis) and develop and implement appropriate measures to mitigate these threats

Objectives

  1. Te be aware of the theoretical and practical set of problems that constitute process oriented data science, and to understand the main algorithms to tackle it: both at the conceptual level and at the level of their application through some of the current tools and libraries.
    Related competences: CB10, CB7, CB9, CT4, CT5, CE13, CE5, CE6, CE7, CE9, CG2, CG3,
  2. To acquire and demonstrate an ability to put to work the knowledge obtained during the course, and to relate it to the organizational and team perspectives as a process oriented data science project running in a real organization.
    Related competences: CB6, CB8, CT1,

Contents

  1. Process models and event data
    Describing the concepts of process models and event data
  2. Automatic process model discovery
    Overview on the different techniques to mine process models from event data
  3. Conformance checking of process models and event data
    The main techniques to relate observed and modeled behavior will be introduced
  4. Evidence-based process enhancement grounded in event data
    Techniques to improve and extend process models from event data
  5. Assorted advanced techniques and applications
    Advanced techniques to solve particular applications will be described, including online and multi-perspective techniques.
  6. Methodology for process oriented data science projects
    A description of the life-cycle of a PODS project will be provided.

Activities

Activity Evaluation act


Process models and event data

This activity will introduce process models to specify processes in organizations, and data that talk about events that originate in the execution of processes.
Objectives: 1
Contents:
Theory
5h
Problems
0h
Laboratory
4h
Guided learning
0h
Autonomous learning
15.9h

Automatic process model discovery

In this activity, various techniques will be introduced that extract process models in various formalisms from event data.
Objectives: 1
Contents:
Theory
6h
Problems
0h
Laboratory
6h
Guided learning
0h
Autonomous learning
16h

Conformance checking of process models and event data

In this activity algorithms will be introduced for the relation between modeled and observed process behavior.
Objectives: 1
Contents:
Theory
6h
Problems
0h
Laboratory
6h
Guided learning
0h
Autonomous learning
16h

Evidence-based process enhancement grounded in event data

In this activity techniques will be presented to use event data to project and enhance process models and event logs.
Objectives: 1
Contents:
Theory
4h
Problems
0h
Laboratory
4h
Guided learning
0h
Autonomous learning
16h

Assorted advanced techniques and applications

Assorted techniques for solving real-life process oriented data sciences problems
Objectives: 1 2
Contents:
Theory
4h
Problems
0h
Laboratory
4h
Guided learning
0h
Autonomous learning
16h

Methodology for process oriented data science projects

Overview of how to manage a PODS project
Objectives: 2
Contents:
Theory
2h
Problems
0h
Laboratory
3h
Guided learning
0h
Autonomous learning
16h

Teaching methodology

Theory sessions that may include problem solving sessions with or without a programming component, practical sessions with open-source or commercial process oriented data science software, development of a case study.

Evaluation methodology

The evaluation of the subject consists of two elements: final exam (60%), practical assessments (40%).

The final exam will contain questions and problems about the theoretical contents that are explained in the theory classes.

The practical assessments will be guided assessments that will be conducted during the lab classes on various process mining tools and platforms. Assessments can be done in pairs or individually.

Bibliography

Basic:

Complementary:

Web links

Previous capacities

Thorough understanding of computing in general; good command of several programming languages; basic ability to formalize mathematically issues in informatics engineering.