Skip to main content

Process-Oriented Data Science

Credits
6
Types
Compulsory
Requirements
This subject has not requirements , but it has got previous capacities
Department
CS
Every organization is structured on a set of processes that define its operation. In order to be able to manage their processes, organizations use models that allow them to be analyzed and continuously improved. In this course we will look at how data science can significantly improve the way organizations manage and improve their business processes.

Teachers

Person in charge

Others

Weekly hours

Theory
1.9
Problems
0
Laboratory
1.9
Guided learning
0
Autonomous learning
6.85

Competences

Information literacy

  • CT4 - Capacity for managing the acquisition, the structuring, analysis and visualization of data and information in the field of specialisation, and for critically assessing the results of this management.
  • Third language

  • CT5 - Achieving a level of spoken and written proficiency in a foreign language, preferably English, that meets the needs of the profession and the labour market.
  • Entrepreneurship and innovation

  • CT1 - Know and understand the organization of a company and the sciences that govern its activity; have the ability to understand labor standards and the relationships between planning, industrial and commercial strategies, quality and profit. Being aware of and understanding the mechanisms on which scientific research is based, as well as the mechanisms and instruments for transferring results among socio-economic agents involved in research, development and innovation processes.
  • Basic

  • CB6 - Ability to apply the acquired knowledge and capacity for solving problems in new or unknown environments within broader (or multidisciplinary) contexts related to their area of study.
  • CB7 - Ability to integrate knowledge and handle the complexity of making judgments based on information which, being incomplete or limited, includes considerations on social and ethical responsibilities linked to the application of their knowledge and judgments.
  • CB8 - Capability to communicate their conclusions, and the knowledge and rationale underpinning these, to both skilled and unskilled public in a clear and unambiguous way.
  • CB9 - Possession of the learning skills that enable the students to continue studying in a way that will be mainly self-directed or autonomous.
  • CB10 - Possess and understand knowledge that provides a basis or opportunity to be original in the development and/or application of ideas, often in a research context.
  • Generic

  • CG2 - Identify and apply methods of data analysis, knowledge extraction and visualization for data collected in disparate formats
  • CG3 - Define, design and implement complex systems that cover all phases in data science projects
  • Especifics

  • CE5 - Model, design, and implement complex data systems, including data visualization
  • CE6 - Design the Data Science process and apply scientific methodologies to obtain conclusions about populations and make decisions accordingly, from both structured and unstructured data and potentially stored in heterogeneous formats.
  • CE7 - Identify the limitations imposed by data quality in a data science problem and apply techniques to smooth their impact
  • CE9 - Apply appropriate methods for the analysis of non-traditional data formats, such as processes and graphs, within the scope of data science
  • CE13 - Identify the main threats related to ethics and data privacy in a data science project (both in terms of data management and analysis) and develop and implement appropriate measures to mitigate these threats
  • Objectives

    1. Te be aware of the theoretical and practical set of problems that constitute process oriented data science, and to understand the main algorithms to tackle it: both at the conceptual level and at the level of their application through some of the current tools and libraries.
      Related competences: CB10, CB7, CB9, CT4, CT5, CE13, CE5, CE6, CE7, CE9, CG2, CG3,
    2. To acquire and demonstrate an ability to put to work the knowledge obtained during the course, and to relate it to the organizational and team perspectives as a process oriented data science project running in a real organization.
      Related competences: CB6, CB8, CT1,

    Contents

    1. Process models and event data
      Describing the concepts of process models and event data
    2. Automatic process model discovery
      Overview on the different techniques to mine process models from event data
    3. Conformance checking of process models and event data
      The main techniques to relate observed and modeled behavior will be introduced
    4. Evidence-based process enhancement grounded in event data
      Techniques to improve and extend process models from event data
    5. Assorted advanced techniques and applications
      Advanced techniques to solve particular applications will be described, including online and multi-perspective techniques.
    6. Methodology for process oriented data science projects
      A description of the life-cycle of a PODS project will be provided.

    Activities

    Activity Evaluation act


    Process models and event data

    This activity will introduce process models to specify processes in organizations, and data that talk about events that originate in the execution of processes.
    Objectives: 1
    Contents:
    Theory
    5h
    Problems
    0h
    Laboratory
    4h
    Guided learning
    0h
    Autonomous learning
    15.9h

    Automatic process model discovery

    In this activity, various techniques will be introduced that extract process models in various formalisms from event data.
    Objectives: 1
    Contents:
    Theory
    6h
    Problems
    0h
    Laboratory
    6h
    Guided learning
    0h
    Autonomous learning
    16h

    Conformance checking of process models and event data

    In this activity algorithms will be introduced for the relation between modeled and observed process behavior.
    Objectives: 1
    Contents:
    Theory
    6h
    Problems
    0h
    Laboratory
    6h
    Guided learning
    0h
    Autonomous learning
    16h

    Evidence-based process enhancement grounded in event data

    In this activity techniques will be presented to use event data to project and enhance process models and event logs.
    Objectives: 1
    Contents:
    Theory
    4h
    Problems
    0h
    Laboratory
    4h
    Guided learning
    0h
    Autonomous learning
    16h

    Assorted advanced techniques and applications

    Assorted techniques for solving real-life process oriented data sciences problems
    Objectives: 1 2
    Contents:
    Theory
    4h
    Problems
    0h
    Laboratory
    4h
    Guided learning
    0h
    Autonomous learning
    16h

    Methodology for process oriented data science projects

    Overview of how to manage a PODS project
    Objectives: 2
    Contents:
    Theory
    2h
    Problems
    0h
    Laboratory
    3h
    Guided learning
    0h
    Autonomous learning
    16h

    Teaching methodology

    Theory sessions that may include problem solving sessions with or without a programming component, practical sessions with open-source or commercial process oriented data science software, development of a case study.

    Evaluation methodology

    The evaluation of the subject consists of two elements: final exam (60%), practical assessments (40%).

    The final exam will contain questions and problems about the theoretical contents that are explained in the theory classes.

    The practical assessments will be guided assessments that will be conducted during the lab classes on various process mining tools and platforms. Assessments can be done in pairs or individually.

    Bibliography

    Basic

    Complementary

    Web links

    Previous capacities

    Thorough understanding of computing in general; good command of several programming languages; basic ability to formalize mathematically issues in informatics engineering.