Machine Learning Systems in Production

You are here

Credits
6
Types
  • MDS: Elective
  • MIRI: Elective
Requirements
This subject has not requirements, but it has got previous capacities
Department
ESSI
In this course, we introduce a project-based learning approach to teaching MLOps, focused on the demonstration and experience with emerging software engineering practices and tools to automatize the construction of ML-enabled components. The course includes laboratory sessions that cover the end-to-end ML component life cycle, from model building to production deployment.

Teachers

Person in charge

  • Silverio Juan Martínez Fernández ( )

Others

  • Matías Sebastián Martínez Martínez ( )
  • Santiago Del Rey Juarez ( )

Weekly hours

Theory
1.8
Problems
0
Laboratory
1.8
Guided learning
0
Autonomous learning
6.4

Competences

Transversal Competences

Sustainability and social commitment

  • CT2 - Capability to know and understand the complexity of economic and social typical phenomena of the welfare society; capability to relate welfare with globalization and sustainability; capability to use technique, technology, economics and sustainability in a balanced and compatible way.

Teamwork

  • CT3 - Ability to work as a member of an interdisciplinary team, as a normal member or performing direction tasks, in order to develop projects with pragmatism and sense of responsibility, making commitments taking into account the available resources.

Third language

  • CT5 - Achieving a level of spoken and written proficiency in a foreign language, preferably English, that meets the needs of the profession and the labour market.

Basic

  • CB8 - Capability to communicate their conclusions, and the knowledge and rationale underpinning these, to both skilled and unskilled public in a clear and unambiguous way.
  • CB9 - Possession of the learning skills that enable the students to continue studying in a way that will be mainly self-directed or autonomous.

Generic Technical Competences

Generic

  • CG3 - Define, design and implement complex systems that cover all phases in data science projects
  • CG4 - Design and implement data science projects in specific domains and in an innovative way

Technical Competences

Especifics

  • CE5 - Model, design, and implement complex data systems, including data visualization
  • CE7 - Identify the limitations imposed by data quality in a data science problem and apply techniques to smooth their impact
  • CE10 - Identify machine learning and statistical modeling methods to use and apply them rigorously in order to solve a specific data science problem

Objectives

  1. Interpret the basic concepts of Software Engineering for ML systems, especially in relation to the use and exploitation of MLOps practices.
    Related competences: CT5, CG3, CE5,
  2. Apply and analyze MLOps practices to build ML models, fostering reproducibility and quality assurance.
    Related competences: CT2, CT3, CE7, CE10, CB8, CB9,
  3. Apply and analyze MLOps practices to deploy ML models, fostering API development and component delivery.
    Related competences: CT3, CG3, CG4, CE5, CB8, CB9,
  4. Describe concepts and methods related to monitoring data obtained during the use of ML systems, in order to enable feedback loops in response to changes.
    Related competences: CT3, CG3, CG4, CE5, CB8, CB9,

Contents

  1. Basic concepts of Software Engineering for ML systems (MLOps)
    Motivation of the need of software engineering for ML systems. MLOps introduction and key concepts. Requirements engineering for ML. Collaborative development platforms.
  2. MLOps practices to build ML models
    The complexity and diversity of data science projects and ML systems call for engineering techniques to ensure they are built in a robust and future-proof manner. On this chapter we address software engineering best practices for data science projects software including ML components: version control systems; ML pipeline reproducibility and tracking; software measurement for ML; quality assurance for ML.
  3. MLOps practices to deploy ML models
    The complexity and diversity of ML systems call for engineering techniques to ensure they are deployed in a robust and production-ready manner. On this chapter we address software engineering best practices for ML components: software architecture for ML; deploying ML models; APIs for ML; packaging of ML components; automation of ML pipelines.
  4. Monitoring data obtained during the use of ML systems
    A key problem in software development is the evolution of the ML system in response to new needs. The analysis of the data obtained during the use of the ML system by its users, including their explicit comments, makes it possible to discover their real needs, which sometimes even they are not fully aware of. More and more we find software systems that need to be aware of their context in order to provide a correct service. This restriction requires them to monitor context data continuously, discover significant changes and react at runtime (eventually, almost in real time). This topic describes the problem and reviews some basic techniques: monitoring and telemetry; MLOps cycles and feedback loops.

Activities

Activity Evaluation act


Study of basic concepts of Software Engineering for ML systems (MLOps)


Objectives: 1
Contents:
Theory
3.6h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
1.8h

Study of MLOps practices to build ML models


Objectives: 2
Contents:
Theory
7.2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
3.6h

Study of MLOps practices to deploy ML models


Objectives: 3
Contents:
Theory
7.2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
3.6h

Study of concepts for monitoring data obtained during the use of ML systems


Objectives: 4
Contents:
Theory
7.2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
3.6h

Practical development of an end-to-end project of MLOps practices in the context of ML-based systems

The student will progressively develop a practice that allows him to exercise the basic concepts introduced in the theory part. It will be developed in teams of 4-5 students. The resulting software, duly documented, will be uploaded to a code repository. The team will present a report, written in English, summarizing the main aspects of the practice. This is, the process of building and deploying an ML component of an ML-based system, and an evaluation of the accuracy of the models and algorithms used.
Objectives: 1 2 3 4
Contents:
Theory
0h
Problems
0h
Laboratory
25.2h
Guided learning
0h
Autonomous learning
70.6h

Presentation of the summary of an existing article about MLOps

The student will present the summary of a scientific article. All students need to present (at least) once. Presenters need to make at least one question to the other presentations to foster discussions. Lecturers prepare a list of articles.
Objectives: 1
Week: 14
Theory
1.8h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
6.4h

Presentation of the practical development of an end-to-end project of MLOps practices in the context of ML-based systems


Objectives: 1 2 3 4
Week: 14
Theory
0h
Problems
0h
Laboratory
1.8h
Guided learning
0h
Autonomous learning
6.4h

Teaching methodology

The theoretical contents of the course are taught in the theory classes. These classes are complemented with practical examples and problems that students must solve in the Autonomous Learning hours.

In the laboratory sessions, the knowledge acquired in the theory classes is consolidated by solving problems and developing practices related to the theoretical contents. During the laboratory classes, the teacher will introduce new techniques and will leave an important part of the class for the students to work on the proposed exercises.

Evaluation methodology

The grade is calculated by weighting the grade of the project (weight 90%) and grade of an article presentation in theory (weight 10%). Both activities are mandatory.

NOTA-FINAL = 90% ProjectGrade + 10% ArticlePresentation

In the project grade, the completion of the project and the individual work are graded. As a result, each student's final project grade is determined from the following formula:

ProjectGrade = TeamGrade * IndivFact

The project's overall TeamGrade grade takes into account the application of software engineering practices.

The individual factor IndivFact is a multiplicative factor among 0 and 1.2 (and similarly, cannot make ProjectGrade grow beyond 10). This factor is obtained from the evaluation that the teacher makes about the participation of the student in the project development and the evaluation that the team mates make on this very participation.

Bibliography

Basic:

Complementary:

  • 2023 IEEE/ACM 45th International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET) - Lanubile, Filippo; Martínez-Fernández, Silverio; Quaranta, Luigi, 2023 IEEE/ACM 45th International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET) , 2023.
    https://arxiv.org/pdf/2302.01048.pdf
  • Machine Learning in Production - Kästner, Christian, Carnegie Mellon University , 2022.
    https://mlip-cmu.github.io/s2023/

Previous capacities

Those given by the subjects of the previous quarters of the master. Fundamentals of machine learning.