Credits
6
Types
Compulsory
Requirements
This subject has not requirements
, but it has got previous capacities
Department
CS
Teachers
Person in charge
- Carlos Escolano Peinado ( carlos.escolano@upc.edu )
Others
- Jordi Luque Serrano ( jordi.luque.serrano@upc.edu )
Weekly hours
Theory
2
Problems
0
Laboratory
2
Guided learning
0
Autonomous learning
6
Competences
Transversals
Basic
Especifics
Generic
Objectives
-
Understand the fundamental theories and techniques associated with dialog processing and generation.
Related competences: CB3, CB4, CT6, CE14, CE17, CG3, CG5, CG6, -
Understand the fundamental theories and techniques associated with voice and speech processing.
Related competences: CB3, CB4, CT6, CE14, CE17, CE27, CG3, CG5, CB2, -
Get to know the most relevant resources and applications for Dialog Processing and Generation.
Related competences: CB3, CB4, CB5, CT6, CT8, CE15, CE27, CG3, CG4, CG5, CG6, -
Develop programs to solve particular tasks from Dialog and Speech area.
Related competences: CB3, CT2, CT6, CT8, CB2, CE14, CE16, CE18, CG5, CE27, CT1, CG7, CG8, CG9,
Contents
-
Introduction
Introduction to the subject's content and to speech and dialog processing. -
Rule-based systems.
Dialog systems based on human-crafted rules. -
Corpus-based dialog systems: Frame-based and retrieval systems
Statistical dialog systems based on an example corpus. -
Deep Learning based dialog systems
Introduction to seq2seq, Transformer, and their application to dialog tasks. -
Ethical considerations and dialog policy.
Possible risks of dialog systems and techniques to mitigate them. -
Speech processing.
Techniques to transform speech and use it in our systems. -
Automatic speech recognition
Deep learning methods for automatic speech recognition. -
Text-to-Speech systems.
Generative text-to-speech systems based on Deep Learning.
Activities
Activity Evaluation act
Introductory Session
Introduction to the concepts of dialog and speech processing. We will also revisit some basic concepts of natural language processing, that are required to understand the subject (Tokenization and embeddings).- Theory: Explain the objectives and evaluation of the subject, and revise some basic natural language processing concepts.
- Laboratory: Present the practical exercises to do during the subject.
Contents:
Theory
2h
Problems
0h
Laboratory
2h
Guided learning
0h
Autonomous learning
0h
Rule-based dialog systems.
The historical context of dialog and rule-based systems.- Theory: Historical context and rule-based systems. We'll cover human-made rule crafting and its advantages for interpretability.
- Laboratory:
Contents:
Theory
2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
0h
Sistemas de diálogo basados en corpus: Sistemas de retrieval i frame-based.
En esta actividad se explicarán los sistemas basados en un corpus de ejemplos y sus principales diferencias con los sistemas basados en reglas. Dentro de estos nuevos sistemas, nos centraremos en los sistemas que recuperan ejemplos de una base de ejemplos (retrieval) y los sistemas generativos a partir de frames (frame-based).- Theory: In this activity we'll explain corpus-based systems and their main differences with rule-based systems. About this new approach, we will focus on retrieval systems from an example corpus and generative frame-based systems.
Contents:
Theory
4h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
0h
Deep Learning-based dialog systems.
Introduction to Seq2Seq systems, Transformer, and their application to dialog.- Theory: Introduction to Seq2Seq systems, Transformer, and their application to dialog.
Contents:
Theory
6h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
0h
Ethical considerations and dialog policy.
Ethical considerations when training dialog systems and methods to mitigate the dangers of this kind of system.- Theory: Ethical considerations when training dialog systems and methods to mitigate the dangers of this kind of system.
Contents:
Theory
2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
0h
Speech processing.
Introduction to speech processing, especially the transformations needed to train deep learning-based systems.- Theory: Introduction to speech processing, especially the transformations needed to train deep learning-based systems.
Contents:
Theory
2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
0h
Automatic speech recognition.
Deep Learning-based techniques for speech recognition, CTC loss, and Seq2Seq-based systems.- Theory: Deep Learning-based techniques for speech recognition, CTC loss, and Seq2Seq-based systems.
Contents:
Theory
4h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
0h
P2. Framed-based dialog system and Deep Learning.
Crafting a framed-based diàlog system using deep learning techniques.- Laboratory: Crafting a framed-based diàlog system using deep learning techniques.
Contents:
Theory
0h
Problems
0h
Laboratory
8h
Guided learning
0h
Autonomous learning
0h
P3. Automatic speech recognition.
Crafting an automatic speech recognition system using deep learning techniques.- Laboratory: Crafting an automatic speech recognition system using deep learning techniques.
Contents:
Theory
0h
Problems
0h
Laboratory
8h
Guided learning
0h
Autonomous learning
0h
Teaching methodology
The course deepens the concepts of Human Language Processing, extending them to dialogue tasks. In addition, it introduces a new modality of data, speech, and how both tasks can be combined when creating our systems.Classes are organized into theory and laboratory sessions. In the theory classes, the teacher will present the concepts to the students by combining them with exercises and questions to make the classes more interactive and ensure that the students achieve the concepts of the subject. In laboratory classes, students work in groups independently to apply the concepts they have seen in class to real data. These tasks include laboratory sessions where students can make inquiries and resolve their doubts, with independent work to develop their systems. The students' ability to research and find new solutions to the proposed problems will be assessed. In addition, at the end of the subject, students will have to test their ability to acquire new knowledge independently, by presenting a research article on one of the subjects covered in the subject.
Evaluation methodology
20% Partial Exam + 25% Final Exam + 45% Laboratory + 10% Paper PresentationThe theoretical part of the subject will be evaluated based on two exams. The first partial exam will focus on the dialogue blog (Contents 1-5). The second exam (Final) will evaluate the second block of speech processing (Contents 6-8). This exam will include exercises that combine speaking and dialogue to evaluate how students have acquired the knowledge of both blocks.
Regarding the laboratory part, the three activities will have the same weight, 15% of the total of the subject. Students will have around four weeks to complete them. The objective is to evaluate how the students apply the content seen in class in practice as well as their ability to solve problems and work as a team.
Finally, at the end of the subject, the students will have to choose an article on the processing of dialogue or voice and make a presentation in class. The objective of this task is to evaluate your ability to analyze new information and be able to achieve new knowledge of the subject, autonomously.
Assessment of skills.
The assessment of competence on autonomous use of information will be carried out with the oral presentation of the scientific article (10%). The students must be able to draw their conclusions on a new work related to the topics seen in class.
Reevaluation
Only students who attended the exams and failed can attend the reevaluation. The maximum grade after reevaluation is a 7.
Bibliography
Basic
-
Speech and language processing : an introduction to natural language processing, computational linguistics, and speech recognition
- Jurafsky, Dan; Martin, James H,
Els autors,
2019.
-
Foundations of statistical natural language processing
- Manning, Christopher D; Schütze, Hinrich,
MIT Press,
1999.
ISBN: 0262133601
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991001994779706711&context=L&vid=34CSUC_UPC:VU1&lang=ca -
The Handbook of computational linguistics and natural language processing [Recurs electrònic]
- Clark, Alexander; Fox, Chris; Lappin, Shalom,
Wiley-Blackwell,
2010.
ISBN: 9781444324044
-
Deep learning
- Goodfellow, Ian; Bengio, Yoshua; Courville, Aaron,
2016.
ISBN: 9780262035613
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004107709706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Previous capacities
To be able to do this subject, it is recommended to have previously taken the following subjects:XNDL-IA: In this subject, the fundamentals of deep learning are explained, including recurrent networks. Knowing these topics is necessary to understand how models based on Seq2Seq architectures work, state of the art in both voice and dialogue processing.
PLH-IA: This subject explains the basics of human language processing. Concepts such as text preprocessing to reduce ambiguities or the continuous representation of text are necessary to be able to develop the systems we will study in the subject.