Spoken and Written Language Processing

You are here

Credits
6
Types
Compulsory
Requirements
This subject has not requirements, but it has got previous capacities
Department
TSC
This course is focused on the study of speech and language technologies, a fundamental part of artificial intelligence that aims to develop systems to analyze, understand, translate, and generate oral or written human language. Special attention is given to new technologies based on deep learning and its applications. The final project gives students the opportunity to deepen on a particular topic, and it also aims to help boost their own skills in the development of applications or in research.

Teachers

Person in charge

  • Marta Ruiz Costa-Jussa ( )

Others

  • Jose Adrian Rodriguez Fonollosa ( )

Weekly hours

Theory
3
Problems
0
Laboratory
1
Guided learning
0.4
Autonomous learning
6

Competences

Technical Competences

Technical competencies

  • CE5 - Design and apply techniques of signal processing, choosing between different technological tools, including those of Artificial vision, speech recognition and multimedia data processing.
  • CE6 - Build or use systems of processing and comprehension of written language, integrating it into other systems driven by the data. Design systems for searching textual or hypertextual information and analysis of social networks.

Transversal Competences

Transversals

  • CT5 - Solvent use of information resources. Manage the acquisition, structuring, analysis and visualization of data and information in the field of specialty and critically evaluate the results of such management.
  • CT6 - Autonomous Learning. Detect deficiencies in one's own knowledge and overcome them through critical reflection and the choice of the best action to extend this knowledge.
  • CT7 - Third language. Know a third language, preferably English, with an adequate oral and written level and in line with the needs of graduates.

Basic

  • CB4 - That the students can transmit information, ideas, problems and solutions to a specialized and non-specialized public.
  • CB5 - That the students have developed those learning skills necessary to undertake later studies with a high degree of autonomy

Generic Technical Competences

Generic

  • CG1 - To design computer systems that integrate data of provenances and very diverse forms, create with them mathematical models, reason on these models and act accordingly, learning from experience.
  • CG2 - Choose and apply the most appropriate methods and techniques to a problem defined by data that represents a challenge for its volume, speed, variety or heterogeneity, including computer, mathematical, statistical and signal processing methods.
  • CG4 - Identify opportunities for innovative data-driven applications in evolving technological environments.
  • CG5 - To be able to draw on fundamental knowledge and sound work methodologies acquired during the studies to adapt to the new technological scenarios of the future.

Objectives

  1. Know the most important deep learning technologies of interest in the processing of oral and written language.
    Related competences: CE5, CE6, CT5, CT6, CT7, CG1, CG2, CG4, CG5, CB4, CB5,
  2. The student must know the most important applications of speech and language technology.
    Related competences: CE5, CE6, CT5, CT6, CT7, CG1, CG2, CG4, CG5, CB4, CB5,
  3. The student must be able to select the most appropriate speech and language technology for a particular task or application.
    Related competences: CE5, CE6, CT5, CT6, CT7, CG1, CG2, CG4, CG5, CB4, CB5,
  4. Develop innovative applications that use speech technology appropriately.
    Related competences: CE5, CE6, CT5, CT6, CT7, CG1, CG2, CG4, CG5, CB4, CB5,
  5. El alumno debe ser capaz de identificar los parámetros fundamentales de la voz en el dominio temporal y frecuencial
    Related competences: CE5, CT5, CT6, CT7, CG1, CB4, CB5,
  6. The student must know the most important mathematical and machine learning tools for the analysis of the voice as vector quantification (VQ), Gaussian mixture models (GMM) and hidden Markov models (HMM).
    Related competences: CE5, CT5, CT6, CT7, CG1, CG2, CG4, CG5, CB4, CB5,
  7. The student must know the techniques for statistical language modeling.
    Related competences: CE6, CT5, CT6, CT7, CG1, CG2, CG4, CG5, CB4, CB5,

Contents

  1. Introduction to language and speech technologies and applications
    Applications of oral and written language processing. Social impact.
    Main blocks of a natural language processing system: speech recognition, natural language processing, text to speech conversion.
    Language as a sequence of words. Vector representation of words. One-hot encoding versus continuous-space representations.
    Word2vec: Continuous bag-of-words (CBOW) versus Continuous skip-gram. GloVe vectors. Structures and analogies in word vector representations.
  2. Language Modeling
    Statistical modeling based on N-grams.
    Modeling with neural networks. Recurring networks Convolutional networks. Attention mechanisms: the Transformer.
  3. Contextual language representations
    General purpose language representations.
    Unsupervised training. Unidirectional and bidirectional systems.
    Main architectures: ULMfit, OpenAI GPT, ELMo, BERT, XLM. Applications.
  4. Neural Machine Translation
    Introduction to Machine Translation. Automatic quality evaluation: BLEU
    Neural Machine Translation.
  5. Introduction to automatic speech recognition
    Pattern matching. Dynamic time warping.
    Hidden Markov models. Isolated word recognition.
    Large vocabulary continuous ASR: Acoustic modeling, Language modeling, Search.
  6. Speech synthesis
    Linguistic processing.
    Prosody modeling.
    Waveform generation.
    Concatenation methods.

Activities

Activity Evaluation act


Topic development: Introduction to speech and language technology and applications

Introduction to speech and language technology and applications. Word vectors
Objectives: 3 2
Contents:
Theory
6h
Problems
0h
Laboratory
2h
Guided learning
0h
Autonomous learning
10h

Topic development: Language Modeling


Objectives: 6
Contents:
Theory
6h
Problems
0h
Laboratory
2h
Guided learning
0h
Autonomous learning
10h

Topic development: Automatic Speech Recognition

Automatic Speech Recognition
Objectives: 5 6 7
Contents:
Theory
9h
Problems
0h
Laboratory
0h
Guided learning
1h
Autonomous learning
10h

Topic development: Speech Synthesis

Speech Synthesis
Objectives: 2
Contents:
Theory
6h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
10h

Topic development: Contextual language representations


Objectives: 1
Contents:
Theory
9h
Problems
0h
Laboratory
2h
Guided learning
1h
Autonomous learning
16h

Topic development: Neuronal Machine Translation

Neuronal Machine Translation

Theory
6h
Problems
0h
Laboratory
2h
Guided learning
0h
Autonomous learning
10h

Theory
3h
Problems
0h
Laboratory
7h
Guided learning
4h
Autonomous learning
24h

Teaching methodology

Lectures presenting new theoretical material and practical examples.
Theoretical and practical assignments grouped in subjects.
Research project, presented in written and oral form by the students.

Evaluation methodology

Course evaluation is based on three aspects:

- Two exams, a midterm exam and the final exam, to assess the knowledge acquired but the student on the topics worked on in theory and practice sessions (40%)

- Evaluation of laboratory assignments: (30%)

- Evaluation of the final project (30%)

Only the 40% of the grade corresponding to the exams can be reassessed. The new grade will replace the grade obtained in the two exams taken during the course, with the same weigh in the final grade.

Bibliography

Basic:

Complementary:

Previous capacities

College Calculus, Linear Algebra
Basic Probability and Statistics
Large programming experience in Pyhton
Machine Learning.
Introduction to Deep Learning