Advanced Statistical Modelling

You are here

Credits
6
Types
  • MIRI: Specialization complementary (Data Science)
  • MDS: Elective
Requirements
This subject has not requirements, but it has got previous capacities
Department
EIO
The course focuses on two situations of advanced statistical modeling: Time Series and Bayesian Statistics. Emphasis is placed on model selection and validation. A key part of the course is the study of real cases, both by teachers and students in the scheduled assignments

Teachers

Person in charge

  • Jose Antonio Sánchez Espigares ( )
  • Xavier Puig Oriol ( )

Weekly hours

Theory
3
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
7

Competences

Transversal Competences

Information literacy

  • CT4 - Capacity for managing the acquisition, the structuring, analysis and visualization of data and information in the field of specialisation, and for critically assessing the results of this management.

Third language

  • CT5 - Achieving a level of spoken and written proficiency in a foreign language, preferably English, that meets the needs of the profession and the labour market.

Basic

  • CB6 - Ability to apply the acquired knowledge and capacity for solving problems in new or unknown environments within broader (or multidisciplinary) contexts related to their area of study.
  • CB7 - Ability to integrate knowledge and handle the complexity of making judgments based on information which, being incomplete or limited, includes considerations on social and ethical responsibilities linked to the application of their knowledge and judgments.
  • CB10 - Possess and understand knowledge that provides a basis or opportunity to be original in the development and/or application of ideas, often in a research context.

Generic Technical Competences

Generic

  • CG2 - Identify and apply methods of data analysis, knowledge extraction and visualization for data collected in disparate formats

Technical Competences

Especifics

  • CE5 - Model, design, and implement complex data systems, including data visualization
  • CE6 - Design the Data Science process and apply scientific methodologies to obtain conclusions about populations and make decisions accordingly, from both structured and unstructured data and potentially stored in heterogeneous formats.
  • CE9 - Apply appropriate methods for the analysis of non-traditional data formats, such as processes and graphs, within the scope of data science
  • CE10 - Identify machine learning and statistical modeling methods to use and apply them rigorously in order to solve a specific data science problem
  • CE12 - Apply data science in multidisciplinary projects to solve problems in new or poorly explored domains from a data science perspective that are economically viable, socially acceptable, and in accordance with current legislation

Objectives

  1. Time Series
    Related competences: CT4, CT5, CG2, CE5, CE6, CE9, CE10, CE12, CB6, CB7, CB10,
    Subcompetences:
    • At the end of the course the student will be able to propose, estimate and validate ARIMA models for the prediction of time series.
    • At the end of the course the student will be able to improve the ARIMA models with outlier treatment, calendar effects and intervention analysis
    • At the end of the course the student will be able to apply machine learning methods for the prediction of time series (recurrent neural networks and LSTM)
    • At the end of the course the student will be able to define state space models for time series and apply the Kalman filter to solve different types of problems (noise cleaning, imputation of missing data, separation of components in structural time series)
  2. Bayesian Statistics
    Related competences: CT4, CT5, CG2, CE5, CE6, CE9, CE10, CE12, CB6, CB7, CB10,
    Subcompetences:
    • At the end of the course the student will be able to define a prior distribution, and go from prior to posterior distributions
    • At the end of the course the student will be able to check a Bayesian model, compare Bayesian models and use them for prediction
    • At the end of the course the student will be able to simulate from the posterior distribution by means of the suitable software
    • At the end of the course the student will be able to understand the difference between hierarchical and non-hierarchical Bayesian models

Contents

  1. Time Series
    1. Box-Jenkins methodology (ARIMA models) for prediction

    2. Extensions: outliers treatment, calendar effects and intervention analysis

    3. Kalman State Space and Filter Models. Applications

    4. Machine learning methods for time series forecasting (recurrent neural networks and LTSM)
  2. Bayesian Data Analysis
    1. Bayesian Model. The statistical model. The Likelihood function. The Bayesian model

    2. Bayesian Inference. Point and Interval estimation.Hypothesis Test

    3. Bayesian Computation. Markov Chain Montecarlo simulation. Monitoring convergence

    4. Hierarchical Models

    5. Checking and defining the model

Activities

Activity Evaluation act


Presentation of Theme 1 (Time Series) in class

Presentation of Theme 1 (Time Series) in class
Objectives: 1
Contents:
Theory
22.5h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
52.5h

Presentation of theme 2 (Bayesian models) in class

Presentation of theme 2 (Bayesian Models) in class
Objectives: 2
Contents:
Theory
22.5h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
52.5h

Teaching methodology

There is a weekly 3 hours session. The first two hours are devoted to the exposition of the theoretical subjects by the teacher. The last hour is dedicated to implement these contents: Each student has his laptop in class and he or she performs the tasks proposed by the teacher. Each session ends with an assigment to students who must be delivered the following session.

Evaluation methodology

Homeworks will be assigned during the course. Homework grades will be worth 50% of your course grade.

There will be an exam for the first part of the course (first theme), during the partial exams week, and another one for the second part (second theme), each one with a weight of 25%.

Course Grade = 0.5 * Hwk Grade + 0.25 * 1st part Exam Grade + 0.25 * 2nd part Exam Grade

Bibliography

Basic:

Previous capacities

Not specified