Advanced Statistical Modelling

You are here

Specialization complementary (Data Science)
This subject has not requirements, but it has got previous capacities
The course covers different statistical regression models: generalized linear model, nonparametric regression, generalized nonparametric regression, Bayesian models. The model selection and validation is emphasized. A fundamental part of the course is the study of real cases, both by teachers and by students at the weekly assignments.

Weekly hours

Guided learning
Autonomous learning


Generic Technical Competences


  • CG3 - Capacity for mathematical modeling, calculation and experimental designing in technology and companies engineering centers, particularly in research and innovation in all areas of Computer Science.

Transversal Competences

Information literacy

  • CTR4 - Capability to manage the acquisition, structuring, analysis and visualization of data and information in the area of informatics engineering, and critically assess the results of this effort.

Appropiate attitude towards work

  • CTR5 - Capability to be motivated by professional achievement and to face new challenges, to have a broad vision of the possibilities of a career in the field of informatics engineering. Capability to be motivated by quality and continuous improvement, and to act strictly on professional development. Capability to adapt to technological or organizational changes. Capacity for working in absence of information and/or with time and/or resources constraints.


  • CTR6 - Capacity for critical, logical and mathematical reasoning. Capability to solve problems in their area of study. Capacity for abstraction: the capability to create and use models that reflect real situations. Capability to design and implement simple experiments, and analyze and interpret their results. Capacity for analysis, synthesis and evaluation.

Technical Competences of each Specialization


  • CEC2 - Capacity for mathematical modelling, calculation and experimental design in engineering technology centres and business, particularly in research and innovation in all areas of Computer Science.


  1. At the end of the course the student will be able to propose, estimate, interpret and validate generalized linear models.
    Related competences: CG3, CEC2, CTR4, CTR6,
  2. At the end of the course the student will be able to propose, estimate, interpret and validate non-parametric versions of linear regression models and generalized linear models.
    Related competences: CG3, CEC2, CTR4, CTR5, CTR6,
  3. At the end of the course the student will know properly how to choose the smoothing parameters which in nonparametric regression models control the trade-off between good fit to the observed sample and good generalization.
    Related competences: CG3, CEC2, CTR4, CTR5, CTR6,
  4. At the end of the course the student, facing a real problem of modeling and / or prediction, will know to choose the most suitable regression model (parametric, non-parametric, semi-parametric or Bayesian).
    Related competences: CG3, CEC2, CTR4, CTR5, CTR6,
  5. At the end of the course the student will be able to distinguish the difference between Bayesian and non-Bayesian statistical modelling
    Related competences: CG3, CEC2, CTR4, CTR5, CTR6,
  6. At the end of the course the student will be able to define a prior distribution, and go from prior to posterior distributions
    Related competences: CG3, CEC2, CTR4, CTR5, CTR6,
  7. At the end of the course the student will be able to understand the difference between hierarchical and non-hierarchical Bayesian models
    Related competences: CG3, CEC2, CTR4, CTR5, CTR6,
  8. At the end of the course the student will be able to check a Bayesian model, compare Bayesian models and use them for prediction
    Related competences: CG3, CEC2, CTR4, CTR5, CTR6,
  9. At the end of the course the student will be able to simulate from the posterior distribution by means of the suitable software
    Related competences: CG3, CEC2, CTR4, CTR5, CTR6,


  1. Parametric Modelling
    1. Introduction. Deterministic models and statistical models. Parametric, nonparametric and semiparametric models.

    2. Generalized linear models. Models for binary response data. Models for count data and contingency tables. Estimation by maximum likelihood and through the Xi^2 statistic. Inference. Model checking.

    3. Regularized estimation of LM and GLM. Ridge regression. LASSO estimation
  2. Nonparametric Modelling
    1. Nonparametric regression model. Local polynomial regression. Kernels. Linear smoothers. Choosing the smoothing parameter: Cross validation, plug-in methods, varying windows.

    2. Generalized nonparametric regression model. Estimation by maximum local likelihood.

    3. Inference with nonparametric regression. Variability bands. Testing for no effects. Checking a parametric model. Comparing curves.

    4. Spline smoothing. Penalized least squares nonparametric regression. Cubic splines and interpolation. Smoothing splines. B-splines and P-splines. Fitting generalized nonparametric regression models with splines.

    5. Generalized additive models and Semiparametric models. Multiple nonparametric regression. The curse of dimensionality. Generalized additive models. Semiparametric models.
  3. Bayesian Data Analysis
    1. Bayesian Model. The statistical model. The Likelihood function. The Bayesian model

    2. Bayesian Inference. Point and Interval estimation.Hypothesis Test

    3. Bayesian Computation. Markov Chain Montecarlo simulation. Monitoring convergence

    4. Hierarchical Models

    5. Checking and defining the model


Activity Evaluation act

Presentation of Theme 1 (parametric regression models) in class

Presentation of Theme 1 (parametric regression models) in class
Objectives: 1 4
Guided learning
Autonomous learning

Presentation of Theme 2 (non-parametric regression models) in class

Presentation of Theme 2 (non-parametric regression models) in class
Objectives: 2 3 4
Guided learning
Autonomous learning

Presentation of theme 3 (Bayesian models) in class

Presentation of theme 3 (Bayesian models) in class
Objectives: 4 5 6 7 8 9
Guided learning
Autonomous learning

Teaching methodology

There is a weekly 3 hours session. The first two hours are devoted to the exposition of the theoretical subjects by the teacher. The last hour is dedicated to implement these contents: Each student has his laptop in class and he or she performs the tasks proposed by the teacher. Each session ends with an assigment to students who must be delivered the following session.

Evaluation methodology

Homeworks will be assigned during the course. Homework grades will be worth 50% of your course grade.

There will be an exam for the first part of the course (first and second themes), during the partial exams week, and another one for the second part (third theme), each one with a weight of 25%.

Course Grade = 0.5 * Hwk Grade + 0.25 * 1st part Exam Grade + 0.25 * 2nd part Exam Grade



Previous capacities

Not specified