Advanced Statistical Modelling

You are here

Credits
6
Types
Specialization complementary (Data Science)
Requirements
This subject has not requirements, but it has got previous capacities
Department
EIO;DAC
The course covers different statistical regression models: simple and multiple linear regression, parametric non-linear regression, generalized linear model, nonparametric regression, generalized nonparametric regression. The model selection and validation is emphasized. A fundamental part of the course is the study of real cases, both by teachers and by students at the weekly assignments.

Teachers

Person in charge

  • Jose Antonio Sánchez Espigares ( )
  • Pedro Delicado Useros ( )

Weekly hours

Theory
3
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
7

Competences

Generic Technical Competences

Generic

  • CG3 - Capacity for mathematical modeling, calculation and experimental designing in technology and companies engineering centers, particularly in research and innovation in all areas of Computer Science.

Transversal Competences

Information literacy

  • CTR4 - Capability to manage the acquisition, structuring, analysis and visualization of data and information in the area of informatics engineering, and critically assess the results of this effort.

Appropiate attitude towards work

  • CTR5 - Capability to be motivated by professional achievement and to face new challenges, to have a broad vision of the possibilities of a career in the field of informatics engineering. Capability to be motivated by quality and continuous improvement, and to act strictly on professional development. Capability to adapt to technological or organizational changes. Capacity for working in absence of information and/or with time and/or resources constraints.

Reasoning

  • CTR6 - Capacity for critical, logical and mathematical reasoning. Capability to solve problems in their area of study. Capacity for abstraction: the capability to create and use models that reflect real situations. Capability to design and implement simple experiments, and analyze and interpret their results. Capacity for analysis, synthesis and evaluation.

Technical Competences of each Specialization

Specific

  • CEC2 - Capacity for mathematical modelling, calculation and experimental design in engineering technology centres and business, particularly in research and innovation in all areas of Computer Science.

Objectives

  1. At the end of the course the student will be able to propose and estimate simple and multiple linear regression models. She will also be able to interpret and validate the estimated models.
    Related competences: CG3, CEC2, CTR4, CTR6,
  2. At the end of the course the student will be able to propose, estimate, interpret and validate generalized linear models.
    Related competences: CG3, CEC2, CTR4, CTR6,
  3. At the end of the course the student will be able to propose, estimate, interpret and validate non-parametric versions of linear regression models and generalized linear models.
    Related competences: CG3, CEC2, CTR4, CTR5, CTR6,
  4. At the end of the course the student will know properly how to choose the smoothing parameters which in nonparametric regression models control the trade-off between good fit to the observed sample and good generalization.
    Related competences: CG3, CEC2, CTR4, CTR5, CTR6,
  5. At the end of the course the student, facing a real problem of modeling and / or prediction, will know to choose the most suitable regression model (parametric, non-parametric and semi-parametric).
    Related competences: CG3, CEC2, CTR4, CTR5, CTR6,

Contents

  1. Parametric Modelling
    1. Introduction. Deterministic models and statistical models. Parametric, nonparametric and semiparametric models. Statistical model building. Examples. Software.
    2: Normal linear models. Description of the normal linear model. Estimation by least squares. ANOVA table. Inference. Model checking. Use of categorical explanatory variables. Model selection. Prediction. Interpretation of the model and collinearity. Robust regression and outlier detection. Normal non-linear model.
    3. Generalized linear models. Description of the generalized linear models. Models for binary response data. Models for count data and contingency tables. Models for lifetime response data. Estimation by maximum likelihood and through the Xi^2 statistic. Inference. Model checking.
    4. Bayesian models. Frequentist inference and inference based on the likelihood function. What is a Bayesian model? Posterior distribution. Prior predictive and posterior predictive distribution. Selection of a prior distribution. Bayesian inference.
  2. Nonparametric Modelling
    1. Nonparametric regression model. Introduction to nonparametric modeling. Local polynomial regression. The bias-variance trade-off. Kernels. Linear smoothers. Choosing the degree of the local polynomial. Choosing the smoothing parameter: Cross validation, plug-in methods, varying windows.

    2. Generalized nonparametric regression model. Nonparametric regression with binary response. Generalized nonparametric regression model. Estimation by maximum local likelihood.

    3. Inference with nonparametric regression. Variability bands. Testing for no effects. Checking a parametric model. Comparing curves.

    4. Spline smoothing. Penalized least squares nonparametric regression. Cubic splines and interpolation. Smoothing splines. B-splines and P-splines. Spline regression. Fitting generalized nonparametric regression models with splines.

    5. Generalized additive models and Semiparametric models. Multiple nonparametric regression. The curse of dimensionality. Additive models. Generalized additive models. Semiparametric models.

Activities

Activity Evaluation act


Presentation of Theme 1 (parametric regression models) in class

Presentation of Theme 1 (parametric regression models) in class
Objectives: 1 2 5
Contents:
Theory
22.5h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
52.5h

Presentation of Theme 2 (non-parametric regression models) in class

Presentation of Theme 2 (non-parametric regression models) in class
Objectives: 3 4 5
Contents:
Theory
22.5h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
52.5h

Teaching methodology

There is a weekly 3 hours session. The first two hours are devoted to the exposition of the theoretical subjects by the teacher. The last hour is dedicated to implement these contents: Each student has his laptop in class and he or she performs the tasks proposed by the teacher. Each session ends with an assigment to students who must be delivered the following session.

Evaluation methodology

Homeworks will be assigned during the course. Homework grades will be worth 50% of your course grade.

There will be an exam for the first part of the course, during the partial exams week, and another one for the second part, each one with a weight of 25%.

Course Grade = 0.5 * Hwk Grade + 0.25 * 1st part Exam Grade + 0.25 * 2nd part Exam Grade

Bibliography

Basic:

Previous capacities

Not specified

Addendum

Contents

NO HI HA CANVIS RESPECTE LA INFORMACIÓ PUBLICADA A LA GUIA DOCENT NO CHANGES REGARDING THE INFORMATION PUBLISHED IN THE TEACHING GUIDE

Teaching methodology

NO HI HA CANVIS RESPECTE LA INFORMACIÓ PUBLICADA A LA GUIA DOCENT NO CHANGES REGARDING THE INFORMATION PUBLISHED IN THE TEACHING GUIDE

Evaluation methodology

NO HI HA CANVIS RESPECTE LA INFORMACIÓ PUBLICADA A LA GUIA DOCENT NO CHANGES REGARDING THE INFORMATION PUBLISHED IN THE TEACHING GUIDE

Contingency plan

En cas de no poder fer classes presencials, es faran classes on-line o vídeos per a cada sessió. En cas de no poder fer examens presencials, es faran examens on-line. In case of not being able to do face-to-face classes, there will be online classes or videos for each session. In case of not being able to do face-to-face exams, online exams will be done.