Advanced Statistical Modeling

You are here

Credits
6
Department
EIO
Types
Specialization complementary (Data Science)
Requirements
This subject has not requirements
The course covers different statistical regression models: simple and multiple linear regression, parametric non-linear regression, generalized linear model, nonparametric regression, generalized nonparametric regression. The model selection and validation is emphasized. A fundamental part of the course is the study of real cases, both by teachers and by students at the weekly assignments.

Teachers

Person in charge

  • Marta Pérez Casany ( )
  • Pedro Delicado Useros ( )

Weekly hours

Theory
3
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
7

Competences

Generic Technical Competences

Generic

  • CG3 - Capacity for mathematical modeling, calculation and experimental designing in technology and companies engineering centers, particularly in research and innovation in all areas of Computer Science.

Transversal Competences

Solvent use of the information resources

  • CTR4 - Capability to manage the acquisition, structuring, analysis and visualization of data and information in the area of informatics engineering, and critically assess the results of this effort.

Appropiate attitude towards work

  • CTR5 - Capability to be motivated by professional achievement and to face new challenges, to have a broad vision of the possibilities of a career in the field of informatics engineering. Capability to be motivated by quality and continuous improvement, and to act strictly on professional development. Capability to adapt to technological or organizational changes. Capacity for working in absence of information and/or with time and/or resources constraints.

Reasoning

  • CTR6 - Capacity for critical, logical and mathematical reasoning. Capability to solve problems in their area of study. Capacity for abstraction: the capability to create and use models that reflect real situations. Capability to design and implement simple experiments, and analyze and interpret their results. Capacity for analysis, synthesis and evaluation.

Technical Competences of each Specialization

Specific

  • CEC2 - Capacity for mathematical modelling, calculation and experimental design in engineering technology centres and business, particularly in research and innovation in all areas of Computer Science.

Objectives

  1. At the end of the course the student will be able to propose and estimate simple and multiple linear regression models. She will also be able to interpret and validate the estimated models.
    Related competences: CG3, CEC2, CTR4, CTR6,
  2. At the end of the course the student will be able to propose, estimate, interpret and validate generalized linear models.
    Related competences: CG3, CEC2, CTR4, CTR6,
  3. At the end of the course the student will be able to propose, estimate, interpret and validate non-parametric versions of linear regression models and generalized linear models.
    Related competences: CG3, CEC2, CTR4, CTR5, CTR6,
  4. At the end of the course the student will know properly how to choose the smoothing parameters which in nonparametric regression models control the trade-off between good fit to the observed sample and good generalization.
    Related competences: CG3, CEC2, CTR4, CTR5, CTR6,
  5. At the end of the course the student, facing a real problem of modeling and / or prediction, will know to choose the most suitable regression model (parametric, non-parametric and semi-parametric).
    Related competences: CG3, CEC2, CTR4, CTR5, CTR6,

Contents

  1. Parametric Modelling
    1. Introduction. Deterministic models and statistical models. Parametric, nonparametric and semiparametric models. Statistical model building. Examples. Software.
    2: Normal linear models. Description of the normal linear model. Estimation by least squares. ANOVA table. Inference. Model checking. Use of categorical explanatory variables. Model selection. Prediction. Interpretation of the model and collinearity. Robust regression and outlier detection. Normal non-linear model.
    3. Generalized linear models. Description of the generalized linear models. Models for binary response data. Models for count data and contingency tables. Models for lifetime response data. Estimation by maximum likelihood and through the Xi^2 statistic. Inference. Model checking.
    4. Bayesian models. Frequentist inference and inference based on the likelihood function. What is a Bayesian model? Posterior distribution. Prior predictive and posterior predictive distribution. Selection of a prior distribution. Bayesian inference.
  2. Nonparametric Modelling
    1. Nonparametric regression model. Introduction to nonparametric modeling. Local polynomial regression. The bias-variance trade-off . Kernels. Linear smoothers. Choosing the degree of the local polynomial. Choosing the smoothing parameter: Cross validation, plug-in methods, varying windows.

    2. Generalized nonparametric regression model. Nonparametric regression with binary response. Generalized nonparametric regression model. Estimation by maximum local likelihood.

    3. Inference with nonparametric regression. Variability bands. Testing for no e ffects. Checking a parametric model. Comparing curves.

    4. Spline smoothing. Penalized least squares nonparametric regression. Cubic splines and interpolation. Smoothing splines. B-splines and P-splines. Spline regression. Fitting generalized nonparametric regression models with splines.

    5. Generalized additive models and Semiparametric models. Multiple nonparametric regression. The curse of dimensionality. Additive models. Generalized additive models. Semiparametric models.

Activities

Presentation of Theme 1 (parametric regression models) in class

Presentation of Theme 1 (parametric regression models) in class
Theory
22.5
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
52.5
Objectives: 1 2 5
Contents:

Presentation of Theme 2 (non-parametric regression models) in class

Presentation of Theme 2 (non-parametric regression models) in class
Theory
22.5
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
52.5
Objectives: 3 4 5
Contents:

Teaching methodology

There is a weekly 3 hours session. The first two hours are devoted to the exposition of the theoretical subjects by the teacher. The last hour is dedicated to implement these contents: Each student has his laptop in class and he or she performs the tasks proposed by the teacher. Each session ends with an assigment to students who must be delivered the following session.

Evaluation methodology

Homework will be assigned at the end of each session and it will be due the following session. Homework grades will be worth 50% of your course grade. The other 50% of your course grade will come from a fi nal exam.

Course Grade = 0:5  Hwk Grade + 0:5  Exam Grade

Bibliografy

Basic:

  • Applied Smoothing Techniques for Data Analysis - Bowman, A. W. and A. Azzalini, Oxford University Press , 1997. ISBN:
  • Statistical Models in S - Chambers, J.M. and Hastie, T.J., Wadsworth and Brooks/Cole , 1992. ISBN:
  • Generalized Additive Models - Hastie, T.J. and Tibshirani, R.J., Chapman and Hall , 1990. ISBN:
  • The Elements of Statistical Learning. Data Mining, Inference, and Prediction. - Hastie, T.J., Tibshirani, R.J. and Friedman,, Springer Verlag , 2009. ISBN:
  • Generalized Linear Models - McCullagh, P. and Nelder, J.A., Chapman and Hall. , 1989. ISBN:
  • Semiparametric Regression - Ruppert, D., M. P. Wand and R. J. Carroll, Cambridge University Press , 2003. ISBN:
  • Modern Applied Statistics with S-Plus - Venables, W.N. and Ripley, B.D., Springer Verlag , 2002. ISBN:
  • All of Nonparametric Statistics - Wasserman, L., Springer , 2006. ISBN:
  • Applied Linear Regression - Weisberg, S., Wiley , 2005. ISBN:
  • Generalized Additive Models: An Introduction with R. - Wood, S.N, Chapman and Hall/CRC , 2006. ISBN:

Previous capacities

Not specified