Statistical Modelling and Design of Experiments

You are here

Credits
6
Types
Compulsory
Requirements
This subject has not requirements, but it has got previous capacities
Department
EIO
The aim of the course is to provide students with the tools needed to cope with complex systems using statistical modeling techniques. The students also learn different techniques of experimental design.

Teachers

Person in charge

  • Pau Fonseca Casas ( )

Others

  • Esteve Codina Sancho ( )
  • Lidia Montero Mercadé ( )

Weekly hours

Theory
1
Problems
1
Laboratory
2
Guided learning
0
Autonomous learning
6

Competences

Technical Competences of each Specialization

Computer networks and distributed systems

  • CEE2.3 - Capability to understand models, problems and mathematical tools to analyze, design and evaluate computer networks and distributed systems.

High performance computing

  • CEE4.1 - Capability to analyze, evaluate and design computers and to propose new techniques for improvement in its architecture.

Generic Technical Competences

Generic

  • CG1 - Capability to apply the scientific method to study and analyse of phenomena and systems in any area of Computer Science, and in the conception, design and implementation of innovative and original solutions.
  • CG3 - Capacity for mathematical modeling, calculation and experimental designing in technology and companies engineering centers, particularly in research and innovation in all areas of Computer Science.

Transversal Competences

Information literacy

  • CTR4 - Capability to manage the acquisition, structuring, analysis and visualization of data and information in the area of informatics engineering, and critically assess the results of this effort.

Reasoning

  • CTR6 - Capacity for critical, logical and mathematical reasoning. Capability to solve problems in their area of study. Capacity for abstraction: the capability to create and use models that reflect real situations. Capability to design and implement simple experiments, and analyze and interpret their results. Capacity for analysis, synthesis and evaluation.

Objectives

  1. Applying the mathematical formalism to solve problems involving uncertainty.
    Related competences: CG1, CG3, CTR4, CTR6,
  2. Applying the queuing models for computer systems performance evaluation and/or configurations analysis.
    Related competences: CEE2.3, CEE4.1, CTR6,
  3. Ability to design, conduct experiments and analyze results.
    Related competences: CG1, CG3, CTR4, CTR6,

Contents

  1. Introduction to probability
    Students should feel comfortable with the use of set notation and basic statistical terminology. Likewise, the student should be able to write the sample space of simple experiments, including sampling with replacement (like throwing coins or throwing dice), sampling without replacement, from Bernoulli trials and with rules of detention. Likewise, the student should be able to calculate the probabilities in simple cases of the above type of experiment.
  2. Introduction to statistical estimation
    Estimation, in the framework of statistical inference, is the set of techniques with the aim of give an approximate value for a parameter of a population from data provided by a sample. From the different methods that exist (point estimate, estimate intervals, or Bayesian estimation) we focus on the point estimate.
  3. Analysis of data
    The main objective of the section is to know the procedures associated with the analysis of variance (ANOVA terminology in English) and when is useful to be applied.This activity also introduces MANOVA, as a technique useful when there are two or more dependent variables. We also work with the techniques of linear regression and PCA, completing the repertoire of tools for data analysis.
  4. Introduction to experimental design
    Statistical experimental design, a.k.a. design of experiments (DoE) is the methodology of how to conduct and plan experiments in order to extract the maximum amount of information in the fewest number of runs (saving resources). In this section we describe different techniques to achieve that.
  5. Introduction to queuing theory
    This section will introduce the student to use the techniques of operations research for systems analysis for making quantitative decision in the presence of uncertainty through their representation in terms of queuing models.

Activities

Activity Evaluation act


Introduction to probability

At the end of this activity the Student must be comfortable with using basic set notation and terminology. Also the Student must be capable of write down the sample space for simple experiments, including sampling with replacement (such as tossing coins or rolling dice), sampling without replacement, and Bernoulli trials with stopping rules. Also the Student must be capable of calculate probabilities in straightforward instances of the above types of experiment.
Contents:
Theory
1h
Problems
1h
Laboratory
2h
Guided learning
0h
Autonomous learning
5h

Introduction to statistical estimation

Estimation, in the framework of statistical inference, is the set of techniques with the aim of give an approximate value for a parameter of a population from data provided by a sample. From the different methods that exist (point estimate, estimate intervals, or Bayesian estimation) we focus on the point estimate.
Contents:
Theory
2h
Problems
2h
Laboratory
4h
Guided learning
0h
Autonomous learning
8h

ANOVA, introduction to MANOVA

The main objective of the activity is to know the procedures associated with the analysis of variance (ANOVA terminology in English) and when is useful to be applied.This activity also introduces MANOVA, as a technique useful when there are two or more dependent variables.
Contents:
Theory
1h
Problems
1h
Laboratory
2h
Guided learning
0h
Autonomous learning
5h

Linear regression

Linear regression is a mathematical method that models the relationship between a dependent variable Y, independent variables Xi and a random term. This section will examine this method and explain its applicability from different examples.
Contents:
Theory
1h
Problems
1h
Laboratory
2h
Guided learning
0h
Autonomous learning
6h

Principal component analysis

The principal component analysis (PCA, PCA in English), in statistics, is a technique that reduces the dimensionality of a dataset. This allows us to represent them graphically in two or three dimensional graphs of various variables grouped the data into factors, or components, consisting of the grouping variables. In this section we will work this technique from a practical point of view.
Contents:
Theory
1h
Problems
1h
Laboratory
2h
Guided learning
0h
Autonomous learning
6h

Factorial design

Many experiments are conducted to study the effects of two or more factors. in this case the factorial designs are more efficient, presented in this section.
Contents:
Theory
3h
Problems
3h
Laboratory
9h
Guided learning
0h
Autonomous learning
12h

Randomized blocks, Latin squares and related designs

In many research problems is necessary to design experiments that can systematically control the variability caused by different sources. This section will consider some experimental designs for solve these situations.
Contents:
Theory
1h
Problems
1h
Laboratory
2h
Guided learning
0h
Autonomous learning
6h

Incomplete block design

Description incomplete blocks design, useful when you can not develop all combinations of treatment within each block.
Contents:
Theory
1h
Problems
1h
Laboratory
2h
Guided learning
0h
Autonomous learning
6h

General structure of queuing models

Introduction to queuing theory models. Kendall notation.
Contents:
Theory
1h
Problems
1h
Laboratory
2h
Guided learning
0h
Autonomous learning
5h

Queuing models based on birth and death processes

Introduction to basic concepts and elements of the analysis of Markov processes. Markov queues.
Contents:
Theory
1h
Problems
1h
Laboratory
2h
Guided learning
0h
Autonomous learning
5h

Generalized queuing models with no exponential distributions

Introduction to general service distributions and multiple types of work.
Contents:
Theory
1h
Problems
1h
Laboratory
2h
Guided learning
0h
Autonomous learning
5h

Exponential queuing models in serie

Networks of queues: open and closed networks.
Contents:
Theory
1h
Problems
1h
Laboratory
2h
Guided learning
0h
Autonomous learning
5h

First report


Week: 5
Type: assigment
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
5h

Second report


Week: 10
Type: assigment
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
5h

Third report


Week: 15
Type: assigment
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
5h

Final exam


Week: 15 (Outside class hours)
Type: final exam
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
2h
Autonomous learning
10h

Teaching methodology

The course is practical and aims that students will be able, once the course is completed and from the work done in the sessions, to solve real problems similar to those developed in class.

Evaluation methodology

The course will have different exercises that the students must solve during the course (80% of the final grade).
At the end there will be an exam that will weigh 20% of the final grade.

Bibliography

Basic:

  • Statistics for experimenters : an introduction to design, data analysis, and model building - Box, George E. P; Hunter, William Gordon; Hunter, J. Stuart, John Wiley and Sons, cop. 1978. ISBN: 0-471-09315-7
    http://cataleg.upc.edu/record=b1006823~S1*cat
  • Design and Analysis of Experiments - MONTGOMERY, Douglas C., Wiley, ISBN: 1118146921
  • An Introduction to queueing systems - BOSE, Sanjay K., Kluwer Academic/Plenum , 2002.
    http://home.iitk.ac.in/~skb/qbook/qbook.html
  • Estadística per a enginyers informàtics [Recurs electrònic] - González, José A, Edicions UPC, 2008. ISBN: 9788498803532
    http://cataleg.upc.edu/record=b1345832~S1*cat
  • Probability and statistics for computer scientists - BARON, Michael, Chapman & Hall, 2007.

Complementary:

  • The Art of computer systems performance analysis : techniques for experimental design, measurement, simulation, and modeling - Jain, Raj, John Wiley & Sons , cop. 1991. ISBN: 0471503363
    http://cataleg.upc.edu/record=b1080952~S1*cat
  • Probability and statistics with reliability, queuing and computer science applications - Trivedi, Kishor Shridharbhai, John Wiley & Sons , cop. 2002 [i.e. 2001]. ISBN: 0471333417
    http://cataleg.upc.edu/record=b1201882~S1*cat
  • Introduction to operations research - HILLIER, Frederick S., LIEBERMAN, Gerald J. , Mcgraw-Hill College , 1995. ISBN: 978-0078414473
  • Operations research : applications and algorithms - Winston, Wayne L, Brooks/Cole - Thomson Learning , cop. 2004. ISBN: 0534423620
    http://cataleg.upc.edu/record=b1243939~S1*cat
  • Practical reliability engineering - O'CONNOR, Patrick D.T. NEWTON, David, BROMLEY, Richard , Wiley , . ISBN:
  • Probability models for computer science - Ross, Sheldon M, Harcourt/ Academic Press , cop. 2002. ISBN: 9780125980517
    http://cataleg.upc.edu/record=b1312261~S1*cat

Web links

Previous capacities

Students must have sufficient knowledge of algebra and mathematical analysis to assimilate the concepts related to algebra of sets, numerical series, functions of real variables of one or more dimensions, derivation and integration.