Probability and Statistics

You are here

Credits
6
Types
Compulsory
Requirements
  • Prerequisite: M1
  • Prerequisite: M2
Department
EIO
The subject deals with random phenomena, how to model them, how to quantify the predictive capacity of a forecast and how to validate an improvement in a software product. Probability and statistics are the basis of scientific, technical and quality improvement methods. They allow to computer scientists the possibility to quantify guarantees to customers and partners.

Teachers

Person in charge

  • Jose Antonio González Alastrue ( )

Others

  • Bhumika Ashvinbhai Patel ( )
  • Eduard Molins Lleonart ( )
  • Erik Cobo Valeri ( )
  • Joan Garcia Subirana ( )
  • Jordi Cortés Martínez ( )
  • Jordi Escayola Mansilla ( )
  • Klaus Gerhard Langohr ( )
  • Mireia Lopez Beltran ( )
  • Nuria Perez Alvarez ( )
  • Roser Rius Carrasco ( )
  • Sergi Ramirez Mitjans ( )

Weekly hours

Theory
2
Problems
2
Laboratory
0
Guided learning
0.4
Autonomous learning
5.6

Competences

Technical Competences

Common technical competencies

  • CT1 - To demonstrate knowledge and comprehension of essential facts, concepts, principles and theories related to informatics and their disciplines of reference.
    • CT1.2A - To interpret, select and value concepts, theories, uses and technological developments related to computer science and its application derived from the needed fundamentals of mathematics, statistics and physics. Capacity to solve the mathematical problems presented in engineering. Talent to apply the knowledge about: algebra, differential and integral calculus and numeric methods; statistics and optimization.
  • CT8 - To plan, conceive, deploy and manage computer projects, services and systems in every field, to lead the start-up, the continuous improvement and to value the economical and social impact.
    • CT8.3 - To demonstrate knowledge and be able to apply appropriate techniques for modelling and analysing different kinds of decisions.

Transversal Competences

Reasoning

  • G9 [Avaluable] - Capacity of critical, logical and mathematical reasoning. Capacity to solve problems in her study area. Abstraction capacity: capacity to create and use models that reflect real situations. Capacity to design and perform simple experiments and analyse and interpret its results. Analysis, synthesis and evaluation capacity.
    • G9.2 - Analysis and synthesis capacity, capacity to solve problems in its field, and to interpret the results in a critical way. Abstraction capacity: capacity to create and use models which reflect real situations. Capacity to design and perform simple experiments and to analyse and interpret their results in a critical way.

Objectives

  1. 2. Define and calculate probabilities for a random experience.
    Related competences: CT1.2A,
  2. Calculate the conditional and joint probabilities and detect whether there is (in)dependence for a random experience with two variables and apply Bayes' theorem to locating the conditional probabilities for the other variable.
    Related competences: CT1.2A,
  3. Graphically represent a random experience.
    Related competences: CT1.2A,
  4. Calculate mean and variance for given probability and distribution functions for a discrete random variable.
    Related competences: CT1.2A,
  5. Identify the most appropriate theoretical model to represent a given random variable from among the following: Bernoulli, binomial, Poisson, Geometric, Normal, uniform and exponential.
    Related competences: CT1.2A,
  6. Calculate cumulative probabilities for certain values from the parameter for theoretical models with the help of tables or R; conversely, locate the random variable values from the desired cumulative probabilities.
    Related competences: CT1.2A, CT8.3, G9.2,
    Subcompetences:
    • They should also be able to identify and analyse theoretical models suitable for different IT situations.
  7. Calculate and interpret the covariance and correlation values for two random variables.
    Related competences: CT1.2A,
  8. Calculate, using sample data, statistics that reflect central tendency (mean) and dispersion (variance and standard deviation).
    Related competences: CT1.2A,
  9. From sample indicators, obtained from a s.r.s., he/she will compute confidence intervals for certain parameters. For example: from the mean, the standard deviation and the sample size of a variable with Normal distribution, the student will calculate the CI95%.
    Related competences: CT1.2A,
  10. Based on a hypothesis and the sample mean and standard deviation for a normally distributed variable, calculate the P-value and justify the evidence against the hypothesis.
    Related competences: CT1.2A, CT8.3,
  11. From the data of a comparative test (e.g., performance of two computer products), the student will use the confidence interval to obtain a wide range of possible values of the difference in the outcome.
    Related competences: CT1.2A, CT8.3,
  12. Using the summary of the model, obtain and interpret the estimators of the model, compute and interpret the R-squared coefficient, obtain the estimators of the uncertainty of the estimate and build a CI for the population values.
    Related competences: CT1.2A,
  13. Make predictions and assess their degree of uncertainty using summary data from the adjusted model.
    Related competences: CT1.2A, CT8.3,
  14. Based on the graphs of the adjusted model, analyze the premises of the model and, if necessary, propose transformations of the variables.
    Related competences: CT1.2A, CT8.3,
  15. Design a prediction study, collect data and analyse and interpret results.
    Related competences: CT1.2A, CT8.3, G9.2,
    Subcompetences:
    • They should also be able to use the collected data to describe tendency and dispersion characteristics in numerical and graphical terms.
  16. Identify, for a deterministic process, variability sources and magnitudes.
    Related competences: CT1.2A, CT8.3, G9.2,
    Subcompetences:
    • They should also be able to use collected data to describe tendency and dispersion characteristics in numerical and graphical terms.
  17. Design a comparative test of computer products, collect data and analyse and interpret results.
    Related competences: CT1.2A, CT8.3, G9.2,
    Subcompetences:
    • They should also be able to use the collected data to describe tendency and dispersion characteristics in numerical and graphical terms.

Contents

  1. Block A. Probability and random variables
    Random experiment. Probability, conditional probability, joint probability. Definition of random variable and types. Probability function, probability density function and probability distribution function. Joint probability function. Indicators: expectation, variance, standard deviation, covariance, correlation. Independence between two random variables.
  2. Block B. Probabilistic models
    Parameterised theoretical models of random variables. Direct and inverse probabilities computation, with R. Introduction to simulation. Sample mean distribution. Central Limit Theorem, Normal approximations.
  3. Block C. Basis of statistics
    Population and sample. Parameter, statistic and estimator. Bias of an estimator. Confidence interval for a parameter, and for the difference of two parameters. Hypothesis test
  4. Block D. Statistical models and forecasting
    Comparació de dos grups, disseny aparellat, mostres independents. Model lineal. Indicadors de la qualitat de l'ajustament. Validació de les premisses. Introducció a la ciència de dades. Estudis experimentals i observacionals. Ètica de la ciència, research waste.
  5. Block T. Application.
    Identifying sources of variability in computer processes. Design of a study with planning of the goal, data collection, statistical analysis with R and results interpretation.

Activities

Activity Evaluation act


Block A activities. Probability and random variables

Locate probability and statistics, especially in the IT field. Provide a grounding in probability. Be able to calculate and analyze joint and conditional probabilities. Analyze whether there is independence or not. Define random variable (RV), discrete and continuous RV. Define probability function, cumulative probability function and joint probability function. Relate RV indicators to sample indicators.
  • Theory: Tests to monitor pre-reading and study. Explanation of topics: foundations of probability and statistics; independence and conditional and joint probabilities; definition of random variable and of probability and probability distribution functions; random variable metrics and relationship to sample indicators.
  • Problems: Model examples of the topics. Follow-up tests. Completion of set exercises. Discussion of results.
  • Guided learning: Problem resolution in a mid-semester or final exam.
  • Autonomous learning: Study of materials before the theory sessions. Problem completion in E-status.
Objectives: 16 1 2 3 4 7
Contents:
Theory
6h
Problems
6h
Laboratory
0h
Guided learning
0h
Autonomous learning
15h

Block B activities. Probabilistic models

Define the theoretical, discrete and continuous models typically used in the IT field and their characteristics and parameters.
  • Theory: Tests to monitor pre-reading and study. Tests to monitor pre-reading and study. Explanation of topics: define the theoretical, discrete and continuous models typically used in the IT field and their characteristics and parameters and calculate direct and inverse probabilities with the defined models.
  • Problems: Model examples of the topics. Follow-up tests. Completion of set exercises. Discussion of results.
  • Guided learning: Problem resolution in a mid-semester or final exam.
  • Autonomous learning: Study of materials before the theory sessions. Problem completion in E-status.
Objectives: 5 6
Contents:
Theory
6h
Problems
6h
Laboratory
0h
Guided learning
0h
Autonomous learning
15h

Mid-semester exam 1

Mid-semester exam consisting of problems corresponding to topics 1 to 3 (learning objectives 1 to 8).
Objectives: 16 1 2 3 4 5 6 7
Week: 7 (Outside class hours)
Type: theory exam
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
2h
Autonomous learning
6h

Block C activities. Basis of statistics

Basic population, sampling, parameter and estimator concepts. Introduction to statistics; definition and linking of confidence intervals (CI) and hypothesis testing (HT).
  • Theory: Tests to monitor pre-reading and study. Explanation of topics: definition of sample, parameter, estimator and statistic for constructing confidence intervals (CI) and description of the statistics defining the more interesting CIs and HTs in an IT setting.
  • Problems: Model examples of the topics. Follow-up tests. Completion of set exercises. Discussion of results.
  • Guided learning: Problem resolution in a mid-semester or final exam.
  • Autonomous learning: Study of materials before the theory sessions. Problem completion in E-status.
Objectives: 8 9 10 11
Contents:
Theory
6h
Problems
6h
Laboratory
0h
Guided learning
0h
Autonomous learning
15h

Block D activities. Statistical models and forecasting

Definition of statistical models. Analysis of variability. Paired design and independent samples. Linear model. Validation of premises, possible transformations, predictions. Some models for data science. Implications of research.
  • Theory: Tests to monitor pre-reading and study. Explanation of topics: define the suitable model, validate it and analyse transformations, obtain effect estimates and make predictions.
  • Problems: Model examples of the topics. Follow-up tests. Completion of set exercises. Discussion of results.
  • Guided learning: Problem resolution in a mid-semester or final exam.
  • Autonomous learning: Study of materials before the theory sessions. Problem completion in E-status.
Objectives: 12 13 14
Contents:
Theory
6h
Problems
6h
Laboratory
0h
Guided learning
0h
Autonomous learning
15h

Application activities

Identify problems in the IT field for a probability or statistical study. Design a study, collect data and analyse and interpret results. Summarise conclusions critically.
  • Theory: Propose and provide guidance for the probability and/or statistics studies performed by students. Monitor studies and encourage synthetic and critical evaluations.
  • Problems: Guidance and monitoring of probability and/or statistics studies. Guidance and monitoring regarding practical probability and statistics components.
  • Autonomous learning: Research computer situations where a probability or statistical study is necessary. Study design, data collection, results analysis and interpretation.
Objectives: 16 17 15
Contents:
Theory
6h
Problems
6h
Laboratory
0h
Guided learning
0h
Autonomous learning
12h

Mid-semester exam 2

Mid-semester exam consisting of problems corresponding to topics 4 to 6 (learning objectives 9 to 17).
Objectives: 8 9 10 11 17 12 13 14 15
Week: 14 (Outside class hours)
Type: theory exam
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
2h
Autonomous learning
6h

Final Exam

Covers all the topics.

Week: 15 (Outside class hours)
Type: final exam
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
2h
Autonomous learning
0h

Teaching methodology

The subject is based on the active learning of the student, guided and guided by the teacher with the help of e-status (interactive platform that, with data individualized by the exercises, allows to evaluate and learn thanks to a feedback immediate).

The teaching scheme of the 4 specific blocks consists of the repetition of cycles based on: exposition of theoretical concepts, numerical resolution of exercises, support for exercises with R (using laptops in the classroom, or in the laboratory), follow-up tests by of the teacher, and autonomous practice of exercises.

The application block develops the transversal competence with the application in group to a specific case contributed by the students, analyzed with R, under the direction of the teacher.

Evaluation methodology

The qualification of the subject is obtained by means of the continuous evaluation (AC) during the 15 weeks of class or with the final exam (EF).

PE is divided into 5 topics or blocks: 4 specific (A, B, C, D) and one cross-disciplinary applications topic of statistical application (T).

Each block results in a Block Note (NB.i, i = A,B,C,D,T). The following formula is applied in the AC:
AC = [NB.A + NB.B + NB.C + NB.D + NB.T] / 5

If AC> = 5, the student can be released from the final exam.

Please note that the EF may consider the grade for the transversal competence:
EF = max {ef, (4 ef + NB.T) / 5}
where "ef" is the actual grade for the final exam.

The course grade of the subject PE is max(AC, EF).

The qualification of the transversal competence is:
A and NB.T> = 8.5; B for 6.5 <= NB.T <8.5; C for 5 <= NB.T <6.5; and D and NB.T <5

Calculating NB.i grades:
- the first 4 have an assessment based on a Block Problem (PB.i, i = A,B,C,D) in a mid-term exam out of class hours. Usually there are 2 tests that give rise to the grades for the 4 blocks.

In addition, a Block Monitoring factor (SB.i, i = A,B,C,D) is obtained for each of the four theoretical blocks, based on 3 tests: 2 written tests done in the classroom, and a mark for problems solved outside the classroom. The SB.i factor increases the grade for the corresponding Block Problem (PB.i) to obtain the Block Grade according to:
NB.i = min (10, PB.i * SB.i) for i = A,B,C,D
(SB.i factor is 1 + Sum pj, where pj is a number between 0 and 0.05, coming from the different block monitoring tests; the exact number of tests may be less than 3 if there are unforeseen changes to the school calendar, with consequent loss of classes).

- The T-Block grade (NB.T) is calculated on the basis of two reports and a final presentation.

Bibliography

Basic:

Complementary:

Web links

Previous capacities

Students need to be sufficiently knowledgeable about algebra and mathematical analysis to be able to assimilate concepts related to the algebra of sets, numerical series, functions of real variables of one or more dimensions, differentiation and integration. They should also be able to understand technical English.