Probability and Statistics

You are here

Credits
6
Types
Compulsory
Requirements
  • Prerequisite: M2
  • Prerequisite: M1
Department
EIO
The subject deals with random phenomena, how to model them, how to quantify the predictive capacity of a forecast and how to validate an improvement in a software product. Probability and statistics are the basis of scientific, technical and quality improvement methods. They allow to computer scientists the possibility to quantify guarantees to customers and partners.

Teachers

Person in charge

  • José Antonio González Alastrue ( )

Others

  • Erik Cobo Valeri ( )
  • Joan Garcia Subirana ( )
  • Jordi Cortés Martínez ( )
  • Jordi Escayola Mansilla ( )
  • Klaus Gerhard Langohr ( )
  • Mireia Lopez Beltran ( )
  • Miriam Mota Foix ( )
  • Nuria Perez Alvarez ( )
  • Roser Rius Carrasco ( )

Weekly hours

Theory
1
Problems
1
Laboratory
2
Guided learning
0.4
Autonomous learning
5.6

Competences

Technical Competences

Common technical competencies

  • CT1 - To demonstrate knowledge and comprehension of essential facts, concepts, principles and theories related to informatics and their disciplines of reference.
    • CT1.2A - To interpret, select and value concepts, theories, uses and technological developments related to computer science and its application derived from the needed fundamentals of mathematics, statistics and physics. Capacity to solve the mathematical problems presented in engineering. Talent to apply the knowledge about: algebra, differential and integral calculus and numeric methods; statistics and optimization.
  • CT8 - To plan, conceive, deploy and manage computer projects, services and systems in every field, to lead the start-up, the continuous improvement and to value the economical and social impact.
    • CT8.3 - To demonstrate knowledge and be able to apply appropriate techniques for modelling and analysing different kinds of decisions.

Transversal Competences

Reasoning

  • G9 - Capacity of critical, logical and mathematical reasoning. Capacity to solve problems in her study area. Abstraction capacity: capacity to create and use models that reflect real situations. Capacity to design and perform simple experiments and analyse and interpret its results. Analysis, synthesis and evaluation capacity.
    • G9.2 - Analysis and synthesis capacity, capacity to solve problems in its field, and to interpret the results in a critical way. Abstraction capacity: capacity to create and use models which reflect real situations. Capacity to design and perform simple experiments and to analyse and interpret their results in a critical way.

Objectives

  1. 2. Define and calculate probabilities for a random experience.
    Related competences: CT1.2A,
  2. Calculate the conditional and joint probabilities and detect whether there is (in)dependence for a random experience with two variables and apply Bayes' theorem to locating the conditional probabilities for the other variable.
    Related competences: CT1.2A,
  3. Graphically represent a random experience.
    Related competences: CT1.2A,
  4. Calculate mean and variance for given probability and distribution functions for a discrete random variable.
    Related competences: CT1.2A,
  5. Identify the most appropriate theoretical model to represent a given random variable from among the following: Bernoulli, binomial, Poisson, Geometric, Normal, uniform and exponential.
    Related competences: CT1.2A,
  6. Calculate cumulative probabilities for certain values from the parameter for theoretical models with the help of tables and conversely, locate the random variable values from the desired cumulative probabilities.
    Related competences: CT1.2A, CT8.3, G9.2,
    Subcompetences:
    • They should also be able to identify and analyse theoretical models suitable for different IT situations.
  7. Calculate and interpret the covariance and correlation values for two random variables.
    Related competences: CT1.2A,
  8. Calculate, using sample data, statistics that reflect central tendency (mean) and dispersion (variance and standard deviation).
    Related competences: CT1.2A,
  9. Construct a confidence interval for the mean of a normally distributed variable from the sample mean and standard deviation.
    Related competences: CT1.2A,
  10. Based on a hypothesis and the sample mean and standard deviation for a normally distributed variable, calculate the P-value and justify the evidence against the hypothesis.
    Related competences: CT1.2A, CT8.3,
  11. Quantify both the performance difference and the imprecision of random sampling using comparative performance test data for two computer products and report the value of the difference if the test has covered all possible situations of interest.
    Related competences: CT1.2A, CT8.3,
  12. Design a comparative test of two computer products, collect data and analyse and interpret results.
    Related competences: CT1.2A, CT8.3, G9.2,
    Subcompetences:
    • They should also be able to use the collected data to describe tendency and dispersion characteristics in numerical and graphical terms.
  13. Using the summary data for two variables, obtain and interpret the estimators for the regression line variables, compute and interpret the R-squared coefficient, obtain the estimators of the uncertainty of the estimate and build a CI for the population values.
    Related competences: CT1.2A,
  14. Make predictions and assess their degree of uncertainty using summary data for two variables and the adjusted model.
    Related competences: CT1.2A, CT8.3,
  15. Analyse the model premises and, if necessary, propose variable transformations for the adjusted model of the graphs for two variables.
    Related competences: CT1.2A, CT8.3,
  16. Design a prediction study, collect data and analyse and interpret results.
    Related competences: CT1.2A, CT8.3, G9.2,
    Subcompetences:
    • They should also be able to use the collected data to describe tendency and dispersion characteristics in numerical and graphical terms.
  17. Identify, for a deterministic process, variability sources and magnitudes.
    Related competences: CT1.2A, CT8.3, G9.2,
    Subcompetences:
    • They should also be able to use collected data to describe tendency and dispersion characteristics in numerical and graphical terms.

Contents

  1. Block 1. Probability calculations
    Probability and statistics (populations and sampling, induction and deduction, defining models and describing data). Random experience. Probability, conditional probability, joint probability. Independence.
  2. Block 2. Random variables
    Definition of random variable. Random, discrete and continuous variables. Probability function, probability density function and probability distribution function. Joint probability function. Indicators: expectation, variance, standard deviation, covariance, correlation.Independence between two random variables.
  3. Block 3. Random variable models
    Parameterised theoretical models of discrete and continuous random variables. Direct and inverse probabilities computation, with statistical tables and R. Sample mean distribution. Central Limit Theorem, Normal approximations.
  4. Block 4. Evidence: principles of inference
    Population and sample. Parameter and estimator: Statistics. Bias and efficiency of an estimator. Confidence interval. Hypothesis test. P-value of a test. Errors types. Power.
  5. Block 5. Experiment design
    Paired and independent samples design. Comparison of means and variances of Normal variables. Comparison of means in large samples (particular case: comparison of two proportions). Sample size.
  6. Block 6. Statistical models and forecasting
    Graphical fitting of the relationship between two numerical variables. Estimation of a linear model. Indicators of the fit quality. Validation of the premises and transformations. Predictions for an individual value and the average.
  7. Application.
    Identifying sources of variability in computer processes. Design a study with planning of the goal, data collection, statistical analysis and results interpretation.

Activities

Activity Evaluation act


Topic 1 activities. Probability calculations

Situate probability and statistics, especially in the IT setting. Basic probability concepts. Calculating and analysing conditional and joint probabilities. Analysing for (in)dependence.
  • Theory: Tests to monitor pre-reading and study. Explanation of topics: foundations of probability and statistics; independence and conditional and joint probabilities.
  • Problems: Model examples of the topics
  • Laboratory: Individual problem completion in E-status. Follow-up test. Completion of set exercises. Discussion of results.
  • Guided learning: Problem resolution in a mid-semester or final exam.
  • Autonomous learning: Study of materials before the theory sessions. Problem completion in E-status.
Objectives: 17 1 2 3
Contents:
Theory
2h
Problems
2h
Laboratory
4h
Guided learning
0h
Autonomous learning
10h

Topic 2 activities. Random variables.

Define random variable, discrete random variable and continuous random variable. Define the probability function, the probability distribution function and the joint probability function. Link random variable indicators with sampling indicators.
  • Theory: Tests to monitor pre-reading and study. Explanation of topics: definition of random variable and of probability and probability distribution functions; random variable metrics and relationship to sample indicators.
  • Problems: Model examples of the topics
  • Laboratory: Individual problem completion in E-status. Follow-up test. Completion of set exercises. Discussion of results.
  • Guided learning: Problem resolution in a mid-semester or final exam.
  • Autonomous learning: Study of materials before the theory sessions. Problem completion in E-status.
Objectives: 4 6 7
Contents:
Theory
2h
Problems
2h
Laboratory
4h
Guided learning
0h
Autonomous learning
10h

Topic 3 activities. Random variable models

Define the theoretical, discrete and continuous models typically used in the IT field and their characteristics and parameters.
  • Theory: Tests to monitor pre-reading and study. Tests to monitor pre-reading and study. Explanation of topics: define the theoretical, discrete and continuous models typically used in the IT field and their characteristics and parameters and calculate direct and inverse probabilities with the defined models.
  • Problems: Model examples of the topics
  • Laboratory: Individual problem completion in E-status. Follow-up test. Completion of set exercises. Discussion of results.
  • Guided learning: Problem resolution in a mid-semester or final exam.
  • Autonomous learning: Study of materials before the theory sessions. Problem completion in E-status.
Objectives: 5 6
Contents:
Theory
2h
Problems
2h
Laboratory
4h
Guided learning
0h
Autonomous learning
10h

Mid-semester exam 1

Mid-semester exam consisting of problems corresponding to topics 1 to 3 (learning objectives 1 to 8).
Objectives: 17 1 2 3 4 5 6 7
Week: 7 (Outside class hours)
Type: theory exam
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
2h
Autonomous learning
6h

Topic 4 activities. Evidence: principles of inference

Basic population, sampling, parameter and estimator concepts. Introduction to statistics; definition and linking of confidence intervals (CI) and hypothesis testing (HT).
  • Theory: Tests to monitor pre-reading and study. Explanation of topics: definition of sample, parameter, estimator and statistic for constructing confidence intervals (CI) and description of the statistics defining the more interesting CIs and HTs in an IT setting.
  • Problems: Model examples of the topics
  • Laboratory: Individual problem completion in E-status. Follow-up test. Completion of set exercises. Discussion of results.
  • Guided learning: Problem resolution in a mid-semester or final exam.
  • Autonomous learning: Study of materials before the theory sessions. Problem completion in E-status.
Objectives: 8 9 10
Contents:
Theory
2h
Problems
2h
Laboratory
4h
Guided learning
0h
Autonomous learning
10h

Topic 5 activities. Experiment design

Define tests with independent and paired samples. Situate and specify the comparison of two means (using Student-t, CIs and HTs in independent paired samples) and the comparison of two variances (in independent samples and suitable transformations).
  • Theory: Tests to monitor pre-reading and study. Explanation of topics: comparison of means and variances.
  • Problems: Model example of the topics
  • Laboratory: Individual problem completion in E-status. Follow-up test. Completion of set exercises. Discussion of results.
  • Guided learning: Problem resolution in a mid-semester or final exam.
  • Autonomous learning: Study of materials before the theory sessions. Problem completion in E-status.
Objectives: 11 12
Contents:
Theory
2h
Problems
2h
Laboratory
4h
Guided learning
0h
Autonomous learning
10h

Topic 6 activities. Statistical models and forecasting

Define a relational model between two variables, analyse the variability, validate the premises, consider possible transformations and make predictions.
  • Theory: Tests to monitor pre-reading and study. Explanation of topics: define the linear model, validate it and analyse transformations and make predictions.
  • Problems: Model examples of the topics
  • Laboratory: Individual problem completion in E-status. Follow-up test. Completion of set exercises. Discussion of results.
  • Guided learning: Problem resolution in a mid-semester or final exam.
  • Autonomous learning: Study of materials before the theory sessions. Problem completion in E-status.
Objectives: 13 14 15
Contents:
Theory
2h
Problems
2h
Laboratory
4h
Guided learning
0h
Autonomous learning
10h

Application activities

Identify problems in the IT field for a probability or statistical study. Design a study, collect data and analyse and interpret results. Summarise conclusions critically.
  • Theory: Propose and provide guidance for the probability and/or statistics studies performed by students. Monitor studies and encourage synthetic and critical evaluations.
  • Problems: Guidance and monitoring of probability and/or statistics studies.
  • Laboratory: Guidance and monitoring regarding practical probability and statistics components.
  • Autonomous learning: Research computer situations where a probability or statistical study is necessary. Study design, data collection, results analysis and interpretation.
Objectives: 17 12 16
Contents:
Theory
3h
Problems
3h
Laboratory
6h
Guided learning
0h
Autonomous learning
12h

Mid-semester exam 2

Mid-semester exam consisting of problems corresponding to topics 4 to 6 (learning objectives 9 to 17).
Objectives: 8 9 10 11 12 13 14 15 16
Week: 14 (Outside class hours)
Type: theory exam
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
2h
Autonomous learning
6h

Final Exam

Covers all the topics.

Week: 15 (Outside class hours)
Type: final exam
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
2h
Autonomous learning
0h

Teaching methodology

The subject is based on students' active learning, guided and directed by the lecturer with the help of e-status (an interactive instant-feedback platform with individual exercise data).

The teaching method based on six specific topics consists of repeating cycles based on: theoretical explanations, numerical solutions for exercises, guidance in laboratory classes, follow-up tests by the group teacher and independent practice of exercises.

The applications topic develops transferable competencies through group work on specific cases put forward by students under the lecturer's guidance.

Evaluation methodology

The subject is divided into seven topics: six specific topics and one cross-disciplinary applications topic.

For the first six topics, two mid-semester exams (depending on the calendar for each semester) result in six marks (PBi, i = 1 ... 6). For the same six topics, an assessment mark is calculated based on marks for two written exercises completed in the classroom and a mark for problems solved outside class time. The marks are transformed in a Follow-up Factor (SBi, i = 1 .. 6), which can increase the corresponding mark PBi, in order to obtain the block mark:
NBi = min(10, PBi * SBi) for i=1..6
(factor SBi is 1 + sum pj, where pj is a mark between 0 and 0.05, coming from each of the block tests; unexpected events causing loss of classes might decrease the number of marks for a given block)

Topic 7 has no mid-semester exam; hence NB7 will be calculated on the basis of the final report and the presentation.

Given the cumulative nature of the material, the topics will have the following continuous assessment (AC) weighting:
AC = [ 10 NB1 + 11 NB2 + 12 NB3 + 13 NB4 + 14 NB5 + 15 NB6 + 10 NB7 ] / 85

Student who receive AC >= 5 do not have to do the final exam EF.
Keep in mind that EF can consider the transferable competency mark:
EF = max {ef, (75 ef + 10NB7) / 85}
where "ef" is the proper note of the final exam.

The course mark for the subject is max(AC,EF).

The transferable competency is graded as follows:
A if NB7 >= 8.5; B if 6.5 <= NB7 < 8.5; C if 5 <= NB7 < 6.5; and D if NB7 < 5.

Bibliography

Basic:

Complementary:

Web links

Previous capacities

Students need to be sufficiently knowledgeable about algebra and mathematical analysis to be able to assimilate concepts related to the algebra of sets, numerical series, functions of real variables of one or more dimensions, differentiation and integration. They should also be able to understand technical English.