Concurrence, Parallelism and Distributed Systems

You are here

Credits
6
Types
Compulsory
Requirements
This subject has not requirements

Department
AC
This course aims at providing the foundations about computing as a collection of tasks that may be executing simultaneously and potentially interacting with each other. These tasks can be executed on a single or multiple processors or distributed across a network. The course presents the models, challenges, algorithms and systems focusing on three main aspects/modules: concurrency (multiple computations interacting with each other), parallelism (multiple cores or processors), and distribution (multiple computers across a network).

Following a set of introductory sessions, the course has three modules: concurrency (mandatory), parallelism (optional) and distribution (optional). The student has to select one of the two optional modules (parallelism or distribution). The lectures are complemented with programming exercices to illustrate the problems and evaluate the solutions.

Teachers

Person in charge

  • Jordi Guitart Fernandez ( )

Others

  • Joaquin Gabarró Vallés ( )
  • Jorge Castro Rabal ( )
  • Marc Gonzàlez Tallada ( )

Weekly hours

Theory
2
Problems
0
Laboratory
2
Guided learning
0.15
Autonomous learning
5.8

Competences

Technical Competences of each Specialization

Computer networks and distributed systems

  • CEE2.1 - Capability to understand models, problems and algorithms related to distributed systems, and to design and evaluate algorithms and systems that process the distribution problems and provide distributed services.
  • CEE2.3 - Capability to understand models, problems and mathematical tools to analyze, design and evaluate computer networks and distributed systems.

High performance computing

  • CEE4.2 - Capability to analyze, evaluate, design and optimize software considering the architecture and to propose new optimization techniques.

Generic Technical Competences

Generic

  • CG1 - Capability to apply the scientific method to study and analyse of phenomena and systems in any area of Computer Science, and in the conception, design and implementation of innovative and original solutions.
  • CG5 - Capability to apply innovative solutions and make progress in the knowledge to exploit the new paradigms of computing, particularly in distributed environments.

Transversal Competences

Teamwork

  • CTR3 - Capacity of being able to work as a team member, either as a regular member or performing directive activities, in order to help the development of projects in a pragmatic manner and with sense of responsibility; capability to take into account the available resources.

Reasoning

  • CTR6 - Capacity for critical, logical and mathematical reasoning. Capability to solve problems in their area of study. Capacity for abstraction: the capability to create and use models that reflect real situations. Capability to design and implement simple experiments, and analyze and interpret their results. Capacity for analysis, synthesis and evaluation.

Objectives

  1. Understand the definition of distributed system and its possible applications, as well as challenges to be faced to design and implement it.
    Related competences: CEE2.1, CG5,
    Subcompetences:
    • Understand the definition of distributed system
    • Know the different types of communication in a distributed system: direct vs. indirect (space and time uncoupling), persistent vs. transient, synchronous vs. asynchronous, discrete vs. continuous
    • Know the possible applications of a distributed system
    • Understand the challenges to design and implement a distributed system: heterogeneity, lack of global view, concurrency, lack of a single point of control, security, asynchrony, openness, transparency, fault tolerance, scalability
    • Know the basic system architectures in distributed systems: centralized (client-server), decentralized (peer-to-peer), hybrid
    • Know the basic communication paradigms in distributed systems: remote procedure call, message passing, message queuing, group communication, publish/subscribe, shared data spaces, shared-memory, mobile code, stream-oriented
    • Put examples of distributed systems
  2. Understand the problem of time and events ordering in a distributed system and explain and implement the mechanisms of logic clocks to attack this problem and algorithms to synchronize physical clocks in a distributed system.
    Related competences: CEE2.3, CEE2.1,
    Subcompetences:
    • Understand the problem of time and events ordering in a distributed system
    • Explain and implement algorithms to synchronize physical clocks in a distributed system: Cristian (NTP), Berkeley
    • Explain and implement logical clocks mechanisms to attack this problem: happened-before relation, Lamport logical clocks (scalar, vector)
  3. Understand the problem of obtaining a consistent global state in a distributed system and explain the distributed snapshot mechanism to attack this problem, as well as define predicates for global assessment of properties in a distributed system.
    Related competences: CEE2.3, CEE2.1,
    Subcompetences:
    • Explain the Chandy and Lamport's mechanism of distributed snapshot to attack this problem
    • Understand the problem of obtaining a consistent global state in a distributed system
    • Define global predicates for evaluating properties in a distributed system: properties of predicates (stability), occurrence of predicates (possibly, definitely)
  4. Describe, compare and implement algorithms for the coordination of processes in a distributed system, including the coordination necessary to ensure mutual exclusion, leader election, multicast group communication and consensus.
    Related competences: CEE2.3, CEE2.1, CTR3,
    Subcompetences:
    • Describe, compare, and implement algorithms for the coordination of processes in a distributed system to ensure mutual exclusion: permission-based algorithms (centralized, Lin's, Maekawa's, Ricart & Agrawala's), token-based algorithms (token ring)
    • Describe, compare, and implement algorithms for the coordination of processes in a distributed system for multicast group communication: basic reliable multicast, scalable reliable multicast, ordered multicast (FIFO, causal, total), atomic multicast
    • Describe, compare, and implement algorithms for the coordination of processes in a distributed system for the election of leader: Bully, Ring
    • Describe, compare, and implement algorithms for the coordination of processes in a distributed system to ensure consensus: the two army problem, the Byzantine generals problem, consensus using failure detectors, Paxos
  5. Understand the problem of concurrent execution of transactions and describe, compare, and implement different concurrency control mechanisms.
    Related competences: CEE2.3, CEE2.1, CTR6,
    Subcompetences:
    • Understand the problem of concurrent execution of transactions: lost update, inconsistent retrievals, serial equivalence, recovery of aborts (dirty read, write premature)
    • Describe, compare, and implement different concurrency control mechanisms: two-phase locking (including deadlock detection and treatment), optimistic concurrency control, timestamp ordering
  6. Extend the concept of transaction, the commit protocol, and the concurrency control mechanisms in a distributed system.
    Related competences: CEE2.3, CEE2.1,
    Subcompetences:
    • Extend the concept of transaction in a distributed system
    • Extend concurrency control mechanisms in a distributed system: two-phase locking (including distributed deadlock detection and treatment), optimistic concurrency control, timestamp ordering
    • Describe, compare, and implement various distributed commit protocols: one and two-phase
  7. Understand the application of replication in a distributed system, as well as the consistency problems introduced, and describe the corresponding consistency models and their implementation.
    Related competences: CEE2.3, CEE2.1,
    Subcompetences:
    • Understand the application of replication in a distributed system, in addition to the consistency problems introduced
    • Describe the data-centric strong consistency models: strict, sequential, causal, FIFO
    • Describe the data-centric relaxed consistency models: usage of synchronization variables
    • Describe the client-centric consistency models: eventual, monotonic-read, monotonic-write, read-your-writes, writes follow-reads
    • Describe specific implementations of consistency models: primary-based protocols (remote-write, local-write) and replicated-write protocols (active replication, quorum-based protocols)
  8. Understand the problem of concurrent access to resources by different agents (threads, processors), and the design principles to ensure a correct coordination to avoid deadlocks
    Related competences: CG1, CEE4.2,
    Subcompetences:
    • Understand the modeling of a concurrent system, the description and analysis of properties.
    • Understand and carry out the modeling of a concurrent system using a formal model, and analyze its properties.
    • Use the formalism of transition systems to describe the semantics of a concurrent system. Employ a modeling tool to analyze properties.
    • Understand the fundamental features to be addressed in a concurrent system: mutual exclusion, critical sections, interference, and the primitives for the design of concurrent programs.
  9. Understand the design of concurrent programs for shared memory architectures
    Related competences: CEE2.3, CEE4.2, CTR3,
    Subcompetences:
    • Understand the definition of thread, and the design of concurrent programs that use threads. Understand the basic primitives available in a modern programming language (like Java) to ensure proper coordination between threads.
    • Understand the problems of interference and blocking in a concurrent system as well as designing concurrent programs free of these problems.
    • Understand and apply the techniques offered by modern programming languages for the design of concurrent programs: threads, semaphores, critical sections, monitors and synchronization variables.
  10. Understand the design of concurrent programs for message-passing architectures
    Related competences: CEE2.3, CEE2.1, CEE4.2, CTR3,
    Subcompetences:
    • Apply various design patterns of concurrent systems on a message passing architecture: client-server, filter pipeline, worker-supervisor, advertiser-listener.
    • Understand the paradigm of message-passing programming languages ​​like Erlang. Understand the concept of process in Erlang, and the design of simple / medium-sized programs in Erlang.
    • Design solutions based on message passing architecture for known problems, such as sorting by quicksort.
  11. Understand and measure the potential parallelism available in a sequential application and the performance achieved with its parallel implementation
    Related competences: CG1, CEE4.2,
    Subcompetences:
    • Understand the concept of task graph and how to measure potential parallelism
    • Understand the performance of a parallel application and the overhead factors that contribute to its degradation
    • Use of instrumentation, visualization and analysis tools to understand parallelism and performance
  12. Decide the most appropriate decomposition strategy to express parallelism in an application
    Related competences: CEE4.2, CG5, CTR3, CTR6,
    Subcompetences:
    • Use of task and data decomposition
    • Given a task decomposition, decide task ordering and data sharing constrains that guarantee dependencies
    • Given a data decomposition, decide communication requirements and the use of either point-to-point or collective operations to implement them
  13. Specify, using the appropriate programming paradigm, an efficient parallel version that corresponds to a certain task/data decomposition
    Related competences: CEE4.2, CG5, CTR3, CTR6,
    Subcompetences:
    • Use of the message-passing interface (MPI) for computation/data intensive applications in distributed-memory architectures
    • Use of CUDA to offload computation and transfer data for accelerator-based architectures

Contents

  1. Transition systems and process algebra
    Transition systems and primitive processes. Operations prefix and choice. Concurrent processes and their implementation. Description of the LTS system. Implementation in Java and Erlang.
  2. Understanding parallelism
    Introduction to parallel architectures: shared- and distributed-memory architectures and accelerators. Task graph: nodes and edges. Metrics. Speed-up and efficiency. Amdahl law.
  3. Concepts underlying distributed systems
    Definition of a distributed system. Potential applications of a distributed system. Examples of distributed systems. Challenges to design and implement a distributed system: heterogeneity, lack of global view, concurrency, lack of a single point of control, security, asynchrony, openness transparency, fault tolerance, scalability. Basic system architectures in distributed systems: centralized (client-server), decentralized (peer-to-peer), hybrid. Types of communication in a distributed system: direct vs. indirect (space and time uncoupling), persistent vs. transient, synchronous vs. asynchronous, discrete vs. continuous. Basic communication paradigms in distributed systems: remote procedure call, message passing, message queuing, group communication, publish/subscribe, shared data spaces, shared-memory, mobile code, stream-oriented.
  4. Safety and liveness properties
    Description and examples of safety properties and deployment in LTS. Description of liveness properties, especially the progress property and implementation in LTS.
  5. Concurrent objects, mutual exclusion and synchronization conditions. The deadlock problem.
    Problem of destructive interference. Locks and mutual exclusion. Modeling of traffic lights and monitors and of the problem of overlapping. Problem of deadlock, analysis by LTS.
  6. Message passing. Architectures
    Message passing. Client / server architecture, other architectures Introduction to.: Pilelines filters, suvervisor / workers, advertiser / listener. Message-passing in Erlang. Design of Erlang in a client / server architecture.
  7. Predicting and analyzing performance
    Use of models and tools to understand parallelism and performance (Tareador, Extrae, Paraver and Dimemas).
  8. Distributed-memory programming using MPI
    Cluster architecture overview. Process creation, identification and termination. Point-to-point vs. collective operations. Synchronous vs. asynchronous communications.
  9. Programming GPU devices for computation acceleration using CUDA
    GPU architecture overview. Decompositions suitable for GPU acceleration. CUDA programming principles. CUDA Parallel Execution Model: host and device.
  10. Distributed algorithms: Time, global state, coordination, and agreement
    Time and events ordering in a distributed system. Logical clocks: happened-before relation, Lamport logical clocks (scalar, vector). Algorithms to synchronize physical clocks in a distributed system: Cristian (NTP), Berkeley. Consistent global state in a distributed system. The Chandy and Lamport's mechanism of distributed snapshot. Global predicates for evaluating properties in a distributed system: properties of predicates (stability), occurrence of predicates (possibly, definitely). Coordination of processes in a distributed system to ensure mutual exclusion: permission-based algorithms (centralized, Lin's, Maekawa's, Ricart & Agrawala's), token-based algorithms (token ring). Coordination of processes in a distributed system for the election of leader: Bully, Ring. Coordination of processes in a distributed system for multicast group communication: basic reliable multicast, scalable reliable multicast, ordered multicast (FIFO, causal, total), atomic multicast. Coordination of processes in a distributed system to ensure consensus: the two army problem, the Byzantine generals problem, consensus using failure detectors, Paxos
  11. Distributed shared data: Transactions, consistency, and replication
    Concurrent execution of transactions: lost update, inconsistent retrievals, serial equivalence, recovery of aborts (dirty read, write premature). Concurrency control mechanisms: two-phase locking (including deadlock detection and treatment), optimistic concurrency control, timestamp ordering. Distributed transaction. Distributed commit protocols: one, two, and three-phase. Concurrency control mechanisms in a distributed system: two-phase locking (including distributed deadlock detection and treatment), optimistic concurrency control, timestamp ordering. Replication and consistency in a distributed system. Data-centered strong consistency models: strict, sequential, causal, FIFO. Data-centric relaxed consistency models: usage of synchronization variables. Client-centric consistency models: eventual, monotonic-read, monotonic-write, read-your-writes, writes follow-reads. Implementations of consistency models: primary-based protocols (remote-write, local-write) and replicated-write protocols (active replication, quorum-based protocols)

Activities

Concurrency, parallelism and distribution: fundamental concepts

Class preparation with the help of the support material. Understanding and assimilation of the lesson contents and their subsequent application
Theory
6
Problems
0
Laboratory
6
Guided learning
0
Autonomous learning
18
Objectives: 1 8 11
Contents:

Module I: Concurrency

Class preparation with the help of the support material. Understanding and assimilation of the lesson contents associated to Module I (concurrency) and their subsequent application in the practical sessions.
Theory
10
Problems
0
Laboratory
10
Guided learning
0
Autonomous learning
30
Objectives: 8 9 10
Contents:

Alternative modules: Parallelism or Distribution

Class preparation with the help of the support material. Understanding and assimilation of the lesson contents associated to Modules II (parallelism) or III (distribution) and their subsequent application in the practical sessions.
Theory
10
Problems
0
Laboratory
10
Guided learning
0
Autonomous learning
30

Module II: Parallelism

Class preparation with the help of the support material. Understanding and assimilation of the lesson contents associated to Module II (parallelism) and their subsequent application in the practical sessions.
Theory
0
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
0
Objectives: 11 12 13
Contents:

Module III: Distribution

Class preparation with the help of the support material. Understanding and assimilation of the lesson contents associated to Module III (distribution) and their subsequent application in the practical sessions.
Theory
0
Problems
0
Laboratory
0
Guided learning
0
Autonomous learning
0
Objectives: 1 2 3 4 5 6 7
Contents:

Teaching methodology

During the course there will be two types of activities:

a) Activities focused on the acquisition of theoretical knowledge.
b) Activities focused on the acquisition of knowledge through experimentation by implementing and evaluating empirically in the laboratory the mechanisms explained at a theoretical level.

The theoretical activities include participatory lecture classes, which explain the basic contents of the course. The practical activities include seminar laboratories where students implement the mechanisms described in the lectures. The seminars require a preparation by reading the statement and supporting documentation, and a further elaboration of the conclusions in a report.

Evaluation methodology

There will be a final exam (EF) and a lab grade (L). The final exam will comprise problems on the theory taught. The lab grade will reflect the work done by the students in the practical assignments. The final grade will be calculated as follows NF=0.6*EF+ 0.4*L.

Bibliografy

Basic:

  • Distributed Systems: Principles and Paradigms, 2nd Edition - Tanenbaum, A.S. and Van Steen, M., Pearson Prentice Hall , 2007. ISBN: 9780132392273
    http://cataleg.upc.edu/record=b1303976~S1*cat
  • Distributed Systems: Concepts and Design, 5th edition - Coulouris, G.F. and Dollimore, J. and Kindberg, T. and Blair, G., Addison-Wesley , 2011. ISBN: 9780132143011
    http://www.cdk5.net/wp/
  • Concurrency : State Models & Java Programming (Second Edition) - Magee, J. and Kramer, J., John Wiley & Sons , 2006. ISBN:
  • Programming Erlang: software for a concurrent world (2nd ed) - Armstrong, Joe, Pragmatic Bookshelf , cop. 2013. ISBN: 9781937785536
    http://cataleg.upc.edu/record=b1431659~S1*cat
  • Java Concurrency In Practice - Goetz, B. and Peierls, T. and Bloch, J. and Bowbeer, J. and Holmes, D. and Lea, D., Addison-Wesley-Professional , 2006. ISBN:
  • Introduction to Parallel Computing - Grama, A. and Karypis, G. and Kumar, V. and Gupta, A., Pearson Education , 2003. ISBN: 0201648652
    http://cataleg.upc.edu/record=b1340978~S1*cat

Complementary:

  • Erlang Programming: A Concurrent Approach to Software Development - CESARINI, Francesco; THOMPSON, Simon, O'Reilly , 2009. ISBN: 9780596518189
    http://cataleg.upc.edu/record=b1366603~S1*cat
  • The Art of Multiprocessor Programming - Herlihy, M. and Shavit, N., O'Reilly , 2006. ISBN:

Previous capacities

Concurrency: Java at the level of classes and objects.
Parallelism: basic understanding of parallel architectures, including shared- and distributed-memory multiprocessor systems.
Distribution: understanding of the internal structure and operation of an operating system and computer networks.