High Performance Computer Architecture

You are here

Credits
6
Types
Compulsory
Requirements
This subject has not requirements, but it has got previous capacities
Department
AC
The main objective of this course is that students acquire the fundamentals of the microarchitecture techniques used in high performance computers, considering the implications for energy and power. Another objective is the acquisition of architectural techniques used to efficiently support the implementation of operating systems.
The content of the course covers the application of pipelining techniques and parallelism in the processor design. In particular themes developed to enable evaluation of performance of a computing system to run applications and knowledge of the architecture to support an efficient implementation of operating systems. In addition to enable the use of hardware description languages ¿¿and their use for the description of elements of a processor.

Teachers

Person in charge

  • Jose M. Llaberia Griñó ( )

Others

  • Miquel Moretó Planas ( )

Weekly hours

Theory
1.5
Problems
1.5
Laboratory
1
Guided learning
0
Autonomous learning
7.11

Competences

Technical Competences of each Specialization

Especifics

  • CTE1 - Capability to model, design, define the architecture, implement, manage, operate, administrate and maintain applications, networks, systems, services and computer contents.
  • CTE6 - Capability to design and evaluate operating systems and servers, and applications and systems based on distributed computing.
  • CTE7 - Capability to understand and to apply advanced knowledge of high performance computing and numerical or computational methods to engineering problems.

Generic Technical Competences

Generic

  • CG1 - Capability to plan, calculate and design products, processes and facilities in all areas of Computer Science.
  • CG3 - Capability to lead, plan and supervise multidisciplinary teams.
  • CG4 - Capacity for mathematical modeling, calculation and simulation in technology and engineering companies centers, particularly in research, development and innovation tasks in all areas related to Informatics Engineering.
  • CG6 - Capacity for general management, technical management and research projects management, development and innovation in companies and technology centers in the area of Computer Science.
  • CG8 - Capability to apply the acquired knowledge and to solve problems in new or unfamiliar environments inside broad and multidisciplinary contexts, being able to integrate this knowledge.

Transversal Competences

Appropiate attitude towards work

  • CTR5 - Capability to be motivated by professional achievement and to face new challenges, to have a broad vision of the possibilities of a career in the field of informatics engineering. Capability to be motivated by quality and continuous improvement, and to act strictly on professional development. Capability to adapt to technological or organizational changes. Capacity for working in absence of information and/or with time and/or resources constraints.

Basic

  • CB6 - Ability to apply the acquired knowledge and capacity for solving problems in new or unknown environments within broader (or multidisciplinary) contexts related to their area of study.

Objectives

  1. Learn to apply pipelining and parallelism techniques in the processor design.
    Related competences: CTE1, CTR5,
  2. Training to evaluate the performance of a computing system when running applications.
    Related competences: CTE6, CTE7, CTR5, CG4,
  3. Training to exploit the capabilities of a computer system and stand or hide weaknesses.
    Related competences: CTE7, CTR5, CG4,
  4. Training to design and evaluate the architecture to support efficiently the implementation of operating systems.
    Related competences: CTE6, CTR5,
  5. Training for using a hardware description language and its application in the specification of processor elements.
    Related competences: CTE1, CTE7, CG1, CG3, CG6, CG8,

Contents

  1. Computer and performance metrics
    Constituent elements of a computer, functioning, memory hierarchy, multithreaded, energy and performance metrics
  2. Pipelining and parallelism
    Using pipelining and parallelism techniques to increase productivity. Resources
  3. Pipelining instruction execution
    Data path of a linear pipelined processor and control. Concept of data hazard and control hazard. Adequacy of semantics
  4. Performance enhancement
    Software and hardware techniques to reduce the number of stall cycles in a pipelined processor
  5. Parallel pipelines and superscalar processors
    Interpretation of instructions for execution latency greater than the initiation latency. Using the technique of parallelism to interpret instructions
  6. Exceptions and interrupts
    Requirements in the data path and control for supporting interrupts and exceptions
  7. Multiprocessors
    Elements of a multiprocessor system. Private caches. Interconnection network. Concepts of memory consistency and cache coherence.
  8. VHDL hardware description language
    Learning a hardware description language

Activities

Activity Evaluation act


Hardware description language

Learning VHDL language to describe and simulate logic circuits. Description of basic components in the path of a data processor and its subsequent verification
  • Laboratory: Description and verification of a one-bit adder. Using the previous design to describe and verify a four-bit adder. Description and verification of a data path with register file and an adder
  • Autonomous learning: Learning basic VHDL constructs to describe combinational and sequential circuits. Learning circuit verification techniques. Preparation of the associated lab, answer questions and reflect on the answers
Objectives: 5
Contents:
Theory
0h
Problems
0h
Laboratory
4.5h
Guided learning
0h
Autonomous learning
16h

Analysis of a series processor

Study the data path of a serial processor. Identify the parts of the data path used for each type of instruction. Analysis and calculation of delay for each type of instruction and determining the cycle time of processor
  • Laboratory: Perform the actions and verifications indicated in the documentation
  • Autonomous learning: Study the documentation, answer questions and reflect on the answers
Objectives: 2
Contents:
Theory
0h
Problems
0h
Laboratory
3h
Guided learning
0h
Autonomous learning
6h

Designing control logic for a pipelined processor. Determining the cycle time

Analysis of the data path. Designing control logic for an operation that matches the semantics of machine language. Determining the cycle time
  • Laboratory: Perform the actions and verifications indicated in the documentation
  • Autonomous learning: Study the documentation, design of the control logic, answer questions and reflect on the answers
Objectives: 1
Contents:
Theory
0h
Problems
0h
Laboratory
3h
Guided learning
0h
Autonomous learning
8h

Design of an enhanced processor

Design of a pipelined processor with bypasses to reduce stall cycles and the control logic
  • Laboratory: Perform the actions and verifications indicated in the documentation and implement the control logic of data path with bypasses
  • Autonomous learning: Study the documentation, design of the control logic of the data path with bypasses, answer questions and reflect on the answers
Objectives: 1 3
Contents:
Theory
0h
Problems
0h
Laboratory
3h
Guided learning
0h
Autonomous learning
7h

Computer and performance metrics

Development of item 1 of the course
  • Theory: Understanding the basic components of a von-Neumann computer including the memory hierarchy. Understanding the performance metrics
  • Problems: Realization of problems related to the topic
Objectives: 2 3
Contents:
Theory
2h
Problems
3h
Laboratory
0h
Guided learning
0h
Autonomous learning
9h

Pipelining and parallelism

Development of item 2 of the course
  • Theory: Description of the pipelining and parallelism techniques to increase throughput. Using metrics to evaluate increases or reductions in throughput and energy consumption
  • Problems: Realization of problems related to the topic
  • Autonomous learning: Study the concepts of the topic and related concepts and problem solving to consolidate the concepts
Objectives: 1 2
Contents:
Theory
3h
Problems
2h
Laboratory
0h
Guided learning
0h
Autonomous learning
7.7h

Pipelined instruction executions

Development of item 3 of the course
  • Theory: Application of the pipelining technique to the instruction-processing sequence. Observation and analysis of the need to adapt a naive segmentation to the semantic of the language machine
  • Problems: Realization of problems related to the topic
  • Autonomous learning: Study the concepts of the topic and related concepts and problem solving to consolidate the concepts
Objectives: 1 2 3
Contents:
Theory
3h
Problems
3h
Laboratory
0h
Guided learning
0h
Autonomous learning
10h

Increased performance

Development of item 4 of the course
  • Theory: Use of software and hardware techniques to improve performance of a linear pipelined processor
  • Problems: Realization of problems related to the topic
  • Autonomous learning: Study the concepts of the topic and related concepts and problem solving to consolidate the concepts
Objectives: 2 3
Contents:
Theory
3h
Problems
4.3h
Laboratory
0h
Guided learning
0h
Autonomous learning
11.2h

Parallel pipelines and superscalar processors

Development of item 5 of the course
  • Theory: Using parallel pipelines in the microarchitecture to improve performance and support the ability to interpret instructions in parallel
  • Problems: Realization of problems related to the topic
  • Autonomous learning: Study the concepts of the topic and related concepts and problem solving to consolidate the concepts
Objectives: 1 2 3
Contents:
Theory
3h
Problems
4h
Laboratory
0h
Guided learning
0h
Autonomous learning
10h

Exceptions and interrupts

Development of item 6 of the course
  • Theory: Exception Handling in a pipelined processor to fit on the semantics of machine language. Control interruptions to attend to external devices
  • Problems: Realization of problems related to the topic
  • Autonomous learning: Study the concepts of the topic and related concepts and problem solving to consolidate the concepts
Objectives: 2 4
Contents:
Theory
1h
Problems
1h
Laboratory
0h
Guided learning
0h
Autonomous learning
7h

Multiprocessors

Development of item 7 of the subject
  • Theory: Introduction to parallelism using multiple processors. Concepts of memory consistency and cache coherence
  • Problems: Realization of problems related to the topic
  • Autonomous learning: Study the concepts of the topic and related concepts and problem solving to consolidate the concepts
Objectives: 1 2 3
Contents:
Theory
1.3h
Problems
2h
Laboratory
0h
Guided learning
0h
Autonomous learning
4h

Final exam

Evaluation of the consolidation of the concepts presented during the course by responding to questions and problems of reasoning about concepts presented
Objectives: 1 2 3 4
Week: 15
Type: theory exam
Theory
3h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
0h

Midterm exam

Assessment goal for the first three issues
Objectives: 1 2 3
Week: 8
Type: theory exam
Theory
1h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
0h

Midterm exam

Assessment goal for the first three issues
Objectives: 1 2 3
Week: 8
Type: problems exam
Theory
0h
Problems
1h
Laboratory
0h
Guided learning
0h
Autonomous learning
0h

Teaching methodology

Classes of theory in which concepts are developed and there is student participation.
Classes of problems where they apply the concepts developed in the lectures and the active agent is the student.
Laboratory classes where they apply the concepts developed in class theory in a concrete example of the processor. The active agent is the learner and collaboration between the elements of the group is a means to increase or establish knowledge.
The course develops contructiva. That is, some of the concepts learned in grade and in each issue of course increases the knowledge and ability to understand, analyze and reason about aspects of a processor. This training is also quantitative.

Evaluation methodology

The powers have a weight proportional to the time spent in activities and they are evaluated indirectly based on midterm exam, final exam and laboratory.
The two midterm exams are performed simultaneously and are a single exam.
Midterm exam (P): Written test which evaluates the objectives for the first three issues.
Final exam (F): Written test which evaluates all objectives of the course.
Laboratory (L) is evaluated from the reports submitted in each of the practice sessions and, where appropriate, a personal interview.

The final note (NF) is calculated using the following expression:
NF = max (0.8 x F, (0.65 x F + 0.15 x P) ) + 0.2 x L

Bibliography

Basic:

Previous capacities

Combinational and sequential logic circuits. Operation of a computer: components, interconnections, exceptions and interrupts. Machine language: programming and data representation. Memory hierarchy: performance and mechanisms that support it. Operating Systems: address translation, interrupt and exception management