Supercomputers Architecture

Weekly hours
Competences
Objectives
Contents
Activities
Teaching methodology
Evaluation methodology
Bibliography
Web links
Previous capacities

Credits

Types

Specialization complementary (High Performance Computing)

Requirements

This subject has not requirements, but it has got previous capacities

Department

Web

https://torres.ai/SA-MIRI/

This course introduces the fundamentals of high-performance and parallel computing. It is targeted at scientists and engineers seeking to develop the skills necessary for working with supercomputers, the leading edge in high-performance computing technology.

In the first part of the course, we will cover the basic building blocks of supercomputers and their system software stack. Then, we will introduce their traditional parallel and distributed programming models, which allow one to exploit parallelism, a central element for scaling the applications in these types of high-performance infrastructures.

In the second part of the course, we will motivate the current supercomputing systems developed to support artificial intelligence algorithms required in today's world. This year's syllabus will pay special attention to Deep Learning (DL) algorithms and their scalability using a GPU platform.

This course uses the learn by doing approach, based on a set of exercises, made up of programming problems and reading papers, that the students must carry out throughout the course. The course will be marked by a continuous assessment, which ensures constant, steady work.

All in all, this course seeks to enable students to acquire practical skills that can help them as much as possible to adapt and anticipate the new technologies that will undoubtedly emerge in the coming years. For the practical part of the exercises, the student will use supercomputing facilities from the Barcelona Supercomputing Center (BSC-CNS).

UPDATED VERSION: https://torres.ai/sa-miri/

Teachers

Person in charge

Jordi Torres Viñals ( )

Weekly hours

Theory

Problems

Laboratory

Guided learning

0.15

Autonomous learning

7.7

Competences

Transversal Competences

Teamwork

G5 - To be capable to work as a team member, being just one more member or performing management tasks, with the finality of contributing to develop projects in a pragmatic way and with responsibility sense; to assume compromises taking into account the available resources.
CT3 - Ability to work as a member of an interdisciplinary team, as a normal member or performing direction tasks, in order to develop projects with pragmatism and sense of responsibility, making commitments taking into account the available resources.
CTR3 - Capacity of being able to work as a team member, either as a regular member or performing directive activities, in order to help the development of projects in a pragmatic manner and with sense of responsibility; capability to take into account the available resources.

Entrepreneurship and innovation

G1 - To know and understand the organization of a company and the sciences which govern its activity; capacity to understand the labour rules and the relation between planning, industrial and business strategies, quality and benefit. To develop creativity, entrepreneur spirit and innovation tendency.
CT1 - Know and understand the organization of a company and the sciences that govern its activity; have the ability to understand labor standards and the relationships between planning, industrial and commercial strategies, quality and profit. Being aware of and understanding the mechanisms on which scientific research is based, as well as the mechanisms and instruments for transferring results among socio-economic agents involved in research, development and innovation processes.
CTR1 - Capacity for knowing and understanding a business organization and the science that rules its activity, capability to understand the labour rules and the relationships between planning, industrial and commercial strategies, quality and profit. Capacity for developping creativity, entrepreneurship and innovation trend.

Appropiate attitude towards work

G8 - To have motivation to be professional and to face new challenges, have a width vision of the possibilities of the career in the field of informatics engineering. To feel motivated for the quality and the continuous improvement, and behave rigorously in the professional development. Capacity to adapt oneself to organizational or technological changes. Capacity to work in situations with information shortage and/or time and/or resources restrictions.
CT5 - Capability to be motivated for professional development, to meet new challenges and for continuous improvement. Capability to work in situations with lack of information.
CTR5 - Capability to be motivated by professional achievement and to face new challenges, to have a broad vision of the possibilities of a career in the field of informatics engineering. Capability to be motivated by quality and continuous improvement, and to act strictly on professional development. Capability to adapt to technological or organizational changes. Capacity for working in absence of information and/or with time and/or resources constraints.

Reasoning

G9 - Capacity of critical, logical and mathematical reasoning. Capacity to solve problems in her study area. Abstraction capacity: capacity to create and use models that reflect real situations. Capacity to design and perform simple experiments and analyse and interpret its results. Analysis, synthesis and evaluation capacity.
CT6 - Capability to evaluate and analyze on a reasoned and critical way about situations, projects, proposals, reports and scientific-technical surveys. Capability to argue the reasons that explain or justify such situations, proposals, etc..
CTR6 - Capacity for critical, logical and mathematical reasoning. Capability to solve problems in their area of study. Capacity for abstraction: the capability to create and use models that reflect real situations. Capability to design and implement simple experiments, and analyze and interpret their results. Capacity for analysis, synthesis and evaluation.

Sustainability and social commitment

G2 - To know and understand the complexity of the economic and social phenomena typical of the welfare society. To be capable of analyse and evaluate the social and environmental impact.
CT2 - Capability to know and understand the complexity of economic and social typical phenomena of the welfare society; capability to relate welfare with globalization and sustainability; capability to use technique, technology, economics and sustainability in a balanced and compatible way.
CTR2 - Capability to know and understand the complexity of the typical economic and social phenomena of the welfare society. Capacity for being able to analyze and assess the social and environmental impact.

Third language

G3 - To know the English language in a correct oral and written level, and accordingly to the needs of the graduates in Informatics Engineering. Capacity to work in a multidisciplinary group and in a multi-language environment and to communicate, orally and in a written way, knowledge, procedures, results and ideas related to the technical informatics engineer profession.
CT5 - Achieving a level of spoken and written proficiency in a foreign language, preferably English, that meets the needs of the profession and the labour market.

Effective oral and written communication

G4 - To communicate with other people knowledge, procedures, results and ideas orally and in a written way. To participate in discussions about topics related to the activity of a technical informatics engineer.

Information literacy

G6 - To manage the acquisition, structuring, analysis and visualization of data and information of the field of the informatics engineering, and value in a critical way the results of this management.
CT4 - Capacity for managing the acquisition, the structuring, analysis and visualization of data and information in the field of specialisation, and for critically assessing the results of this management.
CTR4 - Capability to manage the acquisition, structuring, analysis and visualization of data and information in the area of informatics engineering, and critically assess the results of this effort.

Autonomous learning

G7 - To detect deficiencies in the own knowledge and overcome them through critical reflection and choosing the best actuation to extend this knowledge. Capacity for learning new methods and technologies, and versatility to adapt oneself to new situations.

Analisis y sintesis

CT7 - Capability to analyze and solve complex technical problems.

Basic

CB6 - Ability to apply the acquired knowledge and capacity for solving problems in new or unknown environments within broader (or multidisciplinary) contexts related to their area of study.
CB7 - Ability to integrate knowledge and handle the complexity of making judgments based on information which, being incomplete or limited, includes considerations on social and ethical responsibilities linked to the application of their knowledge and judgments.
CB8 - Capability to communicate their conclusions, and the knowledge and rationale underpinning these, to both skilled and unskilled public in a clear and unambiguous way.
CB9 - Possession of the learning skills that enable the students to continue studying in a way that will be mainly self-directed or autonomous.
CB1 - That students have demonstrated to possess and understand knowledge in an area of ??study that starts from the base of general secondary education, and is usually found at a level that, although supported by advanced textbooks, also includes some aspects that imply Knowledge from the vanguard of their field of study.
CB2 - That the students know how to apply their knowledge to their work or vocation in a professional way and possess the skills that are usually demonstrated through the elaboration and defense of arguments and problem solving within their area of ??study.
CB3 - That students have the ability to gather and interpret relevant data (usually within their area of ??study) to make judgments that include a reflection on relevant social, scientific or ethical issues.
CB4 - That the students can transmit information, ideas, problems and solutions to a specialized and non-specialized public.
CB5 - That the students have developed those learning skills necessary to undertake later studies with a high degree of autonomy
CB10 - Possess and understand knowledge that provides a basis or opportunity to be original in the development and/or application of ideas, often in a research context.

Transversals

CT1 - Entrepreneurship and innovation. Know and understand the organization of a company and the sciences that govern its activity; Have the ability to understand labor standards and the relationships between planning, industrial and commercial strategies, quality and profit.
CT2 - Sustainability and Social Commitment. To know and understand the complexity of economic and social phenomena typical of the welfare society; Be able to relate well-being to globalization and sustainability; Achieve skills to use in a balanced and compatible way the technique, the technology, the economy and the sustainability.
CT3 - Efficient oral and written communication. Communicate in an oral and written way with other people about the results of learning, thinking and decision making; Participate in debates on topics of the specialty itself.
CT4 - Teamwork. Be able to work as a member of an interdisciplinary team, either as a member or conducting management tasks, with the aim of contributing to develop projects with pragmatism and a sense of responsibility, taking commitments taking into account available resources.
CT5 - Solvent use of information resources. Manage the acquisition, structuring, analysis and visualization of data and information in the field of specialty and critically evaluate the results of such management.
CT6 - Autonomous Learning. Detect deficiencies in one's own knowledge and overcome them through critical reflection and the choice of the best action to extend this knowledge.
CT7 - Third language. Know a third language, preferably English, with an adequate oral and written level and in line with the needs of graduates.

Gender perspective

CT6 - An awareness and understanding of sexual and gender inequalities in society in relation to the field of the degree, and the incorporation of different needs and preferences due to sex and gender when designing solutions and solving problems.

Technical Competences

Common technical competencies

CT1 - To demonstrate knowledge and comprehension of essential facts, concepts, principles and theories related to informatics and their disciplines of reference.
CT2 - To use properly theories, procedures and tools in the professional development of the informatics engineering in all its fields (specification, design, implementation, deployment and products evaluation) demonstrating the comprehension of the adopted compromises in the design decisions.
CT3 - To demonstrate knowledge and comprehension of the organizational, economic and legal context where her work is developed (proper knowledge about the company concept, the institutional and legal framework of the company and its organization and management)
CT4 - To demonstrate knowledge and capacity to apply the basic algorithmic procedures of the computer science technologies to design solutions for problems, analysing the suitability and complexity of the algorithms.
CT5 - To analyse, design, build and maintain applications in a robust, secure and efficient way, choosing the most adequate paradigm and programming languages.
CT6 - To demonstrate knowledge and comprehension about the internal operation of a computer and about the operation of communications between computers.
CT7 - To evaluate and select hardware and software production platforms for executing applications and computer services.
CT8 - To plan, conceive, deploy and manage computer projects, services and systems in every field, to lead the start-up, the continuous improvement and to value the economical and social impact.

Technical competencies

CE1 - Skillfully use mathematical concepts and methods that underlie the problems of science and data engineering.
CE2 - To be able to program solutions to engineering problems: Design efficient algorithmic solutions to a given computational problem, implement them in the form of a robust, structured and maintainable program, and check the validity of the solution.
CE3 - Analyze complex phenomena through probability and statistics, and propose models of these types in specific situations. Formulate and solve mathematical optimization problems.
CE4 - Use current computer systems, including high performance systems, for the process of large volumes of data from the knowledge of its structure, operation and particularities.
CE5 - Design and apply techniques of signal processing, choosing between different technological tools, including those of Artificial vision, speech recognition and multimedia data processing.
CE6 - Build or use systems of processing and comprehension of written language, integrating it into other systems driven by the data. Design systems for searching textual or hypertextual information and analysis of social networks.
CE7 - Demonstrate knowledge and ability to apply the necessary tools for the storage, processing and access to data.
CE8 - Ability to choose and employ techniques of statistical modeling and data analysis, evaluating the quality of the models, validating and interpreting them.
CE9 - Ability to choose and employ a variety of automatic learning techniques and build systems that use them for decision making, even autonomously.
CE10 - Visualization of information to facilitate the exploration and analysis of data, including the choice of adequate representation of these and the use of dimensionality reduction techniques.
CE11 - Within the corporate context, understand the innovation process, be able to propose models and business plans based on data exploitation, analyze their feasibility and be able to communicate them convincingly.
CE12 - Apply the project management practices in the integral management of the data exploitation engineering project that the student must carry out in the areas of scope, time, economic and risks.
CE13 - (End-of-degree work) Plan and design and carry out projects of a professional nature in the field of data engineering, leading its implementation, continuous improvement and valuing its economic and social impact. Defend the project developed before a university court.

Especifics

CE1 - Develop efficient algorithms based on the knowledge and understanding of the computational complexity theory and considering the main data structures within the scope of data science
CE2 - Apply the fundamentals of data management and processing to a data science problem
CE3 - Apply data integration methods to solve data science problems in heterogeneous data environments
CE4 - Apply scalable storage and parallel data processing methods, including data streams, once the most appropriate methods for a data science problem have been identified
CE5 - Model, design, and implement complex data systems, including data visualization
CE6 - Design the Data Science process and apply scientific methodologies to obtain conclusions about populations and make decisions accordingly, from both structured and unstructured data and potentially stored in heterogeneous formats.
CE7 - Identify the limitations imposed by data quality in a data science problem and apply techniques to smooth their impact
CE8 - Extract information from structured and unstructured data by considering their multivariate nature.
CE9 - Apply appropriate methods for the analysis of non-traditional data formats, such as processes and graphs, within the scope of data science
CE10 - Identify machine learning and statistical modeling methods to use and apply them rigorously in order to solve a specific data science problem
CE11 - Analyze and extract knowledge from unstructured information using natural language processing techniques, text and image mining
CE12 - Apply data science in multidisciplinary projects to solve problems in new or poorly explored domains from a data science perspective that are economically viable, socially acceptable, and in accordance with current legislation
CE13 - Identify the main threats related to ethics and data privacy in a data science project (both in terms of data management and analysis) and develop and implement appropriate measures to mitigate these threats
CE14 - Execute, present and defend an original exercise carried out individually in front of an academic commission, consisting of an engineering project in the field of data science synthesizing the competences acquired in the studies

Technical Competences of each Specialization

Information systems specialization

CSI2 - To integrate solutions of Information and Communication Technologies, and business processes to satisfy the information needs of the organizations, allowing them to achieve their objectives effectively.
CSI3 - To determine the requirements of the information and communication systems of an organization, taking into account the aspects of security and compliance of the current normative and legislation.
CSI4 - To participate actively in the specification, design, implementation and maintenance of the information and communication systems.
CSI1 - To demonstrate comprehension and apply the principles and practices of the organization, in a way that they could link the technical and management communities of an organization, and participate actively in the user training.

Software engineering specialization

CES1 - To develop, maintain and evaluate software services and systems which satisfy all user requirements, which behave reliably and efficiently, with a reasonable development and maintenance and which satisfy the rules for quality applying the theories, principles, methods and practices of Software Engineering.
CES2 - To value the client needs and specify the software requirements to satisfy these needs, reconciling conflictive objectives through searching acceptable compromises, taking into account the limitations related to the cost, time, already developed systems and organizations.
CES3 - To identify and analyse problems; design, develop, implement, verify and document software solutions having an adequate knowledge about the current theories, models and techniques.

Information technology specialization

CTI1 - To define, plan and manage the installation of the ICT infrastructure of the organization.
CTI2 - To guarantee that the ICT systems of an organization operate adequately, are secure and adequately installed, documented, personalized, maintained, updated and substituted, and the people of the organization receive a correct ICT support.
CTI3 - To design solutions which integrate hardware, software and communication technologies (and capacity to develop specific solutions of systems software) for distributed systems and ubiquitous computation devices.
CTI4 - To use methodologies centred on the user and the organization to develop, evaluate and manage applications and systems based on the information technologies which ensure the accessibility, ergonomics and usability of the systems.

Computer engineering specialization

CEC1 - To design and build digital systems, including computers, systems based on microprocessors and communications systems.
CEC2 - To analyse and evaluate computer architectures including parallel and distributed platforms, and develop and optimize software for these platforms.
CEC3 - To develop and analyse hardware and software for embedded and/or very low consumption systems.
CEC4 - To design, deploy, administrate and manage computer networks, and manage the guarantee and security of computer systems.

Computer science specialization

CCO1 - To have an in-depth knowledge about the fundamental principles and computations models and be able to apply them to interpret, select, value, model and create new concepts, theories, uses and technological developments, related to informatics.
CCO2 - To develop effectively and efficiently the adequate algorithms and software to solve complex computation problems.
CCO3 - To develop computer solutions that, taking into account the execution environment and the computer architecture where they are executed, achieve the best performance.

Academic

CEA1 - Capability to understand the basic principles of the Multiagent Systems operation main techniques , and to know how to use them in the environment of an intelligent service or system.
CEA2 - Capability to understand the basic operation principles of Planning and Approximate Reasoning main techniques, and to know how to use in the environment of an intelligent system or service.
CEA3 - Capability to understand the basic operation principles of Machine Learning main techniques, and to know how to use on the environment of an intelligent system or service.
CEA4 - Capability to understand the basic operation principles of Computational Intelligence main techniques, and to know how to use in the environment of an intelligent system or service.
CEA5 - Capability to understand the basic operation principles of Natural Language Processing main techniques, and to know how to use in the environment of an intelligent system or service.
CEA6 - Capability to understand the basic operation principles of Computational Vision main techniques, and to know how to use in the environment of an intelligent system or service.
CEA7 - Capability to understand the problems, and the solutions to problems in the professional practice of Artificial Intelligence application in business and industry environment.
CEA8 - Capability to research in new techniques, methodologies, architectures, services or systems in the area of ??Artificial Intelligence.
CEA9 - Capability to understand Multiagent Systems advanced techniques, and to know how to design, implement and apply these techniques in the development of intelligent applications, services or systems.
CEA10 - Capability to understand advanced techniques of Human-Computer Interaction, and to know how to design, implement and apply these techniques in the development of intelligent applications, services or systems.
CEA11 - Capability to understand the advanced techniques of Computational Intelligence, and to know how to design, implement and apply these techniques in the development of intelligent applications, services or systems.
CEA12 - Capability to understand the advanced techniques of Knowledge Engineering, Machine Learning and Decision Support Systems, and to know how to design, implement and apply these techniques in the development of intelligent applications, services or systems.
CEA13 - Capability to understand advanced techniques of Modeling , Reasoning and Problem Solving, and to know how to design, implement and apply these techniques in the development of intelligent applications, services or systems.
CEA14 - Capability to understand the advanced techniques of Vision, Perception and Robotics, and to know how to design, implement and apply these techniques in the development of intelligent applications, services or systems.

Professional

CEP1 - Capability to solve the analysis of information needs from different organizations, identifying the uncertainty and variability sources.
CEP2 - Capability to solve the decision making problems from different organizations, integrating intelligent tools.
CEP3 - Capacity for applying Artificial Intelligence techniques in technological and industrial environments to improve quality and productivity.
CEP4 - Capability to design, write and report about computer science projects in the specific area of ??Artificial Intelligence.
CEP5 - Capability to design new tools and new techniques of Artificial Intelligence in professional practice.
CEP6 - Capability to assimilate and integrate the changing economic, social and technological environment to the objectives and procedures of informatic work in intelligent systems.
CEP7 - Capability to respect the legal rules and deontology in professional practice.
CEP8 - Capability to respect the surrounding environment and design and develop sustainable intelligent systems.

Direcció i gestió

CDG1 - Capability to integrate technologies, applications, services and systems of Informatics Engineering, in general and in broader and multicisciplinary contexts.
CDG2 - Capacity for strategic planning, development, direction, coordination, and technical and economic management in the areas of Informatics Engineering related to: systems, applications, services, networks, infrastructure or computer facilities and software development centers or factories, respecting the implementation of quality and environmental criteria in multidisciplinary working environments .
CDG3 - Capability to manage research, development and innovation projects in companies and technology centers, guaranteeing the safety of people and assets, the final quality of products and their homologation.

Especifics

CTE1 - Capability to model, design, define the architecture, implement, manage, operate, administrate and maintain applications, networks, systems, services and computer contents.
CTE2 - Capability to understand and know how to apply the operation and organization of Internet, technologies and protocols for next generation networks, component models, middleware and services.
CTE3 - Capability to secure, manage, audit and certify the quality of developments, processes, systems, services, applications and software products.
CTE4 - Capability to design, develop, manage and evaluate mechanisms of certification and safety guarantee in the management and access to information in a local or distributed processing.
CTE5 - Capability to analyze the information needs that arise in an environment and carry out all the stages in the process of building an information system.
CTE6 - Capability to design and evaluate operating systems and servers, and applications and systems based on distributed computing.
CTE7 - Capability to understand and to apply advanced knowledge of high performance computing and numerical or computational methods to engineering problems.
CTE8 - Capability to design and develop systems, applications and services in embedded and ubiquitous systems .
CTE9 - Capability to apply mathematical, statistical and artificial intelligence methods to model, design and develop applications, services, intelligent systems and knowledge-based systems.
CTE10 - Capability to use and develop methodologies, methods, techniques, special-purpose programs, rules and standards for computer graphics.
CTE11 - Capability to conceptualize, design, develop and evaluate human-computer interaction of products, systems, applications and informatic services.
CTE12 - Capability to create and exploit virtual environments, and to the create, manageme and distribute of multimedia content.

Computer graphics and virtual reality

CEE1.1 - Capability to understand and know how to apply current and future technologies for the design and evaluation of interactive graphic applications in three dimensions, either when priorizing image quality or when priorizing interactivity and speed, and to understand the associated commitments and the reasons that cause them.
CEE1.2 - Capability to understand and know how to apply current and future technologies for the evaluation, implementation and operation of virtual and / or increased reality environments, and 3D user interfaces based on devices for natural interaction.
CEE1.3 - Ability to integrate the technologies mentioned in CEE1.2 and CEE1.1 skills with other digital processing information technologies to build new applications as well as make significant contributions in multidisciplinary teams using computer graphics.

Computer networks and distributed systems

CEE2.1 - Capability to understand models, problems and algorithms related to distributed systems, and to design and evaluate algorithms and systems that process the distribution problems and provide distributed services.
CEE2.2 - Capability to understand models, problems and algorithms related to computer networks and to design and evaluate algorithms, protocols and systems that process the complexity of computer communications networks.
CEE2.3 - Capability to understand models, problems and mathematical tools to analyze, design and evaluate computer networks and distributed systems.

Advanced computing

CEE3.1 - Capability to identify computational barriers and to analyze the complexity of computational problems in different areas of science and technology as well as to represent high complexity problems in mathematical structures which can be treated effectively with algorithmic schemes.
CEE3.2 - Capability to use a wide and varied spectrum of algorithmic resources to solve high difficulty algorithmic problems.
CEE3.3 - Capability to understand the computational requirements of problems from non-informatics disciplines and to make significant contributions in multidisciplinary teams that use computing.

High performance computing

CEE4.1 - Capability to analyze, evaluate and design computers and to propose new techniques for improvement in its architecture.
CEE4.2 - Capability to analyze, evaluate, design and optimize software considering the architecture and to propose new optimization techniques.
CEE4.3 - Capability to analyze, evaluate, design and manage system software in supercomputing environments.

Service engineering

CEE5.1 - Capability to participate in improvement projects or to create service systems, providing in particular: a) innovation and research proposals based on new uses and developments of information technologies, b) application of the most appropriate software engineering and databases principles when developing information systems, c) definition, installation and management of infrastructure / platform necessary for the efficient running of service systems.
CEE5.2 - Capability to apply obtained knowledge in any kind of service systems, being familiar with some of them, and thorough knowledge of eCommerce systems and their extensions (eBusiness, eOrganization, eGovernment, etc.).
CEE5.3 - Capability to work in interdisciplinary engineering services teams and, provided the necessary domain experience, capability to work autonomously in specific service systems.

Specific

CEC1 - Ability to apply scientific methodologies in the study and analysis of phenomena and systems in any field of Information Technology as well as in the conception, design and implementation of innovative and original computing solutions.
CEC2 - Capacity for mathematical modelling, calculation and experimental design in engineering technology centres and business, particularly in research and innovation in all areas of Computer Science.
CEC3 - Ability to apply innovative solutions and make progress in the knowledge that exploit the new paradigms of Informatics, particularly in distributed environments.

Generic Technical Competences

Generic

CG1 - Identify and apply the most appropriate data management methods and processes to manage the data life cycle, considering both structured and unstructured data
CG2 - Identify and apply methods of data analysis, knowledge extraction and visualization for data collected in disparate formats
CG3 - Define, design and implement complex systems that cover all phases in data science projects
CG4 - Design and implement data science projects in specific domains and in an innovative way
CG5 - To be able to draw on fundamental knowledge and sound work methodologies acquired during the studies to adapt to the new technological scenarios of the future.
CG6 - Capacity for general management, technical management and research projects management, development and innovation in companies and technology centers in the area of Computer Science.
CG7 - Capacity for implementation, direction and management of computer manufacturing processes, with guarantee of safety for people and assets, the final quality of the products and their homologation.
CG8 - Capability to apply the acquired knowledge and to solve problems in new or unfamiliar environments inside broad and multidisciplinary contexts, being able to integrate this knowledge.
CG9 - Capacity to understand and apply ethical responsibility, law and professional deontology of the activity of the Informatics Engineering profession.
CG10 - Capacity to apply economics, human resources and projects management principles, as well as legislation, regulation and standardization of Informatics.

Objectives

To train students to follow by themselves the continuous development of supercomputing systems that enable the convergence of advanced analytic algorithms as artificial intelligence.
Related competences: CB6, CB8, CB9, CTR3, CEE4.1, CEE4.2, CEE4.3, CG1,

00. Welcome: Course content and motivation
01. Supercomputing basics
02. General purpose supercomputers
03. Parallel programming models
04. Parallel performance metrics
05. Parallel Performance models
06. Heterogeneous supercomputers
07. Parallel programming languages for heterogeneous platforms
08. Emerging Trends and Challenges in Supercomputing
09. Artificial Intelligence is a computing problem
10. Deep Learning essential concepts
11. Using Supercomputers for DL training
12. Accelerate the learning with parallel training using a multi-GPU parallel server
13. Accelerate the learning with parallel training using a multi-GPU parallel server
14. How to speed up the training of Transformers-based models

Activities

Activity Evaluation act

00. Welcome

Objectives: 1

Theory

0.5h

Problems

Laboratory

Guided learning

Autonomous learning

01. Supercomputing basics

Theory

Problems

Laboratory

Guided learning

0.1h

Autonomous learning

Exercise 01: Read and present a paper about exascale computers challenges

Theory

Problems

Laboratory

Guided learning

Autonomous learning

02. General purpose supercomputers

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Exercise 02: Getting started with Supercomputing

Theory

Problems

Laboratory

Guided learning

0.2h

Autonomous learning

03. Parallel programming models

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Exercise 03: Getting Started with Parallel Programming Models

Theory

Problems

Laboratory

Guided learning

0.1h

Autonomous learning

04. Parallel performance metrics

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Exercise 04: Getting Started with Parallel Performance Metrics

Theory

Problems

Laboratory

Guided learning

0.1h

Autonomous learning

05. Parallel performance models

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Exercise 05: Getting started with parallel performance metrics and models

Theory

Problems

Laboratory

Guided learning

0.1h

Autonomous learning

06. Heterogeneous supercomputers

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Exercise 06: Comparing supercomputers performance

Theory

Problems

Laboratory

Guided learning

0.1h

Autonomous learning

07. Parallel programming languages for heterogeneous platforms

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Exercise 07: Getting started with CUDA

Theory

Problems

Laboratory

Guided learning

0.1h

Autonomous learning

08. Emerging Trends and Challenges in Supercomputing

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Exercise 08: Read and present a paper about emerging trends in supercomputing

Theory

Problems

Laboratory

Guided learning

0.1h

Autonomous learning

Midterm

Theory

Problems

Laboratory

Guided learning

Autonomous learning

10h

09. Artificial Intelligence is a Supercomputing problem

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Exercise 09: First contact with Deep Learning

Theory

Problems

Laboratory

Guided learning

0.1h

Autonomous learning

10. Deep Learning essential concepts

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Exercise 10: The new edition of the TOP500

Theory

Problems

Laboratory

Guided learning

0.2h

Autonomous learning

11. Using Supercomputers for DL training

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Exercise 11: Using a supercomputer for Deep Learning training

Theory

Problems

Laboratory

Guided learning

0.2h

Autonomous learning

12. Accelerate the learning with parallel training using a multi-GPU parallel server

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Exercise 12: Accelerate the learning with parallel training using a multi-GPU parallel server

Theory

Problems

Laboratory

Guided learning

0.2h

Autonomous learning

13. Accelerate the learning with distributed training using multiple parallel servers

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Exercise 13: Accelerate the learning with distributed training using multiple parallel server

Theory

Problems

Laboratory

Guided learning

0.2h

Autonomous learning

14. How to speed up the training of Transformers-based models

Theory

Problems

Laboratory

Guided learning

Autonomous learning

Exercise 14: How to speed up the training of Transformers-based models

Theory

Problems

Laboratory

Guided learning

0.2h

Autonomous learning

Final remarks

Theory

0.5h

Problems

Laboratory

Guided learning

Autonomous learning

Teaching methodology

Class attendance and participation: Regular attendance is expected, and is required to be able to discuss concepts that will be covered during class.

Lab activities: Some exercises will be conducted as hands-on sessions during the course using supercomputing facilities. The student's own laptop will be required to access these resources during the theory class. Each hands-on session will involve writing a lab report with all the results. There are no days for theory classes and days for laboratory classes. Theoretical and practical activities will be interspersed during the same session to facilitate the learning process.

Reading/presentation assignments: Some exercise assignments will consist of reading documentation/papers that expand the concepts introduced during lectures. Some exercises will involve student presentations (randomly chosen).

Assessment: There will be one midterm exam in the middle of the course. The student is allowed to use any type of documentation (also digital via the student's laptop)

Evaluation methodology

The evaluation of this course can be obtained by continuous assessment. This assessment will take into account the following:

20% Attendance + participation
15% Midterm exam
65% Exercises (+ exercise presentations) and Lab exercises (+ Lab reports)
Details of the weight of each component of the course in the grade are described in the tentative scheduling section.

Course Exam: For those students who have not benefited from the continuous assessment, a course exam will be announced during the course. This exam includes evaluating the knowledge of the entire course (practical part, theoretical part, and self-learning part). During this exam, the student is not allowed to use any documentation (neither on paper nor digital).

Bibliography

Basic:

Class handouts and materials associated with this class - Torres, J, 2019.
Understanding Supercomputing, to speed up machine learning algorithms (Course notes) - Torres, J, 2018.
Marenostrum4 User's guide - BSC documentation, Operations department, 2019.
High performance computing : modern systems and practices - Sterling, T.; Anderson, M.; Brodowicz, M, Morgan Kaufmann, 2018. ISBN: 9780124201583
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004173809706711&context=L&vid=34CSUC_UPC:VU1&lang=ca
Dive into deep learning - Zhang, A.; Lipton, Z.C.; Li, M.; Smola, A.J, 2020.
First contact with Deep learning: practical introduction with Keras - Torres, J, Kindle Direct Publishing, 2018. ISBN: 9781983211553
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004153269706711&context=L&vid=34CSUC_UPC:VU1&lang=ca

Web links

UPDATED INFORMATION FOR THIS COURSE https://torres.ai/SA-MIRI https://torres.ai/SA-MIRI/

Previous capacities

Programming in C and Linux basics will be expected in the course. In addition, prior exposure to parallel programming constructions, Python language, experience with linear algebra/matrices, or machine learning knowledge will be helpful.

Supercomputers Architecture

You are here

Teachers

Person in charge

Weekly hours

Competences

Transversal Competences

Teamwork

Entrepreneurship and innovation

Appropiate attitude towards work

Reasoning

Sustainability and social commitment

Third language

Effective oral and written communication

Information literacy

Autonomous learning

Analisis y sintesis

Basic

Transversals

Gender perspective

Technical Competences

Common technical competencies

Technical competencies

Especifics

Technical Competences of each Specialization

Information systems specialization

Software engineering specialization

Information technology specialization

Computer engineering specialization

Computer science specialization

Academic

Professional

Direcció i gestió

Especifics

Computer graphics and virtual reality

Computer networks and distributed systems

Advanced computing

High performance computing

Service engineering

Specific

Generic Technical Competences

Generic

Objectives

Contents

Activities

00. Welcome

01. Supercomputing basics

Exercise 01: Read and present a paper about exascale computers challenges

02. General purpose supercomputers

Exercise 02: Getting started with Supercomputing

03. Parallel programming models

Exercise 03: Getting Started with Parallel Programming Models

04. Parallel performance metrics

Exercise 04: Getting Started with Parallel Performance Metrics

05. Parallel performance models

Exercise 05: Getting started with parallel performance metrics and models

06. Heterogeneous supercomputers

Exercise 06: Comparing supercomputers performance

07. Parallel programming languages for heterogeneous platforms

Exercise 07: Getting started with CUDA

08. Emerging Trends and Challenges in Supercomputing

Exercise 08: Read and present a paper about emerging trends in supercomputing

Midterm

09. Artificial Intelligence is a Supercomputing problem

Exercise 09: First contact with Deep Learning

10. Deep Learning essential concepts

Exercise 10: The new edition of the TOP500

11. Using Supercomputers for DL training

Exercise 11: Using a supercomputer for Deep Learning training

12. Accelerate the learning with parallel training using a multi-GPU parallel server

Exercise 12: Accelerate the learning with parallel training using a multi-GPU parallel server

13. Accelerate the learning with distributed training using multiple parallel servers

Exercise 13: Accelerate the learning with distributed training using multiple parallel server

14. How to speed up the training of Transformers-based models

Exercise 14: How to speed up the training of Transformers-based models

Final remarks

Teaching methodology

Evaluation methodology

Bibliography

Basic: