Cloud Computing and Big Data Analytics

You are here

Credits
6
Types
  • MIRI: Specialization complementary (Computer Networks and Distributed Systems)
  • MDS: Elective
Requirements
This subject has not requirements, but it has got previous capacities
Department
AC
Cloud computing is a service model for large-scale distributed computing based on concentrated infrastructure and a set of collaborative services over which applications can be deployed and run over the network.

This course about Cloud computing has a mainly practical approach dealing with the related technologies. While the explained concepts apply to any application, the classes pay particular attention to the creation of Big Data Analytics applications on the Cloud.

In the lectures of this course, the students will learn the principles and the state of the art of large-scale distributed computing in a service-based model. Students will study how scale affects system properties, models, architecture, and requirements.

Regarding principles, this course looks at how scale affects systems properties, issues (such as virtualization, availability, locality, performance, and adaptation), system models, architectural models, environment and application requirements (such as fault tolerance, content distribution). This course also reviews state of the art in resource management of cloud environments (composed of different types of platforms and organization) to support current applications and their requirements.

In the laboratory sessions of this course, the students will gain a practical view of the latest in Cloud technology to implement a prototype that meets a business idea created by a student. The students will begin by building an essential toolbox to get started in the Cloud. They will later have to practice with APIs, the doors in the Cloud. All these things together will allow the students to mine the deluge of data coming from the Cloud or use new advanced analytics services provided nowadays by the Cloud. Finally, they will look under the hood of these high-level analytics services in the Cloud, either regarding software or hardware, to understand, how to meet high-performance requirements.

Teachers

Person in charge

  • Angel Toribio Gonzalez ( )
  • Jordi Torres Viñals ( )

Others

  • Josep Lluís Berral García ( )
  • René Serral Gracià ( )

Weekly hours

Theory
2
Problems
0
Laboratory
1.2
Guided learning
0.8
Autonomous learning
6.8

Competences

Technical Competences of each Specialization

Computer networks and distributed systems

  • CEE2.1 - Capability to understand models, problems and algorithms related to distributed systems, and to design and evaluate algorithms and systems that process the distribution problems and provide distributed services.
  • CEE2.3 - Capability to understand models, problems and mathematical tools to analyze, design and evaluate computer networks and distributed systems.

Generic Technical Competences

Generic

  • CG4 - Capacity for general and technical management of research, development and innovation projects, in companies and technology centers in the field of Informatics Engineering.
  • CG5 - Capability to apply innovative solutions and make progress in the knowledge to exploit the new paradigms of computing, particularly in distributed environments.

Transversal Competences

Entrepreneurship and innovation

  • CTR1 - Capacity for knowing and understanding a business organization and the science that rules its activity, capability to understand the labour rules and the relationships between planning, industrial and commercial strategies, quality and profit. Capacity for developping creativity, entrepreneurship and innovation trend.

Teamwork

  • CTR3 - Capacity of being able to work as a team member, either as a regular member or performing directive activities, in order to help the development of projects in a pragmatic manner and with sense of responsibility; capability to take into account the available resources.

Reasoning

  • CTR6 - Capacity for critical, logical and mathematical reasoning. Capability to solve problems in their area of study. Capacity for abstraction: the capability to create and use models that reflect real situations. Capability to design and implement simple experiments, and analyze and interpret their results. Capacity for analysis, synthesis and evaluation.

Basic

  • CB7 - Ability to integrate knowledges and handle the complexity of making judgments based on information which, being incomplete or limited, includes considerations on social and ethical responsibilities linked to the application of their knowledge and judgments.
  • CB8 - Capability to communicate their conclusions, and the knowledge and rationale underpinning these, to both skilled and unskilled public in a clear and unambiguous way.

Objectives

  1. Present the student with new execution environments required to manage the computing resources and simplify the development and integration of the different types of applications and services at nowadays Internet-scale systems.
    Related competences: CEE2.1, CEE2.3,
  2. Collaborate in the design, implementation and presentation of a cloud computing environment that is required for a class project.
    Related competences: CB7, CB8, CTR1, CTR3, CTR6, CEE2.1,
  3. Find and understand useful information to create innovative solutions.
    Related competences: CG4, CG5,

Contents

  1. Lectures: Cloud Computing fundamentals
    Fundamental concepts: The effect of scale on system properties.
    ---- Issues in large-scale systems: virtualization, service orientation and composition, availability, locality, performance and adaptation.
    ---- Models for large-scale systems: system models for analysis, architectural models and service/deployment models.
    ---- Scaling techniques: basic techniques, scalable computing techniques for architectural models.
    ---- Middleware and Applications: computing, storage, web, content distribution, Internet-scale systems or services.
    ---- Environment and applications requirements.
  2. Laboratory sessions: Practical view of Cloud Computing
    Big Data Analytics in the Cloud
    ---- APIs: The Doors in the Cloud
    ---- Current required layers in Big Data Software Stack
    ---- New Software requirements for Advanced Analytics
    ---- New Hardware requirements for Advanced Analytics
  3. Assigment: Experimental part
    Development of a prototype application using Cloud service offerings (such as AWS, Google AppEngine, Open Stack, OpenNebula)
    ---- Development of a prototype application using advanced analytics services either provided regarding APIs or Software as a Service.

Activities

Activity Evaluation act


Introduction

Cloud Computing Definition. Service Oriented Architectures. Web Services.Business considerations Business considerations

Theory
2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
4h

Cloud Computing Architecture

Technology. Architecture. Modelling and Design.
Objectives: 1
Theory
2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
4h

Virtualization

Foundations. Grid, cloud and virtualization.

Theory
2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
4h

Data Storage


Objectives: 1
Theory
4h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
8h

Cloud Services



Theory
4h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
8h

Cloud Security



Theory
2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
4h

Service Oriented Architectures



Theory
2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
4h

Cloud Tools



Theory
4h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
8h

Cloud Applications



Theory
3h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
6h

Future Trends



Theory
2h
Problems
0h
Laboratory
0h
Guided learning
0h
Autonomous learning
2h

Collaborative class project


Objectives: 2 3
Theory
0h
Problems
0h
Laboratory
0h
Guided learning
4h
Autonomous learning
30h

Lab: Basic knowledge toolbox



Theory
0h
Problems
0h
Laboratory
0.5h
Guided learning
1.5h
Autonomous learning
0h

Lab: Doors in the cloud



Theory
0h
Problems
0h
Laboratory
1.5h
Guided learning
0.5h
Autonomous learning
0h

Lab: Content Delivery Network



Theory
0h
Problems
0h
Laboratory
1.5h
Guided learning
0.5h
Autonomous learning
0h

Lab: Extract and analyze data



Theory
0h
Problems
0h
Laboratory
1.5h
Guided learning
0.5h
Autonomous learning
0h

Lab: Interact with users and services



Theory
0h
Problems
0h
Laboratory
1.5h
Guided learning
0.5h
Autonomous learning
0h

Lab: Monitoring and Security



Theory
0h
Problems
0h
Laboratory
1.5h
Guided learning
0.5h
Autonomous learning
0h

Lab: Data storage



Theory
0h
Problems
0h
Laboratory
1.5h
Guided learning
0.5h
Autonomous learning
0h

Lab: Web Services



Theory
0h
Problems
0h
Laboratory
1.5h
Guided learning
0.5h
Autonomous learning
0h

Discuss: Virtualization



Theory
0h
Problems
0h
Laboratory
1h
Guided learning
1h
Autonomous learning
2h

Discuss: Cloud providers comparison



Theory
0h
Problems
0h
Laboratory
1h
Guided learning
1h
Autonomous learning
2h

Discuss: Federated Cloud Computing



Theory
0h
Problems
0h
Laboratory
1h
Guided learning
1h
Autonomous learning
2h

Discuss: Cloud governance



Theory
0h
Problems
0h
Laboratory
1h
Guided learning
1h
Autonomous learning
2h

Discuss: Future trends



Theory
0h
Problems
0h
Laboratory
1.2h
Guided learning
1.5h
Autonomous learning
2h

Teaching methodology

Lectures, reading and discussion of technical and research papers, Presentation of topics (and papers) by students. Laboratory sessions and a practical class project.

Students are required to bring their laptop to carry out the laboratory sessions and practical class project.

This subject is taught only in English language.

Evaluation methodology

Students will be evaluated on their participation in class, laboratory sessions, class attendance, reading and presenting reports and papers and assignments on specific topics.

The final grade for the course is the weighted average of the grades for the following components obtained en each part of the course:
· Lab sessions: 30%
· Papers Reading/Presentation and homework: 20%
· Course Projects: 30%
· Final exam: 20%

In order to be able to publicly defend the course project, students must have attended at least 70% of the classes and teams must have delivered on time the activities that have been planned during the course. The course project is the result of teamwork, which will be reflected in the grade given to the group as a whole. Each member of the group will be responsible for part of the project and might be graded individually on his or her contribution.

Bibliography

Basic:

Previous capacities

General knowledge of:
- TCP/IP networking
- Operating Systems basic administration and use of the OS from the programs
- Software development

Basic knowledge of:
- Unix command line.
- Python programming language.
- Git version control system.

Warning. Students are supposed to have the above background before starting the laboratory sessions. Complimentary fast-paced materials will be provided before class to help students meet the above requirements.