Algorismes, Estructures de Dades i Bases de Dades

Esteu aquí

Crèdits
6
Tipus
Obligatòria
Requisits
Aquesta assignatura no té requisits, però té capacitats prèvies
Departament
CS;ESSI
This is a fundamental course that covers basic concepts on algorithms, data structures and databases. This course spans two different tracks: one for students who have a major in Computer Science and another track for the rest.
During the first weeks, the course is different for both tracks. For the former, it introduces students to CS-specific advanced topics for Data Science, while for the latter it requires the students to cover basic material on algorithms, data structures and databases in a guided, yet autonomous learning manner.

The last part of the course (corresponding to the last 5 weeks) is common and it is structured as project-based learning. A statement about an end-to-end Data Science project will be delivered and students will need to create the required software infrastructure, using adequate tools.

The objective is to help students get a rigorous and strong knowledge required by the following master courses (for students with a minor in CS) or investigate further CS-specific topics for Data Science (for students with a major in CS). Last, but not least, the course promotes adopting good habits when creating software projects for Data Science in the final project.

Professors

Responsable

  • Jordi Delgado Pin ( )
  • Oscar Romero Moral ( )

Altres

  • Anna Queralt Calafat ( )
  • Maria Josefina Sierra Santibañez ( )

Hores setmanals

Teoria
1
Problemes
0
Laboratori
2.9
Aprenentatge dirigit
0
Aprenentatge autònom
6.85

Competències

Competències Transversals

ús solvent dels recursos d'informació

  • CT4 - Gestionar l'adquisició, l'estructuració, l'anàlisi i la visualització de dades i informació de l'àmbit d'especialitat, i valorar de forma crítica els resultats d'aquesta gestió.

Tercera llengua

  • CT5 - Conèixer una tercera llengua, preferentment l'anglès, amb un nivell adequat oral i escrit i en consonància amb les necessitats que tindran els titulats i titulades.

Bàsiques

  • CB6 - Que els estudiants sàpiguen aplicar els coneixements adquirits y la seva capacitat de resolució de problemes en entorns nous o poc coneguts dins de contexts més amplis (o multidisciplinaris) relacionats amb la seva àrea d'estudi.
  • CB9 - Que els estudiants posseeixin les habilitats d'aprenentatge que els permetin continuar estudiant d'una manera que haurà de ser en gran mesura autodirigida o autònoma..

Competències Tècniques Generals

Genèriques

  • CG1 - Identificar i aplicar els mètodes i processos de gestió de dades més adequats per gestionar el cicle de vida de les dades, incloent-hi dades estructurades i no estructurades

Competències Tècniques

Específiques

  • CE1 - Desenvolupar algoritmes eficients fonamentats en el coneixement i comprensió de la teoria de la complexitat computacional i les principals estructures de dades, dins de l'àmbit de ciència de dades
  • CE2 - Aplicar els fonaments de la gestió i processament de dades en un problema de ciència de dades

Objectius

  1. To analyse the cost of iterative and recursive algorithms
    Competències relacionades: CT4, CT5, CE1,
  2. To review some simple data structures: stacks, queues, lists, and trees
    Competències relacionades: CT4, CT5, CE1,
  3. To know, explain, design, analyse, compare and implement the main data structures and algorithms that can be used to implement priority queues
    Competències relacionades: CT4, CT5, CE1,
  4. To know, explain, design, analyse, compare and implement the main data structures and algorithms that can be used to implement dictionaries
    Competències relacionades: CT4, CT5, CE1,
  5. To know, explain, design, analyse, compare and implement the main data structures and algorithms that can be used to represent graphs and solve classic graph problems such as traversals, topological ordering and shortest paths
    Competències relacionades: CT4, CT5, CE1,
  6. To know, understand, explain, analyse and compare some algorithm design techniques: greedy, divide and conquer, and dynamic programming
    Competències relacionades: CT4, CT5, CE1,
  7. To be aware of the limits of computation: to understand the definitions of the P and NP classes, the concept of Polynomial-Time reduction, the notion of NP-Completeness, and to know some classic NP-complete problems
    Competències relacionades: CT4, CT5, CE1,
  8. Describe what is a database and a database management system
    Competències relacionades: CT4, CT5, CE2,
  9. Effectively use the standard Structured Query Language (SQL) to query relational databases
    Competències relacionades: CT4, CT5, CE2,
  10. Explain the relational data model, including its data structures, the relational algebra and integrity constraints
    Competències relacionades: CT4, CT5, CE2,
  11. Given a set of informational requirements, model the logic schema of a relational database
    Competències relacionades: CT4, CT5, CE2,
  12. Identify the main objectives of a database management system query optimizer
    Competències relacionades: CT4, CT5, CE2,
  13. Investigate, summarize and synthesize the main problems and their related techniques, methods and algorithms to undertake data science specific projects
    Competències relacionades: CT4, CT5, CB6, CB9,
  14. Develop a quality realistic end-to-end system architecture for a data science project
    Competències relacionades: CT4, CT5, CG1, CE1, CE2, CB6, CB9,

Continguts

  1. Basics of Analysis of Algorithms
    Worst case, best case and average case cost analysis. Asymptotic order of growth notations: Big-O, Omega and Theta. Analysis of the cost of iterative and recursive algorithms.
  2. Simple Data Structures: Review
    Stacks, queues, lists and trees.
  3. Priority Queues
    Operations of priority queues. Implementations with heaps. Heapsort.
  4. Dictionaries
    Operations of dictionaries. Basic implementations: tables and lists. Advanced implementations: hash tables, binary search trees, and AVL trees.
  5. Graphs
    Representations: adjacency matrices, adjacency lists and implicit representations. Depth-first search (DFS). Breadth-first search (BFS). Topological sort. Algorithms for shortest paths. Algorithm for minimum spanning trees.
  6. Algorithm design techniques
    Greedy, divide and conquer, and dynamic programming.
  7. Introduction to NP and Computational Intractability
    Basic introduction to P and NP classes, Polynomial-Time reduction, and NP-completeness. Examples of classic NP-complete problems.
  8. Introduction to databases and database management systems
    Main concepts on databases and database management systems. Relational database management systems.
  9. SQL: Data-definition language and data-manipulation language
    Introduction to the SQL language
  10. The relational model
    Data structures and integrity constraints. Views.
  11. The relational algebra
    The relational algebra operators and how to build data pipes with them. Notion of semantic and syntactic optimization.
  12. Logical design of relational databases
    Normalization theory. Translating conceptual schemas into relational schemas.
  13. Notions of physical design and physical database optimization
    Notions of query optimizer, access plan and cost model
  14. Data Science-related advanced topics
    Students with a major in Computer Science will investigate on advanced topics specific for data science projects. For example, data quality, entity resolution, data integration, etc.

Activitats

Activitat Acte avaluatiu


Basics of Analysis of Algorithms

To compare the efficiency of different algorithms for solving the same problem and select the most appropriate one. To compute the cost of an algorithm in the worst, best and average cases. To understand the definitions of the asymptotic order of growth notations Big-O, Omega and Theta, and their usefulness in characterising algorithm efficiency in time and space.
Objectius: 1
Continguts:
Teoria
1h
Problemes
0h
Laboratori
2h
Aprenentatge dirigit
0h
Aprenentatge autònom
0h

Review of simple data structures

Operations. Lists, Stacks, Queues, Trees
Objectius: 2
Continguts:
Teoria
1h
Problemes
0h
Laboratori
3h
Aprenentatge dirigit
0h
Aprenentatge autònom
0h

Priority Queues

Operations of priority queues. Implementations with heaps. Heapsort.
Objectius: 3
Continguts:
Teoria
1h
Problemes
0h
Laboratori
3h
Aprenentatge dirigit
0h
Aprenentatge autònom
0h

Dictionaries

Operations of dictionaries and ordered dictionaries. Basic implementations: tables and lists. Advanced implementations: hash tables, binary search trees, AVL trees
Objectius: 4
Continguts:
Teoria
1h
Problemes
0h
Laboratori
3h
Aprenentatge dirigit
0h
Aprenentatge autònom
0h

Graphs

Representations: adjacency matrices, adjacency lists and implicit representation. Depth-first search (DFS). Breadth-first search (BFS). Topological sort. Dijkstra's algorithm for shortest paths. Prim's algorithm for minimum spanning trees.
Objectius: 5
Continguts:
Teoria
1h
Problemes
0h
Laboratori
3h
Aprenentatge dirigit
0h
Aprenentatge autònom
0h

Algorithmic schemes

Divide and conquer, Greedy algorithms, Dynamic Programming, Exhaustive search, Backtracking.
Objectius: 6
Continguts:
Teoria
1h
Problemes
0h
Laboratori
4h
Aprenentatge dirigit
0h
Aprenentatge autònom
0h

Notions of Intractability

Basic introduction to P and NP classes. NP-completeness.
Objectius: 7
Continguts:
Teoria
0h
Problemes
0h
Laboratori
0h
Aprenentatge dirigit
0h
Aprenentatge autònom
0h

Partial Exam

For students with a minor in Computer Science, this exam evaluates their knowledge on fundamental concepts of algorithms, data structures and databases
Objectius: 3 9 4 6 7 1 2 5 8 10 11 12
Setmana: 12
Tipus: examen de teoria
Teoria
2h
Problemes
0h
Laboratori
0h
Aprenentatge dirigit
0h
Aprenentatge autònom
6h

Introduction to databases and database management systems

The student attends the lecture, takes notes and participates in the session exercises
Objectius: 8
Continguts:
Teoria
1h
Problemes
0h
Laboratori
0h
Aprenentatge dirigit
0h
Aprenentatge autònom
0h

SQL

The student attends the lecture, takes notes and participates in the session exercises
Objectius: 9
Continguts:
Teoria
0h
Problemes
0h
Laboratori
3h
Aprenentatge dirigit
0h
Aprenentatge autònom
0h

The Relational Model

The student attends the lecture, takes notes and participates in the session exercises
Objectius: 10 11
Continguts:
Teoria
1h
Problemes
0h
Laboratori
2h
Aprenentatge dirigit
0h
Aprenentatge autònom
0h

Logical Design of Relational Databases

The student attends the lecture, takes notes and participates in the session exercises
Objectius: 10 11
Continguts:
Teoria
1h
Problemes
0h
Laboratori
3h
Aprenentatge dirigit
0h
Aprenentatge autònom
0h

Physical Optimization

The student attends the lecture, takes notes and participates in the session exercises
Objectius: 12
Continguts:
Teoria
3h
Problemes
0h
Laboratori
4h
Aprenentatge dirigit
0h
Aprenentatge autònom
0h

Final Exam

For students with a minor in Computer Science, this exam evaluates their knowledge on fundamental concepts of algorithms, data structures and databases
Objectius: 3 9 4 6 7 1 2 5 8 10 11 12
Setmana: 17
Tipus: examen de teoria
Teoria
2h
Problemes
0h
Laboratori
0h
Aprenentatge dirigit
0h
Aprenentatge autònom
5h

Data Science Advanced Topics project (DS-AT project)

Students with a major in Computer Science will investigate on advanced topics specific for data science projects. For example, data quality, entity resolution, data integration, etc. Students with a minor in Computer Science will investigate and further study the fundamental concepts introduced in the lecturing hours.
Objectius: 13
Continguts:
Teoria
0h
Problemes
0h
Laboratori
4h
Aprenentatge dirigit
0h
Aprenentatge autònom
40h

Teoria
0h
Problemes
0h
Laboratori
4h
Aprenentatge dirigit
0h
Aprenentatge autònom
45h

Metodologia docent

During the first 10 weeks, the students are divided in two tracks: one for students with a minor in computer science (track 1) and another one for students with a major in computer science (track 2).

Students in track 1 will study fundamental concepts in algorithms, data structures and databases. First, additional material to read, study and understand is provided. Students will have a large bank of self-assessing exercises to practice their understanding on their own. Then, in the face-to-face lectures, the lecturer will solve doubts and go through representative exercises to guarantee a solid understanding. Also, for the most complex concepts, some regular teaching sessions will be scheduled.

Students in track 2 will carry out the DS-AT project. Relevant and critical problems such as data integration, entity resolution and data quality will be proposed. The student will have access to a description of the problem, some seminal research papers and reference tools in this matter.

The last part of the course (corresponding to the last 5 weeks of the semester) is common for all students regardless of the track they followed during the first 10 weeks. During these weeks the DS-EE project takes place. Students must create a end-to-end system architecture to ingest, store, process, learn models and deploy such system for a realistic project with realistic data.

This course has a strong self-learning component. Complementing it, the lecturers will provide additional material, advice hours and complementary lectures and labs to guarantee a solid comprehension.

Mètode d'avaluació

Let E1 be the score of the partial exam,
E2 the score of the final exam,
RPM the score of the DS-AT project and
CPM the score of the DS-EE project.

Then,

NE = MAX(E2, E1)
If the student followed track 1 (see methodology) then NT = NE
else if student followed track 2 (see methodology) then NT = RPM

The final mark will be 0.6*NT + 0.4*CPM
Transversal competences will weight a 0% in the final score.

Bibliografia

Bàsica:

Complementaria:

Web links

Capacitats prèvies

This course assumes basic competences in algorithms, data structures and databases. The course is structured to cope with different backgrounds and learning needs but basic knowledge on Computer Science principles is assumed: notions of computer architecture, basic programming constructs and data structures.

Addenda

Continguts

The syllabus does not change.

Metodologia docent

The course is scheduled to be given face-to-face. No changes.

Mètode d'avaluació

No changes.

Pla de contingència

In case a new lock-down is set, the course will be fully online. However, we will prioritize synchronous online sessions.