Crèdits
6
Tipus
Obligatòria
Requisits
Aquesta assignatura no té requisits
, però té capacitats prèvies
Departament
CS;ESSI
Web
https://learnsql.fib.upc.es/moodle/
The methodology is different for both tracks. For the former, it is a project-oriented course focusing on dataOps / MLOps, while for the latter it covers basic material on algorithms, data structures and databases in a guided, yet autonomous learning manner that will allow them to develop complex data systems.
Professorat
Responsable
- Anna Queralt Calafat ( anna.queralt@upc.edu )
- Maria Josefina Sierra Santibañez ( jsierra@cs.upc.edu )
Altres
- Marc Maynou Yelamos ( marc.maynou@upc.edu )
- Oscar Romero Moral ( oscar.romero@upc.edu )
Hores setmanals
Teoria
1
Problemes
0
Laboratori
3
Aprenentatge dirigit
0
Aprenentatge autònom
7.11
Competències
Ús solvent dels recursos d'informació
Tercera llengua
Bàsiques
Genèriques
Específiques
Objectius
-
To specify a computational problem and justify the correctness and termination of an iterative or recursive algorithm that solves this problem.
Competències relacionades: CT4, CT5, CE1, -
To carry out asymptotic analyses of the worst case running time of iterative and recursive algorithms.
Competències relacionades: CT4, CT5, CE1, -
To review some simple data structures: stacks, queues, lists, and trees
Competències relacionades: CT4, CT5, CE1, -
To know what a priority queue is, be able to use it to solve
computational problems, and understand and analyse the main data
structures and algorithms that are used to implement it.
Competències relacionades: CT4, CT5, CE1, -
To know what ordered and unordered dictionaries are, be able to
use them to solve computational problems, and understand and analyse
the main data structures and algorithms that are used to implement
them.
Competències relacionades: CT4, CT5, CE1, -
To know the main data structures and algorithms that can be used
to represent graphs and solve classic graph problems such as traversals,
topological ordering and shortest paths, analyse their asymptotic
running time and be able to use them to solve computational problems.
Competències relacionades: CT4, CT5, CE1, -
To know what a decision problem is, the definitions of the
P and NP classes of problems, what a polynomial-time reduction is
and the associated notion of NP-completeness.
Competències relacionades: CT4, CT5, CE1, -
Describe what is a database and a database management system
Competències relacionades: CT4, CT5, CE2, -
Effectively use the standard Structured Query Language (SQL) to query relational databases
Competències relacionades: CT4, CT5, CE2, -
Explain the relational data model, including its data structures, the relational algebra and integrity constraints
Competències relacionades: CT4, CT5, CE2, -
Given a set of informational requirements, model the logic schema of a relational database
Competències relacionades: CT4, CT5, CE2, -
Identify the main objectives of a database management system query optimizer
Competències relacionades: CT4, CT5, CE2, -
Apply data structures, algorithms and database queries to solve a problem in a realistic situation
Competències relacionades: CT5, CG1, CE1, CE2, CB6, CB9,
Continguts
-
Specification and Analysis of Algorithms
Justification of correctness and termination of iterative and recursive
algorithms. -
Asymptotic Analysis
Asymptotic analysis: Order of growth, Big-O, Omega and Theta notations, asymptotic analysis of iterative algorithms, introduction to the asymptotic analysis of recursive algorithms (recurrences and the master method). -
Review of Object Oriented Programming
Review of Fundamental Concepts of Object Oriented Programming. -
Basic Data Structures
Basic Data Structures: stack, queue, linked-list and tree.
Abstract Data Types, examples of use, and implementations. -
Priority Queue
Priority Queue: ADT (operations) and examples of use.
Implementation with heaps and asymptotic analysis. Heapsort algorithm. -
Ordered Dictionary
Ordered Dictionary: ADT (operations) and examples of use. Implementation with binary search trees and AVL trees, and asymptotic analysis. -
Unordered Dictionary
Unordered Dictionary: ADT (operations) and examples of use. Implementation with hash tables and asymptotic analysis. -
Graph
Graph: Adjacency list and adjacency matrix representation. Traversal algorithms: depth-first search (DFS) and breadth-first search (BFS). Topological sort, shortest paths, and minimum spanning tree algorithms. -
Introduction to Complexity and Intractability
Decision problems. P and NP classes of problems. Polynomial-time reduction and NP-completeness. -
Introduction to databases and database management systems
Main concepts on databases and database management systems. Relational database management systems. -
SQL: Data-definition language and data-manipulation language
Introduction to the SQL language. -
The relational model
Data structures and integrity constraints. -
The relational algebra
The relational algebra operators and how to build data pipes with them. Notion of semantic and syntactic optimization. -
Logical design of relational databases
Design the logical schema of a database. -
Notions of physical design and physical database optimization
Notions of query optimizer, access plan and cost model. -
Data Science-related advanced topics
Students with a major in Computer Science will investigate on advanced topics specific for data science projects. For example, data
quality, entity resolution, data integration, etc.
Activitats
Activitat Acte avaluatiu
Computer Science for Data Science
Students in the minor track will put in practice all the contents of the course by solving exercises related to algorithms, data structures, and databases. Students in the major track will undertake a project spanning all main phases of a data science. As result, they are asked to develop a quality realistic end-to-end system architecture for a data science project.Objectius: 4 9 5 7 2 3 6 10 11 12 13 1
Continguts:
- 1 . Specification and Analysis of Algorithms
- 5 . Priority Queue
- 6 . Ordered Dictionary
- 9 . Introduction to Complexity and Intractability
- 11 . SQL: Data-definition language and data-manipulation language
- 12 . The relational model
- 13 . The relational algebra
- 14 . Logical design of relational databases
- 15 . Notions of physical design and physical database optimization
- 16 . Data Science-related advanced topics
- 2 . Asymptotic Analysis
- 4 . Basic Data Structures
- 3 . Review of Object Oriented Programming
- 8 . Graph
- 7 . Unordered Dictionary
Teoria
0h
Problemes
0h
Laboratori
40.5h
Aprenentatge dirigit
0h
Aprenentatge autònom
84h
Metodologia docent
The students are divided into two tracks: one for students with a minor in computer science (track 1) and another one for students with a major in computer science (track 2).Students in track 1 will study fundamental concepts on algorithms, data structures and databases.
The Databases part for track 1 students is organized in theory sessions where the main concepts on relational databases, SQL, Relational Algebra and optimizations are presented. Examples and exercises will be introduced to practice and discuss alternative solutions during the class. Additional material to facilitate understanding theoretical concepts will be provided.
Between sessions, students will have to solve some mandatory exercises to guarantee a smooth learning process. These exercises are individual and correspond to the DBEx part of the final grade in the evaluation.
The Algorithms and Data Structures part for track 1 consists of theory, problems and programming sessions. During the 'theory sessions' the main concepts, data structures, algorithms and results will be presented and illustrated with examples. Additional material to read and problems to facilitate understanding theoretical concepts will be provided.
Exercises on theoretical aspects of the course contents, such as justifying correctness and temination, asymptotic analysis of worst case running time, application of data structure operations or algorithms to particular problem instances, or demonstration of results on data structures and algorithms will be proposed. These exercises will be solved in groups of two or three students during the 'problems sessions', to promote discussion and team work. The lecturer will answer questions and suggest reading course material relevant to the problems. Assessment of written solutions submitted by each group corresponds to the ADSEx grade in the evaluation.
Programming problems requiring the implementation and use of the main data structures presented in the course in Python will be proposed. Students should solve these problems individually during the 'programming sessions' and afterwards as homework. Assessment of programming problems corresponds to the ADSPr grade in the evaluation. and it will be carried out through short meetings requiring demonstration of some of the solutions submitted, explanation, and slight modifications.
Students in track 2 will carry out the DS-EtE project and, for them, this course is a project course. There, students must create an end-to-end system architecture to ingest, store, process, learn models and deploy such system for a realistic project with realistic data. The students must develop good practices developing such architecture (what nowadays is known as DataOps / MLOps).
This course has a strong self-learning component and shares the same objectives in both tracks. The lecturers will supervise the students progress during the semester to guarantee a proper progress.
Mètode d'avaluació
Let DBM = Databases midterm exam grade,DBF = Databases final exam grade,
DBEx = Database exercises to be solved during the course,
ADSM = Algorithms and Data Structures midterm exam grade,
ADSF = Algorithms and Data Structures final exam grade and
ADSEx = Algorithms and Data Structures exercises to be solved in groups during the course,
ADSPr = Programming problems to be solved individually during the course,
DS-EtE = DS-EtE project grade
Then,
1) If the student followed track 1 (see methodology) the mark is calculated as follows:
BD = MAX (0.2*DBEx + 0.8*DBM, DBF),
ADS = ADSPr*0.3 + ADSEx*0.1 + MAX(0.2*ADSM + 0.4*ADSF, 0.6*ADSF)
ADSDB (final course mark) = 0.4*BD + 0.6*ADS
2) If the student followed track 2 (see methodology) the mark is calculated as follows:
ADSDB (final course mark) = DS-EtE
Bibliografia
Bàsic
-
Introduction to algorithms
- Cormen, T.H. [et al.],
MIT Press,
[2022].
ISBN: 9780262046305
https://research-ebsco-com.recursos.biblioteca.upc.edu/c/ik5pvi/search/details/lq2sgumirf?db=nlebk -
Data structures and algorithms in Python
- Goodrich, M.T.; Tamassia, R.; Goldwasser, M.H,
John Wiley & Sons,
2013.
ISBN: 9781118290279
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004196419706711&context=L&vid=34CSUC_UPC:VU1&lang=ca -
Data Structures and Algorithms with Python
- Lee, K.D.; Hubbard, S,
Springer,
2014.
ISBN: 9783319130712
http://cataleg.upc.edu/record=b1510089~S1*cat -
Computers and intractability: a guide to the theory of NP-Completeness
- Garey, M.R.; Johnson, D.S,
W.H. Freeman,
1979.
ISBN: 0716710447
http://cataleg.upc.edu/record=b1008002~S1*cat -
Problem solving with C++
- Savitch, Walter J; Mock, Kenrick,
Pearson,
[2018].
ISBN: 9780134448282
https://discovery.upc.edu/discovery/fulldisplay?docid=alma991004158189706711&context=L&vid=34CSUC_UPC:VU1&lang=ca -
Bases de dades
- Sistac, J.; Camps, R,
UOC,
2005.
ISBN: 8497883349
http://cataleg.upc.edu/record=b1300927~S1*cat -
Tècniques avançades de bases de dades
- Sistac i Planas, J. (coord.),
EDIUOC,
2000.
ISBN: 8484291065
http://cataleg.upc.edu/record=b1170684~S1*cat -
SQL: 1999: understanding relational, language components
- Melton, J.; Simon, A.R,
Morgan Kaufmann,
2002.
ISBN: 1558604561
http://cataleg.upc.edu/record=b1193378~S1*cat -
Database management systems
- Ramakrishnan, R.; Gehrke, J,
McGraw-Hill,
2003.
ISBN: 0071151109
http://cataleg.upc.edu/record=b1267022~S1*cat -
Database modeling and design: logical design
- Teorey, T.J. [et al.],
Morgan Kaufmann Publishers/Elsevier,
2011.
ISBN: 9780123820204
http://cataleg.upc.edu/record=b1431485~S1*cat -
Physical database design: the database professional's guide to exploiting indexes, views, storage, and more
- Lightstone, S.; Teorey, T.J.; Nadeau, T,
Morgan Kaufmann Publishers,
2007.
ISBN: 9780123693891
http://cataleg.upc.edu/record=b1310453~S1*cat -
Database systems: the complete book
- Garcia-Molina, H.; Ullman, J.D.; Widom, J,
Pearson Education,
2009.
ISBN: 9780131873254
http://cataleg.upc.edu/record=b1346544~S1*cat
Complementari
-
Introduction to algorithms: a creative approach
- Manber, U,
Addison-Wesley,
1989.
ISBN: 0201120372
http://cataleg.upc.edu/record=b1071970~S1*cat -
SQL-99 complete, really
- Gulutzan, P.; Pelzer, T,
R & D books,
1999.
ISBN: 0879305681
http://cataleg.upc.edu/record=b1181510~S1*cat -
Understanding SQL and Java together: a guide to SQLJ, JDBC, and related technologies
- Melton, J.; Eisenberg, A,
Morgan Kaufmann Publishers,
2000.
ISBN: 1558605622
http://cataleg.upc.edu/record=b1173373~S1*cat -
Encyclopedia of database systems
- Liu, L.; Özsu, M.T,
Springer,
2009.
ISBN: 9780387399409
http://cataleg.upc.edu/record=b1377257~S1*cat -
Cost-based oracle fundamentals
- Lewis, J,
Apress,
2006.
ISBN: 9781590596364
http://cataleg.upc.edu/record=b1327951~S1*cat
Web links
- Atenea http://atenea.upc.edu
- LEARN-SQL http://learnsql.fib.upc.edu