Data Science Technology
The course is taught in English and the exam is in English too.
The course aims to develop the essential skills to optimize a relational Database and to give an overview of the main alternative technologies for data persitence related to the big data phenomenon, so to promote the informed choice of the most suitable system for each use case.
Most of the course is on physical Database optimization, and at the end the main related skills will be to read an execution plan and to decide when ad how indexing, multidimensional clustering, range partitioning or materialized views should be recommended in relation systems.
Concerning non relational systems, the student will be able to decide autonomously which data solution is the most adequate given the nature of the data and the requirements to be fulfilled.
Advanced features of PL/SQL will be presented, together with rudiment of NoSQL technology.
A project that adds advanced functionalities and the capability to store and manage non relational data through polyglot system is required to be developed and discussed.
Knowledge and understanding:
A good autonomy in managing and optimizing a relational DBMS, reading an execution plan and mastering PL/SQL is required.
Knowledge of fundamentals of NOSQL models (key values, document store, graph based, columnar) is required, so to be able to decide which technology
should be preferred and under which circumstances.
Ability to apply knowledge and understanding:
the ability to add non trivial functionalities and multiple solutions to an existing Database is expected.
the project must be rigidly documented with a precise language and UML diagrams.
Fundamental relational Database theory and practice acquired through the first Database course during the bachelor degree is necessary, while it is recommended familiarity with Algorithms and Data Structures.
Physical Design: Data Structures, indexing, execution plans; concurrency control; reliability; Active Databases; DDBMS and NoSQL; Big Data; Data Warehouses; Database Mining.
The course will treat in depth the physical design of relational DBMS, so to give to the students the ability to properly tune in terms of performance a relational database; furthermore it will give a broad overview of the most recent alternatives to relational systems explaining the proper use cases for each of them.
Taught classes with blackboard and chalk and practical exercises. There are no slides and active participation of students is encouraged.
Shamkant B. Navathe Ramez A. Elmasri. Fundamentals of Database Systems. Pearson, 7^th edition, 2016.
Tom Nadeau Sam Lightstone, Toby Teorey. Physical Database Design. Morgan Kaufmann, 2007.
Michael McLaughlin Scott Urman, Ron Hardman. PL/SQL programming. Oracle press, 2004.
References for some specific topics:
Joe Celko. Joe Celko’s complete guide to NoSQL. Morgan Kaufmann, 1 edition, 2014.
Annalisa Franco Dario Maio, Stefano Rizzi. Esercizi di progettazione di Basi di Dati. Progetto Leonardo, 2 edition, 2005.
Further educational material (videolessons of the whole theory course (in italian) and some pdf help files) is available on the e-learning platform.
Two midterm tests are offered to promote active participation of students to the course. The exam is in written form and requires the design through diagrams of an integrated data management system with various technologies.