PARALLEL AND DISTRIBUTED COMPUTING
The course introduces to the use of the methodologies, techniques and tools to develop algorithms and software for parallel computing environments (high-performance computing HPC).
For laboratory activity, the course involves the use of C/C++ programming language and an introduction to standard parallel computing libraries (MPI and OpenMP), for using them to parallel software development in the different high performance environments, such as clusters of multiprocessors and/or multicore CPU.
Knowledge and understanding: the student must demonstrate knowledge of the fundamentals of parallel and distributed computing, particulary with regard to the different forms of hardware and software parallelism and the parallel strategies for some basic computational kernels of programming.
Ability to apply knowledge and understanding: the student must demonstrate how to use the parallel strategies studied and the standard libraries available to develop algorithms in a high-performance environment, leveraging the knowledge of parallel software evaluation parameters and the kind of hardware available.
Autonomy of judgement: the student be able to know independently how to evaluate the results of a parallel algorithm by analyzing the speedup and software efficiency.
Communication skills: the student should be able to illustrate a parallel algorithm and document its implementation in a high-performance environment.
Learning skills: the student must be able to update and deepen topics and specific applications of numerical computing, even accessing databases, on-line scientific software repositories and other tools available on the web.
The attendant student must have acquired knowledge and skills transmitted in the following courses: Mathematics 1, Computer Programming 1 with Labs, Operating Systems and laboratory Operating Systems, Algorithms and Data Structures and laboratory Algorithms and Data Structures.
Introduction to high-performance computing: definition, motivation and evolution of supercomputers - Kinds of parallelism: temporal, spatial and asynchronous. First kind of parallelism on chip - pipelined units and vector processors - Flynn classification - MIMD shared memory (SM) and MIMD distributed memory (DM) - Interconnection networks - Second type of parallelism on chip - Cluster and multicore architectures - Basic issues and differences between parallel computing and distributed computing concepts - Parameter evaluation of parallel software: Speed-up, overhead and efficiency of a parallel algorithm, Ware-Amdahl (basic and generalized formulation), communication overhead (unitary and total), scaling speedup and efficiency, isofficiency, scalability, Gustaffson's law - Summation in parallel: decomposition problem for strategies I, II and III (in MIMD Shared-Memory and MIMD Distribuited-Memory environment) - matrix for vector product: decomposition problem (block algorithms) for strategies I, II and III (in MIMD Shared-Memory and MIMD Distribuited-Memory environment) - MPI (Message Passing Interface): Message-passing template: basic features and functions, major routines for process management and communication (functions for defining the environment, for one-to-one communications, for group communications) - Virtual topologies: Processor grids - OpenMp (Open specifications for Multi Processing): processes and threads, synchronization and semaphore, fork-join parallel execution model, compiler directives, constructs and clauses, runtime library routines and environment variables - Writing, compiling and running programs that use the MPI library and the OpenMp library employing the C/C++ language.
L. Marcellino: “Richiami di Calcolo Parallelo 1”, piattaforma di e-learning del Dipartimento di Scienze e Tecnologie, 2014.
A. Grama, G. Karypis, V. Kumar, A. Gupta: “Introduction to Parallel Computing (2nd Edition)”, Ed. Addison Wesley, 2003.
All lessons are available as slides (in pdf format) on the e-learning platform of the Department of Science and Technology, together with self-assessment exercises, libraries manuals, exams, recent papers on the most innovative parallel topics.
The goal of the verification procedure is to quantify, for each student, the degree of achievement of the learning objectives listed above. To be specific, the exam consists of a laboratory test that verifies the ability to implement a simple high-performance computing program (30% of the vote), a written test for assessing the knowledge of parallel strategies for the basic kernel of linear algebra computation (40% of the vote), an oral test to examine the analytical capacity of a parallel software in terms of efficiency (30% of the vote).