PATTERN RECOGNITION AND CLASSIFICATION
Lectures in Italian, books in English
The aim is to provide basics of probability theory that drives through resolution of classification/clustering problems in pattern recognition. Theoretical problem will be discussed, and student will learn how to implement algorithms for classification and machine learning on bigdata.
Knowledge and understanding: the student must demonstrate knowledge and understanding of the theoretical and practical fundamentals of problems of pattern recognition for classification and clustering, with focus on automatic learning and feature extraction from data.
Ability to apply knowledge and understanding: the student must demonstrate the ability to use their acquired knowledge to analyse complex and big datasets with the goal of being able to implement and customise the most appropriate and recent techniques and algorithms for knowledge extraction.
Autonomy of judgement: the student should be able to assess independently the challenges of heterogeneous and big datasets and being able to detect and analyse strengths and limitations of known machine learning algorithms to solve classification/clustering problems.
Communication skills: the student should be able to discuss with scientific rigor and adequate terminology the complex theoretical topics of machine learning techniques as well as being able to evaluate with criticism the experimental results on real case studies.
Learning skills: students must be able to update and autonomously deepen the recent and emerging proposals in machine learning algoritms and deep learning in the scientific literature. He/She will be also able to evaluate the lectures of the course and set up a personal perspective of the state-of-the-art.
A good knowledge of Image Processing is required to understand the problems of classification and clustering on images.
A good knowledge of probability theory and mathematical analysis is strongly required to understand techniques and algorithm discussed during the course.
The extended program of the course is organised in the following lessons:
• Bayesian theory of decision: probability theory, prior/a-posteriori probability, maximum a posteriori probability MAP.
• Discriminant Functions, univariate/multivariate Gaussian density, Normal density, linear transformations
• Non-Parametric methods: non-parametric density estimation, histogram method, K-nearest neighbours (NKK) and Parzen Window, examples and exercises.
• Unsupervised learning and Clustering: definition of identifiability and mixture of densities, maximum likelihood estimation (MLE), mixture of Gaussians (MOG), definition of cluster, squared error partitioning, hierarchical clustering: dendrogram, single linking clustering and full-linking clustering, graph-theoretic clustering, examples and exercises.
• Principal Components Analysis (PCA): dimensionality reduction techniques, geometrical representation of PCA, algebraic definition of PCA, examples of use of PCA, practical examples and exercises.
• Support Vector Machine (SVM): non-linear classification, mathematical definition of SVM, kernel-trick, examples and exercise on SVM.
• Expectation-Maximization Algorithm (EM): definition and discussion of EM algorithm, the practical use of EM algorithm for MOG, examples.
• Supervised and Unsupervised learning: Perceptron, Multilayer Perceptron network (MLP), Self-Organising Maps (SOM), examples and exercises.
• Deep Learning: Restricted Boltzman Machine and Deep Belief Network (DBN), Convolutional neural network (CNN), examples and exercises.
Teaching provides the fundamentals of machine learning techniques and patter recognition starting from the Bayesian theory of the decision, going through parametric and non-parametric methods, supervised/unsupervised learning, classification/clustering, up to neural networks and deep learning.
Theaching consists in a total of 72-hours lectures (9 CFU) of frontal lessons, divided in a module for theorerial presentation of the algorithms and solutions (6CFU) and second module for laboratory activities for 3CFU. The teaching takes place in the classroom or laboratory and it is organised in 3-hours lessons according to the academic calendar. The frontal lessons consist in theoretical and practical lectures as well as exercise sessions given from the teacher according to the topics of course. Theoretical lessons are meant to transfer the student the knowledge of fundamentals of the decision theory and parametric/non-parametric methods for the estimation of distribution of probability of big datasets. They also consider the challenges and open issues of classification/clustering problems on data as well as techniques of feature extraction and data mining for knowledge extraction. Evolved and emerging techniques for automatic learning on data conclude the course, with an accurate discussion of Hebbian theory of learning and MLP neural networks up to deep learning with feed-forward neural network and backpropagation, convolutional neural networks (CNN), Boltzman Machine and Deep Belief network (DBN). Laboratory lessons are meant to retrace the theoretical topics to provide feasible implementation of algorithms and analyse strengths and limitations on different datasets available online. During the laboratory part, exercises lessons are provided to the students. They are collegial in nature, take place in the laboratory and are given by the teacher who proposes solutions to practical exercises meant to verify the adoption and implementation of the theoretical topics presented in the previous lessons. The resolution of such exercises allows the students to verify his/her understanding of the theoretical concepts and his/her ability to propose alternative implementations
The attendance is strongly encouraged, although it is optional. The exam is the same for all student, no matter of the rate of the attendance of the lessons.
Pattern Classification, 2nd Edition, Richard O. Duda, Peter E. Hart, David G. Stork, ISBN: 978-0-471-05669-0, Wiley-Interscience, 2000
Project and Final oral exam.
The exam consists in a project assigned to each student individually. The student must demonstrate of being able to read and understand a scientific article from the literature on one of the topics of the course. The student, by preparing a report, must be able to describe and discuss with criticism and scientific rigor the assigned article. Moreover, he/she must provide a feasible and reliable implementation of the algorithm and replicate the experiments discussed in the paper as well as to try to widen the analysis of the results with his/her own experimentation. At this stage, the student can freely arrange his work and schedule, no deadlines are provided to the delivery of the project.
The student accesses to the second and final part of the exam, the oral test, just when he/she believes that the project is ready to be discussed. Thus, the student book a reservation for the incoming exam session. The oral test is organised in two separate parts. In the first part the student presents and discuss the project (possibly with the support of few slides) to the teacher. On completion of such a first stage, the oral test continues with some questions about the theoretical topics of the course. The teacher will evaluate the maturity of the student and his/her ability of providing answers with appropriate terms and technical/scientific rigor.
The final score will consider of both the accuracy of the presentation of the project and the quality of the oral test. The maximum score assigned is 30 and the minimum to pass the exam is 18.
The learning material is provided, in English, through the e-learning platform at the following weblink http://e-scienzeetecnologie.uniparthenope.it. To access the published material, the student must be regularly registered at the platform and subscribed at the course.
Appointments are taken by contacting the teacher by email.