Code |
A005434 |
CFU |
3 |
Teacher |
Andrea Capotorti |
Teachers |
- Irene Benedetti (Codocenza)
|
Hours |
- 21 ore (Codocenza) - Irene Benedetti
|
Learning activities |
Affine/integrativa |
Area |
Attività formative affini o integrative |
Academic discipline |
MAT/06 |
Type of study-unit |
Obbligatorio (Required) |
Language of instruction |
Italian |
Contents |
Notions and techniques of Probability. Random walks, Markov chains. Stationary Processes, |
Reference texts |
J. Jacod and P. Protter, Probability Essentials, Springer-verlag Berlin And Heidelberg Gmbh & Co. Kg, 2004. Grimmett-Stirzaker: Probability and Random Processes; Clarendon Press, Oxford (1982). |
Educational objectives |
Deepen the fundamental concepts of Probability Theory, with particular attention to the main results concerning the different types of convergence of sequences of random variables, characteristic functions, and classical limit theorems. General knowledge of the main stochastic processes and mastery of investigative methods. The student is expected to be able to present, connect, and compare the main concepts and results covered in the course, as well as to prove the fundamental theorems included in the exam syllabus. They should also be able to solve problems, following the approach illustrated by the exercises carried out in class. |
Prerequisites |
To better understand the course and pass the exam, the formative goals of the course Probability and Statistics I are mandatory. |
Teaching methods |
Lectures on all the topics of the program; public resolution of exercises to train students to face explicit problems and to explain them. |
Other information |
1) Attendance: optional. 2) For students with Specific Learning Disorders and/or Disabilities please refer to the web page: http://www.unipg.it/disabilita-e-dsa |
Learning verification modality |
The evaluation includes an oral exam aimed to ascertain the knowledge level and the understanding capability acquired by the student on theoretical and methodological contents as indicated on the program. The oral exam will also test the student presentation skills and her/his autonomy in the organization and exposure of the theoretical topics. For info on support services for students with Specific Learning Disorders and/or Disabilities please refer to the web page: http://www.unipg.it/disabilita-e-dsa |
Extended program |
Convergence of sequences of random variables: convergence in distribution, convergence in probability, convergence in ¿ r-th mean, almost sure convergence. Characteristic functions: definition, properties, relationship with convergence in distribution, continuity theorems and applications. Classical limit theorems: laws of large numbers and the central limit theorem. Random walks: distributions, first passage or return times, reflection principle and some consequences concerning occupation times. Markov chains: transition matrices, recurrent and transient states, classification of states. Stationary distributions and their connection with mean recurrence times. Consequences for random walks. Stationary processes, ergodic theorem and some of its consequences. |
Code |
A005435 |
CFU |
6 |
Teacher |
Andrea Capotorti |
Teachers |
|
Hours |
- 42 ore - Andrea Capotorti
|
Learning activities |
Affine/integrativa |
Area |
Attività formative affini o integrative |
Academic discipline |
MAT/06 |
Type of study-unit |
Obbligatorio (Required) |
Language of instruction |
Italian |
Contents |
Mathematical insights into advanced statistical methods in Statistical and Machine Learning, both in the case of supervised learning (classification and regression) and unsupervised learning (cluster analysis, dimensionality reduction). |
Reference texts |
Hastie T., Tibshirani R., Friedman J. The Elements of Statistical Learning Data Mining,Inference,and Prediction (available on-line at https://hastie.su.domains/ElemStatLearn/download.html) James G., Witten D., Hastie T., Tibshirani R. (2021) An Introduction to Statistical Learning with Applications in R, 2nd edition, Springer-Verlag (freely available at https://www.statlearning.com) Slides delle lezioni disponibili nella pagina UniStudium del corso. |
Educational objectives |
The course is a mathematical study of the main methods and techniques of Statistical and Machine Learning, both in the supervised (regression and classification) and unsupervised (clustering and dimensionality reduction) fields. The main knowledge acquired will be: • introductory concepts and specific statistical learning models; • evaluation of the predictive capacity of regression and classification models through resampling techniques. The main skills (i.e. the ability to apply the acquired knowledge) will be: • autonomously apply the appropriate methods and algorithms to real regression, classification and clustering problems; • analyze data using the R software for the estimation of supervised and unsupervised models. |
Prerequisites |
Knowledge of the main discrete and continuous statistical models, probability distributions and their properties, Bayes theorem, linear regression |
Teaching methods |
Frontal lessons and laboratory activities with R software |
Other information |
Attendance at classes is strongly recommended. For students with Specific Learning Disorders and/or Disabilities please refer to the we page: http://www.unipg.it/disabilita-e-dsa |
Learning verification modality |
Ongoing assessments and final oral exam. Laboratory activities are aimed at assessing the student's ability to put into practice the methodologies introduced in class. The final oral exam is intended to assess the level of knowledge and understanding achieved by the student regarding the computational and methodological aspects covered during the course. |
Extended program |
The course includes a methodological study of advanced statistical methods for Data Science, both in the case of supervised learning (classification and regression) and unsupervised learning (cluster analysis, dimensionality reduction). These methods have been successfully applied in many fields, from finance to economics, from business analytics to social and natural sciences. The methods covered will be introduced starting from real case studies and analyzed using the R software. In detail, the following topics will be covered: - Statistical and machine learning: introduction. - Forecasting vs interpretability. - Supervised vs unsupervised learning. - Classification vs regression. - Evaluation of the accuracy of a statistical model. - Supervised learning: introduction. - Extensions to the linear regression model: model selection and regularization. Polynomial regression. - Resampling methods: cross-validation and bootstrap. - Classification: introduction. - Logistic and multinomial model. - Linear and quadratic discriminant analysis. - Gaussian naive Bayes. - Gaussian finite mixture models. - K-nearest neighbor algorithm. - Advanced methods for regression and classification. - Generalized Additive Models. - Artificial neural networks. - Decision trees. - Bagging. - Random forests. - Boosting. - Unsupervised learning: introduction. - Principal component analysis. - Similarity and distance measures. Distance matrix. - Hierarchical methods for cluster analysis. - Non-hierarchical methods (k-means method). - Model-based clustering. |