Unit STATISTICS FOR DATA SCIENCE WITH R AND PYTHON

Course
Quantitative finance and data science for economics
Study-unit Code
A003079
Location
PERUGIA
Curriculum
Data science for economics and finance
Teacher
Elena Stanghellini
CFU
12
Course Regulation
Coorte 2026
Offered
2026/27
Type of study-unit
Obbligatorio (Required)
Type of learning activities
Attività formativa integrata

Modulo I Generalized linear models

Code A003092
Location PERUGIA
CFU 6
Teacher Simone Del Sarto
Teachers
  • Simone Del Sarto
Hours
  • 42 ore - Simone Del Sarto
Learning activities Caratterizzante
Area Discipline matematiche, statistiche, informatiche
Sector STAT-01/A
Type of study-unit Obbligatorio (Required)
Language of instruction English
Contents Recalls of probability and statistical inference; maximum likelihood theory; simple and multiple linear regression models; method of least squares; model diagnostics; inclusion of categorical explanatory variables and analysis of variance; introduction to generalised linear models; mention of logistic regression model; Poisson model for count data; numerical methods for maximum likelihood estimation of generalised linear models.
Reference texts Alan Agresti, Maria Kateri (2021): Foundations of Statistics for Data Scientists (with R and Python). CRC Press, Chapman & Hall. ISBN: 9781003159834
Educational objectives Students will learn the tools for correctly formulating statistical models used for the main types of response variables, learning how to estimate them and draw inferential conclusions based on observed data. The course also aims to illustrate basic diagnostic techniques for model selection, while conveying the guiding principles of statistical modelling (which often go beyond technicalities).
Prerequisites Basic knowledge of univariate and bivariate descriptive statistics, probability theory (main random variables and their mass/ probability density functions, expected values, variances etc.) and inferential statistics (point estimation, confidence intervals, hypothesis testing).
Teaching methods Frontal theoretical lectures, practical sessions with the use of suitable software.
Other information
Learning verification modality Oral examination with questions on theory topics; analysis and commentary on software output with estimation of models covered in the course.
Extended program Recalls of probability and statistical inference: main random variables and their moments. Properties of estimators, confidence intervals and hypothesis tests for means and proportions. Likelihood theory: definition of the likelihood function and estimation of parameters through its maximisation. Properties and examples for the parameters of the main distributions. Hints at bootstrap resampling methods. Likelihood ratio test and Wald test. Simple linear regression model: parameter estimation by least squares method, standard error estimation, interpretation of effects, model diagnostics and goodness of fit. Relationship between regression analysis and linear correlation. Multiple linear regression model: parameter estimation and standard errors, interpretation of effects. Proper specification of the functional form of the model: higher-order effects and interactions. Diagnostic analysis: checking the assumptions underlying the model and remedies for possible violations. Inference on the linear model: F-tests and t-tests for global and local significance. Introduction of categorical explanatory variables and analysis of variance tests. Matrix formulation of linear models. Generalised linear models: introduction of the three key components and specification for the major distributions: Normal, Binomial and Poisson. Model deviance and test of the likelihood ratio. Model selection. Poisson model for count data. Numerical methods for estimating the parameters of a generalised linear model: Newton-Raphson algorithm and Fisher scoring.
Obiettivi Agenda 2030 per lo sviluppo sostenibile

Credit scoring

Code A003093
Location PERUGIA
CFU 6
Teacher Elena Stanghellini
Teachers
  • Elena Stanghellini
Hours
  • 42 ore - Elena Stanghellini
Learning activities Caratterizzante
Area Discipline matematiche, statistiche, informatiche
Sector STAT-01/A
Type of study-unit Obbligatorio (Required)