Unit SIGNAL PROCESSING AND OPTIMIZATION FOR BIG-DATA

Course

Computer engineering and robotics

Study-unit Code

A001256

Curriculum

Data science

Teacher

Paolo Banelli

Teachers

Paolo Banelli

Hours

72 ore - Paolo Banelli

CFU

Course Regulation

Coorte 2022

Offered

2023/24

Learning activities

Affine/integrativa

Area

Attività formative affini o integrative

Academic discipline

ING-INF/03

Type of study-unit

Obbligatorio (Required)

Type of learning activities

Attività formativa monodisciplinare

Language of instruction

Italian

Contents

- RECALLS
of STATISTICAL SIGNAL PROCESSING BASICS
-FUNDAMENTALS of CONVEX OPTIMIZATION
- BIG-DATA REDUCTION and SAMPLING
- GRAPH-BASED SIGNAL/DATA PROCESSING
- DISTRIBUTED OPTIMIZATION AND SIGNAL PROCESSING for LEARNING over NETWORKS

Reference texts

Most of the class content will be inspired to some chapters and paragraphs of these books:
- S.Kay, Fundamentals of Statistical Signal Processing, Vol. I & II, Prentice Hall, 1993-1998;
- S. Theodoridis, Machine Learning: A Bayesian and optimization perspective.
- T. Hastie, et. al., The Elements of Statistical Learning: data Mining, Inference, and Prediction
- M. E. J Newman, Networks an Introduction- S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004;
- S. Boyd et al., Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, Foundations and Trends in Machine Learning, 3(1):1–122, 2011- Furthermore some notes of the teacher will be available.

Educational objectives

Understanding and applying the basics of statistical inference and convex optimization to (big)-data analytics. Understanding the concept of data-reduction/sampling and conditions under which statistical inference and reconstruction of the information does not suffer too much by reduction/sampling. Extend the knowledge of classical signal processing to signals defined over graphs, which is a natural representation of big-data either dependent on their distribution over a network, or on their statistical similarity, or both. Understand the methodological tools to distribute complex statistical inference on parallel and distributed agents (computers, etc.) as a way to empower statistical inference on big-data, possibly geographically or logically distributed over a network. Learning from observed data the topological structure that characterizes their generation and evolution.

Prerequisites

Mandatory: Calculus, Linear Algebra, Random Variables and Stochastic processes, Fourier Analysis, Digital signal processing.Suggested: Machine Learning and Data Mining. Useful: Estimation and Detection Theory (Statistical Inference)

Teaching methods

The class will be given face-to-face by the lecturer with the aid of computer-slides. Furthermore some of the algorithms will be also implements by PC-based simulations, interactively with the students.

Other information

Learning verification modality

1) Short Thesis on a topic related to the class content, with computer aided simulations. To be given 1 week before the oral exam.
2) Oral Exam: Discussion of the Thesis plus typically 2 questions.

Extended program

- Part I: RECALLS on BASICS OF STATISTICAL INFERENCE AND LEARNING (6 hours)
Recalls on estimators, frequentist and Bayesian, performance indicators and common estimators (MVUE, MLE, MMSE, LS, etc.)
Recalss on binary hypothesis testing: likelihood ratio test (LRT), Neyman-Pearson and Bayesian perspectives (Minimum error probability, MAP, Bayes Risk).
Statistical learning and relationship with machine-learning: linear regression, K-means, etc.

- Part II: FUNDAMENTALS OF (DISTRIBUTED) CONVEX OPTIMIZATION (15 hours ) Basics of convex optimization: Convex sets, convex functions, convex optimization problems; Duality theory: Lagrange dual problem, Slater's constraint qualifications, KKT conditions; Optimization algorithms: Primal methods (steepest descent, gradient projection, Newton method), primal-dual methods (dual ascent, alternating direction method of multipliers);Examples of applications: Approximation and fitting, statistical estimation and detection, adaptive filtering, supervised and unsupervised learning from data;
Distributed optimization: Consensus and sharing; Distributed optimization: Primal and primal-dual methods;

- Part III: BIG-DATA REDUCTION (9 hours) Compressed Sampling/Sensing and reconstruction. Statistical Inference by Sparse Sensing, Classification by Principal Component Analysis, Canonical Correlation Analysis, and Information Bottleneck.

- Part IV: GRAPH-BASED SIGNAL PROCESSING (15 hours) Signals on graph: motivating examples; algebraic graph theory, graph features; signal processing on graphs: Fourier Transform, smoothing, sampling, and data compression on graph;

- Part V: DISTRIBUTED OPTIMIZATION, SIGNAL PROCESSING, and LEARNING over NETWORKS 27 hours) Average consensus: Theory and algorithms; Distributed signal processing: Estimation and detection; Distributed signal processing: LMS, RLS and Kalman Filtering on Graphs. Distributed supervised learning (LASSO, SVM, Logistic Regression) Distributed unsupervised learning: Dictionary, learning and data clustering: learning of eigenvector and eigenvalues of Laplacian matrices. Graph learning: Gaussian Markov Random Fields and Graphical LASSO, Smoothness and Total Variation approaches, Gaussian processes for directed causal inference.
Matrix Completion algorithms.