Math 110.773 - Topics in Data Science
Probabilistic theory for time series learning

Instructor:     Fei Lu [ feilu##   ( ## = @math.jhu.edu) ]
Class meets:  TuTr,10:30-11:45,Maryland 202
Office Hours: TuTr,10-10:30,11:45--12:30,   Krieger 218  

Textbooks: see more in course Plan .
CS02: S0Felipe Cucker and Steve Smale. On the mathematical foundations of learning. Bulletin of the American Mathematical Society, 39(1):1--49, 2002.
DGL96: Luc Devroye, Laszlo Gyorfi, and Gabor Lugosi, A probability theory of pattern recognition
Mur22intro, Mur22adv: Kevin Murphy, Machine learning: a probabilistic perspective. Introduction and Advanced Topics.
BN06: Christopher M Bishop and Nasser M Nasrabadi. Pattern recognition and machine learning, volume 4. Springer, 2006.
SF12: Robert E. Schapire and Yoav Freund. Boosting: Foundations and Algorithms. MIT Press, 2012. Open Access

Syllabus: This course provides an introduction to three topics: learning kernels in operators arising from interacting particle systems, probability theory for time series classification, and time series modeling with neural networks. The course focuses on modeling with time series data by combining statistical/machine learning theory with dynamical systems. The underlying theme is a probabilistic perspective of learning, viewing the time series and dynamical systems as descriptions of stochastic processes.

Grading: class participation   (60%),  presentation (40%)



Tentative schedule (it will be updated weekly) :
week Topic
Nonparametric learning of kernels in operators
8/29,31 Plan     
Overview and review classical learning theory LecNote1
9/5,7 Review classical learning theory: LecNote1
Finite-many particles LecNote2
9/12,14 Coercivity condition: LecNote2
Mean-field equations: construction of loss function LecNote4
9/19,21 Mean-field equations: identifiability LecNote4
Regularization: DARTR LecNote5
9/26,28 Regularization: DARTR LecNote5
Small noise analysis LecNote6
10/3,5 Small noise analysis: proof LecNote6
Bayesian perspective and measure on infinite-D
Time series classification: Keras and TSC website
10/10,12 Probability theory for Pattern Recognition: DGL96, Chapter 1-2
10/17,19 Linear discrimiation: DGL96, Chapter 4
10/24,26 Nearest neighbor rules: DGL96, Chapter 5
10/31,11/2 Consistency: DGL96, Chapter 6
Boosting: slides of Rob Schapire Survey by Schapire, 2012; and XGBoost (Chuhuan Huang)
11/7,9 Convergence of AdaBoost: Boosting: Foundations and Algorithms, rate of convergence
TSC: ROCKET paper and code (Yantao Wu)
11/14,16 Random Forest
TSC: ResNet
Time series modeling
11/20 -24 No class. Thanksgiving break
11/28,30 PanguWeather in Nature: Li etc: Accurate medium-range global weather forecasting with 3D neural networks
GraphCast in Science: Lam etc: Learning skillful medium-range global weather forecasting      Deepmind blog
Transformer: Vaswani etc 17': Attention Is All You Need
12/5,7 Transformer for TS modeling: Geneva+Zabaras22: Transformer for modeling physical systems
Survey: Zeng etc2205: Are Transformers Effective for Time Series Forecasting? AAAI 2023
Neural ODEs Chen etc18: Neural Ordinary Differential Equations Deep Implicit Layers Tutorial
Neural CDEs Kidger+Morill+Foster+Lyons20: Neural Controlled Differential Equations for Irregular Time Series
12/12 Deep state-space model: Rangapuram etc18: Deep State Space Models for Time Series Forecasting
Deep SSM, nonlinear Gedon etc21: Deep State Space Models for Nonlinear System Identification
Structured state-space sequential model (S4): Gu+Goel+Saab+Ré: Structured State Spaces for Sequence Modeling (S4)