Draft Schedule¶

Recording of lectures are accessible here.

Week 1
1. 9/2: Intro Recording
  1. syllabus
  2. juypter book
  3. review survey
  4. about me and my research and a preview for the course
Week 2
1. 9/9: Basic prob theory Recording
  1. Random Variables
    1. Probability space
  2. Probability Mass and Density functions
  3. Conditional Probability
  4. Bayes Theorem
    1. Quantifying prior odds via betting
    2. Incoherent beliefs
  5. Axioms of probability
  6. Examples
Week 3
1. 9/14: Class Recording
  1. Conditional probability for continuous variables
    1. Chain rule of probability
    2. Sneak peek at graphical models
    3. The Drake equation
    4. Phosphine on Venus and Bayes Theorem
  2. Marginal Distributions
  3. Independence
  4. Emperical Distribution
  5. Expectation
  6. Variance, Covariance, Correlation
  7. Mutual Information
  8. Simple Data Exploration
2. 9/16: Class Recording
  1. Likelihood
  2. Change of variables
  3. Demo change of variables with autodiff
  4. Independence and correlation
  5. Conditioning
  6. Autoregressive Expansion
  7. Graphical Models
Week 4
1. 9/21: Recording
  1. Change of variables formula
  2. Probability Integral Transform
  3. Intro to automatic differentiation
    1. Demo with automatic differentiation
  4. Transformation properties of the likelihood
  5. Transformation properties of the MLE
  6. Transformation properties of the prior and posterior
  7. Transformation properties of the MAP
2. 9/23: Estimators
  1. Skipped material from last lecture
    1. Lorentz-invariant phase space
    2. Normalizing Flows
    3. Copula
  2. Bias, Variance, and Mean Squared Error
  3. Simple Examples: Poisson and Gaussian
  4. Cramer-Rao bound & Information Matrix
  5. Bias-Variance tradeoff
    1. James-Stein Demo
    2. Shrinkage
  6. HW:
    1. James Stein
Week 5
1. 9/28 (Yom Kippur): Random Numbers Recording
  1. Decision Theory
    1. Admissible decision rule
    2. generalized decision rules (“for some prior”)
  2. Consistency
  3. Sufficiency
  4. Exponential Family
  5. Score Statistic
  6. Information Matrix
    1. Information Geometry
    2. Transformation properties of Information Matrix
    3. Jeffreys’ prior
      1. Transformation properties
    4. Reference Prior
    5. Sensitivity analysis
    6. likelihood principle
2. 9/30: Lecture 8: Consistency and homework
  1. Neyman Scott phenomena (an example of inconsistent MLE)
    1. Note: Elizabeth Scott was an astronomer by background. In 1957 Scott noted a bias in the observation of galaxy clusters. She noticed that for an observer to find a very distant cluster, it must contain brighter-than-normal galaxies and must also contain a large number of galaxies. She proposed a correction formula to adjust for (what came to be known as) the Scott effect.
    2. Note: Revisiting the Neyman-Scott model: an Inconsistent MLE or an Ill-defined Model?
  2. walk through of nbgrader and home work assignment
Week 6
1. 10/5: Lecture 9: Propagaion of Errors
  1. a simple example from physics 1: estimating \(g\)
  2. Change of variables vs. Error propagation
  3. Demo Error propagation fails
  4. Error propagation and Marginalization
  5. Convolution
  6. Central Limit Theorem
  7. Error propagation with correlation
    1. track example
2. 10/7: Lecture 10: Likelihood-based modeling
  1. Building a probabilistic model for simple physics 1 example
  2. Connection of MLE to traditional algebraic estimator
  3. Connection to least squares regression
Week 7
1. 10/12 Lecture 11: Sampling
  1. Motiving examples:
    1. Estimating high dimensional integrals and expectations
    2. Bayesian credible intervals
    3. Marginals are trivial with samples
  2. Generating Random numbers
    1. Scipy distributions
  3. Probability Integral Transform
  4. Accept-Reject MC
    1. Acceptance and efficiency
    2. native python loops vs. numpy broadcasting
  5. Importance Sampling & Unweighting
    1. Vegas
  6. Connetion to Bayesian Credible Intervals
  7. Metropolis Hastings MCMC
    1. Proposal functions
  8. Hamiltonian Monte Carlo
    1. Excerpts from A Conceptual Introduction to Hamiltonian Monte Carlo by Michael Betancourt
    2. Stan and PyMC3
2. 10/14: Lecture 12: Hypothesis Testing and Confidence Intervals
  1. Simple vs. Compound hypotheses
  2. TypeI and TypeII error
  3. critical / acceptance region
  4. Neyman-Pearson Lemma
  5. Test statistics
  6. Confidence Intervals
  7. Interpretation
  8. Coverage
  9. Power
  10. No UMPU Tests
  11. Neyman-Construction
  12. Likelihood-Ratio tests
  13. Connection to binary classification
    1. prior and domain shift
Week 8
1. 10/19: Lecture 13:
  1. Simple vs. Compound hypotheses
    1. Nuisance Parameters
  2. Profile likelihood
  3. Profile construction
  4. Pivotal quantity
  5. Asymptotic Properties of Likelihood Ratio
    1. Wilks
    2. Wald
2. 10/21 Canceled
Week 9
1. 10/26: Lecture 14
  1. Upper Limits, Lower Limits, Central Limits, Discovery
  2. Power, Expected Limits, Bands
  3. Sensitivity Problem for uppper limits
    1. CLs
    2. power-constrained limits
2. 10/28: Lecture 15 flip-flopping, multiple testing
  1. flip flopping
  2. multiple testing
    1. look elsewhere effect
    2. Familywise error rate
    3. False Discovery Rate
    4. Hypothesis testing when nuisance parameter is present only under the alternative
      1. Asymptotics, Daves, Gross and Vitells
Week 10
1. 11/2 Lecture 16 Combinations, probabilistic modelling languages, probabilistic programming
  1. Combinations
    1. Combining p-values
    2. combining posteriors
    3. likelihood-based combinations
    4. likelihood publishing
  2. probabilistic modelling languages
    1. computational graphs
  3. Probabilistic Programming
    1. First order PPLs
      1. Stan
    2. Universal Probabilistic Programming
      1. pyro
      2. pyprob and ppx
      3. Inference compilation
2. 11/4 Lecture 17: Goodness of fit
  1. conceptual framing
  2. difference to hypothesis testing
  3. chi-square test
  4. Kolmogorov-Smirnov
  5. Anderson-Darling
  6. Zhang’s tests
  7. Bayesian Information Criteria
  8. software
  9. anomaly detection
Week 11
1. 11/9: Lecture 18 Intro to machine learning
  1. Supervised Learning
  2. Statistical Learning Theory
    1. Loss, Risk, Emperical Risk
    2. Generalization
    3. VC dimension and Emperical risk minimization
    4. No Free Lunch
  3. Cross-validation test/train
    1. Preview: the mystery of deep learning
  4. Least Squares
  5. Regularized least squares
  6. Bayesian Curve fitting
  7. Bias-Variance tradeoff
2. 11/11 Lecture 19
  1. Generalization
  2. Loss functions for regression
  3. loss function for classification
  4. Information theory background
    1. Entropy
    2. Mutual information
    3. cross entropy
    4. Relative Entropy
Week 12
1. 11/16: Lecture 20 Density Estimation, Deep Generative Models
  1. Unsupervised learning
  2. Loss functions for density estimation
    1. Divergences
      1. KL Divergence
      2. Fisher distance
      3. Optimal Transport
      4. Hellinger distance
      5. f-divergences
      6. Stein divergence
  3. Maximum likelihood (Forward KL)
    1. can approximate with samples, don’t need target distribution
  4. Variational Inference (Reverse KL)
    1. Connecton to statistical physics
    2. LDA (Topic Modelling)
    3. BBVI
  5. Deep Generative models
    1. Normalizing Flows intro
    2. background on auto-encoders
    3. Variational Auto-encoder intro
2. 11/18: Lecture 21 Deep Generative Models
  1. Deep Generative models comparison
    1. Normalizing Flows
    2. Autoregresive models
    3. Variational Auto-encoder
    4. GANs
Week 13
1. 11/23: Lecture 22 The data manifold
  1. what is it, why is it there
    1. in real data
    2. in GANs etc.
  2. How it complicates distances based on likelihood ratios
  3. Optimal transport
2. 11/25 Lecture 23 Optimization
  1. Gradient descent
  2. Momentum, Adam
  3. Differences of likelihood fits in classical statistics and loss landscape of deep learning models
  4. stochastic gradient descent and mini-batching intro
    1. what is it
Week 14
1. 11/30: Lecture 23 Stochastic gradient descent
  1. Robbins-Monro
  2. connection to Langevin dynamics and approximate Bayesian inference
2. 12/2: Lecture 24 Implicit bias and regularization in learning algorithms
  1. dynamics of gradient descent
  2. Double descent
Week 15
1. 12/7 Lecture 25 Deep Learning
  1. Loss landscape
    1. random matrix theory
    2. connection to statistical mechanics
  2. Deep Model Zoo
    1. MLP
    2. Convolutions
    3. Sequence Models: RNN and Tree RNN
      1. vanishing and exploding gradients
    4. Graph Networks
    5. Transformers
    6. images, sets, sequences, graphs, hyper-graphs
    7. DL and functional programming
    8. Differentiable programming
2. 12/9: Review
  1. Review

Statistics and Data Science

Draft Schedule¶

Other topics that we touched on or planned to touch on.¶