Draft Schedule

Recording of lectures are accessible here.

  1. Week 1

    1. 9/2: Intro Recording

      1. syllabus

      2. juypter book

      3. review survey

      4. about me and my research and a preview for the course

  2. Week 2

    1. 9/9: Basic prob theory Recording

      1. Random Variables

        1. Probability space

      2. Probability Mass and Density functions

      3. Conditional Probability

      4. Bayes Theorem

        1. Quantifying prior odds via betting

        2. Incoherent beliefs

      5. Axioms of probability

      6. Examples

  3. Week 3

    1. 9/14: Class Recording

      1. Conditional probability for continuous variables

        1. Chain rule of probability

        2. Sneak peek at graphical models

        3. The Drake equation

        4. Phosphine on Venus and Bayes Theorem

      2. Marginal Distributions

      3. Independence

      4. Emperical Distribution

      5. Expectation

      6. Variance, Covariance, Correlation

      7. Mutual Information

      8. Simple Data Exploration

    2. 9/16: Class Recording

      1. Likelihood

      2. Change of variables

      3. Demo change of variables with autodiff

      4. Independence and correlation

      5. Conditioning

      6. Autoregressive Expansion

      7. Graphical Models

  4. Week 4

    1. 9/21: Recording

      1. Change of variables formula

      2. Probability Integral Transform

      3. Intro to automatic differentiation

        1. Demo with automatic differentiation

      4. Transformation properties of the likelihood

      5. Transformation properties of the MLE

      6. Transformation properties of the prior and posterior

      7. Transformation properties of the MAP

    2. 9/23: Estimators

      1. Skipped material from last lecture

        1. Lorentz-invariant phase space

        2. Normalizing Flows

        3. Copula

      2. Bias, Variance, and Mean Squared Error

      3. Simple Examples: Poisson and Gaussian

      4. Cramer-Rao bound & Information Matrix

      5. Bias-Variance tradeoff

        1. James-Stein Demo

        2. Shrinkage

      6. HW:

        1. James Stein

  5. Week 5

    1. 9/28 (Yom Kippur): Random Numbers Recording

      1. Decision Theory

        1. Admissible decision rule

        2. generalized decision rules (“for some prior”)

      2. Consistency

      3. Sufficiency

      4. Exponential Family

      5. Score Statistic

      6. Information Matrix

        1. Information Geometry

        2. Transformation properties of Information Matrix

        3. Jeffreys’ prior

          1. Transformation properties

        4. Reference Prior

        5. Sensitivity analysis

        6. likelihood principle

    2. 9/30: Lecture 8: Consistency and homework

      1. Neyman Scott phenomena (an example of inconsistent MLE)

        1. Note: Elizabeth Scott was an astronomer by background. In 1957 Scott noted a bias in the observation of galaxy clusters. She noticed that for an observer to find a very distant cluster, it must contain brighter-than-normal galaxies and must also contain a large number of galaxies. She proposed a correction formula to adjust for (what came to be known as) the Scott effect.

        2. Note: Revisiting the Neyman-Scott model: an Inconsistent MLE or an Ill-defined Model?

      2. walk through of nbgrader and home work assignment

  6. Week 6

    1. 10/5: Lecture 9: Propagaion of Errors

      1. a simple example from physics 1: estimating \(g\)

      2. Change of variables vs. Error propagation

      3. Demo Error propagation fails

      4. Error propagation and Marginalization

      5. Convolution

      6. Central Limit Theorem

      7. Error propagation with correlation

        1. track example

    2. 10/7: Lecture 10: Likelihood-based modeling

      1. Building a probabilistic model for simple physics 1 example

      2. Connection of MLE to traditional algebraic estimator

      3. Connection to least squares regression

  7. Week 7

    1. 10/12 Lecture 11: Sampling

      1. Motiving examples:

        1. Estimating high dimensional integrals and expectations

        2. Bayesian credible intervals

        3. Marginals are trivial with samples

      2. Generating Random numbers

        1. Scipy distributions

      3. Probability Integral Transform

      4. Accept-Reject MC

        1. Acceptance and efficiency

        2. native python loops vs. numpy broadcasting

      5. Importance Sampling & Unweighting

        1. Vegas

      6. Connetion to Bayesian Credible Intervals

      7. Metropolis Hastings MCMC

        1. Proposal functions

      8. Hamiltonian Monte Carlo

        1. Excerpts from A Conceptual Introduction to Hamiltonian Monte Carlo by Michael Betancourt

        2. Stan and PyMC3

    2. 10/14: Lecture 12: Hypothesis Testing and Confidence Intervals

      1. Simple vs. Compound hypotheses

      2. TypeI and TypeII error

      3. critical / acceptance region

      4. Neyman-Pearson Lemma

      5. Test statistics

      6. Confidence Intervals

      7. Interpretation

      8. Coverage

      9. Power

      10. No UMPU Tests

      11. Neyman-Construction

      12. Likelihood-Ratio tests

      13. Connection to binary classification

        1. prior and domain shift

  8. Week 8

    1. 10/19: Lecture 13:

      1. Simple vs. Compound hypotheses

        1. Nuisance Parameters

      2. Profile likelihood

      3. Profile construction

      4. Pivotal quantity

      5. Asymptotic Properties of Likelihood Ratio

        1. Wilks

        2. Wald

    2. 10/21 Canceled

  9. Week 9

    1. 10/26: Lecture 14

      1. Upper Limits, Lower Limits, Central Limits, Discovery

      2. Power, Expected Limits, Bands

      3. Sensitivity Problem for uppper limits

        1. CLs

        2. power-constrained limits

    2. 10/28: Lecture 15 flip-flopping, multiple testing

      1. flip flopping

      2. multiple testing

        1. look elsewhere effect

        2. Familywise error rate

        3. False Discovery Rate

        4. Hypothesis testing when nuisance parameter is present only under the alternative

          1. Asymptotics, Daves, Gross and Vitells

  10. Week 10

    1. 11/2 Lecture 16 Combinations, probabilistic modelling languages, probabilistic programming

      1. Combinations

        1. Combining p-values

        2. combining posteriors

        3. likelihood-based combinations

        4. likelihood publishing

      2. probabilistic modelling languages

        1. computational graphs

      3. Probabilistic Programming

        1. First order PPLs

          1. Stan

        2. Universal Probabilistic Programming

          1. pyro

          2. pyprob and ppx

          3. Inference compilation

    2. 11/4 Lecture 17: Goodness of fit

      1. conceptual framing

      2. difference to hypothesis testing

      3. chi-square test

      4. Kolmogorov-Smirnov

      5. Anderson-Darling

      6. Zhang’s tests

      7. Bayesian Information Criteria

      8. software

      9. anomaly detection

  11. Week 11

    1. 11/9: Lecture 18 Intro to machine learning

      1. Supervised Learning

      2. Statistical Learning Theory

        1. Loss, Risk, Emperical Risk

        2. Generalization

        3. VC dimension and Emperical risk minimization

        4. No Free Lunch

      3. Cross-validation test/train

        1. Preview: the mystery of deep learning

      4. Least Squares

      5. Regularized least squares

      6. Bayesian Curve fitting

      7. Bias-Variance tradeoff

    2. 11/11 Lecture 19

      1. Generalization

      2. Loss functions for regression

      3. loss function for classification

      4. Information theory background

        1. Entropy

        2. Mutual information

        3. cross entropy

        4. Relative Entropy

  12. Week 12

    1. 11/16: Lecture 20 Density Estimation, Deep Generative Models

      1. Unsupervised learning

      2. Loss functions for density estimation

        1. Divergences

          1. KL Divergence

          2. Fisher distance

          3. Optimal Transport

          4. Hellinger distance

          5. f-divergences

          6. Stein divergence

      3. Maximum likelihood (Forward KL)

        1. can approximate with samples, don’t need target distribution

      4. Variational Inference (Reverse KL)

        1. Connecton to statistical physics

        2. LDA (Topic Modelling)

        3. BBVI

      5. Deep Generative models

        1. Normalizing Flows intro

        2. background on auto-encoders

        3. Variational Auto-encoder intro

    2. 11/18: Lecture 21 Deep Generative Models

      1. Deep Generative models comparison

        1. Normalizing Flows

        2. Autoregresive models

        3. Variational Auto-encoder

        4. GANs

  13. Week 13

    1. 11/23: Lecture 22 The data manifold

      1. what is it, why is it there

        1. in real data

        2. in GANs etc.

      2. How it complicates distances based on likelihood ratios

      3. Optimal transport

    2. 11/25 Lecture 23 Optimization

      1. Gradient descent

      2. Momentum, Adam

      3. Differences of likelihood fits in classical statistics and loss landscape of deep learning models

      4. stochastic gradient descent and mini-batching intro

        1. what is it

  14. Week 14

    1. 11/30: Lecture 23 Stochastic gradient descent

      1. Robbins-Monro

      2. connection to Langevin dynamics and approximate Bayesian inference

    2. 12/2: Lecture 24 Implicit bias and regularization in learning algorithms

      1. dynamics of gradient descent

      2. Double descent

  15. Week 15

    1. 12/7 Lecture 25 Deep Learning

      1. Loss landscape

        1. random matrix theory

        2. connection to statistical mechanics

      2. Deep Model Zoo

        1. MLP

        2. Convolutions

        3. Sequence Models: RNN and Tree RNN

          1. vanishing and exploding gradients

        4. Graph Networks

        5. Transformers

        6. images, sets, sequences, graphs, hyper-graphs

        7. DL and functional programming

        8. Differentiable programming

    2. 12/9: Review

      1. Review

Other topics that we touched on or planned to touch on.

I need to move some of these topics that we discussed into the schedule. This is a place holder for now.

  1. examples

    1. unbinned likelihood exponential example

  2. HW ideas

    1. Conditional Distribuutions

    2. Bernouli to Binomial

    3. Binomial to Poisson

    4. Poisson to Gaussian

    5. Product of Poissons vs. Multinomial

    6. CLT to Extreme Value Theory

    7. Neyman Scott Phenomena

    8. some other shrinkage?

    9. Jeffreys for examples

    10. prior odds via betting example

    11. Negatively biased relevant subsets

    12. Group Project: interactive Neyman-Construction Demo

  3. Simulation-based inference

    1. ABC

    2. Diggle

    3. likleihood ratio

    4. likelihood

    5. posterior

    6. Mining Gold

  4. Topics to Reschedule

    1. Parametric vs. non-parametric

    2. Non-parametric

      1. Histograms

        1. Binomial / Poisson statistical uncertainty

        2. weighted entries

      2. Kernel Density Estimation

        1. bandwidth and boundaries

        2. K-D Trees

    3. Parameterized

      1. Unsupervised learning

      2. Maximum likelihood

        1. loss function

        2. Neural Denstiy Estimation

      3. Adversarial Training

        1. GANs

        2. WGAN

    4. Latent Variable Models

    5. Simulators

    6. Connections

      1. graphical models

      2. probability spaces

      3. Change of variables

    7. GANs

  5. Classification

    1. Binary vs. Multi-class classification

    2. Loss functions

    3. logistic regression

    4. Softmax

    5. Neural Networks

    6. Domain Adaptation and Algorithmic Fairness

  6. Kernel Machines and Gaussian Processes

    1. Warm up with N-Dim Gaussian

    2. Theory

    3. Examples

  7. Causal Inference

    1. ladder of causality

    2. simple examples

    3. Domain shift, inductive bias

    4. Statistical Invariance, pivotal quantities, Causal invariance

    5. Elements of Causal Inference by Jonas Peters, Dominik Janzing and Bernhard Schölkopf free PDF