Statistics and Data Science¶
This is the start of a book for a graduate-level course at NYU Physics titled Statistics and Data Science.
Here are some of the objectives of this course:
Learn essential concepts of probability
Become familiar with how intuitive notions of probability are connected to formal foundations.
Overcome barriers presented by unfamiliar notation and terminology.
Internalize the transformation properties of distributions, the likelihood function, and other probabilistic objects.
Understand the differences between Bayesian and Frequentist approaches, particularly in the context of physical theories.
Connect these concepts to modern data science tools and techniques like the scientific python ecosystem and automatic differentiation.
Learn essential concepts of statistics
Learn classical statistical procedures: point estimates, goodness of fit tests, hypothesis tests, confidence intervals and credible intervals.
Become familiar with statistical decision theory
Recognize probabilistic programs as statistical models
Become familiar with the computational challenges found in statistical inference and techniques developed to overcome them.
Understand the difference between statistical associations and causal inference
Learn essential concepts of software and computing
Become familiar with the scientific python ecosystem
Become familiar with software testing via use of nbgrader
Become familiar with automatic differentiation & differentiable programming
Become familiar with probabilistic programming
Learn essential concepts of machine learning
Become familiar with core tasks such as classification and regression
Understand the notion of generalization
Understand the role of regularization and inductive bias
Become familiar with the taxonomy of different types of models found in machine learning: linear models, kernel methods, neural networks, deep learning
Become familiar with the interplay of model, data, and learning (optimization) algorithms
Touch on different learning settings: supervised learning, unsupervised learning, reinforcement learning
Learn essential concepts of data science
Understand how data science connects to the topics above
Gain confidence in using scientific python and modern data science tools to analyze real data
Warning
Please note that the class website is under active development, and content will be added throughout the duration of the course.
Tip
If you would like to audit this class, email Prof. Cranmer (kyle.cranmer at nyu ) with your NYU netID
Note
In approaching this book I am torn between different styles. I like very much the atomic nature of Quantum Field Theory by Mark Srednicki as it is readable and a useful reference without too much narrative. On the other hand, I want to blend together the hands-on coding elements with fundamental concepts, and I am inspired by the book Functional Differential Geometry by Gerald Jay Sussman and Jack Wisdom.