# Statistics and Data Science
This is the start of a book for a graduate-level course at NYU Physics titled *Statistics and Data Science*.
Here are some of the objectives of this course:
* **Learn essential concepts of probability**
* Become familiar with how intuitive notions of probability are connected to formal foundations.
* Overcome barriers presented by unfamiliar notation and terminology.
* Internalize the transformation properties of distributions, the likelihood function, and other probabilistic objects.
* Understand the differences between Bayesian and Frequentist approaches, particularly in the context of physical theories.
* Connect these concepts to modern data science tools and techniques like the scientific python ecosystem and automatic differentiation.
* **Learn essential concepts of statistics**
* Learn classical statistical procedures: point estimates, goodness of fit tests, hypothesis tests, confidence intervals and credible intervals.
* Become familiar with statistical decision theory
* Recognize probabilistic programs as statistical models
* Become familiar with the computational challenges found in statistical inference and techniques developed to overcome them.
* Understand the difference between statistical associations and causal inference
* **Learn essential concepts of software and computing**
* Become familiar with the scientific python ecosystem
* Become familiar with software testing via use of nbgrader
* Become familiar with automatic differentiation & differentiable programming
* Become familiar with probabilistic programming
* **Learn essential concepts of machine learning**
* Become familiar with core tasks such as classification and regression
* Understand the notion of generalization
* Understand the role of regularization and inductive bias
* Become familiar with the taxonomy of different types of models found in machine learning: linear models, kernel methods, neural networks, deep learning
* Become familiar with the interplay of model, data, and learning (optimization) algorithms
* Touch on different learning settings: supervised learning, unsupervised learning, reinforcement learning
* **Learn essential concepts of data science**
* Understand how data science connects to the topics above
* Gain confidence in using scientific python and modern data science tools to analyze real data
```{warning} Please note that the class website is under active development, and content will be added throughout the duration of the course.
```
```{tip} If you would like to audit this class, email Prof. Cranmer (kyle.cranmer at nyu ) with your NYU netID
```
```{note}
In approaching this book I am torn between different styles. I like very much the atomic nature of [Quantum Field Theory by Mark Srednicki](https://www.amazon.com/Quantum-Field-Theory-Mark-Srednicki/dp/0521864496) as it is readable and a useful reference without too much narrative. On the other hand, I want to blend together the hands-on coding elements with fundamental concepts, and I am inspired by the book [Functional Differential Geometry by Gerald Jay Sussman and Jack Wisdom](https://mitpress.mit.edu/books/functional-differential-geometry).
```