# Estimators¶

One of the main differences between topics of probability and topics in statistics is that in statistics we have some task in mind.
While a probability model \(P_X(X \mid \theta)\) is an object of study when discussing probability, in statistics we usually want to
*do* something with it.

The first example that we will consider is to estimate the true, unknown value \(\theta^*\) given some dataset \(\{x_i\}_{i=1}^N\) assuming that the data were drawn from \(X_i \sim p_X(X|\theta^*)\).

Definition

An estimator \(\hat{\theta}(x_1, \dots, x_N)\) is a function of the data (that aims to estimate the true, unknown value \(\theta^*\) assuming that the data were drawn from \(X_i \sim p_X(X|\theta^*)\).

There are several concrete estimators for different quantities, but this is an abstract definition of what is meant by an estimator. It is useful to think of the estimator as a procedure that you apply to the data, and then you can ask about the properties of a given procedure.

Terminology

These closely related terms have slightly different meanings:

The

*estimand*refers to the parameter \(\theta\) being estimated.The

*estimator*refers to the function or procedure \(\hat{\theta}(x_1, \dots, x_N)\)The specific value that an estimator takes (returns) for specific data is known as the

*estimate*.

We already introduced two estimators when studying Transformation properties of the likelihood and posterior:

The maximum likelihood estimator: \(\hat{\theta}_\textrm{MLE} := \textrm{argmax}_\theta p(X=x \mid \theta)\)

The maximum a posteriori estimator: \(\hat{\theta}_{MAP} := \textrm{argmax}_\theta p(\theta \mid X=x)\)

Note both of these estimators are defined by procedures that you apply once you have specific data.

Notation

The estimate \(\hat{\theta}(X_1, \dots, X_N)\) depends on the random variables \(X_i\), so it is itself a random variable (unlike the parameter \(\theta\)). Often the estimate is denoted \(\hat{\theta}\) and the dependence on the data is implicit. Subscripts are often used to indicate which estimator is being used, eg. the maximum likelihood estimator \(\hat{\theta}_\textrm{MLE}\) and the maximum a posteriori estimator \(\hat{\theta}_\textrm{MAP}\).

Hint

It is often useful to consider two straw man estimators:

A constant estimator: \(\hat{\theta}_\textrm{const} = \theta_0\) for \(\theta_0 \in \Theta\)

A random estimator: \(\hat{\theta}_\textrm{random} =\) some random value for \(\theta\) independent of the data Neither of these are useful estimators, but they can be used to help clarify your thinking due to their obvious properties.