| ID | Height | Shoe Size |
|---|---|---|
| 1 | 5.5 | 7 |
| 2 | 6.2 | 9 |
| 3 | 6.3 | 8 |
Recently, I’ve written a few posts about how survey sampling variances can be estimated using the method of linearization based on influence functions. This method is quite useful, but it is hard to find a complete proof of why it works. The first paper to contain a purported proof of the method (Deville, 1999) leaves a few key parts unexplained and I think also has an unfortunate typo or two in its proof. A couple of later papers (Goga, Deville, and Ruiz-Gazen, 2009, and Goga and Ruiz-Gazen 2013) both contain clearer proofs in their appendices.
All of the proofs above are fairly intimidating, and so the purpose of this blog post is to add more context and detail to the proof from Goga, Deville, and Ruiz-Gazen, 2009.
Notation and Background
The method of linearization using influence functions applies to a broad class of estimators used in finite population inference which are referred to as “substitution estimators”. As the name suggests, substitution estimators are estimators calculated by pretending your (weighted) sample is the population. The weighted sample mean, for example, is a substitution estimator.
This idea is formalized using the concept of functionals. Loosely speaking, a functional is a recipe that indicates how to calculate a quantity of interest (a mean, ratio, quantile, etc.) given a population and a “measure” which assigns weight to each person in the population. One example of a measure is a population measure, which equals ‘1’ for each person in the population and 0 for persons outside the population of interest. Another example of a measure is the “sample measure”, which equals the value of a sample weight for each person in the sample and zero for every person not in the sample. More formally, a functional is a function whose input is another function: the input is called a “measure”, which is a function that assigns a weight to values in a space.
Measures
Let’s give a more concrete example of what we mean by measure. Suppose we have a population of three individuals, with heights and shoe sizes as described in the following table.
We can represent this population with a “population measure”, which is a function
Now, suppose we draw a simple random sample without replacement of size two, resulting in us selecting persons 2 and 3. Then we can represent our sample with a “sample measure”, which is a function denoted
In general, the sample measure is based on each sample member’s “probability of inclusion” in the sample, denoted
Functionals
With a clearer idea of what a measure is, we can now look at some examples of functionals. A functional
Influence Functions
The influence function is a function denoted
Mathematically, it is a derivative:
For example, consider again our tiny population with values
Then the influence function at the point
And we can determine its value by noting:
If we work it out, we’d find that the influence function equals the following:
And in general, the influence function for a population mean is given by
Assumptions of the Theorem
The first set of assumptions (1 through 4) apply to the sample design and must apply to each variable of interest,
Assumption 1. We assume that
.Assumption 2 . We assume that
exists, for any variable of interest .Assumption 3. As
, we have that in probability, for any variable of interest .Assumption 4. As
in distribution, for any variable of interest .
The second set of assumptions, 5 through 7, apply to the functional
Assumption 5. We assume that
is homogeneous, in that there exists a real number dependent on such that for any real .Assumption 6. We assume that
.Assumption 7. We assume that
is Fréchet differentiable.Show definition:
The following definition is adapted from Huber (1981), pages 34-35.
Definition: The functional is Fréchet differentiable at if
there exists a functional (linear, continuous),
such that for any function , then where is some metric on the set of relevant measures for which the following conditions hold:- For all
, is open for all - If
, then .
- For all
Formal Statement of the Theorem
Let
If Assumptions 1 through 7 hold, then:
In short, asymptotically, the sampling variance of the sample statistic
Proof
From Assumptions 5 (homogeneity of
Let us provide the space
From Assumption 4, then, we have that
Why?
Assumption 4 states that
converges in distribution to a Normal distribution (with mean zero and finite variance ), and so we must have that can be reduced to any arbitrarily small by choosing sufficiently large . This implies that can be reduced to any arbitrarily small number by choosing sufficiently large . Noting that and , we thus have that .
Because of the equivalence between the metric
and in terms of convergence properties, this implies that .
Now because
What is a first-order Von Mises Expansion?
The following definition is loosely taken from “Robust Statistics” Hampel et al. 1986, Cabrera and Fernholz (1999), and Huber 1981.
Von Mises expansion. For distributions
and , the first-order von Mises expansion of at (which is derived from a Taylor series) evaluated in is given by for a suitable distance metric .
Next, we note that for a functional of degree
Denote the remainder term
And so we have that:
Multiplying both sides by
This gives us the desired result: