Stacking Replicate Weights

Suppose we select a stratified simple random sample, with $H$ strata and $n_{h}$ sampling units in stratum $h$ . Then the usual Horvitz-Thompson variance estimator has degrees of freedom $d f = \sum_{h = 1}^{H} (n_{h} - 1)$ , and we thus need at least $d f$ replicates in order to obtain a fully efficient replicate estimator of the sampling variance. As a concrete example, suppose we have two strata ( $H = 2$ ) with $n_{h} = 6$ for both strata. Then the Horvitz-Thompson variance estimator has $d f = 10$ , and to get a fully efficient variance estimator, we would need $R = 12$ replicates with the JKn jackknife method (i.e., the stratified, delete-one jackknife), or $R = 10$ replicates with a more sophisticated method such as Fay’s generalized replication method.

For illustration, here’s what it would look like to create the 12 JKn replicates for the two strata.

Show code

library(survey)

sample_data <- data.frame(
  STRATUM = rep(1:2, each = 6),
  PSU     = 1:12
)

survey::jknweights(
  strata   = sample_data$STRATUM,
  psu      = sample_data$PSU,
  compress = FALSE
) |> getElement("repweights")

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
 [1,]  0.0  1.2  1.2  1.2  1.2  1.2  1.0  1.0  1.0   1.0   1.0   1.0
 [2,]  1.2  0.0  1.2  1.2  1.2  1.2  1.0  1.0  1.0   1.0   1.0   1.0
 [3,]  1.2  1.2  0.0  1.2  1.2  1.2  1.0  1.0  1.0   1.0   1.0   1.0
 [4,]  1.2  1.2  1.2  0.0  1.2  1.2  1.0  1.0  1.0   1.0   1.0   1.0
 [5,]  1.2  1.2  1.2  1.2  0.0  1.2  1.0  1.0  1.0   1.0   1.0   1.0
 [6,]  1.2  1.2  1.2  1.2  1.2  0.0  1.0  1.0  1.0   1.0   1.0   1.0
 [7,]  1.0  1.0  1.0  1.0  1.0  1.0  0.0  1.2  1.2   1.2   1.2   1.2
 [8,]  1.0  1.0  1.0  1.0  1.0  1.0  1.2  0.0  1.2   1.2   1.2   1.2
 [9,]  1.0  1.0  1.0  1.0  1.0  1.0  1.2  1.2  0.0   1.2   1.2   1.2
[10,]  1.0  1.0  1.0  1.0  1.0  1.0  1.2  1.2  1.2   0.0   1.2   1.2
[11,]  1.0  1.0  1.0  1.0  1.0  1.0  1.2  1.2  1.2   1.2   0.0   1.2
[12,]  1.0  1.0  1.0  1.0  1.0  1.0  1.2  1.2  1.2   1.2   1.2   0.0
attr(,"class")
[1] "repweights"

In practice, we might have computational constraints and can’t simply make all of the replicates that we would need to get a fully efficient estimator. In the two-stratum example above, say, we might only be able to make $6$ replicates, even though we need at least $10$ to get a fully-efficient estimator. So in practice, it’s common to simply create replicate weights separately in each stratum and then stack them together. For example:

Show code

stratum_one_replicates <- sample_data |> 
  subset(STRATUM == 1) |>
  svydesign(data = _, ids = ~ PSU, weights = ~ 1) |>
  as.svrepdesign(type = "JK1", compress = FALSE) |>
  getElement("repweights")

stratum_two_replicates <- sample_data |> 
  subset(STRATUM == 2) |>
  svydesign(data = _, ids = ~ PSU, weights = ~ 1) |>
  as.svrepdesign(type = "JK1", compress = FALSE) |>
  getElement("repweights")

stacked_reps <- rbind(stratum_one_replicates, stratum_two_replicates)

rownames(stacked_reps) <- rep(c("Stratum 1", "Stratum 2"), each = 6)

print(stacked_reps)

          [,1] [,2] [,3] [,4] [,5] [,6]
Stratum 1  0.0  1.2  1.2  1.2  1.2  1.2
Stratum 1  1.2  0.0  1.2  1.2  1.2  1.2
Stratum 1  1.2  1.2  0.0  1.2  1.2  1.2
Stratum 1  1.2  1.2  1.2  0.0  1.2  1.2
Stratum 1  1.2  1.2  1.2  1.2  0.0  1.2
Stratum 1  1.2  1.2  1.2  1.2  1.2  0.0
Stratum 2  0.0  1.2  1.2  1.2  1.2  1.2
Stratum 2  1.2  0.0  1.2  1.2  1.2  1.2
Stratum 2  1.2  1.2  0.0  1.2  1.2  1.2
Stratum 2  1.2  1.2  1.2  0.0  1.2  1.2
Stratum 2  1.2  1.2  1.2  1.2  0.0  1.2
Stratum 2  1.2  1.2  1.2  1.2  1.2  0.0

What Is a “Good” Way to Stack Replicate Weights?

It turns out that there are good ways to stack replicates and there are also some pretty bad ways to stack replicates, depending on the replication method and the way the sample was selected. The example above would be fine for a stratified simple random sample, but it could be really bad for a systematic sample. The safest thing to do in general is to randomize the order of each stratum’s replicates before combining replicates across strata.

But why though? What separates a good method of stacking from a bad method of stacking?

Our Goal with Replication

One way of thinking about replication methods is that they are supposed to yield variance estimates for linear statistics (e.g., totals, or means in certain cases) that match some target variance estimator, such as the Horvitz-Thompson variance estimator. The easiest way to see this is to write down our variance estimator as a quadratic form, $\hat{V} (\hat{Y}) = {\overset{˘}{y}}^{'} Σ \overset{˘}{y}$ , where $\overset{˘}{y}$ is the vector of weighted values $y_{i} / π_{i}$ , for $i = 1, \dots, n$ . For a stratified simple random sample with two strata and $n_{h} = 3$ in each stratum, we can see below what the quadratic form matrix $Σ$ looks like.

Show code

ht_quad_form <- data.frame(
  STRATUM = rep(1:2, each = 3),
  PSU     = 1:6
) |>
  svydesign(data = _, strata = ~ STRATUM,
            ids = ~ PSU, weights = ~ 1) |>
  svrep::get_design_quad_form(
    variance_estimator = "Ultimate Cluster"
  )

ht_quad_form

6 x 6 sparse Matrix of class "dsCMatrix"
                                  
[1,]  1.0 -0.5 -0.5  .    .    .  
[2,] -0.5  1.0 -0.5  .    .    .  
[3,] -0.5 -0.5  1.0  .    .    .  
[4,]  .    .    .    1.0 -0.5 -0.5
[5,]  .    .    .   -0.5  1.0 -0.5
[6,]  .    .    .   -0.5 -0.5  1.0

An important observation is that if units $i$ and $j$ are in different strata, then entry $i j$ in the quadratic form matrix $Σ$ equals $0$ (that is, $σ_{i j} = 0$ ).

With replication, we create $R$ sets of replicate weights, where the $r$ -th set of replicate weights is a vector of length $n$ denoted $a^{(r)}$ , whose $k$ -th value is denoted $a_{k}^{(r)}$ . This yields $R$ replicate estimates of the population total, ${\hat{T}}_{y}^{(r)} = \sum_{k \in s} a_{k}^{(r)} {\overset{˘}{y}}_{k}$ , for $r = 1, \dots, R$ , which can be used to estimate sampling variance.

The standard formula is:

$v_{R} ({\hat{T}}_{y}) = C \sum_{b = 1}^{R} c_{r} {({\hat{T}}_{y}^{(r)} - {\hat{T}}_{y})}^{2}$

where $C$ and $c_{r}, r = 1, \dots, R$ are constants that depend on the replication method.

Like the target variance estimator, the replication variance estimator can also be written as a quadratic form:

$v_{R} ({\hat{T}}_{y}) = {\overset{˘}{y}}^{'} Σ_{R} \overset{˘}{y}$

where

$Σ_{R} = C \sum_{b = 1}^{R} c_{r} (a^{(r)} - 1_{n}) {(a^{(r)} - 1_{n})}^{'}$

Ideally, we want the matrix $Σ_{R}$ to exactly equal $Σ$ . That’s possible if we use enough replicates. For instance, with the example we just looked at, if we create $R = 6$ replicates with the JKn method, which has $C = 1$ and $c_{r} = 2 / 3, r = 1, \dots, 6$ , then we have the following replicate weights:

Show code

rep_weights <- jknweights(
  strata = rep(1:2, each = 3),
  psu     = 1:6,
  compress = FALSE
) |> getElement("repweights")

rownames(rep_weights) <- rep(c("Stratum 1", "Stratum 2"), each = 3)

print(rep_weights)

          [,1] [,2] [,3] [,4] [,5] [,6]
Stratum 1  0.0  1.5  1.5  1.0  1.0  1.0
Stratum 1  1.5  0.0  1.5  1.0  1.0  1.0
Stratum 1  1.5  1.5  0.0  1.0  1.0  1.0
Stratum 2  1.0  1.0  1.0  0.0  1.5  1.5
Stratum 2  1.0  1.0  1.0  1.5  0.0  1.5
Stratum 2  1.0  1.0  1.0  1.5  1.5  0.0
attr(,"class")
[1] "repweights"

And we have $Σ_{R} = Σ$ .

Show code

Sigma_R <- (2/3) * tcrossprod((rep_weights - 1), (rep_weights - 1))

print(Sigma_R)

          Stratum 1 Stratum 1 Stratum 1 Stratum 2 Stratum 2 Stratum 2
Stratum 1       1.0      -0.5      -0.5       0.0       0.0       0.0
Stratum 1      -0.5       1.0      -0.5       0.0       0.0       0.0
Stratum 1      -0.5      -0.5       1.0       0.0       0.0       0.0
Stratum 2       0.0       0.0       0.0       1.0      -0.5      -0.5
Stratum 2       0.0       0.0       0.0      -0.5       1.0      -0.5
Stratum 2       0.0       0.0       0.0      -0.5      -0.5       1.0

Again, we note that the entries for sampling units in different strata are equal to 0.

Reducing the Number of Replicates

In order to make $Σ_{R}$ exactly equal to $Σ$ , we might need to create many replicates. To use fewer replicates, we would need to content ourselves with $Σ_{R} \approx Σ$ .

The method of stacking replicates from different strata is one way of obtaining $Σ_{R} \approx Σ$ .

Below we can see stacked jackknife replicate weights:

Show code

stratum_one_reps <- jknweights(
  strata = rep(1, each = 3),
  psu     = 1:3,
  compress = FALSE
) |> getElement("repweights")

stratum_two_reps <- jknweights(
  strata = rep(2, each = 3),
  psu     = 4:6,
  compress = FALSE
) |> getElement("repweights")

stacked_rep_weights <- rbind(
  stratum_one_reps,
  stratum_two_reps
)

rownames(stacked_rep_weights) <- rep(c("Stratum 1", "Stratum 2"), each = 3)

print(stacked_rep_weights)

          [,1] [,2] [,3]
Stratum 1  0.0  1.5  1.5
Stratum 1  1.5  0.0  1.5
Stratum 1  1.5  1.5  0.0
Stratum 2  0.0  1.5  1.5
Stratum 2  1.5  0.0  1.5
Stratum 2  1.5  1.5  0.0

And now we can see what $Σ_{R}$ looks like for the stacked replicates.

Show code

Sigma_R_stacked <- (2/3) * tcrossprod((stacked_rep_weights - 1), (stacked_rep_weights - 1))

print(Sigma_R_stacked)

          Stratum 1 Stratum 1 Stratum 1 Stratum 2 Stratum 2 Stratum 2
Stratum 1       1.0      -0.5      -0.5       1.0      -0.5      -0.5
Stratum 1      -0.5       1.0      -0.5      -0.5       1.0      -0.5
Stratum 1      -0.5      -0.5       1.0      -0.5      -0.5       1.0
Stratum 2       1.0      -0.5      -0.5       1.0      -0.5      -0.5
Stratum 2      -0.5       1.0      -0.5      -0.5       1.0      -0.5
Stratum 2      -0.5      -0.5       1.0      -0.5      -0.5       1.0

The key thing to note is that the entries of the two matrices, $Σ_{R}$ and $Σ$ exactly match for entries corresponding to sampling units in the same stratum, and the mismatches only occur for entries corresponding to sampling units in different strata.

Important

The downside of stacking replicate weights from different strata is that the between-stratum elements of $Σ_{R}$ are incorrect: they should all be zero, but stacking causes some of them to be nonzero. This adds error to the resulting variance estimates.

Using Randomization to Prevent Bias

Because stacking replicate weights introduces error, we might want to make sure that this error is random rather than systematic, so that our variance estimates don’t end up being biased one way or the other. Put another way, if we’re going to content ourselves with $Σ_{R} \approx Σ$ , we would probably feel better if we knew that $E [Σ_{R}] = Σ$ , where the expectation is evaluated with respect to a randomization process used to form the replicates.

To make $E [Σ_{R}] = Σ$ , we really only need to worry about the elements of $Σ_{R}$ that correspond to pairs of units in different strata. That is, if sampling units $i$ and $j$ are in different strata, then we need to make sure $E [σ_{R, i j}] = σ_{i j}$ .

The simplest way to do this is just to randomly shuffle the order of the replicates in each stratum, before stacking the replicates.

Show code

set.seed(1999)

stacked_rep_weights <- rbind(
  stratum_one_reps[,sample(1:3, size = 3, replace = FALSE)],
  stratum_one_reps[,sample(1:3, size = 3, replace = FALSE)]
)

rownames(stacked_rep_weights) <- rep(c("Stratum 1", "Stratum 2"), each = 3)

print(stacked_rep_weights)

          [,1] [,2] [,3]
Stratum 1  0.0  1.5  1.5
Stratum 1  1.5  0.0  1.5
Stratum 1  1.5  1.5  0.0
Stratum 2  1.5  1.5  0.0
Stratum 2  0.0  1.5  1.5
Stratum 2  1.5  0.0  1.5

We can see this using a Monte Carlo simulation. Below, our Monte Carlo estimate of $E [Σ_{R}]$ shows that the entries corresponding to pairs of units within the same stratum are exactly correct (i.e., match entries in $Σ$ ), while the entries corresponding to pairs of units from different strata are also on average correct.

Show code

expected_Sigma_R_stacked <- replicate(n = 100000, expr = {
  # Stack replicates from each stratum, after shuffling
  stacked_rep_weights <- rbind(
    stratum_one_reps[,sample(1:3, size = 3, replace = FALSE)],
    stratum_one_reps[,sample(1:3, size = 3, replace = FALSE)]
  )
  rownames(stacked_rep_weights) <- rep(c("Stratum 1", "Stratum 2"), each = 3)
  
  # Calculate Sigma_R
  Sigma_R_stacked <- (2/3) * tcrossprod((stacked_rep_weights - 1), (stacked_rep_weights - 1))
  
  return(Sigma_R_stacked)
}, simplify = FALSE) |> 
  # Average across Monte Carlo simulations
  Reduce(f = `+`) |> magrittr::divide_by(100000) |> round(2)

expected_Sigma_R_stacked

          Stratum 1 Stratum 1 Stratum 1 Stratum 2 Stratum 2 Stratum 2
Stratum 1       1.0      -0.5      -0.5       0.0       0.0       0.0
Stratum 1      -0.5       1.0      -0.5       0.0       0.0       0.0
Stratum 1      -0.5      -0.5       1.0       0.0       0.0       0.0
Stratum 2       0.0       0.0       0.0       1.0      -0.5      -0.5
Stratum 2       0.0       0.0       0.0      -0.5       1.0      -0.5
Stratum 2       0.0       0.0       0.0      -0.5      -0.5       1.0

Some Possible Causes of Bias

The error in the between-stratum elements of $Σ_{R}$ is a problem if there is some systematic pattern behind the ordering of sampling units in each stratum. This can happen by accident more often than you might think. For example, maybe your sampling frame was sorted in some way and you accidentally retained that frame ordering in your sample. That can easily happen, like so:

Show code

library(dplyr)    # For data manipulation
library(sampling) # For sampling

# Order the sampling frame of libraries from smallest to largest
  sampling_frame <- svrep::library_census |> arrange(LIBRARIA)

# Select the sample
  is_sampled <- sampling::srswor(n = 5, N = nrow(sampling_frame)) |>
    as.logical()
  
  selected_sample <- sampling_frame[is_sampled,]

# Inspect the sample data
  selected_sample |> dplyr::select(LIBNAME, LIBRARIA)

# A tibble: 5 × 2
  LIBNAME                                LIBRARIA
  <chr>                                     <dbl>
1 GERALDINE E. ANDERSON VILLAGE LIBRARY      0.7 
2 SONOMA COUNTY PUBLIC LAW LIBRARY           1   
3 GLADSTONE AREA SCHOOL & PUBLIC LIBRARY     1   
4 CALAIS FREE LIBRARY                        3.18
5 ABINGTON TWP PUBLIC LIBRARY                8.93

Or if you use unequal-probability sampling, your software might roughly order the sample data file by “number of hits”, which would mean that the sampling file is approximately sorted by the sampling weights.

Most of us probably wouldn’t think to double-check that our data file is actually unordered: we’re just happy that we got a random sample with the correct inclusion probabilities and weights, and we probably don’t think much about the order of the data file records. But if we are eventually going to form replicates using a stacking approach, we should be wary of the possibility that the data has some non-random order to it. So generally speaking, it’s a good idea to shuffle replicates before stacking.

Using the ‘svrep’ R Package to Stack Replicates

The ‘svrep’ package includes a few functions that are helpful for this. First, there’s the function shuffle_replicates() which “shuffles” the order of the replicates in a survey design object.

Show code

library(survey)
library(svrep)

# Create a replicate weights design object in R
  sample_data <- data.frame(
    STRATUM = rep(1:2, each = 6),
    PSU     = 1:12
  )
  
  stratum_one_rep_design <- sample_data |> 
    subset(STRATUM == 1) |>
    svydesign(data = _, ids = ~ PSU, weights = ~ 1) |>
    as.svrepdesign(type = "JK1", compress = FALSE)

# Inspect replicate weights
  stratum_one_rep_design$repweights

     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]  0.0  1.2  1.2  1.2  1.2  1.2
[2,]  1.2  0.0  1.2  1.2  1.2  1.2
[3,]  1.2  1.2  0.0  1.2  1.2  1.2
[4,]  1.2  1.2  1.2  0.0  1.2  1.2
[5,]  1.2  1.2  1.2  1.2  0.0  1.2
[6,]  1.2  1.2  1.2  1.2  1.2  0.0
attr(,"class")
[1] "repweights"

Show code

# Shuffle the order of the replicates
  stratum_one_shuffled_rep_design <- stratum_one_rep_design |>
    shuffle_replicates()

# Inspect the shuffled replicate weight columns
  stratum_one_shuffled_rep_design$repweights

     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]  0.0  1.2  1.2  1.2  1.2  1.2
[2,]  1.2  1.2  1.2  0.0  1.2  1.2
[3,]  1.2  1.2  0.0  1.2  1.2  1.2
[4,]  1.2  1.2  1.2  1.2  1.2  0.0
[5,]  1.2  0.0  1.2  1.2  1.2  1.2
[6,]  1.2  1.2  1.2  1.2  0.0  1.2

Second, there’s the function stack_replicate_designs(), which allows us to stack together two or more replicate designs, combining their columns of replicate weights, their data variables, and other metadata.

Show code

set.seed(2023)

# Create replicates for a second stratum
  stratum_two_rep_design <- sample_data |> 
    subset(STRATUM == 2) |>
    svydesign(data = _, ids = ~ PSU, weights = ~ 1) |>
    as.svrepdesign(type = "JK1", compress = FALSE)

# Combine the replicate designs from both strata,
# after independently shuffling their replicates
  
  stacked_rep_design <- stack_replicate_designs(
    stratum_one_rep_design |> shuffle_replicates(),
    stratum_two_rep_design |> shuffle_replicates()
  )

This allowed us to create just 6 replicates for a design with 12 PSUs.

Show code

# Create a data frame with the weights and specific variables of interest
stacked_rep_design |>
  as_data_frame_with_weights(
    vars_to_keep   = c("STRATUM", "PSU"),
    full_wgt_name  = "BASE_WGT",
    rep_wgt_prefix = "REP_WGT_"
  )

   STRATUM PSU BASE_WGT REP_WGT_1 REP_WGT_2 REP_WGT_3 REP_WGT_4 REP_WGT_5 REP_WGT_6
1        1   1        1       1.2       0.0       1.2       1.2       1.2       1.2
2        1   2        1       1.2       1.2       1.2       1.2       0.0       1.2
3        1   3        1       1.2       1.2       1.2       0.0       1.2       1.2
4        1   4        1       1.2       1.2       0.0       1.2       1.2       1.2
5        1   5        1       0.0       1.2       1.2       1.2       1.2       1.2
6        1   6        1       1.2       1.2       1.2       1.2       1.2       0.0
7        2   7        1       1.2       0.0       1.2       1.2       1.2       1.2
8        2   8        1       0.0       1.2       1.2       1.2       1.2       1.2
9        2   9        1       1.2       1.2       1.2       1.2       0.0       1.2
10       2  10        1       1.2       1.2       1.2       0.0       1.2       1.2
11       2  11        1       1.2       1.2       0.0       1.2       1.2       1.2
12       2  12        1       1.2       1.2       1.2       1.2       1.2       0.0