# statistics bootstrap – Bootstrapping bioequivalence results R

I am trying to run a bootstrap simulation of a bioequivalence study in R with 3 pharmacokinetic parameters (AUC0-infinity [`AUCIFO`]AUClast [`AUCLST`]Cmax [`CMAX`]). I want to sample with replacement from the actual study dataset to increase the sample size from 25 to 100, perform the BE calculation (using the `bw2x2` function from the `BE` package), and then repeat that 1000 to 10000 times (depending on how long it takes to run) and generate a geometric mean ratio (GMR) for each parameter with 90%CI.

Here is a reprex dataset that I generated to represent the study date (sample size of 25):

``````require(dplyr)
require(BE)

set.seed(123)

aucifo_test <- rnorm(25, 2500, 25)
aucifo_ref <- rnorm(25, 2600, 25)
auclst_test <- rnorm(25, 2400, 25)
auclst_ref <- rnorm(25, 2400, 25)
cmax_test <- rnorm(25, 50, 5)
cmax_ref <- rnorm(25, 55, 5)

SUBJ = c(1:25)
GRP = sample(c(1,2), 25, replace = TRUE)

be_df <- rbind(data.frame(SUBJ = SUBJ, GRP = GRP, TRT = "R", AUCIFO = aucifo_ref, AUCLST = auclst_ref, CMAX = cmax_ref),
data.frame(SUBJ = SUBJ, GRP = GRP, TRT = "T",AUCIFO = aucifo_test, AUCLST = auclst_test, CMAX = cmax_test)) %>%
mutate(PRD = case_when(
GRP == 1 & TRT == "R" ~ 1,
GRP == 1 & TRT == "T" ~ 2,
GRP == 2 & TRT == "R" ~ 2,
GRP == 2 & TRT == "T" ~ 1
)) %>%
mutate(GRP = case_when(
GRP == 1 ~ "RT",
GRP == 2 ~ "TR"
)) %>%
select(GRP, PRD, SUBJ, TRT, AUCIFO, AUCLST, CMAX)

be_results <- be2x2(be_df, c("AUCIFO", "AUCLST", "CMAX"))
``````

The output of `be2x2` is a list, which can be subsetted by the column name to obtain the GMR. Here is the output of the BE results:

``````print(be_results)

\$AUCIFO
\$AUCIFO\$`Analysis of Variance (log scale)`
Sum Sq Df   Mean Sq  F value    Pr(>F)
SUBJECT        0.0024154 24 0.0001006   1.8132   0.07915 .
GROUP          0.0001999  1 0.0001999   2.0749   0.16322
SUBJECT(GROUP) 0.0022155 23 0.0000963   1.7355   0.09686 .
PERIOD         0.0003246  1 0.0003246   5.8476   0.02392 *
DRUG           0.0203062  1 0.0203062 365.8577 1.332e-15 ***
ERROR          0.0012766 23 0.0000555
TOTAL          0.0245614 49
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

\$AUCIFO\$`Between and Within Subject Variability`
Between Subject Within Subject
Variance Estimate                  2.041146e-05    5.55029e-05
Coefficient of Variation, CV(%)    4.517928e-01    7.45013e-01

\$AUCIFO\$`Least Square Means (geometric mean)`
Reference Drug Test Drug
Geometric Means       2601.982  2499.114

\$AUCIFO\$`90% Confidence Interval of Geometric Mean Ratio (T/R)`
Lower Limit Point Estimate Upper Limit
90% CI for Ratio   0.9570003      0.9604654   0.9639432

\$AUCIFO\$`Sample Size`
True Ratio=1 True Ratio=Point Estimate
80% Power Sample Size            2                         2

\$AUCLST
\$AUCLST\$`Analysis of Variance (log scale)`
Sum Sq Df    Mean Sq F value  Pr(>F)
SUBJECT        0.0019802 24 0.00008251  0.9748 0.52554
GROUP          0.0000025  1 0.00000248  0.0288 0.86668
SUBJECT(GROUP) 0.0019778 23 0.00008599  1.0159 0.48504
PERIOD         0.0003203  1 0.00032025  3.7837 0.06408 .
DRUG           0.0001160  1 0.00011600  1.3705 0.25371
ERROR          0.0019467 23 0.00008464
TOTAL          0.0043485 49
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

\$AUCLST\$`Between and Within Subject Variability`
Between Subject Within Subject
Variance Estimate                  6.746561e-07   8.464061e-05
Coefficient of Variation, CV(%)    8.213747e-02   9.200228e-01

\$AUCLST\$`Least Square Means (geometric mean)`
Reference Drug Test Drug
Geometric Means       2407.201  2399.873

\$AUCLST\$`90% Confidence Interval of Geometric Mean Ratio (T/R)`
Lower Limit Point Estimate Upper Limit
90% CI for Ratio    0.992516      0.9969559    1.001416

\$AUCLST\$`Sample Size`
True Ratio=1 True Ratio=Point Estimate
80% Power Sample Size            2                         2

\$CMAX
\$CMAX\$`Analysis of Variance (log scale)`
Sum Sq Df  Mean Sq F value   Pr(>F)
SUBJECT        0.17640 24 0.007350  0.6666 0.834823
GROUP          0.00032  1 0.000321  0.0419 0.839670
SUBJECT(GROUP) 0.17608 23 0.007656  0.6943 0.805917
PERIOD         0.00308  1 0.003078  0.2792 0.602300
DRUG           0.12204  1 0.122036 11.0680 0.002934 **
ERROR          0.25360 23 0.011026
TOTAL          0.55687 49
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

\$CMAX\$`Between and Within Subject Variability`
Between Subject Within Subject
Variance Estimate                             0     0.01102599
Coefficient of Variation, CV(%)               0    10.52948160

\$CMAX\$`Least Square Means (geometric mean)`
Reference Drug Test Drug
Geometric Means       53.51791  48.47896

\$CMAX\$`90% Confidence Interval of Geometric Mean Ratio (T/R)`
Lower Limit Point Estimate Upper Limit
90% CI for Ratio   0.8608553      0.9058456   0.9531872

\$CMAX\$`Sample Size`
True Ratio=1 True Ratio=Point Estimate
80% Power Sample Size            3                         6
``````

You can subset the results such that you isolate the GMR point estimate. For example if I want the point estimate for AUCIFO, I can enter `be_results\$AUCIFO\$`90% Confidence Interval of Geometric Mean Ratio (T/R)`[2]`.

I am trying to use the `bootstrap` package to run the bootstrapping itself. So far I have the following `theta` function:

``````theta <- function(x) {

temp <- data.frame(SAMP = sample(c(1:25), 100, replace=TRUE), ID = c(1:100)) #get random sample of 100 subject IDs

ref <- temp %>% left_join(., x %>% filter(TRT == "R"), by = c("SAMP" = "SUBJ")) %>% select(-SAMP) #join reference samples subject IDs with study data and replace ID with new value

test <- temp %>% left_join(., x %>% filter(TRT == "T"), by = c("SAMP" = "SUBJ")) %>% select(-SAMP) #join test samples subject IDs with study data and replace ID with new value

be_df <- rbind(ref,test) %>% rename(SUBJ=ID) #join test and reference into single dataset

be_results <- be2x2(be_df, c("AUCIFO", "AUCLST", "CMAX"))
AUCIFO_GMR <- c(be_results\$AUCIFO\$`90% Confidence Interval of Geometric Mean Ratio (T/R)`[2])
AUCLST_GMR <- c(be_results\$AUCLST\$`90% Confidence Interval of Geometric Mean Ratio (T/R)`[2])
CMAX_GMR <- c(be_results\$CMAX\$`90% Confidence Interval of Geometric Mean Ratio (T/R)`[2])
}
``````

However, when I try to run the following (small `nboot` for troubleshooting): `bootstrap(be_df, 10, theta)`I get the following error message: `Error in UseMethod("filter") : no applicable method for 'filter' applied to an object of class "list"`

I am thinking this has something to do with the fact that I am trying to use a dataframe in the `bootstrap` function. Any help is appreciated!

Solution

Thanks to @jay.sf for the helpful answer! Made some additional tweaks to the code to output a dataframe, which are below:

``````theta <- function() {

temp <- data.frame(SAMP = sample(c(1:25), 100, replace=TRUE), ID = c(1:100)) #get random sample of 100 subject IDs

ref <- temp %>% left_join(., x101_boot_be %>% filter(TRT == "R"), by = c("SAMP" = "SUBJ")) %>% select(-SAMP) #join reference samples subject IDs with study data and replace ID with new value

test <- temp %>% left_join(., x101_boot_be %>% filter(TRT == "T"), by = c("SAMP" = "SUBJ")) %>% select(-SAMP) #join test samples subject IDs with study data and replace ID with new value

be_df <- rbind(ref,test) %>% rename(SUBJ=ID) #join test and reference into single dataset

be_results <- be2x2_quiet(be_df, c("AUCIFO", "AUCLST", "CMAX"))
AUCIFO_GMR <- c(be_results\$AUCIFO\$`90% Confidence Interval of Geometric Mean Ratio (T/R)`[2])
AUCLST_GMR <- c(be_results\$AUCLST\$`90% Confidence Interval of Geometric Mean Ratio (T/R)`[2])
CMAX_GMR <- c(be_results\$CMAX\$`90% Confidence Interval of Geometric Mean Ratio (T/R)`[2])

output = list(c(AUCIFO_GMR=AUCIFO_GMR, AUCLST_GMR=AUCLST_GMR, CMAX_GMR=CMAX_GMR))

}

boostrap_results <- replicate(1000L, theta())

boostrap_results_df <- as.data.frame(do.call(rbind, boostrap_results))
``````