| Title: | Analysis Results Data |
|---|---|
| Description: | Construct CDISC (Clinical Data Interchange Standards Consortium) compliant Analysis Results Data objects. These objects are used and re-used to construct summary tables, visualizations, and written reports. The package also exports utilities for working with these objects and creating new Analysis Results Data objects. |
| Authors: | Daniel D. Sjoberg [aut, cre] (ORCID: <https://orcid.org/0000-0003-0862-2018>), Becca Krouse [aut], Emily de la Rua [aut] (ORCID: <https://orcid.org/0009-0000-8738-5561>), Malan Bosman [aut] (ORCID: <https://orcid.org/0000-0002-3020-195X>), F. Hoffmann-La Roche AG [cph, fnd], GlaxoSmithKline Research & Development Limited [cph] |
| Maintainer: | Daniel D. Sjoberg <[email protected]> |
| License: | Apache License 2.0 |
| Version: | 0.8.0.9000 |
| Built: | 2026-05-28 18:04:49 UTC |
| Source: | https://github.com/insightsengineering/cards |
Data frame imported from the CDISC SDTM/ADaM Pilot Project
ADSL ADAE ADTTE ADLBADSL ADAE ADTTE ADLB
An object of class tbl_df (inherits from tbl, data.frame) with 254 rows and 49 columns.
An object of class tbl_df (inherits from tbl, data.frame) with 1191 rows and 56 columns.
An object of class tbl_df (inherits from tbl, data.frame) with 254 rows and 26 columns.
An object of class tbl_df (inherits from tbl, data.frame) with 5784 rows and 46 columns.
Use this function to add a new statistic row that is a function of the other statistics in an ARD.
add_calculated_row( x, expr, stat_name, by = c(all_ard_groups(), all_ard_variables(), any_of("context")), stat_label = stat_name, fmt_fun = NULL, fmt_fn = deprecated() )add_calculated_row( x, expr, stat_name, by = c(all_ard_groups(), all_ard_variables(), any_of("context")), stat_label = stat_name, fmt_fun = NULL, fmt_fn = deprecated() )
x |
( |
expr |
( |
stat_name |
( |
by |
( |
stat_label |
( |
fmt_fun |
( |
fmt_fn |
an ARD data frame of class 'card'
ard_summary(mtcars, variables = mpg) |> add_calculated_row(expr = max - min, stat_name = "range") ard_summary(mtcars, variables = mpg) |> add_calculated_row( expr = dplyr::case_when( mean > median ~ "Right Skew", mean < median ~ "Left Skew", .default = "Symmetric" ), stat_name = "skew" )ard_summary(mtcars, variables = mpg) |> add_calculated_row(expr = max - min, stat_name = "range") ard_summary(mtcars, variables = mpg) |> add_calculated_row( expr = dplyr::case_when( mean > median ~ "Right Skew", mean < median ~ "Left Skew", .default = "Symmetric" ), stat_name = "skew" )
Accepted aliases are non-negative integers and strings.
The integers are converted to functions that round the statistics to the number of decimal places to match the integer.
The formatting strings come in the form "xx", "xx.x", "xx.x%", etc.
The number of xs that appear after the decimal place indicate the number of
decimal places the statistics will be rounded to.
The number of xs that appear before the decimal place indicate the leading
spaces that are added to the result.
If the string ends in "%", results are scaled by 100 before rounding.
alias_as_fmt_fun(x, variable, stat_name)alias_as_fmt_fun(x, variable, stat_name)
x |
( |
variable |
( |
stat_name |
( |
a function
alias_as_fmt_fun(1) alias_as_fmt_fun("xx.x")alias_as_fmt_fun(1) alias_as_fmt_fun("xx.x")
Apply the formatting functions to each of the raw statistics.
Function aliases are converted to functions using alias_as_fmt_fun().
apply_fmt_fun(x, replace = FALSE)apply_fmt_fun(x, replace = FALSE)
x |
( |
replace |
(scalar |
an ARD data frame of class 'card'
ard_summary(ADSL, variables = "AGE") |> apply_fmt_fun()ard_summary(ADSL, variables = "AGE") |> apply_fmt_fun()
Add variable attributes to an ARD data frame.
The label attribute will be added for all columns, and when no label
is specified and no label has been set for a column using the label= argument,
the column name will be placed in the label statistic.
The class attribute will also be returned for all columns.
Any other attribute returned by attributes() will also be added, e.g. factor levels.
ard_attributes(data, ...) ## S3 method for class 'data.frame' ard_attributes(data, variables = everything(), label = NULL, ...) ## Default S3 method: ard_attributes(data, ...)ard_attributes(data, ...) ## S3 method for class 'data.frame' ard_attributes(data, variables = everything(), label = NULL, ...) ## Default S3 method: ard_attributes(data, ...)
data |
( |
... |
These dots are for future extensions and must be empty. |
variables |
( |
label |
(named |
an ARD data frame of class 'card'
df <- dplyr::tibble(var1 = letters, var2 = LETTERS) attr(df$var1, "label") <- "Lowercase Letters" ard_attributes(df, variables = everything())df <- dplyr::tibble(var1 = letters, var2 = LETTERS) attr(df$var1, "label") <- "Lowercase Letters" ard_attributes(df, variables = everything())
Place default and passed argument values to a function into an ARD structure.
ard_formals(fun, arg_names, passed_args = list(), envir = parent.frame())ard_formals(fun, arg_names, passed_args = list(), envir = parent.frame())
fun |
( |
arg_names |
( |
passed_args |
(named |
envir |
( |
an partial ARD data frame of class 'card'
# Example 1 ---------------------------------- # add the `mcnemar.test(correct)` argument to an ARD structure ard_formals(fun = mcnemar.test, arg_names = "correct") # Example 2 ---------------------------------- # S3 Methods need special handling to access the underlying method ard_formals( fun = asNamespace("stats")[["t.test.default"]], arg_names = c("mu", "paired", "var.equal", "conf.level"), passed_args = list(conf.level = 0.90) )# Example 1 ---------------------------------- # add the `mcnemar.test(correct)` argument to an ARD structure ard_formals(fun = mcnemar.test, arg_names = "correct") # Example 2 ---------------------------------- # S3 Methods need special handling to access the underlying method ard_formals( fun = asNamespace("stats")[["t.test.default"]], arg_names = c("mu", "paired", "var.equal", "conf.level"), passed_args = list(conf.level = 0.90) )
Functions ard_hierarchical() and ard_hierarchical_count() are primarily helper
functions for ard_stack_hierarchical() and ard_stack_hierarchical_count(),
meaning that it will be rare a user needs to call
ard_hierarchical()/ard_hierarchical_count() directly.
Performs hierarchical or nested tabulations, e.g. tabulates AE terms nested within AE system organ class.
ard_hierarchical() includes summaries for the last variable listed
in the variables argument, nested within the other variables included.
ard_hierarchical_count() includes summaries for all variables
listed in the variables argument each summary nested within the preceding
variables, e.g. variables=c(AESOC, AEDECOD) summarizes AEDECOD nested
in AESOC, and also summarizes the counts of AESOC.
ard_hierarchical(data, ...) ard_hierarchical_count(data, ...) ## S3 method for class 'data.frame' ard_hierarchical( data, variables, by = dplyr::group_vars(data), statistic = everything() ~ c("n", "N", "p"), denominator = NULL, fmt_fun = NULL, stat_label = everything() ~ default_stat_labels(), id = NULL, fmt_fn = deprecated(), ... ) ## S3 method for class 'data.frame' ard_hierarchical_count( data, variables, by = dplyr::group_vars(data), fmt_fun = NULL, stat_label = everything() ~ default_stat_labels(), fmt_fn = deprecated(), ... )ard_hierarchical(data, ...) ard_hierarchical_count(data, ...) ## S3 method for class 'data.frame' ard_hierarchical( data, variables, by = dplyr::group_vars(data), statistic = everything() ~ c("n", "N", "p"), denominator = NULL, fmt_fun = NULL, stat_label = everything() ~ default_stat_labels(), id = NULL, fmt_fn = deprecated(), ... ) ## S3 method for class 'data.frame' ard_hierarchical_count( data, variables, by = dplyr::group_vars(data), fmt_fun = NULL, stat_label = everything() ~ default_stat_labels(), fmt_fn = deprecated(), ... )
data |
( |
... |
Arguments passed to methods. |
variables |
( |
by |
( |
statistic |
( |
denominator |
(
|
fmt_fun |
( |
stat_label |
( |
id |
( |
fmt_fn |
an ARD data frame of class 'card'
ard_hierarchical( data = ADAE |> dplyr::slice_tail(n = 1L, by = c(USUBJID, TRTA, AESOC, AEDECOD)), variables = c(AESOC, AEDECOD), by = TRTA, id = USUBJID, denominator = ADSL ) ard_hierarchical_count( data = ADAE, variables = c(AESOC, AEDECOD), by = TRTA )ard_hierarchical( data = ADAE |> dplyr::slice_tail(n = 1L, by = c(USUBJID, TRTA, AESOC, AEDECOD)), variables = c(AESOC, AEDECOD), by = TRTA, id = USUBJID, denominator = ADSL ) ard_hierarchical_count( data = ADAE, variables = c(AESOC, AEDECOD), by = TRTA )
Function ingests pre-calculated statistics and returns the identical results, but in an ARD format.
ard_identity(x, variable, context = "identity")ard_identity(x, variable, context = "identity")
x |
(named |
variable |
( |
context |
( |
a ARD
t.test(formula = AGE ~ 1, data = ADSL)[c("statistic", "parameter", "p.value")] |> ard_identity(variable = "AGE", context = "onesample_t_test")t.test(formula = AGE ~ 1, data = ADSL)[c("statistic", "parameter", "p.value")] |> ard_identity(variable = "AGE", context = "onesample_t_test")
Compute Analysis Results Data (ARD) for statistics related to data missingness.
ard_missing(data, ...) ## S3 method for class 'data.frame' ard_missing( data, variables, by = dplyr::group_vars(data), statistic = everything() ~ c("N_obs", "N_miss", "N_nonmiss", "p_miss", "p_nonmiss"), fmt_fun = NULL, stat_label = everything() ~ default_stat_labels(), fmt_fn = deprecated(), ... )ard_missing(data, ...) ## S3 method for class 'data.frame' ard_missing( data, variables, by = dplyr::group_vars(data), statistic = everything() ~ c("N_obs", "N_miss", "N_nonmiss", "p_miss", "p_nonmiss"), fmt_fun = NULL, stat_label = everything() ~ default_stat_labels(), fmt_fn = deprecated(), ... )
data |
( |
... |
Arguments passed to methods. |
variables |
( |
by |
( |
statistic |
( The value assigned to each variable must also be a named list, where the names
are used to reference a function and the element is the function object.
Typically, this function will return a scalar statistic, but a function that
returns a named list of results is also acceptable, e.g.
|
fmt_fun |
( |
stat_label |
( |
fmt_fn |
an ARD data frame of class 'card'
ard_missing(ADSL, by = "ARM", variables = "AGE") ADSL |> dplyr::group_by(ARM) |> ard_missing( variables = "AGE", statistic = ~"N_miss" )ard_missing(ADSL, by = "ARM", variables = "AGE") ADSL |> dplyr::group_by(ARM) |> ard_missing( variables = "AGE", statistic = ~"N_miss" )
Function is similar to ard_summary(), but allows for more complex, multivariate
summaries. While ard_summary(statistic) only allows for a univariable
function, ard_mvsummary(statistic) can handle more complex data summaries.
ard_mvsummary(data, ...) ## S3 method for class 'data.frame' ard_mvsummary( data, variables, by = dplyr::group_vars(data), strata = NULL, statistic, fmt_fun = NULL, stat_label = everything() ~ default_stat_labels(), fmt_fn = deprecated(), ... )ard_mvsummary(data, ...) ## S3 method for class 'data.frame' ard_mvsummary( data, variables, by = dplyr::group_vars(data), strata = NULL, statistic, fmt_fun = NULL, stat_label = everything() ~ default_stat_labels(), fmt_fn = deprecated(), ... )
data |
( |
... |
Arguments passed to methods. |
variables |
( |
by, strata
|
(
Arguments may be used in conjunction with one another. |
statistic |
(
It is unlikely any one function will need all of the above elements,
and it's recommended the function passed accepts |
fmt_fun |
( |
stat_label |
( |
fmt_fn |
an ARD data frame of class 'card'
# example how to mimic behavior of `ard_summary()` ard_mvsummary( ADSL, by = "ARM", variables = "AGE", statistic = list(AGE = list(mean = \(x, ...) mean(x))) ) # return the grand mean and the mean within the `by` group grand_mean <- function(data, full_data, variable, ...) { list( mean = mean(data[[variable]], na.rm = TRUE), grand_mean = mean(full_data[[variable]], na.rm = TRUE) ) } ADSL |> dplyr::group_by(ARM) |> ard_mvsummary( variables = "AGE", statistic = list(AGE = list(means = grand_mean)) )# example how to mimic behavior of `ard_summary()` ard_mvsummary( ADSL, by = "ARM", variables = "AGE", statistic = list(AGE = list(mean = \(x, ...) mean(x))) ) # return the grand mean and the mean within the `by` group grand_mean <- function(data, full_data, variable, ...) { list( mean = mean(data[[variable]], na.rm = TRUE), grand_mean = mean(full_data[[variable]], na.rm = TRUE) ) } ADSL |> dplyr::group_by(ARM) |> ard_mvsummary( variables = "AGE", statistic = list(AGE = list(means = grand_mean)) )
Utility to perform pairwise comparisons.
ard_pairwise(data, variable, .f, include = NULL)ard_pairwise(data, variable, .f, include = NULL)
data |
( |
variable |
( |
.f |
( |
include |
( |
list of ARDs
ard_pairwise( ADSL, variable = ARM, .f = \(df) { ard_mvsummary( df, variables = AGE, statistic = ~ list(ttest = \(x, data, ...) t.test(x ~ data$ARM)[c("statistic", "p.value")]) ) }, include = "Placebo" # only include comparisons to the "Placebo" group )ard_pairwise( ADSL, variable = ARM, .f = \(df) { ard_mvsummary( df, variables = AGE, statistic = ~ list(ttest = \(x, data, ...) t.test(x ~ data$ARM)[c("statistic", "p.value")]) ) }, include = "Placebo" # only include comparisons to the "Placebo" group )
Stack multiple ARD calls sharing common input data and by variables.
Optionally incorporate additional information on represented variables, e.g.
overall calculations, rates of missingness, attributes, or transform results
with shuffle_ard().
If the ard_stack(by) argument is specified, a univariate tabulation of the
by variable will also be returned.
ard_stack( data, ..., .by = NULL, .overall = FALSE, .missing = FALSE, .attributes = FALSE, .total_n = FALSE, .shuffle = FALSE, .by_stats = TRUE )ard_stack( data, ..., .by = NULL, .overall = FALSE, .missing = FALSE, .attributes = FALSE, .total_n = FALSE, .shuffle = FALSE, .by_stats = TRUE )
data |
( |
... |
( |
.by |
( |
.overall |
( |
.missing |
( |
.attributes |
( |
.total_n |
( |
.shuffle |
|
.by_stats |
( |
an ARD data frame of class 'card'
ard_stack( data = ADSL, ard_tabulate(variables = "AGEGR1"), ard_summary(variables = "AGE"), .by = "ARM", .overall = TRUE, .attributes = TRUE ) ard_stack( data = ADSL, ard_tabulate(variables = "AGEGR1"), ard_summary(variables = "AGE"), .by = "ARM" )ard_stack( data = ADSL, ard_tabulate(variables = "AGEGR1"), ard_summary(variables = "AGE"), .by = "ARM", .overall = TRUE, .attributes = TRUE ) ard_stack( data = ADSL, ard_tabulate(variables = "AGEGR1"), ard_summary(variables = "AGE"), .by = "ARM" )
Use these functions to calculate multiple summaries of nested or hierarchical data in a single call.
ard_stack_hierarchical(): Calculates rates of events (e.g. adverse events)
utilizing the denominator and id arguments to identify the rows in data
to include in each rate calculation.
ard_stack_hierarchical_count(): Calculates counts of events utilizing
all rows for each tabulation.
ard_stack_hierarchical( data, variables, by = dplyr::group_vars(data), id, denominator, include = everything(), statistic = everything() ~ c("n", "N", "p"), overall = FALSE, over_variables = FALSE, attributes = FALSE, total_n = FALSE, shuffle = FALSE, by_stats = TRUE ) ard_stack_hierarchical_count( data, variables, by = dplyr::group_vars(data), denominator = NULL, include = everything(), overall = FALSE, over_variables = FALSE, attributes = FALSE, total_n = FALSE, shuffle = FALSE, by_stats = TRUE )ard_stack_hierarchical( data, variables, by = dplyr::group_vars(data), id, denominator, include = everything(), statistic = everything() ~ c("n", "N", "p"), overall = FALSE, over_variables = FALSE, attributes = FALSE, total_n = FALSE, shuffle = FALSE, by_stats = TRUE ) ard_stack_hierarchical_count( data, variables, by = dplyr::group_vars(data), denominator = NULL, include = everything(), overall = FALSE, over_variables = FALSE, attributes = FALSE, total_n = FALSE, shuffle = FALSE, by_stats = TRUE )
data |
( |
variables |
( |
by |
( |
id |
( |
denominator |
(
|
include |
( |
statistic |
( |
overall |
(scalar |
over_variables |
(scalar |
attributes |
(scalar |
total_n |
(scalar |
shuffle |
|
by_stats |
( |
an ARD data frame of class 'card'
To calculate event rates, the ard_stack_hierarchical() function identifies
rows to include in the calculation.
First, the primary data frame is sorted by the columns identified in
the id, by, and variables arguments.
As the function cycles over the variables specified in the variables argument,
the data frame is grouped by id, intersect(by, names(denominator)), and variables
utilizing the last row within each of the groups.
For example, if the call is
ard_stack_hierarchical(data = ADAE, variables = c(AESOC, AEDECOD), id = USUBJID),
then we'd first subset ADAE to be one row within the grouping c(USUBJID, AESOC, AEDECOD)
to calculate the event rates in 'AEDECOD'. We'd then repeat and
subset ADAE to be one row within the grouping c(USUBJID, AESOC)
to calculate the event rates in 'AESOC'.
When we set overall=TRUE, we wish to re-run our calculations removing the
stratifying columns. For example, if we ran the code below, we results would
include results with the code chunk being re-run with by=NULL.
ard_stack_hierarchical( data = ADAE, variables = c(AESOC, AEDECOD), by = TRTA, denominator = ADSL, id = USUBJID, overall = TRUE )
But there is another case to be aware of: when the by argument includes
columns that are not present in the denominator, for example when tabulating
results by AE grade or severity in addition to treatment assignment.
In the example below, we're tabulating results by treatment assignment and
AE severity. By specifying overall=TRUE, we will re-run the to get
results with by = AESEV and again with by = NULL.
ard_stack_hierarchical( data = ADAE, variables = c(AESOC, AEDECOD), by = c(TRTA, AESEV), denominator = ADSL, id = USUBJID, overall = TRUE )
ard_stack_hierarchical( ADAE, variables = c(AESOC, AEDECOD), by = TRTA, denominator = ADSL, id = USUBJID ) ard_stack_hierarchical_count( ADAE, variables = c(AESOC, AEDECOD), by = TRTA, denominator = ADSL )ard_stack_hierarchical( ADAE, variables = c(AESOC, AEDECOD), by = TRTA, denominator = ADSL, id = USUBJID ) ard_stack_hierarchical_count( ADAE, variables = c(AESOC, AEDECOD), by = TRTA, denominator = ADSL )
General function for calculating ARD results within subgroups.
While the examples below show use with other functions from the cards package, this function would primarily be used with the statistical functions in the cardx functions.
ard_strata(.data, .by = NULL, .strata = NULL, .f, ...)ard_strata(.data, .by = NULL, .strata = NULL, .f, ...)
.data |
( |
.by, .strata
|
(
These argument should not include any columns that appear in the |
.f |
( |
... |
Additional arguments passed on to the |
an ARD data frame of class 'card'
# Example 1 ---------------------------------- ard_strata( ADSL, .by = ARM, .f = ~ ard_summary(.x, variables = AGE) ) # Example 2 ---------------------------------- df <- data.frame( USUBJID = 1:12, PARAMCD = rep(c("PARAM1", "PARAM2"), each = 6), AVALC = c( "Yes", "No", "Yes", # PARAM1 "Yes", "Yes", "No", # PARAM1 "Low", "Medium", "High", # PARAM2 "Low", "Low", "Medium" # PARAM2 ) ) ard_strata( df, .strata = PARAMCD, .f = \(.x) { lvls <- switch(.x[["PARAMCD"]][1], "PARAM1" = c("Yes", "No"), "PARAM2" = c("Zero", "Low", "Medium", "High") ) .x |> dplyr::mutate(AVALC = factor(AVALC, levels = lvls)) |> ard_tabulate(variables = AVALC) } )# Example 1 ---------------------------------- ard_strata( ADSL, .by = ARM, .f = ~ ard_summary(.x, variables = AGE) ) # Example 2 ---------------------------------- df <- data.frame( USUBJID = 1:12, PARAMCD = rep(c("PARAM1", "PARAM2"), each = 6), AVALC = c( "Yes", "No", "Yes", # PARAM1 "Yes", "Yes", "No", # PARAM1 "Low", "Medium", "High", # PARAM2 "Low", "Low", "Medium" # PARAM2 ) ) ard_strata( df, .strata = PARAMCD, .f = \(.x) { lvls <- switch(.x[["PARAMCD"]][1], "PARAM1" = c("Yes", "No"), "PARAM2" = c("Zero", "Low", "Medium", "High") ) .x |> dplyr::mutate(AVALC = factor(AVALC, levels = lvls)) |> ard_tabulate(variables = AVALC) } )
Compute Analysis Results Data (ARD) for simple continuous summary statistics.
ard_summary(data, ...) ## S3 method for class 'data.frame' ard_summary( data, variables, by = dplyr::group_vars(data), strata = NULL, statistic = everything() ~ continuous_summary_fns(), fmt_fun = NULL, stat_label = everything() ~ default_stat_labels(), fmt_fn = deprecated(), ... )ard_summary(data, ...) ## S3 method for class 'data.frame' ard_summary( data, variables, by = dplyr::group_vars(data), strata = NULL, statistic = everything() ~ continuous_summary_fns(), fmt_fun = NULL, stat_label = everything() ~ default_stat_labels(), fmt_fn = deprecated(), ... )
data |
( |
... |
Arguments passed to methods. |
variables |
( |
by, strata
|
(
Arguments may be used in conjunction with one another. |
statistic |
( The value assigned to each variable must also be a named list, where the names
are used to reference a function and the element is the function object.
Typically, this function will return a scalar statistic, but a function that
returns a named list of results is also acceptable, e.g.
|
fmt_fun |
( |
stat_label |
( |
fmt_fn |
an ARD data frame of class 'card'
ard_summary(ADSL, by = "ARM", variables = "AGE") # if a single function returns a named list, the named # results will be placed in the resulting ARD ADSL |> dplyr::group_by(ARM) |> ard_summary( variables = "AGE", statistic = ~ list(conf.int = \(x) t.test(x)[["conf.int"]] |> as.list() |> setNames(c("conf.low", "conf.high"))) )ard_summary(ADSL, by = "ARM", variables = "AGE") # if a single function returns a named list, the named # results will be placed in the resulting ARD ADSL |> dplyr::group_by(ARM) |> ard_summary( variables = "AGE", statistic = ~ list(conf.int = \(x) t.test(x)[["conf.int"]] |> as.list() |> setNames(c("conf.low", "conf.high"))) )
Compute Analysis Results Data (ARD) for categorical summary statistics.
ard_tabulate(data, ...) ## S3 method for class 'data.frame' ard_tabulate( data, variables, by = dplyr::group_vars(data), strata = NULL, statistic = everything() ~ c("n", "p", "N"), denominator = "column", fmt_fun = NULL, stat_label = everything() ~ default_stat_labels(), fmt_fn = deprecated(), ... )ard_tabulate(data, ...) ## S3 method for class 'data.frame' ard_tabulate( data, variables, by = dplyr::group_vars(data), strata = NULL, statistic = everything() ~ c("n", "p", "N"), denominator = "column", fmt_fun = NULL, stat_label = everything() ~ default_stat_labels(), fmt_fn = deprecated(), ... )
data |
( |
... |
Arguments passed to methods. |
variables |
( |
by, strata
|
(
Arguments may be used in conjunction with one another. |
statistic |
( |
denominator |
( |
fmt_fun |
( |
stat_label |
( |
fmt_fn |
an ARD data frame of class 'card'
By default, the ard_tabulate() function returns the statistics "n", "N", and
"p", where little "n" are the counts for the variable levels, and big "N" is
the number of non-missing observations. The calculation for the
proportion is p = n/N.
However, it is sometimes necessary to provide a different "N" to use
as the denominator in this calculation. For example, in a calculation
of the rates of various observed adverse events, you may need to update the
denominator to the number of enrolled subjects.
In such cases, use the denominator argument to specify a new definition
of "N", and subsequently "p".
The argument expects one of the following inputs:
a string: one of "column", "row", or "cell".
"column", the default, returns percentages where the sum is equal to
one within the variable after the data frame has been subset with by/strata.
"row" gives 'row' percentages where by/strata columns are the 'top'
of a cross table, and the variables are the rows. This is well-defined
for a single by or strata variable, and care must be taken when there
are more to ensure the the results are as you expect.
"cell" gives percentages where the denominator is the number of non-missing
rows in the source data frame.
a data frame. Any columns in the data frame that overlap with the by/strata
columns will be used to calculate the new "N".
an integer. This single integer will be used as the new "N"
a structured data frame. The data frame will include columns from by/strata.
The last column must be named "...ard_N...". The integers in this column will
be used as the updated "N" in the calculations.
When the p statistic is returned, the proportion is returned—bounded by [0, 1].
The default function to format the statistic scales the proportion by 100
and the percentage is returned which matches the default statistic label of '%'.
To get the formatted values, pass the ARD to apply_fmt_fun().
ard_tabulate(ADSL, by = "ARM", variables = "AGEGR1") ADSL |> dplyr::group_by(ARM) |> ard_tabulate( variables = "AGEGR1", statistic = everything() ~ "n" )ard_tabulate(ADSL, by = "ARM", variables = "AGEGR1") ADSL |> dplyr::group_by(ARM) |> ard_tabulate( variables = "AGEGR1", statistic = everything() ~ "n" )
Tabulate the number of rows in a data frame.
ard_tabulate_rows( data, colname = "..row_count..", by = dplyr::group_vars(data), strata = NULL, fmt_fun = NULL )ard_tabulate_rows( data, colname = "..row_count..", by = dplyr::group_vars(data), strata = NULL, fmt_fun = NULL )
data |
( |
colname |
( |
by, strata
|
(
Arguments may be used in conjunction with one another. |
fmt_fun |
( |
an ARD data frame of class 'card'
ard_tabulate_rows(ADSL, by = TRTA)ard_tabulate_rows(ADSL, by = TRTA)
Tabulate an Analysis Results Data (ARD) for dichotomous or a specified value.
ard_tabulate_value(data, ...) ## S3 method for class 'data.frame' ard_tabulate_value( data, variables, by = dplyr::group_vars(data), strata = NULL, value = maximum_variable_value(data[variables]), statistic = everything() ~ c("n", "N", "p"), denominator = NULL, fmt_fun = NULL, stat_label = everything() ~ default_stat_labels(), fmt_fn = deprecated(), ... )ard_tabulate_value(data, ...) ## S3 method for class 'data.frame' ard_tabulate_value( data, variables, by = dplyr::group_vars(data), strata = NULL, value = maximum_variable_value(data[variables]), statistic = everything() ~ c("n", "N", "p"), denominator = NULL, fmt_fun = NULL, stat_label = everything() ~ default_stat_labels(), fmt_fn = deprecated(), ... )
data |
( |
... |
Arguments passed to methods. |
variables |
( |
by, strata
|
(
Arguments may be used in conjunction with one another. |
value |
(named |
statistic |
( |
denominator |
( |
fmt_fun |
( |
stat_label |
( |
fmt_fn |
an ARD data frame of class 'card'
By default, the ard_tabulate() function returns the statistics "n", "N", and
"p", where little "n" are the counts for the variable levels, and big "N" is
the number of non-missing observations. The calculation for the
proportion is p = n/N.
However, it is sometimes necessary to provide a different "N" to use
as the denominator in this calculation. For example, in a calculation
of the rates of various observed adverse events, you may need to update the
denominator to the number of enrolled subjects.
In such cases, use the denominator argument to specify a new definition
of "N", and subsequently "p".
The argument expects one of the following inputs:
a string: one of "column", "row", or "cell".
"column", the default, returns percentages where the sum is equal to
one within the variable after the data frame has been subset with by/strata.
"row" gives 'row' percentages where by/strata columns are the 'top'
of a cross table, and the variables are the rows. This is well-defined
for a single by or strata variable, and care must be taken when there
are more to ensure the the results are as you expect.
"cell" gives percentages where the denominator is the number of non-missing
rows in the source data frame.
a data frame. Any columns in the data frame that overlap with the by/strata
columns will be used to calculate the new "N".
an integer. This single integer will be used as the new "N"
a structured data frame. The data frame will include columns from by/strata.
The last column must be named "...ard_N...". The integers in this column will
be used as the updated "N" in the calculations.
When the p statistic is returned, the proportion is returned—bounded by [0, 1].
The default function to format the statistic scales the proportion by 100
and the percentage is returned which matches the default statistic label of '%'.
To get the formatted values, pass the ARD to apply_fmt_fun().
ard_tabulate_value(mtcars, by = vs, variables = c(cyl, am), value = list(cyl = 4)) mtcars |> dplyr::group_by(vs) |> ard_tabulate_value( variables = c(cyl, am), value = list(cyl = 4), statistic = ~"p" )ard_tabulate_value(mtcars, by = vs, variables = c(cyl, am), value = list(cyl = 4)) mtcars |> dplyr::group_by(vs) |> ard_tabulate_value( variables = c(cyl, am), value = list(cyl = 4), statistic = ~"p" )
Returns the total N for the data frame.
The placeholder variable name returned in the object is "..ard_total_n.."
ard_total_n(data, ...) ## S3 method for class 'data.frame' ard_total_n(data, ...)ard_total_n(data, ...) ## S3 method for class 'data.frame' ard_total_n(data, ...)
data |
( |
... |
Arguments passed to methods. |
an ARD data frame of class 'card'
ard_total_n(ADSL)ard_total_n(ADSL)
Convert data frames to ARDs of class 'card'.
as_card(x, check = TRUE)as_card(x, check = TRUE)
x |
( |
check |
(scalar |
an ARD data frame of class 'card'
data.frame( stat_name = c("N", "mean"), stat_label = c("N", "Mean"), stat = c(10, 0.5) ) |> as_card(check = FALSE) dplyr::tibble( variable = "AGE", stat_name = c("N", "mean"), stat_label = c("N", "Mean"), stat = list(10, 0.5), fmt_fun = replicate(2, list()), warning = replicate(2, list()), error = replicate(2, list()) ) |> as_card()data.frame( stat_name = c("N", "mean"), stat_label = c("N", "Mean"), stat = c(10, 0.5) ) |> as_card(check = FALSE) dplyr::tibble( variable = "AGE", stat_name = c("N", "mean"), stat_label = c("N", "Mean"), stat = list(10, 0.5), fmt_fun = replicate(2, list()), warning = replicate(2, list()), error = replicate(2, list()) ) |> as_card()
Add attributes to a function that specify the expected results.
It is used when ard_summary() or ard_mvsummary() errors and constructs
an ARD with the correct structure when the results cannot be calculated.
as_cards_fn(f, stat_names) is_cards_fn(f) get_cards_fn_stat_names(f)as_cards_fn(f, stat_names) is_cards_fn(f) get_cards_fn_stat_names(f)
f |
( |
stat_names |
( |
an ARD data frame of class 'card'
# When there is no error, everything works as if we hadn't used `as_card_fn()` ttest_works <- as_cards_fn( \(x) t.test(x)[c("statistic", "p.value")], stat_names = c("statistic", "p.value") ) ard_summary( mtcars, variables = mpg, statistic = ~ list(ttest = ttest_works) ) # When there is an error and we use `as_card_fn()`, # we will see the same structure as when there is no error ttest_error <- as_cards_fn( \(x) { t.test(x)[c("statistic", "p.value")] stop("Intentional Error") }, stat_names = c("statistic", "p.value") ) ard_summary( mtcars, variables = mpg, statistic = ~ list(ttest = ttest_error) ) # if we don't use `as_card_fn()` and there is an error, # the returned result is only one row ard_summary( mtcars, variables = mpg, statistic = ~ list(ttest = \(x) { t.test(x)[c("statistic", "p.value")] stop("Intentional Error") }) )# When there is no error, everything works as if we hadn't used `as_card_fn()` ttest_works <- as_cards_fn( \(x) t.test(x)[c("statistic", "p.value")], stat_names = c("statistic", "p.value") ) ard_summary( mtcars, variables = mpg, statistic = ~ list(ttest = ttest_works) ) # When there is an error and we use `as_card_fn()`, # we will see the same structure as when there is no error ttest_error <- as_cards_fn( \(x) { t.test(x)[c("statistic", "p.value")] stop("Intentional Error") }, stat_names = c("statistic", "p.value") ) ard_summary( mtcars, variables = mpg, statistic = ~ list(ttest = ttest_error) ) # if we don't use `as_card_fn()` and there is an error, # the returned result is only one row ard_summary( mtcars, variables = mpg, statistic = ~ list(ttest = \(x) { t.test(x)[c("statistic", "p.value")] stop("Intentional Error") }) )
as_nested_list(x)as_nested_list(x)
x |
( |
a nested list
ard_summary(mtcars, by = "cyl", variables = c("mpg", "hp")) |> as_nested_list()ard_summary(mtcars, by = "cyl", variables = c("mpg", "hp")) |> as_nested_list()
Wrapper for dplyr::bind_rows() with additional checks
for duplicated statistics.
bind_ard( ..., .distinct = TRUE, .update = FALSE, .order = FALSE, .quiet = FALSE )bind_ard( ..., .distinct = TRUE, .update = FALSE, .order = FALSE, .quiet = FALSE )
... |
( |
.distinct |
( |
.update |
( |
.order |
( |
.quiet |
( |
an ARD data frame of class 'card'
ard <- ard_tabulate(ADSL, by = "ARM", variables = "AGEGR1") bind_ard(ard, ard, .update = TRUE)ard <- ard_tabulate(ADSL, by = "ARM", variables = "AGEGR1") bind_ard(ard, ard, .update = TRUE)
See below for options available in the {cards} package
There are two types of rounding types in the {cards} package that are implemented
in label_round(), alias_as_fmt_fun(), and apply_fmt_fun() functions.
'round-half-up' (default): rounding method where values exactly halfway
between two numbers are rounded to the larger in magnitude number.
Rounding is implemented via round5().
'round-to-even': base R's default IEC 60559 rounding standard.
See round() for details.
To change the default rounding to use IEC 60559, this option must be set both
when the ARDs are created and when apply_fmt_fun() is run. This ensures that
any default formatting functions created with label_round() utilize the
specified rounding method and the method is used what aliases are converted
into functions (which occurs in apply_fmt_fun() when it calls alias_as_fmt_fun()).
Function tests the structure and returns notes when object does not conform to expected structure.
check_ard_structure( x, column_order = TRUE, method = TRUE, error_on_fail = FALSE )check_ard_structure( x, column_order = TRUE, method = TRUE, error_on_fail = FALSE )
x |
( |
column_order |
(scalar |
method |
(scalar |
error_on_fail |
(scalar |
an ARD data frame of class 'card' (invisible)
ard_summary(ADSL, variables = "AGE") |> dplyr::select(-warning, -error) |> check_ard_structure()ard_summary(ADSL, variables = "AGE") |> dplyr::select(-warning, -error) |> check_ard_structure()
compare_ard() compares columns of two ARDs row-by-row using a shared set
of key columns. Rows where the column values differ are returned.
The is_ard_equal() function accepts a compare_ard()
object, and returns TRUE or FALSE depending on whether the comparison
reported difference. check_ard_equal() returns as error if not equal.
compare_ard( x, y, keys = c(all_ard_groups(), all_ard_variables(), any_of(c("variable", "variable_level", "stat_name"))), columns = any_of(c("stat_label", "stat", "stat_fmt")), tolerance = sqrt(.Machine$double.eps), check.attributes = TRUE ) is_ard_equal(x) check_ard_equal(x)compare_ard( x, y, keys = c(all_ard_groups(), all_ard_variables(), any_of(c("variable", "variable_level", "stat_name"))), columns = any_of(c("stat_label", "stat", "stat_fmt")), tolerance = sqrt(.Machine$double.eps), check.attributes = TRUE ) is_ard_equal(x) check_ard_equal(x)
x |
( |
y |
( |
keys |
( |
columns |
( |
tolerance |
( |
check.attributes |
( |
a named list of class "ard_comparison" containing:
rows_in_x_not_y: data frame of rows present in x but not in y
(based on key columns)
rows_in_y_not_x: data frame of rows present in y but not in x
(based on key columns)
compare: a named list where each element is a data frame containing
the key columns, the compared column values from both ARDs, and a
difference column with the all.equal() description for rows where
values differ
base <- ard_summary(ADSL, by = ARM, variables = AGE) compare <- ard_summary(dplyr::mutate(ADSL, AGE = AGE + 1), by = ARM, variables = AGE ) compare_ard(base, compare)$compare$statbase <- ard_summary(ADSL, by = ARM, variables = AGE) compare <- ard_summary(dplyr::mutate(ADSL, AGE = AGE + 1), by = ARM, variables = AGE ) compare_ard(base, compare)$compare$stat
Returns a named list of statistics labels
default_stat_labels()default_stat_labels()
named list
# stat labels default_stat_labels()# stat labels default_stat_labels()
Some functions have been deprecated and are no longer being actively
supported.
Renamed functions
ard_categorical() to ard_tabulate()
ard_continuous() to ard_summary()
ard_complex() to ard_mvsummary()
apply_fmt_fn() to apply_fmt_fun()
alias_as_fmt_fn() to alias_as_fmt_fun()
update_ard_fmt_fn() to update_ard_fmt_fun()
Deprecated functions
shuffle_ard()
This function ingests an ARD object and shuffles the information to prepare for analysis. Helpful for streamlining across multiple ARDs. Combines each group/group_level into 1 column, back fills missing grouping values from the variable levels where possible, and optionally trims statistics-level metadata.
ard_continuous(data, ...) ard_categorical(data, ...) ard_complex(data, ...) ard_dichotomous(data, ...) ## S3 method for class 'data.frame' ard_continuous(data, ...) ## S3 method for class 'data.frame' ard_categorical(data, ...) ## S3 method for class 'data.frame' ard_complex(data, ...) ## S3 method for class 'data.frame' ard_dichotomous(data, ...) apply_fmt_fn(...) alias_as_fmt_fn(...) update_ard_fmt_fn(...) shuffle_ard(x, trim = TRUE)ard_continuous(data, ...) ard_categorical(data, ...) ard_complex(data, ...) ard_dichotomous(data, ...) ## S3 method for class 'data.frame' ard_continuous(data, ...) ## S3 method for class 'data.frame' ard_categorical(data, ...) ## S3 method for class 'data.frame' ard_complex(data, ...) ## S3 method for class 'data.frame' ard_dichotomous(data, ...) apply_fmt_fn(...) alias_as_fmt_fn(...) update_ard_fmt_fn(...) shuffle_ard(x, trim = TRUE)
data, ...
|
|
x |
( |
trim |
( |
a tibble
bind_ard( ard_tabulate(ADSL, by = "ARM", variables = "AGEGR1"), ard_tabulate(ADSL, variables = "ARM") ) |> shuffle_ard()bind_ard( ard_tabulate(ADSL, by = "ARM", variables = "AGEGR1"), ard_tabulate(ADSL, variables = "ARM") ) |> shuffle_ard()
eval_capture_conditions()
Evaluates an expression while also capturing error and warning conditions.
Function always returns a named list list(result=, warning=, error=).
If there are no errors or warnings, those elements will be NULL.
If there is an error, the result element will be NULL.
Messages are neither saved nor printed to the console.
Evaluation is done via rlang::eval_tidy(). If errors and warnings are produced
using the {cli} package, the messages are processed with cli::ansi_strip()
to remove styling from the message.
captured_condition_as_message()/captured_condition_as_error()
These functions take the result from eval_capture_conditions() and return
errors or warnings as either messages (via cli::cli_inform()) or
errors (via cli::cli_abort()). These functions handle cases where the
condition messages may include curly brackets, which would typically cause
issues when processed with the cli::cli_*() functions.
Functions return the "result" from eval_capture_conditions().
eval_capture_conditions(expr, data = NULL, env = caller_env()) captured_condition_as_message( x, message = c("The following {type} occured:", x = "{condition}"), type = c("error", "warning"), envir = rlang::current_env() ) captured_condition_as_error( x, message = c("The following {type} occured:", x = "{condition}"), type = c("error", "warning"), call = get_cli_abort_call(), envir = rlang::current_env() )eval_capture_conditions(expr, data = NULL, env = caller_env()) captured_condition_as_message( x, message = c("The following {type} occured:", x = "{condition}"), type = c("error", "warning"), envir = rlang::current_env() ) captured_condition_as_error( x, message = c("The following {type} occured:", x = "{condition}"), type = c("error", "warning"), call = get_cli_abort_call(), envir = rlang::current_env() )
expr |
An expression or quosure to evaluate. |
data |
A data frame, or named list or vector. Alternatively, a
data mask created with |
env |
The environment in which to evaluate |
x |
( |
message |
( |
type |
( |
envir |
Environment to evaluate the glue expressions in. |
call |
( |
a named list
# function executes without error or warning eval_capture_conditions(letters[1:2]) # an error is thrown res <- eval_capture_conditions(stop("Example Error!")) res captured_condition_as_message(res) # if more than one warning is returned, all are saved eval_capture_conditions({ warning("Warning 1") warning("Warning 2") letters[1:2] }) # messages are not printed to the console eval_capture_conditions({ message("A message!") letters[1:2] })# function executes without error or warning eval_capture_conditions(letters[1:2]) # an error is thrown res <- eval_capture_conditions(stop("Example Error!")) res captured_condition_as_message(res) # if more than one warning is returned, all are saved eval_capture_conditions({ warning("Warning 1") warning("Warning 2") letters[1:2] }) # messages are not printed to the console eval_capture_conditions({ message("A message!") letters[1:2] })
This function is used to filter stacked hierarchical ARDs.
For the purposes of this function, we define a "variable group" as a combination of ARD rows
grouped by the combination of all their variable levels, but excluding any by variables.
filter_ard_hierarchical( x, filter, var = NULL, keep_empty = FALSE, quiet = FALSE )filter_ard_hierarchical( x, filter, var = NULL, keep_empty = FALSE, quiet = FALSE )
x |
( |
filter |
( |
var |
( |
keep_empty |
(scalar |
quiet |
( |
The filter argument can be used to filter out variable groups of a hierarchical
ARD which do not meet the requirements provided as an expression.
Variable groups can be filtered on the values of any of the possible
statistics (n, p, and N) provided they are included at least once
in the ARD, as well as the values of any by variables.
Additionally, filters can be applied on individual levels of the by variable via the
n_XX, N_XX, and p_XX statistics, where each XX represents the index of the by
variable level to select the statistic from. For example, filter = n_1 > 5 will check
whether n values for the first level of by are greater than 5 in each row group.
Overall statistics for each row group can be used in filters via the n_overall, N_overall,
and p_overall statistics. If the ARD is created with parameter overall=TRUE, then these
overall statistics will be extracted directly from the ARD, otherwise the statistics will be
derived where possible. If overall=FALSE, then n_overall can only be derived if the n
statistic is present in the ARD for the filter variable, N_overall if the N statistic is
present for the filter variable, and p_overall if both the n and N statistics are
present for the filter variable.
By default, filters will be applied at the level of the innermost hierarchy variable, i.e.
the last variable supplied to variables. If filters should instead be applied at the level
of one of the outer hierarchy variables, the var parameter can be used to select a different
variable to filter on. When var is set to a different (outer) variable and a level of the
variable does not meet the filtering criteria then the section corresponding to that variable
level and all sub-sections within that section will be removed.
To illustrate how the function works, consider the typical example below where the AE summaries are provided by treatment group.
ADAE |>
dplyr::filter(AESOC == "GASTROINTESTINAL DISORDERS",
AEDECOD %in% c("VOMITING", "DIARRHOEA")) |>
ard_stack_hierarchical(
variables = c(AESOC, AEDECOD),
by = TRTA,
denominator = ADSL,
id = USUBJID
)
| SOC / AE | Placebo | Xanomeline High Dose | Xanomeline Low Dose |
| GASTROINTESTINAL DISORDERS | 11 (13%) | 10 (12%) | 8 (9.5%) |
| DIARRHOEA | 9 (10%) | 4 (4.8%) | 5 (6.0%) |
| VOMITING | 3 (3.5%) | 7 (8.3%) | 3 (3.6%) |
Filters are applied to the summary statistics of the innermost variable in the hierarchy by
default—AEDECOD in this case. If we wanted to filter based on SOC rates instead of AE
rates we could specify var = AESOC instead.
If any of the summary statistics meet the filter requirement for any of the treatment groups,
the entire row is retained.
For example, if filter = n >= 9 were passed, the criteria would be met for DIARRHOEA
as the Placebo group observed 9 AEs and as a result the summary statistics for the other
treatment groups would be retained as well.
Conversely, no treatment groups' summary statistics satisfy the filter requirement
for VOMITING so all rows associated with this AE would be removed.
In addition to filtering on individual statistic values, filters can be applied
across the treatment groups (i.e. across all by variable values) by using
aggregate functions such as sum() and mean(). For simplicity, it is suggested to use
the XX_overall statistics in place of sum(XX) in equivalent scenarios. For example,
n_overall is equivalent to sum(n).
A value of filter = sum(n) >= 18 (or filter = n_overall >= 18) retains AEs where the sum of
the number of AEs across the treatment groups is greater than or equal to 18.
If filter = n_overall >= 18 and var = AESOC then all rows corresponding to an SOC with an
overall rate less than 18 - including all AEs within that SOC - will be removed.
If ard_stack_hierarchical(overall=TRUE) was run, the overall column is not considered in
any filtering except for XX_overall statistics, if specified.
If ard_stack_hierarchical(over_variables=TRUE) was run, any overall statistics are kept regardless
of filtering.
Some examples of possible filters:
filter = n > 5: keep AEs where one of the treatment groups observed more than 5 AEs
filter = n == 2 & p < 0.05: keep AEs where one of the treatment groups observed exactly 2
AEs and one of the treatment groups observed a proportion less than 5%
filter = n_overall >= 4: keep AEs where there were 4 or more AEs observed across the treatment groups
filter = mean(n) > 4 | n > 3: keep AEs where the mean number of AEs is 4 or more across the
treatment groups or one of the treatment groups observed more than 3 AEs
filter = n_2 > 2: keep AEs where the "Xanomeline High Dose" treatment group (second by variable
level) observed more than 2 AEs
an ARD data frame of class 'card'
# create a base AE ARD ard <- ard_stack_hierarchical( ADAE, variables = c(AESOC, AEDECOD), by = TRTA, denominator = ADSL, id = USUBJID, overall = TRUE ) # Example 1 ---------------------------------- # Keep AEs from TRTA groups where more than 3 AEs are observed across the group filter_ard_hierarchical(ard, sum(n) > 3) # Example 2 ---------------------------------- # Keep AEs where at least one level in the TRTA group has more than 3 AEs observed filter_ard_hierarchical(ard, n > 3) # Example 3 ---------------------------------- # Keep AEs that have an overall prevalence of greater than 5% filter_ard_hierarchical(ard, sum(n) / sum(N) > 0.05) # Example 4 ---------------------------------- # Keep AEs that have a difference in prevalence of greater than 3% between reference group with # `TRTA = "Xanomeline High Dose"` and comparison group with `TRTA = "Xanomeline Low Dose"` filter_ard_hierarchical(ard, abs(p_2 - p_3) > 0.03) # Example 5 ---------------------------------- # Keep AEs from SOCs that have an overall prevalence of greater than 20% filter_ard_hierarchical(ard, p_overall > 0.20, var = AESOC)# create a base AE ARD ard <- ard_stack_hierarchical( ADAE, variables = c(AESOC, AEDECOD), by = TRTA, denominator = ADSL, id = USUBJID, overall = TRUE ) # Example 1 ---------------------------------- # Keep AEs from TRTA groups where more than 3 AEs are observed across the group filter_ard_hierarchical(ard, sum(n) > 3) # Example 2 ---------------------------------- # Keep AEs where at least one level in the TRTA group has more than 3 AEs observed filter_ard_hierarchical(ard, n > 3) # Example 3 ---------------------------------- # Keep AEs that have an overall prevalence of greater than 5% filter_ard_hierarchical(ard, sum(n) / sum(N) > 0.05) # Example 4 ---------------------------------- # Keep AEs that have a difference in prevalence of greater than 3% between reference group with # `TRTA = "Xanomeline High Dose"` and comparison group with `TRTA = "Xanomeline Low Dose"` filter_ard_hierarchical(ard, abs(p_2 - p_3) > 0.03) # Example 5 ---------------------------------- # Keep AEs from SOCs that have an overall prevalence of greater than 20% filter_ard_hierarchical(ard, p_overall > 0.20, var = AESOC)
Returns the statistics from an ARD as a named list.
get_ard_statistics(x, ..., .column = "stat", .attributes = NULL)get_ard_statistics(x, ..., .column = "stat", .attributes = NULL)
x |
( |
... |
( |
.column |
( |
.attributes |
( |
named list
ard <- ard_tabulate(ADSL, by = "ARM", variables = "AGEGR1") get_ard_statistics( ard, group1_level %in% "Placebo", variable_level %in% "65-80", .attributes = "stat_label" )ard <- ard_tabulate(ADSL, by = "ARM", variables = "AGEGR1") get_ard_statistics( ard, group1_level %in% "Placebo", variable_level %in% "65-80", .attributes = "stat_label" )
Returns a function with the requested rounding and scaling schema.
label_round(digits = 1, scale = 1, width = NULL)label_round(digits = 1, scale = 1, width = NULL)
digits |
( |
scale |
( |
width |
( |
a function
label_round(2)(pi) label_round(1, scale = 100)(pi) label_round(2, width = 5)(pi)label_round(2)(pi) label_round(1, scale = 100)(pi) label_round(2, width = 5)(pi)
For each column in the passed data frame, the function returns a named list
with the value being the largest/last element after a sort.
For factors, the last level is returned, and for logical vectors TRUE is returned.
maximum_variable_value(data)maximum_variable_value(data)
data |
( |
a named list
ADSL[c("AGEGR1", "BMIBLGR1")] |> maximum_variable_value()ADSL[c("AGEGR1", "BMIBLGR1")] |> maximum_variable_value()
Create empty ARDs used to create mock tables or table shells.
Where applicable, the formatting functions are set to return 'xx' or 'xx.x'.
mock_categorical( variables, statistic = everything() ~ c("n", "p", "N"), by = NULL ) mock_continuous( variables, statistic = everything() ~ c("N", "mean", "sd", "median", "p25", "p75", "min", "max"), by = NULL ) mock_dichotomous( variables, statistic = everything() ~ c("n", "p", "N"), by = NULL ) mock_missing( variables, statistic = everything() ~ c("N_obs", "N_miss", "N_nonmiss", "p_miss", "p_nonmiss"), by = NULL ) mock_attributes(label) mock_total_n()mock_categorical( variables, statistic = everything() ~ c("n", "p", "N"), by = NULL ) mock_continuous( variables, statistic = everything() ~ c("N", "mean", "sd", "median", "p25", "p75", "min", "max"), by = NULL ) mock_dichotomous( variables, statistic = everything() ~ c("n", "p", "N"), by = NULL ) mock_missing( variables, statistic = everything() ~ c("N_obs", "N_miss", "N_nonmiss", "p_miss", "p_nonmiss"), by = NULL ) mock_attributes(label) mock_total_n()
variables |
( a named list for functions |
statistic |
( |
by |
(named |
label |
(named |
an ARD data frame of class 'card'
mock_categorical( variables = list( AGEGR1 = factor(c("<65", "65-80", ">80"), levels = c("<65", "65-80", ">80")) ), by = list(TRTA = c("Placebo", "Xanomeline High Dose", "Xanomeline Low Dose")) ) |> apply_fmt_fun() mock_continuous( variables = c("AGE", "BMIBL"), by = list(TRTA = c("Placebo", "Xanomeline High Dose", "Xanomeline Low Dose")) ) |> # update the mock to report 'xx.xx' for standard deviations update_ard_fmt_fun(variables = c("AGE", "BMIBL"), stat_names = "sd", fmt_fun = \(x) "xx.xx") |> apply_fmt_fun()mock_categorical( variables = list( AGEGR1 = factor(c("<65", "65-80", ">80"), levels = c("<65", "65-80", ">80")) ), by = list(TRTA = c("Placebo", "Xanomeline High Dose", "Xanomeline Low Dose")) ) |> apply_fmt_fun() mock_continuous( variables = c("AGE", "BMIBL"), by = list(TRTA = c("Placebo", "Xanomeline High Dose", "Xanomeline Low Dose")) ) |> # update the mock to report 'xx.xx' for standard deviations update_ard_fmt_fun(variables = c("AGE", "BMIBL"), stat_names = "sd", fmt_fun = \(x) "xx.xx") |> apply_fmt_fun()
This function is similar to tidyr::nest(), except that it retains
rows for unobserved combinations (and unobserved factor levels) of by
variables, and unobserved combinations of stratifying variables.
The levels are wrapped in lists so they can be stacked with other types of different classes.
nest_for_ard( data, by = NULL, strata = NULL, key = "data", rename_columns = TRUE, list_columns = TRUE, include_data = TRUE, include_by_and_strata = FALSE )nest_for_ard( data, by = NULL, strata = NULL, key = "data", rename_columns = TRUE, list_columns = TRUE, include_data = TRUE, include_by_and_strata = FALSE )
data |
( |
by, strata
|
(
Arguments may be used in conjunction with one another. |
key |
( |
rename_columns |
( |
list_columns |
( |
include_data |
(scalar |
include_by_and_strata |
( |
a nested tibble
nest_for_ard( data = ADAE |> dplyr::left_join(ADSL[c("USUBJID", "ARM")], by = "USUBJID") |> dplyr::filter(AOCCSFL %in% "Y"), by = "ARM", strata = "AESOC" )nest_for_ard( data = ADAE |> dplyr::left_join(ADSL[c("USUBJID", "ARM")], by = "USUBJID") |> dplyr::filter(AOCCSFL %in% "Y"), by = "ARM", strata = "AESOC" )
Function parses the errors and warnings observed while calculating the statistics requested in the ARD and prints them to the console as messages.
print_ard_conditions(x, condition_type = c("inform", "identity"))print_ard_conditions(x, condition_type = c("inform", "identity"))
x |
( |
condition_type |
( |
returns invisible if check is successful, throws all condition messages if not.
# passing a character variable for numeric summary ard_summary(ADSL, variables = AGEGR1) |> print_ard_conditions()# passing a character variable for numeric summary ard_summary(ADSL, variables = AGEGR1) |> print_ard_conditions()
Functions process tidyselect arguments passed to functions in the cards package. The processed values are saved to the calling environment, by default.
process_selectors(): the arguments will be processed with tidyselect and
converted to a vector of character column names.
process_formula_selectors(): for arguments that expect named lists or
lists of formulas (where the LHS of the formula is a tidyselector). This
function processes these inputs and returns a named list. If a name is
repeated, the last entry is kept.
fill_formula_selectors(): when users override the default argument values,
it can be important to ensure that each column from a data frame is assigned
a value. This function checks that each column in data has an assigned
value, and if not, fills the value in with the default value passed here.
compute_formula_selector(): used in process_formula_selectors() to
evaluate a single argument.
check_list_elements(): used to check the class/type/values of the list
elements, primarily those processed with process_formula_selectors().
cards_select(): wraps tidyselect::eval_select() |> names(), and returns
better contextual messaging when errors occur.
process_selectors(data, ...) process_formula_selectors(data, ...) fill_formula_selectors(data, ...) ## S3 method for class 'data.frame' process_selectors(data, ..., env = caller_env()) ## S3 method for class 'data.frame' process_formula_selectors( data, ..., env = caller_env(), include_env = FALSE, allow_empty = TRUE ) ## S3 method for class 'data.frame' fill_formula_selectors(data, ..., env = caller_env()) compute_formula_selector( data, x, arg_name = caller_arg(x), env = caller_env(), strict = TRUE, include_env = FALSE, allow_empty = TRUE ) check_list_elements( x, predicate, error_msg = NULL, arg_name = rlang::caller_arg(x) ) cards_select(expr, data, ..., arg_name = NULL)process_selectors(data, ...) process_formula_selectors(data, ...) fill_formula_selectors(data, ...) ## S3 method for class 'data.frame' process_selectors(data, ..., env = caller_env()) ## S3 method for class 'data.frame' process_formula_selectors( data, ..., env = caller_env(), include_env = FALSE, allow_empty = TRUE ) ## S3 method for class 'data.frame' fill_formula_selectors(data, ..., env = caller_env()) compute_formula_selector( data, x, arg_name = caller_arg(x), env = caller_env(), strict = TRUE, include_env = FALSE, allow_empty = TRUE ) check_list_elements( x, predicate, error_msg = NULL, arg_name = rlang::caller_arg(x) ) cards_select(expr, data, ..., arg_name = NULL)
data |
( |
... |
(
|
env |
( |
include_env |
( |
allow_empty |
( |
x |
|
arg_name |
( |
strict |
( |
predicate |
( |
error_msg |
( |
expr |
( |
process_selectors(), fill_formula_selectors(), process_formula_selectors()
and check_list_elements() return NULL. compute_formula_selector() returns a
named list.
example_env <- rlang::new_environment() process_selectors(ADSL, variables = starts_with("TRT"), env = example_env) get(x = "variables", envir = example_env) fill_formula_selectors(ADSL, env = example_env) process_formula_selectors( ADSL, statistic = list(starts_with("TRT") ~ mean, TRTSDT = min), env = example_env ) get(x = "statistic", envir = example_env) check_list_elements( get(x = "statistic", envir = example_env), predicate = function(x) !is.null(x), error_msg = c( "Error in the argument {.arg {arg_name}} for variable {.val {variable}}.", "i" = "Value must be a named list of functions." ) ) # process one list compute_formula_selector(ADSL, x = starts_with("U") ~ 1L)example_env <- rlang::new_environment() process_selectors(ADSL, variables = starts_with("TRT"), env = example_env) get(x = "variables", envir = example_env) fill_formula_selectors(ADSL, env = example_env) process_formula_selectors( ADSL, statistic = list(starts_with("TRT") ~ mean, TRTSDT = min), env = example_env ) get(x = "statistic", envir = example_env) check_list_elements( get(x = "statistic", envir = example_env), predicate = function(x) !is.null(x), error_msg = c( "Error in the argument {.arg {arg_name}} for variable {.val {variable}}.", "i" = "Value must be a named list of functions." ) ) # process one list compute_formula_selector(ADSL, x = starts_with("U") ~ 1L)
Rename the grouping and variable columns to their original column names.
rename_ard_columns( x, columns = c(all_ard_groups("names"), all_ard_variables("names")), fill = "{colname}", fct_as_chr = TRUE, unlist = NULL )rename_ard_columns( x, columns = c(all_ard_groups("names"), all_ard_variables("names")), fill = "{colname}", fct_as_chr = TRUE, unlist = NULL )
x |
( |
columns |
( |
fill |
(scalar/glue) |
fct_as_chr |
(scalar |
unlist |
data frame
# Example 1 ---------------------------------- ADSL |> ard_tabulate(by = ARM, variables = AGEGR1) |> apply_fmt_fun() |> rename_ard_columns() |> unlist_ard_columns() # Example 2 ---------------------------------- ADSL |> ard_summary(by = ARM, variables = AGE) |> apply_fmt_fun() |> rename_ard_columns(fill = "Overall {colname}") |> unlist_ard_columns()# Example 1 ---------------------------------- ADSL |> ard_tabulate(by = ARM, variables = AGEGR1) |> apply_fmt_fun() |> rename_ard_columns() |> unlist_ard_columns() # Example 2 ---------------------------------- ADSL |> ard_summary(by = ARM, variables = AGE) |> apply_fmt_fun() |> rename_ard_columns(fill = "Overall {colname}") |> unlist_ard_columns()
Functions for renaming group columns names in ARDs.
rename_ard_groups_shift(x, shift = -1) rename_ard_groups_reverse(x)rename_ard_groups_shift(x, shift = -1) rename_ard_groups_reverse(x)
x |
( |
shift |
( |
an ARD data frame of class 'card'
ard <- ard_summary(ADSL, by = c(SEX, ARM), variables = AGE) # Example 1 ---------------------------------- rename_ard_groups_shift(ard, shift = -1) # Example 2 ---------------------------------- rename_ard_groups_reverse(ard)ard <- ard_summary(ADSL, by = c(SEX, ARM), variables = AGE) # Example 1 ---------------------------------- rename_ard_groups_shift(ard, shift = -1) # Example 2 ---------------------------------- rename_ard_groups_reverse(ard)
When a statistical summary function errors, the "stat" column will be
NULL. It is, however, sometimes useful to replace these values with a
non-NULL value, e.g. NA.
replace_null_statistic(x, value = NA, rows = TRUE)replace_null_statistic(x, value = NA, rows = TRUE)
x |
( |
value |
(usually a |
rows |
( |
an ARD data frame of class 'card'
# the quantile functions error because the input is character, while the median function returns NA data.frame(x = rep_len(NA_character_, 10)) |> ard_summary( variables = x, statistic = ~ continuous_summary_fns(c("median", "p25", "p75")) ) |> replace_null_statistic(rows = !is.null(error))# the quantile functions error because the input is character, while the median function returns NA data.frame(x = rep_len(NA_character_, 10)) |> ard_summary( variables = x, statistic = ~ continuous_summary_fns(c("median", "p25", "p75")) ) |> replace_null_statistic(rows = !is.null(error))
Rounds the values in its first argument to the specified number of
decimal places (default 0). Importantly, round5() does not use Base R's
"round to even" default. Standard rounding methods are implemented, for example,
cards::round5(0.5) = 1, whereas base::round(0.5) = 0.
round5(x, digits = 0)round5(x, digits = 0)
x |
( |
digits |
( |
Function inspired by janitor::round_half_up().
a numeric vector
x <- 0:4 / 2 round5(x) |> setNames(x) # compare results to Base R round(x) |> setNames(x)x <- 0:4 / 2 round5(x) |> setNames(x) # compare results to Base R round(x) |> setNames(x)
These selection helpers match variables according to a given pattern.
all_ard_groups(): Function selects grouping columns, e.g. columns
named "group##" or "group##_level".
all_ard_variables(): Function selects variables columns, e.g. columns
named "variable" or "variable_level".
all_ard_group_n(): Function selects n grouping columns.
all_missing_columns(): Function selects columns that are all NA or empty.
all_ard_groups(types = c("names", "levels")) all_ard_variables(types = c("names", "levels")) all_ard_group_n(n, types = c("names", "levels")) all_missing_columns()all_ard_groups(types = c("names", "levels")) all_ard_variables(types = c("names", "levels")) all_ard_group_n(n, types = c("names", "levels")) all_missing_columns()
types |
( |
n |
( |
tidyselect output
ard <- ard_tabulate(ADSL, by = "ARM", variables = "AGEGR1") ard |> dplyr::select(all_ard_groups()) ard |> dplyr::select(all_ard_variables())ard <- ard_tabulate(ADSL, by = "ARM", variables = "AGEGR1") ard |> dplyr::select(all_ard_groups()) ard |> dplyr::select(all_ard_variables())
This function is used to sort stacked hierarchical ARDs.
For the purposes of this function, we define a "variable group" as a combination of ARD rows grouped by the
combination of all their variable levels, but excluding any by variables.
sort_ard_hierarchical(x, sort = everything() ~ "descending")sort_ard_hierarchical(x, sort = everything() ~ "descending")
x |
( |
sort |
(
Defaults to |
an ARD data frame of class 'card'
If overall data is present in x (i.e. the ARD was created with ard_stack_hierarchical(overall=TRUE)), the
overall data will be sorted last within each variable group (i.e. after any other rows with the same combination of
variable levels).
ard_stack_hierarchical( ADAE, variables = c(AESOC, AEDECOD), by = TRTA, denominator = ADSL, id = USUBJID ) |> sort_ard_hierarchical(AESOC ~ "alphanumeric") ard_stack_hierarchical_count( ADAE, variables = c(AESOC, AEDECOD), by = TRTA, denominator = ADSL ) |> sort_ard_hierarchical(sort = list(AESOC ~ "alphanumeric", AEDECOD ~ "descending"))ard_stack_hierarchical( ADAE, variables = c(AESOC, AEDECOD), by = TRTA, denominator = ADSL, id = USUBJID ) |> sort_ard_hierarchical(AESOC ~ "alphanumeric") ard_stack_hierarchical_count( ADAE, variables = c(AESOC, AEDECOD), by = TRTA, denominator = ADSL ) |> sort_ard_hierarchical(sort = list(AESOC ~ "alphanumeric", AEDECOD ~ "descending"))
continuous_summary_fns() returns a named list of summary functions
for continuous variables. Some functions include slight modifications to
their base equivalents. For example, the min() and max() functions
return NA instead of Inf when an empty vector is passed.
Statistics "p25" and "p75" are calculated with quantile(type = 2),
which matches
SAS's default value.
continuous_summary_fns( summaries = c("N", "mean", "sd", "median", "p25", "p75", "min", "max"), other_stats = NULL )continuous_summary_fns( summaries = c("N", "mean", "sd", "median", "p25", "p75", "min", "max"), other_stats = NULL )
summaries |
( |
other_stats |
(named |
named list of summary statistics
# continuous variable summaries ard_summary( ADSL, variables = "AGE", statistic = ~ continuous_summary_fns(c("N", "median")) )# continuous variable summaries ard_summary( ADSL, variables = "AGE", statistic = ~ continuous_summary_fns(c("N", "median")) )
ARD functions for relocating columns and rows to the standard order.
tidy_ard_column_order() relocates columns of the ARD to the standard order.
tidy_ard_row_order() orders rows of ARD according to groups and
strata (group 1, then group2, etc), while retaining the column order of the input ARD.
tidy_ard_column_order(x, group_order = c("ascending", "descending")) tidy_ard_row_order(x)tidy_ard_column_order(x, group_order = c("ascending", "descending")) tidy_ard_row_order(x)
x |
( |
group_order |
( |
an ARD data frame of class 'card'
# order columns ard <- dplyr::bind_rows( ard_summary(mtcars, variables = "mpg"), ard_summary(mtcars, variables = "mpg", by = "cyl") ) tidy_ard_column_order(ard) |> tidy_ard_row_order()# order columns ard <- dplyr::bind_rows( ard_summary(mtcars, variables = "mpg"), ard_summary(mtcars, variables = "mpg", by = "cyl") ) tidy_ard_column_order(ard) |> tidy_ard_row_order()
Unlist ARD Columns
unlist_ard_columns( x, columns = c(where(is.list), -any_of(c("warning", "error", "fmt_fun"))), fill = NA, fct_as_chr = TRUE )unlist_ard_columns( x, columns = c(where(is.list), -any_of(c("warning", "error", "fmt_fun"))), fill = NA, fct_as_chr = TRUE )
x |
( |
columns |
( |
fill |
(scalar) |
fct_as_chr |
(scalar |
a data frame
ADSL |> ard_tabulate(by = ARM, variables = AGEGR1) |> apply_fmt_fun() |> unlist_ard_columns() ADSL |> ard_summary(by = ARM, variables = AGE) |> apply_fmt_fun() |> unlist_ard_columns()ADSL |> ard_tabulate(by = ARM, variables = AGEGR1) |> apply_fmt_fun() |> unlist_ard_columns() ADSL |> ard_summary(by = ARM, variables = AGE) |> apply_fmt_fun() |> unlist_ard_columns()
Functions used to update ARD formatting functions and statistic labels.
This is a helper function to streamline the update process. If it does not exactly meet your needs, recall that an ARD is just a data frame and it can be modified directly.
update_ard_fmt_fun( x, variables = everything(), stat_names, fmt_fun, filter = TRUE, fmt_fn = deprecated() ) update_ard_stat_label( x, variables = everything(), stat_names, stat_label, filter = TRUE )update_ard_fmt_fun( x, variables = everything(), stat_names, fmt_fun, filter = TRUE, fmt_fn = deprecated() ) update_ard_stat_label( x, variables = everything(), stat_names, stat_label, filter = TRUE )
x |
( |
variables |
( |
stat_names |
( |
fmt_fun |
( |
filter |
( |
fmt_fn |
|
stat_label |
( |
an ARD data frame of class 'card'
ard_summary(ADSL, variables = AGE) |> update_ard_fmt_fun(stat_names = c("mean", "sd"), fmt_fun = 8L) |> update_ard_stat_label(stat_names = c("mean", "sd"), stat_label = "Mean (SD)") |> apply_fmt_fun() # same as above, but only apply update to the Placebo level ard_summary( ADSL, by = ARM, variables = AGE, statistic = ~ continuous_summary_fns(c("N", "mean")) ) |> update_ard_fmt_fun(stat_names = "mean", fmt_fun = 8L, filter = group1_level == "Placebo") |> apply_fmt_fun()ard_summary(ADSL, variables = AGE) |> update_ard_fmt_fun(stat_names = c("mean", "sd"), fmt_fun = 8L) |> update_ard_stat_label(stat_names = c("mean", "sd"), stat_label = "Mean (SD)") |> apply_fmt_fun() # same as above, but only apply update to the Placebo level ard_summary( ADSL, by = ARM, variables = AGE, statistic = ~ continuous_summary_fns(c("N", "mean")) ) |> update_ard_fmt_fun(stat_names = "mean", fmt_fun = 8L, filter = group1_level == "Placebo") |> apply_fmt_fun()