calc

Introduction

calc() is the analytic workhorse of rads. It provides a standardized method for obtaining most of what we usually want to calculate: means, medians, counts, confidence intervals, standard errors, relative standard errors (RSE), numerators, denominators, the number missing, and the proportion missing. calc() can be used with record data (e.g., vital statistics, census, enrollment numbers, etc.) as well as survey data (e.g., BRFSS, ACS PUMS, etc.).

calc() provides a unified interface built on top of common R packages, including data.table, which can make it faster than implementing custom functions with base R or dplyr in some cases. This makes it a convenient and efficient option, allowing APDE staff to use consistent syntax across different data sources. While all calc() functionality could be achieved with other packages directly, the primary benefit is standardization and ease of use—even if some specialized packages might occasionally offer more efficient implementations for specific analyses.

This vignette will provide some examples to introduce the calc() function by walking through basic analyses with vital statistics (birth data) and survey data (ACS PUMS). To get the most out of this vignette, we recommend that you type each and every bit of code into R. Doing so will almost definitely help you learn the syntax much faster than just reading the vignette or copying and pasting the code.

`calc()` arguments

Arguments are the values that we send to a function when it is called. The standard arguments for calc() are:

ph.data <- the name of the data.table/data.frame or survey object that you want to analyze.
what <- a character vector of the variable(s) for which you want to calculate the metrics. E.g., what = c("uninsured")
where <- think of this as a SQL WHERE clause, i.e., a filtering or subsetting of data. E.g., ages %in% c(0:17) & gender == "Female"
by <- a character vector of the variables that you want to computer the what by, i.e., the cross-tab variable(s). E.g., by = c("gender") would stratify results by gender
metrics <- a character vector of the metrics that you want returned. E.g., metrics = c("mean", "rse"). You can see a complete list of available metrics by typing metrics() and get detailed descriptions of what each metric means by typing ?metrics().
per <- an integer, which is the denominator when rate is selected as a metric. Metrics will be multiplied by this value. E.g., per = 1000. NOTE this is just a scalar. At present this does not calculate a true rate (i.e., the relevant population denominator is not used).
win <- an integer, which is the number of consecutive units of time (e.g., years, months, etc.) over which the metrics will be calculated, i.e., the 'window' for a rolling average, sum, etc. E.g. win = 5 will perform calculations over every 5 time unit window.
time_var <- a character, which is the name of the time variable in the dataset. Used in combination with the win argument to generate time windowed calculations.
fancy_time <- a logical (i.e., T or F) flag. If TRUE, a record of all the years going into the data is returned. If FALSE, just a simple range (where certain years within the range might not be represented in your data).
proportion <- a logical (i.e., T or F) flag determining whether the metrics should be calculated as a proportion. Currently only relevant for survey data. The default is FALSE.
ci <- a numeric value between 0 & 1 for the confidence level returned in the estimates. E.g., 0.95 == 95% confidence interval
verbose <- a logical used to toggle on/off printed warnings

There is no need to specify all of the arguments listed above. As you can see in the following example, the calc function simply needs a specified dataset (i.e., ph.data) and at least one what variable to return an intelligible result.

library(rads)
library(data.table)
data(mtcars)
calc(ph.data = mtcars, what = c("mpg"))[]

variable	mean	numerator	denominator	level	mean_se	mean_lower	mean_upper
mpg	20.09062	642.9	32	NA	1.065424	18.00243	22.17882

Note: The use of [] after calc() is used to print the output to the console. Typically, you would not print the results but would save them as an object. E.g., my.est <- calc().

Example vital statistics analyses

First get the birth data (cf. get_data vignette)

birth <- get_data_birth(cols = c("chi_year", "chi_sex", "chi_race_eth8", 
                                 "preterm", "birth_weight_grams", "mother_birthplace_state"), 
                        year = c(2013:2019), 
                        kingco = T)

Mean (proportion) for a binary over multiple years

calc(ph.data = birth, 
     what = c("preterm"), 
     metrics = c("mean", "rse", "numerator", "denominator"), 
     time_var = "chi_year")[]

chi_year	variable	mean	numerator	denominator	level	mean_se	mean_lower	mean_upper	rse
2013-2019	preterm	0.0905233	15806	174607	NA	0.0006867	0.0891774	0.0918691	0.7585532
Mean (proportion) for a binary over multiple years -- `where` newborn is male

calc(ph.data = birth, 
     what = c("preterm"), 
     where = chi_sex == "Male",
     metrics = c("mean", "rse", "numerator", "denominator"), 
     time_var = "chi_year")[]

chi_year	variable	mean	numerator	denominator	level	mean_se	mean_lower	mean_upper	rse
2013-2019	preterm	0.097169	8694	89473	NA	0.0009902	0.0952282	0.0991097	1.019051

Mean (proportion) for a binary over multiple years -- where newborn is male born to a Hispanic mother

calc(ph.data = birth, 
     what = c("preterm"), 
     where = chi_sex == "Male" & chi_race_eth8 == "Hispanic",
     metrics = c("mean", "rse", "numerator", "denominator"), 
     time_var = "chi_year")[]

chi_year	variable	mean	numerator	denominator	level	mean_se	mean_lower	mean_upper	rse
2013-2019	preterm	0.1105628	1277	11550	NA	0.002918	0.1048435	0.116282	2.639253

Mean (proportion) for a binary over individual years

birth[, cy := chi_year]
calc(ph.data = birth, 
     what = c("preterm"), 
     metrics = c("mean", "rse", "numerator", "denominator"), 
     time_var = "chi_year", 
     by = "cy")[]

cy	chi_year	variable	mean	numerator	denominator	level	mean_se	mean_lower	mean_upper	rse
2013	2013	preterm	0.0925978	2293	24763	NA	0.0018421	0.0889874	0.0962082	1.989329
2014	2014	preterm	0.0884756	2231	25216	NA	0.0017884	0.0849704	0.0919808	2.021357
2015	2015	preterm	0.0882979	2241	25380	NA	0.0017810	0.0848072	0.0917886	2.017038
2016	2016	preterm	0.0895684	2320	25902	NA	0.0017744	0.0860907	0.0930461	1.981016
2017	2017	preterm	0.0912052	2296	25174	NA	0.0018146	0.0876487	0.0947617	1.989553
2018	2018	preterm	0.0894487	2166	24215	NA	0.0018340	0.0858541	0.0930433	2.050369
2019	2019	preterm	0.0942939	2259	23957	NA	0.0018881	0.0905933	0.0979946	2.002371

Mean (proportion) for a binary over windowed years

calc(ph.data = birth, 
     what = c("preterm"), 
     metrics = c("mean", "rse", "numerator", "denominator"), 
     time_var = "chi_year", 
     win = 3)[]

chi_year	variable	mean	numerator	denominator	level	mean_se	mean_lower	mean_upper	rse
2013-2015	preterm	0.0897703	6765	75359	NA	0.0010413	0.0877294	0.0918112	1.159964
2014-2016	preterm	0.0887866	6792	76498	NA	0.0010284	0.0867710	0.0908023	1.158281
2015-2017	preterm	0.0896856	6857	76456	NA	0.0010334	0.0876602	0.0917109	1.152210
2016-2018	preterm	0.0900772	6782	75291	NA	0.0010434	0.0880322	0.0921221	1.158314
2017-2019	preterm	0.0916342	6721	73346	NA	0.0010653	0.0895462	0.0937221	1.162563

Mean for a continuous over windowed years

calc(ph.data = birth, 
     what = c("birth_weight_grams"), 
     metrics = c("mean", "rse"), 
     time_var = "chi_year", 
     win = 3)[]

chi_year	variable	mean	level	mean_se	mean_lower	mean_upper	rse
2013-2015	birth_weight_grams	3339.469	NA	2.122024	3335.310	3343.628	0.0635437
2014-2016	birth_weight_grams	3336.555	NA	2.087116	3332.465	3340.646	0.0625530
2015-2017	birth_weight_grams	3333.507	NA	2.068735	3329.452	3337.562	0.0620588
2016-2018	birth_weight_grams	3327.129	NA	2.073889	3323.065	3331.194	0.0623327
2017-2019	birth_weight_grams	3320.961	NA	2.088449	3316.868	3325.055	0.0628869

Mean for a continuous in 2019, by gender and race/eth

calc(ph.data = birth, 
     what = c("birth_weight_grams"), 
     chi_year == 2019,
     metrics = c("mean", "rse"), 
     by = c("chi_race_eth8", "chi_sex"))[]

chi_race_eth8	chi_sex	variable	mean	level	mean_se	mean_lower	mean_upper	rse
AIAN	Female	birth_weight_grams	3251.965	NA	79.070604	3096.989	3406.940	2.4314716
AIAN	Male	birth_weight_grams	3438.641	NA	84.306018	3273.404	3603.877	2.4517252
Asian	Female	birth_weight_grams	3145.902	NA	9.419157	3127.441	3164.363	0.2994104
Asian	Male	birth_weight_grams	3231.952	NA	9.784437	3212.775	3251.129	0.3027408
Black	Female	birth_weight_grams	3196.872	NA	18.683442	3160.253	3233.491	0.5844288
Black	Male	birth_weight_grams	3306.213	NA	19.794051	3267.418	3345.009	0.5986925
Hispanic	Female	birth_weight_grams	3270.074	NA	13.703834	3243.215	3296.933	0.4190681
Hispanic	Male	birth_weight_grams	3301.784	NA	14.310447	3273.736	3329.832	0.4334156
Multiple	Female	birth_weight_grams	3251.312	NA	24.365976	3203.556	3299.069	0.7494198
Multiple	Male	birth_weight_grams	3369.178	NA	23.359494	3323.394	3414.962	0.6933292
NHPI	Female	birth_weight_grams	3287.541	NA	49.322668	3190.871	3384.212	1.5002905
NHPI	Male	birth_weight_grams	3444.178	NA	43.324387	3359.264	3529.092	1.2579021
Other	Female	birth_weight_grams	3223.599	NA	30.670381	3163.487	3283.712	0.9514328
Other	Male	birth_weight_grams	3315.557	NA	33.357232	3250.178	3380.936	1.0060823
White	NA	birth_weight_grams	1873.500	NA	1321.500000	-14917.750	18664.750	70.5364291
White	Female	birth_weight_grams	3340.295	NA	7.280580	3326.026	3354.565	0.2179622
White	Male	birth_weight_grams	3460.298	NA	7.489711	3445.618	3474.977	0.2164470

Proportion of 2017-2019 births among each race/eth group

calc(ph.data = birth, 
     what = c("chi_race_eth8"), 
     chi_year %in% 2017:2019,
     metrics = c("mean", "rse", "obs", "numerator", "denominator"))[]

level	variable	denominator	obs	mean	mean_se	mean_lower	mean_upper	numerator	rse
AIAN	chi_race_eth8	73701	73701	0.0053052	0.0002676	0.0048059	0.0058561	391	5.0438189
Asian	chi_race_eth8	73701	73701	0.2291149	0.0015481	0.2260950	0.2321631	16886	0.6756696
Black	chi_race_eth8	73701	73701	0.0903651	0.0010561	0.0883165	0.0924564	6660	1.1686901
Hispanic	chi_race_eth8	73701	73701	0.1300796	0.0012391	0.1276703	0.1325275	9587	0.9525797
Multiple	chi_race_eth8	73701	73701	0.0420347	0.0007392	0.0406097	0.0435075	3098	1.7584788
NHPI	chi_race_eth8	73701	73701	0.0162956	0.0004664	0.0154064	0.0172352	1201	2.8619613
Other	chi_race_eth8	73701	73701	0.0186700	0.0004986	0.0177176	0.0196726	1376	2.6705534
White	chi_race_eth8	73701	73701	0.4681348	0.0018380	0.4645341	0.4717388	34502	0.3926283

2017-2019 rate per 100k of births among each race/eth group

calc(ph.data = birth, 
     what = c("chi_race_eth8"), 
     chi_year %in% 2017:2019,
     metrics = c("obs", "numerator", "denominator", "rate"), 
     per = 100000)[]

level	variable	denominator	obs	numerator	rate	rate_se	rate_lower	rate_upper	rate_per
AIAN	chi_race_eth8	73701	73701	391	530.522	26.75857	480.5929	585.6077	1e+05
Asian	chi_race_eth8	73701	73701	16886	22911.494	154.80599	22609.4981	23216.3131	1e+05
Black	chi_race_eth8	73701	73701	6660	9036.512	105.60883	8831.6537	9245.6411	1e+05
Hispanic	chi_race_eth8	73701	73701	9587	13007.965	123.91123	12767.0314	13252.7538	1e+05
Multiple	chi_race_eth8	73701	73701	3098	4203.471	73.91714	4060.9678	4350.7475	1e+05
NHPI	chi_race_eth8	73701	73701	1201	1629.557	46.63730	1540.6391	1723.5175	1e+05
Other	chi_race_eth8	73701	73701	1376	1867.003	49.85932	1771.7604	1967.2633	1e+05
White	chi_race_eth8	73701	73701	34502	46813.476	183.80296	46453.4068	47173.8775	1e+05

Number and proportion of missing gender by year

calc(ph.data = birth, 
     what = c("chi_sex"), 
     chi_year %in% 2017:2019,
     metrics = c("obs", "missing", "missing.prop"), 
     by = "chi_year")[]

chi_year	variable	obs	missing	missing.prop	level
2017	chi_sex	25274	0	0.0e+00	Female
2017	chi_sex	25274	0	0.0e+00	Male
2018	chi_sex	24337	0	0.0e+00	Female
2018	chi_sex	24337	0	0.0e+00	Male
2019	chi_sex	24090	2	8.3e-05	NA
2019	chi_sex	24090	2	8.3e-05	Female
2019	chi_sex	24090	2	8.3e-05	Male

Example survey analyses

Survey data must be 'set' to use the proper survey design. However, survey data accessed through rads::get_data() functions have already been set as data.table/dtsurvey objects and are ready for analysis with calc().

pums <- get_data_pums(year = 2021, kingco = F, records = 'person')

## Your data was survey set with the following parameters is ready for rads::calc():
##  - record type = person
##  - valid years = 2021
##  - replicate weights = pwgtp, pwgtp1 ... pwgtp9

Let's get on the same page re: the meaning of the columns from calc()

First, create a simple tabulation of the population of Seattle after sub-setting PUMS to King County.

test1 <- calc(ph.data = pums, 
             what = "nonseattle", 
             metrics = c('mean', 'numerator', 'denominator', 'obs', 'total'), 
             where = chi_geo_kc == "King County")
print(test1)

level	variable	denominator	obs	mean	mean_se	mean_lower	mean_upper	total	total_se	total_lower	total_upper	numerator
NonSeattle	nonseattle	21506	21506	0.6742587	0.0002685	0.6737242	0.6747932	1518730	539.2718	1517656.6	1519803.4	14962
Seattle	nonseattle	21506	21506	0.3257413	0.0002685	0.3252068	0.3262758	733714	783.3842	732154.7	735273.3	6544

mean: The proportion of the total population (i.e., King County) that lives in Seattle
total: The survey weighted population Seattle and NonSeattle residents in King County
numerator: The number of rows with people living in Seattle, compare to nrow(pums[nonseattle == 'Seattle])
denominator: The number of rows in the the dataset where the 'what' variable can be assessed because it is not missing, compare to nrow(pums[!is.na(nonseattle)])
obs: The number of rows in the dataset after filtering by the 'where' argument.

To highlight the difference between denominator and obs, in this next example we introduce 100 missing values to our 'what' variable (nonseattle), thereby lowering the denominator by 100 but leaving the obs the same as above.

pums2 <- copy(pums)
pums2 <- pums2[nonseattle == 'Seattle', nonseattle := ifelse(rowid(nonseattle) <= 100, NA, nonseattle)]
test2 <- calc(ph.data = pums2, 
             what = "nonseattle", 
             metrics = c('mean', 'numerator', 'denominator', 'obs', 'total'), 
             where = chi_geo_kc == "King County")
print(test2)

level	variable	denominator	obs	mean	mean_se	mean_lower	mean_upper	total	total_se	total_lower	total_upper	numerator
NonSeattle	nonseattle	21406	21506	0.6754666	0.0002912	0.6748871	0.6760462	1518730	539.2718	1517656.6	1519803.4	14962
Seattle	nonseattle	21406	21506	0.3245334	0.0002912	0.3239538	0.3251129	729686	876.1572	727942.1	731429.9	6444

As expected, the obs value is the same as in table test1 above (21,506), but the denominator value decreased by 100 to 21,406.

Mean (proportion) of those near poverty or disabled, by King County (vs. remainder of WA)

calc(ph.data = pums, 
     what = c("disability", "GEpov200"), 
     metrics = c("mean", "rse", "obs", "numerator", "denominator"), 
     proportion = T, 
     by = "chi_geo_kc")[]

chi_geo_kc	level	variable	denominator	obs	mean	mean_se	mean_lower	mean_upper	numerator	rse
NA	With a disability	disability	57022	57022	0.1474778	0.0018915	0.1437525	0.1512827	9374	1.2825730
NA	Without a disability	disability	57022	57022	0.8525222	0.0018915	0.8487173	0.8562475	47648	0.2218723
King County	With a disability	disability	21506	21506	0.0966941	0.0024221	0.0919792	0.1016236	2404	2.5048703
King County	Without a disability	disability	21506	21506	0.9033059	0.0024221	0.8983764	0.9080208	19102	0.2681330
NA	NA	GEpov200	55189	57022	0.7536707	0.0035095	0.7466186	0.7605894	42256	0.4656592
King County	NA	GEpov200	20903	21506	0.8172007	0.0050456	0.8069427	0.8270304	17450	0.6174295

Proportion in each CHNA age group, by disability status

calc(ph.data = pums, 
     what = c("age6"), 
     metrics = c("mean", "rse", "obs", "numerator", "denominator"), 
     proportion = F, 
     by = "disability")[]

disability	level	variable	denominator	obs	mean	mean_se	mean_lower	mean_upper	numerator	rse
Without a disability	18-24	age6	66750	66750	0.0900336	0.0006471	0.0887455	0.0913217	5545	0.7187710
Without a disability	25-44	age6	66750	66750	0.3084901	0.0011385	0.3062239	0.3107563	18734	0.3690608
Without a disability	45-64	age6	66750	66750	0.2398575	0.0009444	0.2379776	0.2417373	16968	0.3937522
Without a disability	65-74	age6	66750	66750	0.0886149	0.0006455	0.0873300	0.0898998	7519	0.7284496
Without a disability	75+	age6	66750	66750	0.0348890	0.0005355	0.0338231	0.0359550	3197	1.5349388
Without a disability	<18	age6	66750	66750	0.2381149	0.0006944	0.2367327	0.2394971	14787	0.2916310
With a disability	18-24	age6	11778	11778	0.0602630	0.0025721	0.0551434	0.0653827	610	4.2681494
With a disability	25-44	age6	11778	11778	0.1871306	0.0056229	0.1759384	0.1983228	1760	3.0048262
With a disability	45-64	age6	11778	11778	0.2668707	0.0053801	0.2561619	0.2775796	2959	2.0160022
With a disability	65-74	age6	11778	11778	0.1850895	0.0040475	0.1770332	0.1931457	2390	2.1867618
With a disability	75+	age6	11778	11778	0.2273166	0.0033096	0.2207290	0.2339043	3342	1.4559469
With a disability	<18	age6	11778	11778	0.0733296	0.0037293	0.0659066	0.0807525	717	5.0856353

Mean & median age in King County, by disability status

calc(ph.data = pums, 
     what = c("agep"),
     chi_geo_kc == "King County",
     metrics = c("mean", "median", "rse", "obs", "numerator", "denominator"), 
     by = "disability")[]

##              disability variable     mean median numerator denominator   obs  level    mean_se mean_lower mean_upper       rse
##                  <fctr>   <char>    <num>  <num>     <int>       <int> <int> <lgcl>      <num>      <num>      <num>     <num>
## 1: Without a disability     agep 36.57783     38    727379       19102 19102     NA 0.06941677   36.43966   36.71600 0.1897783
## 2:    With a disability     agep 56.32190     64    142658        2404  2404     NA 0.66249953   55.00323   57.64057 1.1762734

Example analyses with non-standard data

In the examples above we used standard public health data (birth vital statistics and the Census Bureau's ACS survey). However, since calc() is a generalized function, you can use it with nearly any dataset, as long as it is a data.frame, a data.table, or a survey object. To demonstrate this, we will use calc() with synthetic data that we will generate below.

Create the dataset

library(data.table)
set.seed(98121) 

mydt <- data.table(
  school = as.factor(sample(c("Alpha", "Beta", "Gamma", "Delta"), 2000, replace = T)),
  grades = as.factor(sample(c("A", "B", "C", "D"), 2000, replace = T)), 
  year = sample(2016:2021, 2000, replace = T))

head(mydt)

school	grades	year
Alpha	A	2017
Delta	B	2019
Gamma	C	2017
Beta	A	2016
Delta	C	2018
Alpha	D	2019

We see that we created a dataset of 2000 rows with grades in four schools between with 2016 and 2021.

Calculate the proportion of A's and B's in the Alpha and Beta schools

grades.distribution <- calc(
  ph.data = mydt, 
  school %in% c("Alpha", "Beta"), 
  what = "grades", 
  by = "school", 
  time_var = "year", 
  metrics = c("numerator", "denominator", "mean"), proportion = F)

print(grades.distribution[level %in% c("A", "B")])

school	level	year	variable	denominator	mean	mean_se	mean_lower	mean_upper	numerator
Alpha	A	2016-2021	grades	497	0.2857143	0.0202844	0.2477598	0.3269560	142
Alpha	B	2016-2021	grades	497	0.2555332	0.0195842	0.2191639	0.2956526	127
Beta	A	2016-2021	grades	491	0.2484725	0.0195215	0.2123012	0.2885490	122
Beta	B	2016-2021	grades	491	0.2566191	0.0197311	0.2199795	0.2970375	126

These results show that the Alpha School has a higher proportion of A's (0.286 vs 0.248) and a similar proportion of B's (0.256 vs 0.257).

Wait! We forgot the survey weights!

A colleague just reminded you that you forgot to add the survey weights. Let's add weights and survey set the data.

# create weights
set.seed(98121)
mydt[, mywghts := sample(50:1300, 2000, replace = T)]

# survey set the data
# This uses the dtsurvey package
# similar logic applies for survey and srvyr package
mydt[, `_id` := NULL] # remove id to make things play nice
mysvy <-dtsurvey::dtsurvey(data.table(mydt), weight = 'mywghts')

Using the survey information, again calculate the proportion of A's and B's in the Alpha and Beta schools

grades.distribution2 <- calc(
  ph.data = mysvy, 
  school %in% c("Alpha", "Beta"), 
  what = "grades", 
  by = "school", 
  time_var = "year", 
  metrics = c("numerator", "denominator", "mean"), proportion = FALSE)

print(grades.distribution2[level %in% c("A", "B")])

school	level	year	variable	denominator	mean	mean_se	mean_lower	mean_upper	numerator
Alpha	A	2016-2021	grades	497	0.2819952	0.0228060	0.2371869	0.3268036	142
Alpha	B	2016-2021	grades	497	0.2460211	0.0215780	0.2036254	0.2884167	127
Beta	A	2016-2021	grades	491	0.2361366	0.0214262	0.1940380	0.2782351	122
Beta	B	2016-2021	grades	491	0.2685865	0.0229556	0.2234831	0.3136900	126

You'll note that using the survey design caused small changes in the results. For example, the proportion of A's in the Alpha school changed from 0.286 to 0.282.

Example analyses with resampled data/multiple imputation

There may be a few instances where a variable you want to compute some metrics using (multiple) imputed data. calc can work with these sorts of situations.

The following code creates a fake survey of 100 people and then creates 10 different iterations of the survey with differing values of v (to represent the imputed variable).

# Create some fake data
library('data.table')
base = data.table::data.table(id = 1:100, bin = sample(0:1, 100, T), psu = sample(1:3, 100, T), weight = runif(100, 0,2))
base = dtsurvey::dtsurvey(base, psu = 'psu', weight = 'weight')
midat = lapply(1:10, function(i){
  r = data.table::copy(base)[, v := sample(1:3, 100, T)]
})

To use the imputation version of calc, the ph.data argument must be an imputationList object. You can turn your list of plausible datasets into an imputationList via the mitools::imputationList function. You may need to install the mitools package for this to work.

midat = mitools::imputationList(midat)
class(midat)

## [1] "imputationList"

Now that midat is the right type/class of object, you can pass it to the calc function as you would any other dataset. The slight exception is that the metrics argument must be explicitly specified. For fun, a version of the results from one imputation is also computed.

Results using MI combining methods

withmi = calc(ph.data =midat, what = 'bin', by = 'v', metrics = 'mean', proportion = T)
print(withmi)

v	variable	mean	mean_se	mean_lower	mean_upper
1	bin	0.5051381	0.1390337	0.2071490	0.8031272
2	bin	0.5369530	0.1380729	0.2437939	0.8301120
3	bin	0.5218571	0.1491685	0.2154041	0.8283100

Results of one iteration

nomi = calc(ph.data = midat$imputations[[1]], what = 'bin', by = 'v')
print(nomi)

v	variable	mean	mean_se	mean_lower	mean_upper
1	bin	0.4273156	0.0143969	0.3653709	0.4892602
2	bin	0.6505122	0.0025634	0.6394829	0.6615416
3	bin	0.4560479	0.0659792	0.1721622	0.7399336

Some additional notes:

If there is no variation (or imputation) in the what or by variables, you should get the same result as doing just one iteration.
The proportion argument is ignored when ph.data is an imputationList object. This is because the proportion argument governs what methods are used to create confidence intervals – and the special proportion methods are not currently implemented for imputed data estimates. As such, you may find confidence intervals that are outside the normal [0-1] constraints.

Knowing is half the battle ... but only half

You've been introduced to calc(), but you'll only become competent at using it by using it. Try it out with your favorite dataset. Play with it. Try to break it and, if you're successful, submit a GitHub issue. Enjoy!

-- `Updated April 16, 2025 (rads v1.3.5)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

calc

Introduction

`calc()` arguments

Example vital statistics analyses

Example survey analyses

Example analyses with non-standard data

Example analyses with resampled data/multiple imputation

Knowing is half the battle ... but only half

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

calc

Introduction

calc() arguments

Example vital statistics analyses

Example survey analyses

Example analyses with non-standard data

Example analyses with resampled data/multiple imputation

Knowing is half the battle ... but only half

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

`calc()` arguments