I have created a function that performs a last observation carried forward that takes account of how long ago the last observation was recorded. Code for the function below.
library(tibble)
library(lubridate)
library(zoo)
library(testthat)
locf_window <- function(x, by, date, window, unit = "hours"){
# Perform last observation carried forward (LOCF) based on the time difference
# to the last measured observation. Allow for stratification by identifier
# (e.g.patient ID)
#
# Parameters
# x : character or numeric
# vector of measurements on which to perform LOCF
# by : character or numeric
# vector indicating the group to stratify by
# date : datetime
# vector of times at which the value in 'x' was attempted to be measured.
# window : numeric
# length of the time window
# units : character
# units of the time window
if (is.character(x)) {
placeholder <- "-Infinity"
} else if (is.numeric(x)) {
placeholder <- -Inf
} else {
stop("vector 'x' must either be character or numeric")
}
x <- if_else(is.na(x) & by != lag(by), placeholder, x)
date_measure <- as_datetime(ifelse(!is.na(x), date, NA))
date_measure <- zoo::na.locf(date_measure)
n_measure <- unlist(tapply(!is.na(x), by, cumsum))
date_measure <- as_datetime(ifelse(n_measure != 0, date_measure, NA))
x <- if_else(is.na(x) & !is.na(date_measure) &
time_length(lag(date_measure) %--% date, unit = unit) < window,
zoo::na.locf(x, na.rm = FALSE), x, x)
x[x == placeholder] <- NA
x
}
test_locf <- tribble(
~patid, ~start_date , ~value,
1, ymd_hms("2010-01-05 12:00:00"), 5,
1, ymd_hms("2010-01-05 13:00:00"), NA,
1, ymd_hms("2010-01-05 15:59:59"), NA,
1, ymd_hms("2010-01-05 17:00:00"), NA,
1, ymd_hms("2010-01-05 18:00:00"), 10,
2, ymd_hms("2010-01-05 13:00:00"), NA,
2, ymd_hms("2010-01-05 14:00:00"), NA,
2, ymd_hms("2010-01-05 15:00:00"), 2,
2, ymd_hms("2010-01-05 15:31:01"), NA,
2, ymd_hms("2010-01-06 16:00:00"), NA
) %>% as.data.table()
with(test_locf, {
expect_identical(locf_window(value, start_date, 4, by = patid), c(5, 5, 5, NA, 10, NA, NA, 2, 2, NA))
expect_error(locf_window(list(2), ymd_hms("2010-01-06 16:00:00"), 4, by = patid))
})
The function generally seems to work, but it seems overly complicated. Does anyone have an idea of how to simplify or speed up the function?
Hi all,
I have created a function that performs a last observation carried forward that takes account of how long ago the last observation was recorded. Code for the function below.
The function generally seems to work, but it seems overly complicated. Does anyone have an idea of how to simplify or speed up the function?
P