Creates an object of class c("annex", "data.frame")
required to calculate
the statistics.
Usage
annex(formula, data, tz, duplicate.action = NULL, meta = NULL, verbose = FALSE)
# S3 method for annex
head(x, ...)
# S3 method for annex
tail(x, ...)
# S3 method for annex
subset(x, ...)
# S3 method for annex_stats
head(x, ...)
# S3 method for annex_stats
tail(x, ...)
# S3 method for annex_stats
subset(x, ...)
Arguments
- formula
the formula to specify how the data set is set up. See 'Details' for more information.
- data
data.frame
containing the obervations/data.- tz
character, time zone definition (e.g.,
"Europe/Berlin"
or"UTC"
); required.OlsonNames()
returns a list of possible time zones. The correct time zone is important to properly calculate month and time of day.- duplicate.action
NULL
or a function which returns a single numeric value. Used to handle possible duplicates, see 'Details'.- meta
NULL
(default) or alist
with information about study, home, and room (see section 'Duplicates').- verbose
logical, defaults to
FALSE
. Can be set toTRUE
to increase verbosity.- x
object of class
annex
.- ...
arguments to be passed to or from other methods.
Details
In case the data set provided on data
does only contain data
of one study, home, and room, the fomula has two parts, looking
e.g., as follows:
T + RH ~ datetime
The left hand side of the formula (left of ~
) specifies
the names of the variables of the observations to be processed,
the right hand side is the name of the variable containing the
time information (must be of class POSIXt
). In this case,
the meta
argument is required to provide information about
the study, home, and room.
If the grouping information is already in the data set, the analysis can be performed depending on the group information, typically:
T + RH ~ datetime | study + home + room
The latter allows to process observations from different studies, homes, and/or rooms all in one go.
Duplicates
Duplicated records can distort the statistics and should be handled properly.
For each unique study, home, room
only one observation (row) for a specific
date and time should exist.
As there is no general way to deal with such duplicates, the function annex
(as well as annex_prepare
) by default throws a warning for the user if such
duplicates exist (duplicate.action = NULL
; default argument).
However, the package allows the user to provide a custom duplicate.action
function, e.g., mean
, min
, max
, ... This function must return
one single numeric value (or an NA) when applied to a vector. If a function is provided,
the annex
function does the following:
Checks if there are any duplicates. If not, no changes are made. Else ...
Checking if the function is valid (returns single numeric or NA). If not, an error will be thrown.
Takes the measurements of each duplicate; if all values are missing, an
NA
will be returned. Else the usersduplicate.action
is applied to all remaining non-missing values. I.e., ifduplicate.action = mean
the average of all non-missing values will be used.
Examples
# Create artificial data set for testing; typically this information is read
# from a file or any other data connection.
data <- data.frame(datetime = as.POSIXct("2022-01-01 00:00", tz = "Europe/Berlin") + -10:10 * 3600,
T = round(rnorm(21, mean = 20, sd = 2), 2),
RH = round(runif(21, 40, 100), 2))
res1 <- annex(T + RH ~ datetime, data = data,
meta = list(study = "example", home = "ex", room = "BED"),
tz = "Europe/Berlin")
head(res1, n = 3)
#> datetime study room home year month tod T RH
#> 1 2021-12-31 14:00:00 example BED ex 2021 12 07-23 17.20 78.50
#> 2 2021-12-31 15:00:00 example BED ex 2021 12 07-23 20.51 79.62
#> 3 2021-12-31 16:00:00 example BED ex 2021 12 07-23 15.13 45.76
# The meta information can also be added to `data` removing the need
# to specify the `meta` argument and allows to mix data from different
# studies and rooms. Appending study, room, and home to `data`:
data <- transform(data,
study = "example",
home = "ex",
room = "BED")
head(data)
#> datetime T RH study home room
#> 1 2021-12-31 14:00:00 17.20 78.50 example ex BED
#> 2 2021-12-31 15:00:00 20.51 79.62 example ex BED
#> 3 2021-12-31 16:00:00 15.13 45.76 example ex BED
#> 4 2021-12-31 17:00:00 19.99 85.94 example ex BED
#> 5 2021-12-31 18:00:00 21.24 86.18 example ex BED
#> 6 2021-12-31 19:00:00 22.30 99.44 example ex BED
res2 <- annex(T + RH ~ datetime | study + home + room,
data = data, tz = "Europe/Berlin")
head(res2, n = 3)
#> datetime study home room year month tod T RH
#> 1 2021-12-31 14:00:00 example ex BED 2021 12 07-23 17.20 78.50
#> 2 2021-12-31 15:00:00 example ex BED 2021 12 07-23 20.51 79.62
#> 3 2021-12-31 16:00:00 example ex BED 2021 12 07-23 15.13 45.76