Skip to contents

Creates an object of class c("annex", "data.frame") required to calculate the statistics.

Usage

annex(formula, data, tz, duplicate.action = NULL, meta = NULL, verbose = FALSE)

# S3 method for annex
head(x, ...)

# S3 method for annex
tail(x, ...)

# S3 method for annex
subset(x, ...)

# S3 method for annex_stats
head(x, ...)

# S3 method for annex_stats
tail(x, ...)

# S3 method for annex_stats
subset(x, ...)

Arguments

formula

the formula to specify how the data set is set up. See 'Details' for more information.

data

data.frame containing the obervations/data.

tz

character, time zone definition (e.g., "Europe/Berlin" or "UTC"); required. OlsonNames() returns a list of possible time zones. The correct time zone is important to properly calculate month and time of day.

duplicate.action

NULL or a function which returns a single numeric value. Used to handle possible duplicates, see 'Details'.

meta

NULL (default) or a list with information about study, home, and room (see section 'Duplicates').

verbose

logical, defaults to FALSE. Can be set to TRUE to increase verbosity.

x

object of class annex.

...

arguments to be passed to or from other methods.

Details

In case the data set provided on data does only contain data of one study, home, and room, the fomula has two parts, looking e.g., as follows:

  • T + RH ~ datetime

The left hand side of the formula (left of ~) specifies the names of the variables of the observations to be processed, the right hand side is the name of the variable containing the time information (must be of class POSIXt). In this case, the meta argument is required to provide information about the study, home, and room.

If the grouping information is already in the data set, the analysis can be performed depending on the group information, typically:

  • T + RH ~ datetime | study + home + room

The latter allows to process observations from different studies, homes, and/or rooms all in one go.

Duplicates

Duplicated records can distort the statistics and should be handled properly. For each unique study, home, room only one observation (row) for a specific date and time should exist.

As there is no general way to deal with such duplicates, the function annex (as well as annex_prepare) by default throws a warning for the user if such duplicates exist (duplicate.action = NULL; default argument).

However, the package allows the user to provide a custom duplicate.action function, e.g., mean, min, max, ... This function must return one single numeric value (or an NA) when applied to a vector. If a function is provided, the annex function does the following:

  • Checks if there are any duplicates. If not, no changes are made. Else ...

  • Checking if the function is valid (returns single numeric or NA). If not, an error will be thrown.

  • Takes the measurements of each duplicate; if all values are missing, an NA will be returned. Else the users duplicate.action is applied to all remaining non-missing values. I.e., if duplicate.action = mean the average of all non-missing values will be used.

See also

annex_prepare, annex_stats

Author

Reto Stauffer

Examples

# Create artificial data set for testing; typically this information is read
# from a file or any other data connection.
data <- data.frame(datetime = as.POSIXct("2022-01-01 00:00", tz = "Europe/Berlin") + -10:10 * 3600,
                   T  = round(rnorm(21, mean = 20, sd = 2), 2),
                   RH = round(runif(21, 40, 100), 2))

res1 <- annex(T + RH ~ datetime, data = data,
              meta = list(study = "example", home = "ex", room = "BED"),
              tz = "Europe/Berlin")
head(res1, n = 3)
#>              datetime   study room home year month   tod     T    RH
#> 1 2021-12-31 14:00:00 example  BED   ex 2021    12 07-23 17.20 78.50
#> 2 2021-12-31 15:00:00 example  BED   ex 2021    12 07-23 20.51 79.62
#> 3 2021-12-31 16:00:00 example  BED   ex 2021    12 07-23 15.13 45.76

# The meta information can also be added to `data` removing the need
# to specify the `meta` argument and allows to mix data from different
# studies and rooms. Appending study, room, and home to `data`:
data <- transform(data,
                  study = "example",
                  home  = "ex",
                  room  = "BED")
head(data)
#>              datetime     T    RH   study home room
#> 1 2021-12-31 14:00:00 17.20 78.50 example   ex  BED
#> 2 2021-12-31 15:00:00 20.51 79.62 example   ex  BED
#> 3 2021-12-31 16:00:00 15.13 45.76 example   ex  BED
#> 4 2021-12-31 17:00:00 19.99 85.94 example   ex  BED
#> 5 2021-12-31 18:00:00 21.24 86.18 example   ex  BED
#> 6 2021-12-31 19:00:00 22.30 99.44 example   ex  BED
res2 <- annex(T + RH ~ datetime | study + home + room,
              data = data, tz = "Europe/Berlin")
head(res2, n = 3)
#>              datetime   study home room year month   tod     T    RH
#> 1 2021-12-31 14:00:00 example   ex  BED 2021    12 07-23 17.20 78.50
#> 2 2021-12-31 15:00:00 example   ex  BED 2021    12 07-23 20.51 79.62
#> 3 2021-12-31 16:00:00 example   ex  BED 2021    12 07-23 15.13 45.76