stage1 {DTVEM}R Documentation

Stage 1 DTVEM

Description

PLEASE USE THE LAG FUNCTION RATHER THAN THIS UNLESS YOU WOULD LIKE TO SPECIFY THIS MANUALLY. This is the function for the stage 1 of DTVEM. Manipulates the data so that it creates a vector that is both in wide and long formats

Usage

stage1(
  differentialtimevaryingpredictors = differentialtimevaryingpredictors,
  outcome = outcome,
  predictionstart = predictionstart,
  predictionsend = predictionsend,
  predictionsinterval = predictionsinterval,
  namesofnewpredictorvariables = namesofnewpredictorvariables,
  laglongreducedummy = laglongreducedummy,
  gamma = gamma,
  numberofknots = numberofknots,
  k3 = k3,
  controlvariables = controlvariables,
  debug = debug,
  minimumpracticalsignificance = minimumpracticalsignificance,
  minimumpracticalsignificanceneg = minimumpracticalsignificanceneg,
  intermediatestage = FALSE,
  lengthcovariates = lengthcovariates,
  blockdata = blockdata
)

Arguments

differentialtimevaryingpredictors

The variables that will be a varying-coefficient of differential time (AKA the lags you want to know what times they predict the outcome). This must be specified as a vector using c("variables here"). e.g. c("X","Y") (REQUIRED)

outcome

This is each of the outcome variables. Specified as outcome="outcomevariablename" for a single variable or outcome=c("outcomevariablename1","outcomevariablename2") (REQUIRED)

predictionstart

The differential time value to start with, default is NULL, and the lowest time difference in the time series will be used (use lower value if you're first value if you're interested in a smaller interval prediction) e.g. predictionstart = 1. If this is not specified and using a continuous time model, make sure to set blockdata = TRUE so that it will be automatically chosen. (OPTIONAL)

predictionsend

The differential time value to end with. This means how long you want your largest time difference in the study to be (i.e. if you wanted to predict up to allow time predictions up to 24 hours and your time intervals were specified in hours, you would set predictionsend = 24). If this is not specified and using a continuous time model, make sure to set blockdata = TRUE so that it will be automatically chosen. (OPTIONAL)

predictionsinterval

The intervals to predict between differential time points. If using discrete time do you want the intervals to be specified every discrete interval, if so set this to 1. If this is not specified and using a continuous time model, make sure to set blockdata = TRUE so that it will be automatically chosen. (OPTIONAL)

namesofnewpredictorvariables

This is the name of the predictors.

laglongreducedummy

This is the long data output from the data manipulation.

gamma

This can be used to change the wiggliness of the model. This can be useful if the model is too smooth (i.e flat). The lower the number the more wiggly this will be (see ?gam in MGCV for more information). The default is equal to 1. (OPTIONAL, UNCOMMONLY SPECIFIED)

numberofknots

The number of k selection points used in the model for stage 1 (see ?choose.k in mgcv package for more details) (note that this is for the raw data k2 refers to the k for the re-blocked data), default is 10. The ideal k is the maximum number of data points per person, but this slows down DTVEM and is often not required. (OPTIONAL)

k3

The number of k selection points used in the model for the time spline (NOTE THAT THIS CONTROLS FOR TIME TRENDS OF THE POPULATION) (see ?choose.k in mgcv package for more details). Default is 3. (OPTIONAL)

controlvariables

The variables to be controlled for (not lagged). These are traditional covariates in the analysis. These are the variables that will be controlled for in a stationary fashion. To use this use controlvariables = c("list","here") (OPTIONAL)

debug

This will print more useless information as it goes along. Only useful for troubleshooting problems. (OPTIONAL, UNCOMMONLY SPECIFIED)

minimumpracticalsignificance

This can be used to set a minimum amount to pass on from DTVEM stage 1 to stage 2, and stage 1.5 to stage 2. This can be useful if too many variables come back as significant, but they would not meet your criteria for practical significance. Set this to a numerical value (e.g. minimumpracticalsignificance=.2). (OPTIONAL, UNCOMMONLY SPECIFIED)

minimumpracticalsignificanceneg

minimumpracticalsignificanceneg*-1

intermediatestage

Should not be changed unless running from DTVEM function.

lengthcovariates

the number of covariates (+2)

blockdata

IMPORTANT FOR CONTINUOUS-TIME DATA. This re-organizes the raw data into blocks after an exploratory first stage. Default = FALSE. TRUE = Automatic re-organization of data based on the minimum lag number and the time between two lags peaks/valleys. Including a numeric number will automatically re-block the data into chunks at those specific intervals. (REQUIRED FOR CONTINUOUS DATA, OPTIONAL OTHERWISE)

...

A list of variable names used in the function e.g. "X","Y" (REQUIRED)

Value

The output of this function is: The output from the first stage of DTVEM


[Package DTVEM version 1.0010 Index]