Out-of-sample state decoding using a frozen HMM — run_out_of_sample

This function applies the parameters of a previously fitted hidden Markov model (an epiquest_hmm object) to new data. It allows for freezing of the HMM parameters (transition probabilities, emission distributions and initial state distribution) and determining the hidden states for new data without re-estimating the model.

Usage

run_out_of_sample_decoding(obs_data_new, hmm_frozen, seasonal = FALSE)

Arguments

obs_data_new: A data.frame containing the new observations. Must contain an index column and the same response columns (e.g., rate or num/denom) used to fit the hmm_frozen. If seasonal = TRUE, it must also contain a season column.
hmm_frozen: An object of class epiquest_hmm. This is the fitted model object containing the parameters to be applied to the new data.
seasonal: A logical. If TRUE, the function respects season boundaries in obs_data_new to prevent transitions between different seasons.

Value

An object of class epiquest_decode. This is a list containing:

states: The input obs_data_new with added columns for the predicted state and method used (see Details below).
probs: For methods 'local' and 'filtering', the posterior probabilities for each of the states.

Details

The function initializes a new depmix model using the structure of obs_data_new and then uses depmixS4::setpars to hard-code the parameters from hmm_frozen.

This is particularly useful for prospective surveillance, where you want to classify new weekly data using a model trained on historical seasons without the new data points influencing the previously trained model.

The states can be decoded according to different methods:

'global': Uses the Viterbi algorithm to find the single most likely sequence of states across the entire time series. This approach outputs a single state for each data point. There are no associated probabilities.
'local': Assigns the state with the highest posterior smoothing probability at each specific time point.
'filtering': Similar to 'local', but performs real-time decoding, where the state at time t is determined only by observations from time 1 to t.

Comparing 'local' and 'filtering', each approach has its advantages and disadvantages. Filtering state assignments do not use information from the following week to determine states, so they do not use all available information. When computed every week on new data as the data arrive, the state assignments are stable. Local state assingments can change: they are only stable once the data for the following week is available.

The plots generated by create_hmm_plots() and create_loop_plots() all use 'local' state assignments.