Last week I was a little brisk with my treatment of the smoothed regime probabilities in a Markov Regime Switching Regression model.

The basic idea with smoothing is this: the Markov Regime Switching model treats the regime as unobservable and uses the data we do observe to find likelihood maximizing parameter values. Two things we are generally interested in when estimating a Markov Regime Switching Model are:

  1. The likelihood maximizing regime-dependent parameters, and
  2. Making inferences about the prevailing regime at particular points in time

Inferring the regime at time t is where the smoother comes in. To conduct this inference we want to use all the data available to us, and the optimized parameters of model, to say which regime was most likely at each time period.

The basic conceptual difference: the filter says, “given all information up to time t-1 what regime do we think is prevailing at time t. The smoother says, “given all the information available up to time T, what regime do we think was prevailing in time T-k.

The smoothing algorithm is due to Kim (1994) and works like this:

Consider the joint probability that s_{t}=j and s_{t+1}=k based on full information:

P(s_{t}=j,s_{t+1}=k|\Omega_{T})=\frac{P(s_{t+1}=k|\Omega_{T}) P(s_{t}=j,s_{t+1}=k|\Omega_{t})}{P(s{t+1}=k|\Omega{t})}

=\frac{P(s_{t+1}=k|\Omega_{T})P(s_{t}=j|\Omega_{t})P(s_{t+1}=k|s_{t}=j,\Omega_{t})}{P(s_{t+1}=k|\Omega{t})}

The mechanics of the smoother work like this:

  • From the last iteration of the smoother we get the updated regime probability P(s_{T}|\Omega_{T}). In our earlier notation this was \xi_{T|T}
  • We also have the updated probability from the filter loop at T-1, P(s_{T-1}|\Omega_{T-1}), which we called \xi_{T-1|T-1}
  • We also have the transition probabilities P(s_{T}=1|s_{T-1}=1) and P(s_{T}=2|s{T-1}=2), which are the model parameters p_{11}, p_{22}.

Given these quantities we can calculate the smoothed regime probability,

P(s_{T-1}=0,s_{T}=0|\Omega_{T})=\frac{P(s_{T}=0|\Omega_{T})P(s_{T-1}=0|\Omega_{T-1})P(s_{T}=0|s_{T-1}=0)}{P(s_{T}=0|\Omega_{T-1})}

and,

P(s_{T-1}=0,s_{T}=1|\Omega_{T})=\frac{P(s_{T}=1|\Omega_{T})P(s_{T-1}=0|\Omega_{T-1})P(s_{T}=1|s_{T-1}=0)}{P(s_{T}=1|\Omega_{T-1})}

now the smoothed probability that the prevailing regime at time t=T-1 was s=0 is,

P(S_{T-1}=0|\Omega_{T})= P(s_{T-1}=0,s_{T}=0) + P(s_{T-1}=0,s_{T}=1).

It is reasonably straightforward then to get,

P(S_{T-1}=1|\Omega_{T})= P(s_{T-1}=1,s_{T}=1) + P(s_{T-1}=1,s_{T}=0)

and stepping backwards in time we can work our way back to,

P(S_{1}=1|\Omega_{T})= P(s_{1}=1,s_{2}=1) + P(s_{1}=1,s_{2}=0)

The code to get this done is integrated with the Hamilton Filter routine in my GitHub repository called Cool Time Series Stuff but, just for the sake of completeness, here is the chunk that specifically runs the smoother for the Markov Regime Switching model where,

y_{t}=\alpha_{1} + \alpha_{3}x_{t} if regime 1 prevails at time t,
y_{t}=\alpha{2} + \alpha_{4}x_{t} if regime 2 prevails at time t.

################################################################################
################################################################################
################################################################################
#The Hamilton Smoother:  the smoothed probabilities are obtained by:

#1.  using the Hamilton Filter to get maximum likelihood parameter estimates
#2.  using the maximum likelihood parameter estimates to run the Filter again...
#2a  but in the smoothing step we work from T-1 back to t=1

#==============================================================
#==============================================================
#to get the smoothed probabilities we run the filter again using
# the maximum likelihood values but we need to save both the
# forecasted state probabilities (St given info in t-1) and
# the updated state probabilities (St updated to reflect info in time t)
#=================================================================
#=================================================================
#get the maximum likelihood estimates

ham.smooth<-function(theta,y,x){
  alpha1 <- theta[1]
  alpha2 <- theta[2]
  alpha3 <- theta[3]
  alpha4 <- theta[4]
  alpha5 <- theta[5]
  alpha6 <- theta[6]

  #  p11 <- 1/(1+exp(-theta[7]))
  #  p22 <- 1/(1+exp(-theta[8]))

  p11 <- theta[7]
  p22 <- theta[8]

  #in order to make inference about what state we are in in period t we need the conditional
  # densities given the information set through t-1
  f1 <- (1/(alpha5*sqrt(2*pi)))*exp(-((y-alpha1-(alpha3*x))^2)/(2*(alpha5^2)))
  f2 <- (1/(alpha6*sqrt(2*pi)))*exp(-((y-alpha2-(alpha4*x))^2)/(2*(alpha6^2)))
  f <- matrix(cbind(f1,f2),nc=2)

  #S.forecast is the state value looking forward conditional on info up to time t
  #S.inf is the updated state value
  S.forecast <- rep(0,2*length(y))
  S.forecast <- matrix(S.forecast,nrow=(length(y)),ncol=2)

  S.inf <- S.forecast
  o.v <- c(1,1)

  P<-matrix(c(p11,(1-p11),(1-p22),p22),nr=2,nc=2)
  model.lik <- rep(0,length(y))

  S.inf[1,] <- (c(p11,p22)*f[1,])/(o.v %*% (c(p11,p22)*f[1,]))

  for(t in 1:(length(y)-1)){
    #in time t we first make our forecast of the state in t+1 based on the
    # data up to time t, then we update that forecast based on the data
    # available in t+1
    S.forecast[t+1,] <- P%*%S.inf[t,]
    S.inf[t+1,] <- (S.forecast[t+1,]*f[t+1,])/(S.forecast[t+1,] %*% f[t+1,])
    model.lik[t+1] <- o.v%*%(S.forecast[t+1,]*f[t+1,])
  }

  #the smoother works kind of like running the filter in reverse
  # we start with the last value of S from the filter recursion...
  # this is the value of S in the last period, updated to reflect
  # information available in that period...we then work backwards from
  # there
  T<- length(y)
  P.smooth <- data.frame(s1=rep(0,T),s2=rep(0,T))
  P.smooth[T,] <- S.inf[T,]
  for(is in (T-1):1){

    #for clarity we can think of this problem has having 4 states...
    #note we word these as current period and next period because
    # the smoother works from T back to 1

    #1. probability that we observe S(t)=1, S(t+1)=1
    #2. probability that we observe S(t)=1, S(t+1)=2
    #3. probability that we observe S(t)=2, S(t+1)=1
    #4. probability that we observe S(t)=2, S(t+1)=2

    #for #1 P[S(t)=1,S(t+1)=1|I(t+1)] = {P[S(t+1)=1|I(t+1)]*P[S(t)=1|I(t)]*P[S(t+1)=1|S(t)=1]}/P[S(t+1)=1|I(t)]
    p1 <- (S.inf[is+1,1]*S.inf[is,1]*p11)/S.forecast[is+1,1]

    #for #2 we have
    p2 <- (S.inf[is+1,2]*S.inf[is,1]*(1-p11))/S.forecast[is+1,2]

    #for #3 we have
    p3 <- (S.inf[is+1,1]*S.inf[is,2]*(1-p22))/S.forecast[is+1,1]

    #for #4 we have
    p4 <- (S.inf[is+1,2]*S.inf[is,2]*p22)/S.forecast[is+1,2]

    P.smooth[is,1] <- p1 + p2
    P.smooth[is,2] <- p3 + p4

  }
return(P.smooth)
}
#==========================================================

And as a little bonus today, here’s a few lines of code to plot the time-series along with shaded bars indicating the regimes.

####################################################################################
####################################################################################
####################################################################################
#Now let's reproduce the plots that the MSwM package makes by adding 'recession bars' to
# indicate which regime my Hamilton Filter/Hamilton Smoother says we are in...

#first get maximum likelihood parameter estimates
theta.start <- c(0.1,-0.01,0.4,0.3,0.1,0.2,0.5,0.5)
opt.mams <- optim(theta.start,mrs.est,x=lnoil,y=lnng,hessian=T,control=list(maxit=20000))

#now use these estimates to get smoothed regime probs
ham.smooth <- c(opt.mams$par[1:6],1/(1+exp(-opt.mams$par[7])),1/(1+exp(-opt.mams$par[8]))) # assign each time period to a regime based on the smoothed probabilities...here I  # use a rule saying if the estimated smoothed probability of regime 1 > 0.5 then
# the system is in regime 1...
tmp <- tbl_df(data.frame(ham.smooth(theta=theta.smooth,y=lnng,x=lnoil))) %>%
        mutate(regime=ifelse(s1>0.5,1,2),switch=ifelse(regime!=lag(regime),1,0))
tmp$date <- lng$date 

#create a separate data frame with the starting and ending dates of regime 2
switching <- tmp %>% filter(switch==1) %>%
  mutate(startdate=date,enddate=lead(date)) %>%
  filter(regime==2)
#fix the last date in series
switching$enddate[which(is.na(switching$enddate))] <- as.Date('2016-01-01')
switching <- data.frame(switching)

#plot the Natural Gas time series and overlay bars for the periods estimated to be
# regime 2
ggplot(lng,aes(date,lng)) + geom_line()  +
   geom_rect(data=switching, aes(x=NULL,y=NULL,xmin=startdate,xmax=enddate,
                   ymin=ymin,ymax=ymax),fill="red",alpha=0.2) + guides(fill=FALSE) +
    ggtitle("Smoothed Probability of Being in Regime 2") + theme_bw()

smoProb2

Advertisements