Starting with this post I laid out the motivation for my recently renewed interest in time-series analysis. To recap, my basic motivating question is, “what do I do if I’m modeling a process I suspect is seasonal (monthly landings in a commercial fishery for example) but I also have reason to suspect the nature of seasonality might be changing?”

To be a bit more specific here, I’m dealing with several years worth of data on total monthly catches by a group of fishermen. The data exhibit some seasonal signals with catches tending to be higher in the summer and early fall and lower in the late fall, winter, and early spring. I observe that a particular regulation went into effect in year $k$ and I suspect that regulation may have altered the seasonal pattern of landings. The complicating factor is that various regulations and environmental conditions have been influencing this process since the beginning of my data….so it is not immediately obvious whether I should expect this process to exhibit a discrete and notable change in seasonality at time $t=k$ or if I should expect the seasonal pattern in the data to exhibit many (possibly subtle) changes.

In my reading of the literature I determined I have two intellectual paradigms to choose from:

1. Smooth transition/regime change models
• Markov Regime Switching Models, ala Hamilton 1989 and many others.
• Smooth transition regressions
2. Change-point detection models including

The conceptual difference between items #1 and #2 hinges on whether you are

1. dealing with a process that moves back and forth between a finite number of states (and the process behaves differently in each state)
• the canonical example here is a macro-economic time-series that may have a different mean and variance depending on whether we are in an expansionary or contractionary phase of the business cycle.
2. dealing with a process that, once it departs from its past behavior at some point in the time-series, never returns to exhibit similar properties as before.
• an example I’ve seen used here is the demand for red meat and poultry among American consumers around the 1970s. Somewhere in this time the relationship between demand for these two products was fundamentally and permanently altered as consumers became more health conscious. The implication being that the time-series here can really be modeled as two completely different series: those observations before 1978 and those observations after 1978.

In the interest of keeping this a little bit concise and digestible I’m going to, for the moment, abstract away from the discussion about how one decides on the right modeling approach and just focus on how to implement these approaches (I’ll circle back to the issue of which approach is the right on in a future post….maybe Time-series VII or VIII).

I’m going to start with the Markov Regime Switching Model because its one I’ve worked with before and I’m at least a little familiar with it. My first goal – the one I will focus on in this post – is just to understand the basic mechanics and properties of Markov Regime Switching Models.

#################################################################
#################################################################
#################################################################
First, a picture…because I really like pictures. Goldfeld and Quandt (1973), Hamilton (1989) and a bunch of other really smart dudes observed that certain macro-economic time-series like the unemployment rate, federal funds rate, new housing starts, etc. seem to behave differently depending on whether the economy is in an expansionary or contractionary phase.

To illustrate what this might look like, it’s not hard to image a financial asset that is more volatile during market downturns and less volatile during bull markets. Let $Y_{t}$ be the percent return on an asset.

If we were to generate data from the following process:

$Y_{t} \sim N(\mu_{0}=0.1,\sigma_{0}=0.05)$ when we are in regime 0, and

$Y_{t} \sim N(\mu_{1}=0.02,\sigma_{1}=0.1)$ when we are in regime 1

and we define periods 1-10, 16-30, and 40-60 to be in regime 1…we might produce something like this:

#################################################################
#################################################################
#################################################################

#################################################################
#################################################################
#################################################################
Ok, so now that we’ve seen what a Markov Regime Switching Process might look like, let’s run through the discount version of what a Markov Regime Switching Model is….I can’t really improve on the many tutorials already out there so I’ll try to keep this brief:

I’m cribbing heavily from one of Erik Kole’s examples because I find it very clear. Suppose we have data on financial returns of a particular asset (call this $Y$)and these returns are governed by different processes depending on whether we are in economic expansion or economic recession. This means the distribution of returns at time $t$, $Y_{t}$, depends on the state of the economy at time $t$ (call this $S_{t}$).

Call this ‘state of the economy’ process the regime and let $S_{t}=0$ if we are in recession and $S_{t}=1$ if we are expansion.

The return $Y_{t}$ is governed by the process,

$Y_{t} \sim N(\mu_{0},\sigma_{0})$ if $S_{t}=0$, and

$Y_{t} \sim N(\mu_{1},\sigma_{1})$ if $S_{t}=1$.

So $Y$ is distributed normal with average returns $\mu_{0}$ and variance $\sigma_{0}$ when we are in regime 0…and likewise for regime 1 with average returns $\mu_{1}$ and variance $\sigma_{1}$.

The kicker here is that we don’t directly observe the regime in time $t$ so we need a model that will allow us to make inferences on the unobserved regimes based on the data we do observe. The inference we want takes the form of two probabilities:

$\xi_{jt}=Pr[S_{t}=j|\Omega_{t};\theta]$

where $j=1,2$ and $\Omega_{t}=(y_{t},y_{t-1},...y_{1},y_{0})$ denotes the information available to us at time $t$ and $\theta$ is a parameter vector.

In this case we are assuming the unobserved state (regime) follows a Markov Process – meaning that all the information we need to predict next period’s value is contained in last period’s value. So now we introduce the regime-transition parameters. Let:

$p_{ij}=Pr[S_{t}=i|S_{t-1}=j]$.

The probability model for the unobserved state can now be formed using Bayes’ Rule:

$P[S_{t}=0|Y_{t}=y_{t};\theta]=\frac{P[Y_{t}=y_{t}|S_{t}=0]P[S_{t}=0]}{Pr[Y_{t}=y_{t}]}$

Here, it is worth highlighting a few things:

1. $Pr[S_{t}=0]=Pr[S_{t-1}=0]*p_{00} + Pr[S_{t-1}=1]*(1-p_{11})$

That is, once inference has been made about the probability of being in each of the two regimes in time $t$, we can use the transition probabilities that are part of the parameter vector $\theta$ to forecast the regime probabilities for next period.

2. $Pr[Y_{t}=y_{t}|S_{t}=0]=f(y_{t}|S_{t}=0,\Omega{t-1};\theta)=\frac{1}{(2 \pi \sigma_{0}^{2})^{0.5}}exp[\frac{1}{2 \sigma_{0}^{2}}(-(y_{t}-\mu_{0})^{2})]$.

The conditional density of Y given the current regime, past information, and parameter vector is normal with parameters $\mu_{i}, \sigma_{i}$ depending on the regime…this is by assumption of the model.

We can do a couple things to represent this a little more compactly:

First, let

P=$\begin{bmatrix} p_{00} & 1-p_{11} \\ 1-p_{00} & p_{11} \end{bmatrix}$

and let,

$Pr[S_{t}=j|\Omega_{t};\theta]=\xi_{t|t}=\begin{pmatrix}\ \xi_{0t}\\ \xi_{1t}\end{pmatrix}$

finally, denote

$f_{t}=\begin{pmatrix}\ f(y_{t}|S_{t}=0,\Omega_{t-1};\theta) \\ f(y_{t}|S_{t}=1,\Omega_{t-1};\theta) \end{pmatrix}$

Now the series of inference and forecast probabilities regarding the unobserved state (regime) that we need can be expressed as,

$\xi_{t|t}=\frac{1}{\xi_{t|t-1}'f_{t}}\xi_{t|t-1} \odot f_{t}$ and
$\xi_{t+1|t}=P\xi_{t|t}$

This probability model can be solved for the optimal values of $\theta=[\mu_{0},\sigma_{0},\mu_{1},\sigma_{1},p_{11},p_{22}]$ using maximum likelihood by noting that the conditional likelihood function is:

$L(y_{1},y_{2},...y_{T};\theta)=\Pi_{t=1}^{T}Pr[Y_{t}=y_{t}|\Omega_{t-1};\theta]$,

and noting that $Pr[Y_{t}=y_{t}|\Omega_{t-1};\theta]=\xi_{t|t-1}'f_{t}$ the log likelihood function can be written,

$l(y_{1},y_{2},...y_{T};\theta)=\sum_{t=1}^{T} log(\xi_{t|t-1}'f_{t})$.

A popular way to maximize this function for optimal parameter values is through the use of the Hamilton Filter. I’ll post some code for that tomorrow.

#################################################################
#################################################################
#################################################################