This package demonstrates how to use Stan to fit dynamic linear models of form
That is, we fit some static parameters of a linear state space model, including possibly parameters for the model and observation noise. In addition, we show how to sample the states given the parameters efficiently, inspired by the blog post in http://www.juhokokkala.fi/blog/posts/kalman-filter-style-recursion-to-marginalize-state-variables-to-speed-up-stan-inference/ (here we consider a more general case than in the blog post).
Note that there is also a function already available in Stan for DLM fitting, see https://mc-stan.org/docs/2_26/functions-reference/gaussian-dynamic-linear-models.html. The difference here is that we include the "forcing term" B(θ)uk and consider also the random sampling of the states given the parameters.
The code runs with CmdStanPy, which is a thin python-wrapper around CmdStan, the command line interface to Stan. In addition to CmdStanPy, you'll need some standard python stuff like numpy, pandas and matplotlib. Note that you need to also install CmdStan, which can be easily done with cmdstanpy.install_cmdstan().
For a linear state space system, the likelihood of the parameters can be efficiently calculated by integrating out the state variables using a Kalman Filter recursion. The likelihood of the parameters can be calculated using the chain rule of joint probability:
The individual predictive distributions can be calculated in the linear case recursively as follows (dropping the parameter dependency from the matrices here for simplicity), starting from :
Now the final likelihood is the combination of the above densities, and we can run MCMC to sample the posterior for θ. Along the posterior sampling, we can get also samples of the states x1:T using the following backward recursion:
- sample xT from
, which is just the filtering distribution for the last state,
- sample from
- sample from
- ....
That is, we end up having to sample from densities like
where the first equation follows from the Markovian model (conditional independence) and the second equation is the Bayes formula. The components in the final expressions are Gaussians:
We can think of the first one as "likelihood" with xk+1 as "data" and the latter as prior, which gives us the posterior , where
That is, if we store all the results from the Kalman filter forward pass, we can calculate a random sample given θ using the pre-calculated results.
The general Stan code for sampling the parameters and states given the parameters using the above trick is given in https://github.com/solbes/dlmstan/blob/main/dlm.stan. The code does not run on it's own, the user needs to write the functions build_A, build_B etc for the application in question, which can be done using the functions block in Stan, see the examples below. The user also needs to feed in data in a standard format, see again the examples.
Some comments about the code:
- User writes the
functionsblock for the problem at hand, which is then appended by the general codedlm.stan, see the examples. - If matrix
Bis not present in the application, the user needs to still feed in a dummy zero matrix, see the first example - The code separates two types of parameters, normal "model parameters" and noise parameters that enter the covariance matrices
QandR. Normal priors are given for both, whose parameters can be given in the data, see the examples. - Initial values for the states need to be given in the data, see the examples.
- In the DLM code, the matrix inversion lemma is used to avoid explicitly inverting the filtering covariance matrices.
- Nile river, estimating 3 noise parameters: https://github.com/solbes/dlmstan/blob/main/examples/stan_dlm_nile.ipynb
- Beer cooling, 2 model parameters and 2 noise parameters: https://github.com/solbes/dlmstan/blob/main/examples/stan_dlm_beer.ipynb