Skip to content

Conversation

@weikang9009
Copy link
Member

This PR is to extend functions for Markov classes to deal with non-ergodic Markov chains.

Instead of changing the behavior of existing functions like giddy.ergodic.steady_state and giddy.ergodic.fmpt which explicitly claim that they are only suitable for regular Markov chain, I added two functions:

which could deal with reducible Markov chains. For instance, if the Markov chain has two communicating classes, giddy.ergodic.steady_state_general will return an array of two vectors (see here for an example in the inline docstring.), and giddy.ergodic.fmpt_general will return an array which contains np.inf since it is impossible to get a state in class 1 if the system starts at a state in class 2 (see [here] for an example in the inline docstring.).

These two functions replacegiddy.ergodic.steady_state and giddy.ergodic.fmpt in classes Markov, Spatial_Markov, FullRank_Markov and GeoRank_Markov for estimating the steady state distributions and first mean passage times (spatially conditional or not).

Another enhancement is adding an optional parameter fill_diag to each of these Markov related classes to allow users to determine whether a stochastic transition probability matrix (each row summing to 1) is desired.

@weikang9009 weikang9009 added this to the release 2.3.0 milestone Jul 9, 2019
@weikang9009 weikang9009 requested review from ljwolf and sjsrey July 9, 2019 21:50
@weikang9009 weikang9009 changed the title Extend functions for Markov classes to deal with non-ergodic Markov chains (ENH) Extend functions for Markov classes to deal with non-ergodic Markov chains Jul 9, 2019
Copy link
Member

@ljwolf ljwolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this matters since I'm not a maintainer, but I've got some feedback given I've been tagged. My main concerns are:

  1. why are we not filling empty states by default?
  2. we should use clearer language than fill_diagonal, since this has a pretty different meaning elsewhere.

[0. , 0. , 0. , 0. , 1. ]])
>>> sm.S[2]
array([0.03607655, 0.22667277, 0.25883041, 0.43607249, 0.04234777])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this discussion, shouldn't we always fill the empty states? i.e. shouldn't this always be True?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two reasons not filling empty states by default:

  • Right now, the tests for spatial dependence in the Markov chain model actually ignore the rows full of 0 estimates. If we fill empty states by default, it could have significant implications to these tests.
  • Users might want to know where they have no observations.

Although the returned estimated transition probability matrix could have rows full of 0, when estimating the steady state distributions (for absorbing Markov chains) and first mean passage times, internally the transition probability matrix is adjusted so that it is stochastic - otherwise, it would not be possible to estimate them https://github.com/weikang9009/giddy/blob/limiting/giddy/markov.py#L226

@ljwolf
Copy link
Member

ljwolf commented Jul 10, 2019

oh yeah I also should say, this is great, and really significant that it's added 😄 I'm really glad to see this issue being addressed.

@weikang9009 weikang9009 changed the title (ENH) Extend functions for Markov classes to deal with non-ergodic Markov chains (WIP, ENH) Extend functions for Markov classes to deal with non-ergodic Markov chains Jul 15, 2019
@weikang9009
Copy link
Member Author

The current implementation for first mean passage times cannot properly deal with absorbing Markov chains like:

P = np.array([[1, 0,0,0,0],
              [0,1,0,0,0],
              [0, 0,0,0.5, 0.5],
             [0,0.5,0, 0, 0.5],
             [0.5,0, 0, 0.5, 0]])

where 0 and 1 are absorbing states and 2,3 and 4 are transient states.

@sjsrey
Copy link
Member

sjsrey commented Jul 15, 2019

The current implementation for first mean passage times cannot properly deal with absorbing Markov chains like:

P = np.array([[1, 0,0,0,0],
              [0,1,0,0,0],
              [0, 0,0,0.5, 0.5],
             [0,0.5,0, 0, 0.5],
             [0.5,0, 0, 0.5, 0]])

where 0 and 1 are absorbing states and 2,3 and 4 are transient states.

Can you say a little more about what "cannot properly" means here?

@weikang9009
Copy link
Member Author

P = np.array([[1. , 0. , 0. , 0. , 0. ],
              [0. , 1. , 0. , 0. , 0. ],
              [0. , 0. , 0. , 0.5, 0.5],
              [0. , 0.5, 0. , 0. , 0.5],
              [0.5, 0. , 0. , 0.5, 0. ]])

The Markov chain with transition probability matrix P has four communicating classes [0], [1], [2], [3,4] in each of which every state communicates with each other. The current implementation for first mean passage time for this reducible is to apply giddy.ergodic.fmpt to each of these four classes and assign infinite values to those across these classes, leading to the fmpt matrix:

array([[1.        ,        inf,        inf,        inf,        inf],
       [       inf, 1.        ,        inf,        inf,        inf],
       [       inf,        inf, 1.        ,        inf,        inf],
       [       inf,        inf,        inf, 2.        , 1.33333333],
       [       inf,        inf,        inf, 1.33333333, 2.        ]])

However, it is obvious that it is quite possible to pass from state 4 to 0, and once 0 is reached, the system will stay there forever. Therefore, the fmpt[4,0] should not be infinite and the current implementation is unable to deal with such Markov chains.

The reason is that this is an absorbing Markov chain in which 2,3 and 4 are transient states and 0 and 1 are absorbing states. Since the communicating classes [2] and [3,4] contain transient states, the function giddy.ergodic.fmpt which is based on the ergodic Markov theory cannot apply to these classes.

We need to draw on the absorbing chain theory to deal with such Markov chains in addition to the current implementation which can only deal with reducible Markov chains with multiple communicating classes full of ergodic sets.