Markov models

2008-08-17 19:19

There is no homework to read or submit this week. However, I highly recommend that you experiment with markov.scm, which implements probabilistic inference using hidden Markov models.

The model is defined in hmm. Please change it to describe your favorite process: office work, politics, baseball, whatever. The Scheme functions best-path and path-best perform two different kinds of inference. The best-path function finds the most likely sequence of states, which is appropriate when the utility function gives no partial credit for even a slightly incorrect state sequence. The path-best function finds the sequence of most likely states, which is appropriate when the utility function gives partial credit for each correct state in the inferred sequence.

Here are some questions to ask:

Write a Scheme function to sample a sequence of states and observations from the probability distribution specified by hmm.
Using the function suggested above, implement inefficient (exponential-time) algorithms for best-path and path-best that nevertheless return the correct answer.
How does the output of best-path and path-best match or mismatch your expectations as you vary the model and the sequence of observations?
Can changing a single observation affect all inferred states?
Do best-path and path-best ever infer two different state sequences?
What is the time and space complexity of best-path and path-best in terms of the number of states in the model, the number of observations possible in each state, and the duration of the observation sequence?

The remaining three questions address how the implementation fails to scale to longer sequences.

The probability of a sequence of observations or states is roughly inverse-exponential in the duration of the sequence. For example, doubling the duration about squares the probability. Confirm this statement with a concrete example.
Because the probability gets so small so quickly as the duration increases, the computer may approximate the probability of every state sequence by zero, which makes it impossible to check which sequence is most likely. Demonstrate this risk with a concrete example.
To avoid this risk, we represent probabilities using their log in practice. Change the code to do so.

Please tell me about any discoveries or problems you encounter!