- Recent Changes 新聞
- History 歷史
- Preferences 喜好
- Edit 修改
- Discussion 討論
There is no homework to read or submit this week. However, I highly recommend that you experiment with markov.scm, which implements probabilistic inference using hidden Markov models.
The model is defined in hmm
. Please change it to
describe your favorite process: office work, politics, baseball,
whatever. The Scheme functions best-path
and
path-best
perform two different kinds of inference.
The best-path
function finds the most likely sequence
of states, which is appropriate when the utility function gives no
partial credit for even a slightly incorrect state sequence. The
path-best
function finds the sequence of most likely
states, which is appropriate when the utility function gives
partial credit for each correct state in the inferred sequence.
Here are some questions to ask:
-
Write a Scheme function to sample a sequence of states and observations from the probability distribution specified by
hmm
. -
Using the function suggested above, implement inefficient (exponential-time) algorithms for
best-path
andpath-best
that nevertheless return the correct answer. -
How does the output of
best-path
andpath-best
match or mismatch your expectations as you vary the model and the sequence of observations? -
Can changing a single observation affect all inferred states?
-
Do
best-path
andpath-best
ever infer two different state sequences? -
What is the time and space complexity of
best-path
andpath-best
in terms of the number of states in the model, the number of observations possible in each state, and the duration of the observation sequence?
The remaining three questions address how the implementation fails to scale to longer sequences.
-
The probability of a sequence of observations or states is roughly inverse-exponential in the duration of the sequence. For example, doubling the duration about squares the probability. Confirm this statement with a concrete example.
-
Because the probability gets so small so quickly as the duration increases, the computer may approximate the probability of every state sequence by zero, which makes it impossible to check which sequence is most likely. Demonstrate this risk with a concrete example.
-
To avoid this risk, we represent probabilities using their log in practice. Change the code to do so.
Please tell me about any discoveries or problems you encounter!