Given the hidden Markov model and Viterbi algorithm of Section 9.3.6, perform a full trace, including setting up the appropriate back pointers, that shows how the observation #, n, iy, t, # would be processed.
Hand run the robot described in the Markov decision process example of Section 13.3.3. Use the same reward mechanism and select probabilistic values for a and b for the decision processing.
a. Run the robot again with different values for a and b. What policies give the robot the best chances for reward?
Already registered? Login
Not Account? Sign up
Enter your email address to reset your password
Back to Login? Click here