CS 440/ECE 448
Fall 2024
Margaret Fleck
Quiz 6 skills list
The quiz will cover material through partial-order planning.
MP practicalities
Questions related to what you built in MP 10.
Markov Decision Processes
- Model and terminology for an MDP
- Quantized representation of continuous state variables via randomized actions
- Bellman equation
- Methods of solving the Bellman equation
- Value iteration
- Policy iteration
- Asynchronous dynamic programming
- How to choose a policy?
Reinforcement Learning
- Basic setup for reinforcement learning (e.g. main loop)
- Model-based reinforcement learning
- Model-free reinforcement learning
- Q-learning version of Bellman equation (expressing Q in terms of itself, without reference to the utility or
transition probability functions)
- TD update algorithm
- SARSA update algorithm
- How do TD and SARSA differ?
- Selecting an action
- Deriving a policy from utility values or from Q values.
- Incorporating exploration
- Online learning, offline learning, experience replay
Constraint satisfaction problems
Historical trivia and key examples
- Waltz line labelling
- 4-color theorem (proved here!)
- N-queens problem
- Map/graph coloring
- Graph coloring is NP-complete.
Hill-climbing
- how it works (high-level idea only)
- how it differs from backtracking search
Backtracking search (DFS)
- Variable assignments can be done in any order, search is to a known depth
- Why does DFS work well? Why isn't looping a worry?
- Heuristics for variable and value selection
- most constrained/most constraining variable
- least constraining value
- exploit any symmetries in the problem
- Forward checking, constraint propagation
- AC-3 algorithm
- How to incorporate constraint propagation into backtracking search
Planning (Part 1)
- Representing a planning problem
- State and action representation
- Preconditions and effects (postconditions)
- Approaches to planning
- Sussman anomaly
- Situation space planning vs. plan space planning
- Partial order (plan space) planning:
- the idea
- plan modification operations
- plan defects (open preconditions, threats) and how to fix them