CS 440/ECE 448
Fall 2023
Margaret Fleck
Quiz 6 skills list
The quiz will cover material through the lectures on constraint satisfaction
problems.
Markov Decision Processes
- Model and terminology for an MDP
- Quantized representation of continuous state variables via randomized actions
- Bellman equation
- Methods of solving the Bellman equation
- Value iteration
- Policy iteration
- Asynchronous dynamic programming
- How to choose a policy?
Reinforcement Learning
- Basic setup for reinforcement learning (e.g. main loop)
- Model-based reinforcement learning
- Model-free reinforcement learning
- Q-learning version of Bellman equation (expressing Q in terms of itself, without reference to the utility or
transition probability functions)
- TD update algorithm
- SARSA update algorithm
- How do TD and SARSA differ?
- Selecting an action
- Deriving a policy from utility values or from Q values.
- Incorporating exploration
- Online learning, offline learning, experience replay
Constraint satisfaction problems
Historical trivia:
- Waltz line labelling
- 4-color theorem
Key examples (be familiar with basic rules)
- n-queens, map/graph coloring, Sudoku, Cryptarithmetic, Waltz line labelling, task scheduling
- Graph coloring is NP-complete.
Hill-climbing
- how it works (high-level idea only)
- how it differs from backtracking search
Backtracking search (DFS)
- Variable assignments can be done in any order, search is to a known depth
- Why does DFS work well? Why isn't looping a worry?
- Heuristics for variable and value selection
- most constrained/most constraining variable
- least constraining value
- Forward checking, constraint propagation
- AC-3 algorithm
- How to incorporate constraint propagation into backtracking search