CS440 Quiz 4

CS 440/ECE 448
Fall 2021
Margaret Fleck

Quiz 4 skills list

The quiz will be on Wednesday October 27th, covering material through Linear Classifiers

It will be available on moodle 7am to noon central time, and you will have 30 minutes to do it.

General design

Uses for classification: labelling objects, making decisions
Multi-layer systems
What can we tune?
- parameters (e.g. weights)
- hyper-parameters (e.g. tuning constants)
- design, network topology
Challenges with determining the correct answer
- how specific/general should the class label be?
- unfamiliar objects, unfamiliar words
- context may affect best label to choose
- deciding what's important in complex scenes, extended sentences
Data for supervised training
- "gold" answers
- Noise in "correct" answers/annotation
- Annotators with limited training
- Data scraped off the web
- Data available only for final output of system
Workarounds of limited training data
- Re-purposing layers trained for another purpose
- Creating training pairs by removing information
- Self-supervised, semi-supervised, unsupervised methods
Batch vs. incremental training

Specific techniques

k-nearest neighbors (how it works, what happens if you change k)
L1 vs. L2 norm
Decision trees, random forests
Entropy: definition, how it relates to evaluating possible splits in a decision tree

Sample activation functions. Know the equations for sigmoid and ReLU.
Sample loss functions (e.g. 0/1, L1, L2, cross-entropy)
What are we minimizing when we adjust the weights? (composition of weighted feature sum, activation function, loss function)
Adjusting weights for a differentiable unit using gradient descent.
- Main update equation (not details of all the derivatives)
- Why do we need activation and loss functions differentiable?
One-hot representations
Softmax
Regularization