MP3: Face detection using AdaBoost¶

In this lab, you'll use the Viola-Jones algorithm to detect faces in images.

In order to make sure everything works, you might want to go to the command line, and run

pip install -r requirements.txt

This will install the modules that are used on the autograder, including numpy, h5py, and the gradescope utilities.


Part 0: Reading the data¶

The data this time is provided in a set of image files, and in a JSON file called data.json. This file describes 8 rectangles per image.

  • The first four rectangles outline the face. These will be the positive examples for training our face detector. We will call this class 0.
  • The next four rectangles are chosen at random locations in the image. These will be negative examples for training our face detector. We will call this class 1.
In [1]:
import numpy as np
import json

with open('data.json') as f:
    data = json.load(f)

print('There are',len(data['test']),'testing data')
print('There are',len(data['train']),'training data')
print('The first one looks like this:\n',data['train'][0])
There are 32 testing data
There are 124 training data
The first one looks like this:
 {'filename': 'data/AF1_35D12460.jpg', 'rectangles': [[[54, 10, 151, 167], [459, 12, 159, 176], [55, 258, 151, 178], [424, 262, 168, 169]], [[165, 110, 88, 97], [38, 306, 83, 66], [528, 32, 184, 114], [545, 312, 135, 81]]]}

A rectangle is a 4-list, [x,y,width,height]. To make it easier to plot a rectangle on an axis, we can define a function:

In [2]:
def plotrect(rect,ax,color):
    xcoords = rect[0] + np.array([0,0,1,1,0])*rect[2]
    ycoords = rect[1] + np.array([0,1,1,0,0])*rect[3]
    ax.plot(xcoords,ycoords,color)

The JPG images are all in color, but we won't use the colors for this MP, so we'll use Pillow's convert('L') function to convert each image to grayscale as it is loaded.

In [ ]:
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt

img = np.array(Image.open(data['train'][0]['filename']).convert('L'))

fig, ax = plt.subplots(1,1,figsize=(10,10))
ax.imshow(img,cmap='gray')
for y,color in enumerate(['c--','m-']):
    for r in range(4):
        plotrect(data['train'][0]['rectangles'][y][r],ax,color)

Part 1: Integral Images¶

The Viola-Jones face detector is computationally efficient because it takes advantage of an intermediate representation called an integral image. If the grayscale image is I[y,x]I[y,x], then the integral image is

J[y,x]=x−1∑x′=0y−1∑y′=0I[y′,x′]

In order to make the rest of project computationally efficient, the first function you should write is submitted.integral_images:

In [ ]:
import importlib, submitted
importlib.reload(submitted)
help(submitted.integral_images)

Before we run integral_images, let's read in all of the training image files, and convert them all into a single grayscale numpy array called images:

In [ ]:
npix = len(data['train'])
m,n = img.shape
images = np.empty((npix,m,n))
for i in range(len(data['train'])):
    filename = data['train'][i]['filename']
    images[i,:,:] = np.array(Image.open(filename).convert('L'))

fig, ax = plt.subplots(1,1,figsize=(10,10))
ax.imshow(images[0,:,:],cmap='gray')
for y,color in enumerate(['c--','m-']):
    for r in range(4):
        plotrect(data['train'][0]['rectangles'][y][r],ax,color)
   

Now we can try running integral_images:

In [6]:
importlib.reload(submitted)
ii = submitted.integral_images(images)

fig, ax = plt.subplots(1,1,figsize=(10,10))
ax.imshow(ii[0,:,:],cmap='gray')
for y,color in enumerate(['c--','m-']):
    for r in range(4):
        plotrect(data['train'][0]['rectangles'][y][r],ax,color)

To make sure your code is working correctly, check that:

  • The image is dark in the upper left corner, and bright in lower right.
  • The intensity changes fastest in the right-center part of each sub-image, the area where the original image was brightest.

You can also check your answer against my answer, for the first 10 rows and columns or so:

In [7]:
print('Integral image has shape',ii.shape)
print('and its first 10x10 subimage is:\n')
print(ii[0,:10,:10])
Integral image has shape (124, 481, 721)
and its first 10x10 subimage is:

[[   0.    0.    0.    0.    0.    0.    0.    0.    0.    0.]
 [   0.    9.   17.   26.   38.   63.  131.  274.  481.  711.]
 [   0.   14.   28.   47.   76.  136.  287.  592. 1027. 1488.]
 [   0.   15.   30.   54.   95.  185.  414.  876. 1534. 2226.]
 [   0.   18.   35.   61.  105.  209.  494. 1086. 1938. 2862.]
 [   0.   30.   56.   87.  133.  246.  578. 1288. 2319. 3475.]
 [   0.   38.   70.  105.  154.  279.  663. 1498. 2716. 4105.]
 [   0.   39.   72.  108.  160.  302.  747. 1718. 3137. 4759.]
 [   0.   40.   74.  112.  171.  335.  848. 1964. 3592. 5447.]
 [   0.   43.   79.  119.  182.  362.  933. 2182. 4007. 6097.]]

Part 2: Compute Haar-like Features¶

The Viola-Jones detector uses Adaboost to learn a kind of two-layer neural network classifier. The input of the classifier is a selected vector of Haar-like features, which are simple convolutions that be computed very efficiently, for the entire dataset, from the stack of integral images.

Basically, a Haar-like feature is the sum of all pixel intensities in a rectangle, or the difference of several such features. If I[y,x] is an image, r=[r1,r2,r3,r4] is a rectangle, ϕ=[ϕ1,ϕ2,ϕ3,ϕ4,o1,o2] are fractions (0≤ϕ1,ϕ2,ϕ3,ϕ4≤1), and filter orders (1≤o1,1≤o2,o1+o2≤4), then the corresponding Haar-like feature is defined as

f=(ϕ2+ϕ4)r4−1∑y=ϕ2r4(ϕ1+ϕ3)r3−1∑x=ϕ1r3I[r2+y,r1+x](−1)int(o2y−ϕ2r4ϕ4r4)(−1)int(o1x−ϕ1r3ϕ3r3)

The definition is not very obvious from that equation. It's easier to understand it if I show you a picture (CC-SA-3.0). In the following figure, the sum of the pixels which lie within the white blocks is subtracted from the sum of the pixels which lie within the black blocks. Subfigure (1) has order o1=2,o2=1, subfigure (2) has o1=1,o2=2, subfigure (3) has o1=3,o2=1, and subfigure (4) has o1=2,o2=2:

image.png

In order to make the homework gradeable, let's set a standard that matches the equation but is slightly different from this image: let's say that the upper left block (block (0,0)) is always multiplied by +1, and other blocks are multiplied by +1 or -1, respectively.

If a Haar-like feature is computed from the original image, each feature requires O{r3r4} summations to compute. If we're computing too many features, that becomes a lot of computation --- remember that r3 and r4 are of the same order as the size of the image, i.e., hundreds of pixels.

Viola and Jones showed that the same features can be very efficiently computed from the integral image. For example, the summation of all pixels within the block shown below can be computed as J(A)+J(C)−J(B)−J(D) in the following image, where J(⋅) is the integral image at point ⋅ (Public Domain image):

image.png

If we use b[k,l] to denote the (k,l)th block within a subrectangle, then the total feature can be computed as:

s1=r1+ϕ1r3s2=r2+ϕ2r4s3=ϕ3r3/o1s4=ϕ4r4/o2b[k,l]=J[⌊s2+ls4⌋,⌊s1+ks3⌋]+J[⌊s2+(l+1)s4⌋,⌊s1+(k+1)s3⌋]−J[⌊s2+ls4⌋,⌊s1+(k+1)s3⌋]−J[⌊s2+(l+1)s4⌋,⌊s1+ks3⌋]f=ox−1∑k=0oy−1∑l=0(−1)k+lb[k,l]

In order to make Adaboost computationally efficient, you will want to compute a single feature (for a single specified value of the parameters (ϕ=[ϕ1,ϕ2,ϕ3,ϕ4,o1,o2]) for all of the rectangles in all of the images, all at once, as specified in the docstring for haarlike_features:

In [8]:
importlib.reload(submitted)
help(submitted.haarlike_features)
Help on function haarlike_features in module submitted:

haarlike_features(ii, rects, xfrac, yfrac, wfrac, hfrac, xorder, yorder)
    Compute Haar-like features with given params for all rects.
    Feature is computed as follows:
    (1) In each rectangle, find the subrectangle that starts at pixel
    (y+yfrac*h,x+xfrac*w), and ends at pixel 
    (y+(yfrac+hfrac)*h-1,x+(xfrac+wfrac)*w-1).
    (2) Divide the subrectangle into (yorder by xorder) blocks.
    (3) Add the pixels in positive blocks, and subtract the pixels 
    in negative blocks, where the top-left block is always positive.
    (But you shouldn't actually add pixels.  Instead, use the 
    integral image (ii) in a smart way).
    
    @param:
    ii (npix,m+1,n+1): a stack of npix integral images
    rects (2,npix,4,4): for each class, for each image, 4 rects
    xfrac (real): subrectangle start (fraction of rects[:,:,:,2])
    yfrac (real): subrectangle start (fraction of rects[:,:,:,3])
    wfrac (real): subrectangle width (fraction of rects[:,:,:,2])
    hfrac (real): subrectangle height (fraction of rects[:,:,:,3])
    xorder (int): subrectangle contains this many blocks across
    yorder (int): subrectangle contains this many blocks down
    
    @return:
    features (2,npix,4): features[y,i,r] is feature for class y,
    for the i'th image, for the r'th rectangle in the image

First, let's debug by using the top 6×6 rectangle in imagefiles[0].

In [9]:
print(ii[0,:7,:7])
[[  0.   0.   0.   0.   0.   0.   0.]
 [  0.   9.  17.  26.  38.  63. 131.]
 [  0.  14.  28.  47.  76. 136. 287.]
 [  0.  15.  30.  54.  95. 185. 414.]
 [  0.  18.  35.  61. 105. 209. 494.]
 [  0.  30.  56.  87. 133. 246. 578.]
 [  0.  38.  70. 105. 154. 279. 663.]]

We'll use np.tile to create 8 copies of that rectangle for every image in the dataset:

In [10]:
dummyrects = np.tile(np.array([0,0,6,6]),(2,len(images),4,1))
print('dummyrects has shape',dummyrects.shape)
dummyrects has shape (2, 124, 4, 4)

First, let's use (ϕ1,ϕ2,ϕ3,ϕ4) to compute the order (o1=1,o2=1) features for all of the various sub-blocks:

In [11]:
importlib.reload(submitted)
allblocks=[(0,1),(0,0.5),(0.5,0.5)]
for (xfrac,wfrac) in allblocks:
    for (yfrac,hfrac) in allblocks:
        f = submitted.haarlike_features(ii,dummyrects,xfrac,yfrac,wfrac,hfrac,1,1)
        print('(%g,%g,%g,%g): feature=%g'%(xfrac,yfrac,wfrac,hfrac,f[0,0,0]))
(0,0,1,1): feature=663
(0,0,1,0.5): feature=414
(0,0.5,1,0.5): feature=249
(0,0,0.5,1): feature=105
(0,0,0.5,0.5): feature=54
(0,0.5,0.5,0.5): feature=51
(0.5,0,0.5,1): feature=558
(0.5,0,0.5,0.5): feature=360
(0.5,0.5,0.5,0.5): feature=198

Now let's compute the feature for every possible feature order, and check by hand to make sure it makes sense:

In [12]:
importlib.reload(submitted)

for xorder in range(1,4):
    for yorder in range(1,5-xorder):
        features=submitted.haarlike_features(ii,dummyrects,0,0,1,1,xorder,yorder)
        print('xorder=%d,yorder=%d,feature=%d'%(xorder,yorder,features[0,0,0]))
xorder=1,yorder=1,feature=663
xorder=1,yorder=2,feature=165
xorder=1,yorder=3,feature=249
xorder=2,yorder=1,feature=-453
xorder=2,yorder=2,feature=-159
xorder=3,yorder=1,feature=495

Let's convert data['train'][:]['rectangles'] to an array called rects, and test to make sure we can compute a feature from it.

In [13]:
npix = ii.shape[0]
rects = np.empty((2,npix,4,4))
for i in range(npix):
    for y in range(2):
        for r in range(4):
            rects[y,i,r,:]=np.array(data['train'][i]['rectangles'][y][r])

importlib.reload(submitted)
features=submitted.haarlike_features(ii,rects,0,0,1,1,1,1)
print('features have shape',features.shape)
print('features of class 0 first image:',features[0,0,:])
print('features of class 1 first image:',features[1,0,:])
features have shape (2, 124, 4)
features of class 0 first image: [2859162. 3622397. 3276614. 4199616.]
features of class 1 first image: [ 333009.  896903. 3715836. 2166603.]

Part 3: Find the best threshold and sign for a given feature¶

Adaboost creates a strong classifier based on the weighted voting of a large number of weak classifiers. The jth weak classifier is defined as:

hj(x)={1pjfj(x)<pjθj0otherwise,

where pj∈{−1,+1} is a sign, and θj∈ℜ is a threshold.

In Adaboost, each training token (each tuple of x=(y,i,r) where y is a class label, i is an image index, and r is a rectangle) has a weight wj(x), and the weights all add to 1. The error rate can be computed by just adding up the weights of the misclassified tokens. Thus the error rate of the jth weak classifier is:

ϵj=∑xwj(x)|y(x)−hj(x)|

If the sign is positive (pj=+1), then

ϵj=<θj∑fj(x)=−∞wj(x)(1−y(x))+∞∑fj(x)=θjwj(x)y(x)

If the sign is negative (pj=−1), then

ϵj=θj∑fj(x)=−∞wj(x)y(x)+∞∑fj(x)>θjwj(x)(1−y(x))

Notice that the optimum value of θj is always equal to the feature value of one of the training tokens. We can find the optimum threshold in O{npix} time, therefore, as follows:

  • sort all of the labels y(x) and weights wj(x) in ascending order of the feature fj(x)
  • create the cumulative sums of wj(x)y(x) and wj(x)(1−y(x)) on these sorted arrays
  • find the x for which θj=fj(x) minimizes ϵj.
In [14]:
importlib.reload(submitted)
help(submitted.weak_classifier)
Help on function weak_classifier in module submitted:

weak_classifier(features, weights, ind)
    Find the best weak classifier for a set of features.
    The weak classifier is:
    h(a,b) = 1 if sign*f < sign*threshold, otherwise h(a,b)=0.
    
    @param:
    features (2,npix,4): features
      features[0,:,:] are the features of class 0 (faces)
      features[1,:,:] are the features of class 1 (nonfaces)
    weights (2,npix,4): importance weight of each feature
      such that np.sum(weights)=1
    ind (tuple): the result of running
      np.unravel_index(np.argsort(features,axis=None),features.shape)
      ind[0][k] is the class label,
      ind[1][k] is the image index,
      ind[2][k] is the rectangle index of the k'th smallest feature.
      features[ind] gives features sorted in ascending order,
      weights[ind] gives weights sorted in the same order.
    
    @return:
    epsilon (2,8*npix): 
      epsilon[0,k]=error if sign=-1 and threshold=features[ind][k]
      epsilon[1,k]=error if sign=1 and threshold=features[ind][k]

First, let's test np.argsort, to see how it sorts the features.

In [15]:
ind=np.unravel_index(np.argsort(features,axis=None),features.shape)

import matplotlib.pyplot as plt
fig, ax = plt.subplots(2,1,figsize=(14,12))
ax[0].plot(features[ind])
ax[0].set_title('Sorted features',fontsize=18)
ax[1].plot(ind[0])
ax[1].set_title('Class labels of sorted features',fontsize=18)
fig.tight_layout()

We can see that the features are, indeed, sorted. We can see also that the small-value features tend to be of class 1 (nonface), and the high-value features tend to be of class 0 (face). Remember that this is feature (0,0,1,1,1,1), i.e., the sum of all pixels in the rectangle. Apparently, faces tend to have larger pixel-sums than nonfaces. There are two possible reasons for this:

  • Maybe faces are brighter than the background. This would be a useful thing to discover.
  • Since faces are close-up in this database, maybe this database tends to have large face rectangles. Since nonface rectangles are generated at random, maybe nonface are often smaller. This is not a generalizable fact about faces, it is just an artifact of the way in which the database was created! Still, it might be useful for this MP, at least.

Anyway, let's go ahead and run weak_classifier to find out the training error of this classifier. We'll start out by giving equal weight to every training token.

In [16]:
weights = np.ones(features.shape)/np.prod(features.shape)
importlib.reload(submitted)
epsilon=submitted.weak_classifier(features,weights,ind)

print('p=-1, theta=%d, eps=%5.5g'%(features[ind][0],epsilon[0,0]))
print('p=+1, theta=%d, eps=%5.5g'%(features[ind][0],epsilon[1,0]))
print('p=-1, theta=%d, eps=%5.5g'%(features[ind][-1],epsilon[0,-1]))
print('p=+1, theta=%d, eps=%5.5g'%(features[ind][-1],epsilon[1,-1]))
p=-1, theta=28583, eps=0.50101
p=+1, theta=28583, eps=  0.5
p=-1, theta=7801254, eps=  0.5
p=+1, theta=7801254, eps=0.50101

Notice that:

  • with the smallest threshold and a positive sign, ϵ=0.5, because all of the y=1 tokens are classified as h=0.
  • with the smallest threshold and a negative sign, ϵ>0.5. The smallest token has true label y=1, but is misclassified as h=0. All larger tokens are classified as h=1, including all of the tokens that should have been y=0.

The opposite reasoning holds true for the largest threshold.

Now let's plot the error as a function of sort-index.

In [17]:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1,1,figsize=(14,7))
line0,=ax.plot(epsilon[0,:])
line1,=ax.plot(epsilon[1,:])
ax.legend([line0,line1],['sign=-1','sign=+1'],fontsize=18)
fig.tight_layout()

The +1 classifier achieves pretty good error, because it classifies small features as h=1. Error of the -1 classifier is pretty bad, because it classifies small features as h=0. Let's find the best threshold:

In [18]:
p,k = np.unravel_index(np.argmin(epsilon,axis=None),epsilon.shape)
theta = features[ind][k]
print('Best sign is',p,'and best threshold is',theta)
Best sign is 1 and best threshold is 1743865.0

Part 4: Train AdaBoost¶

AdaBoost's strong classifier is a weighted vote among many weak classifiers. The weak classifiers are chosen one by one, so that each weak classifier can be targeted to try to fix the errors made by the previous weak classifier. Specifically, AdaBoost training repeats the following steps:

  • For t=1,…,T:
    1. Scale the weights so they sum to 1
    2. Out of all possible weak classifiers, choose the one with the smallest weighted error, ϵt=minjϵj
    3. Decrease the weights of the tokens that were correctly classified: multiply their weights by βt=ϵt1−ϵt

The result is a sequence of weak classifiers, [h1,…,hT], and their corresponding weights [β1,…,βT], where ht=[ϕt,pt,θt] is a set of feature parameters (ϕt=[ϕ1,t,…,ϕ4,t,o1,t,o2,t]), and the corresponding optimum sign, pt, and threshold, θt.

In [103]:
importlib.reload(submitted)
help(submitted.train_adaboost)
Help on function train_adaboost in module submitted:

train_adaboost(ii, rects, weights)
    Perform one iteration of adaboost training.
    1. Normalize weights so they sum to one
    2. Find the best weak classifier, computed by
    exhaustive search among all classifiers 
    in the set:
      xfrac in {0,1/6,...,5/6}
      yfrac in {0,1/6,...,5/6}
      wfrac in {0,1/6,...,1-xfrac}
      hfrac in {0,1/6,...,1-yfrac}
      xorder in {1,...,3}
      yorder in {1,...,4-yorder}
      sign in {-1,+1}
      threshold in 
        haarlike_features(ii,rects,xfrac,...,yorder)
    3. Downweight correctly classified tokens by beta
    
    @param:
    ii (npix,m+1,n+1): a stack of npix integral images
    rects (2,npix,4,4): for each class, for each image, 4 rects
    weights (2,npix,4): importance weight of each feature.
      May not sum to 1.
    
    @return:
    h (list): xfrac,yfrac,wfrac,hfrac,xorder,yorder,sign,threshold
      of the best weak classifier
    beta (real): eps/(1-eps), where eps is the weighted error
    newweights (2,npix,4): new weights, with weights of correctly
      classified tokens downweighted by beta.  Need not sum to
      one.

The only way to find the best weak classifier is with an exhaustive search over every possible combination of parameters. That may take a while, so, if you wish, you can have your function print status updates every now and then. I recommend making this conditional on the parameter verbose, because you may not want it to be verbose during the longer training run that comes next. Mine prints a status update for each new combination of xfrac and wfrac, if verbose==True.

Notice that, since the first step of train_adaboost is to normalize the weights, we can just initialize it by setting weights to an all-ones matrix.

In [115]:
importlib.reload(submitted)

weights = np.ones((2,ii.shape[0],4))
h,beta,newweights=submitted.train_adaboost(ii,rects,weights,True)

print('best classifier is',h,', and beta is',beta)
Searching xfrac=0/6, wfrac=1/6
Searching xfrac=0/6, wfrac=2/6
Searching xfrac=0/6, wfrac=3/6
Searching xfrac=0/6, wfrac=4/6
Searching xfrac=0/6, wfrac=5/6
Searching xfrac=0/6, wfrac=6/6
Searching xfrac=1/6, wfrac=1/6
Searching xfrac=1/6, wfrac=2/6
Searching xfrac=1/6, wfrac=3/6
Searching xfrac=1/6, wfrac=4/6
Searching xfrac=1/6, wfrac=5/6
Searching xfrac=2/6, wfrac=1/6
Searching xfrac=2/6, wfrac=2/6
Searching xfrac=2/6, wfrac=3/6
Searching xfrac=2/6, wfrac=4/6
Searching xfrac=3/6, wfrac=1/6
Searching xfrac=3/6, wfrac=2/6
Searching xfrac=3/6, wfrac=3/6
Searching xfrac=4/6, wfrac=1/6
Searching xfrac=4/6, wfrac=2/6
Searching xfrac=5/6, wfrac=1/6
best classifier is [0.0, 0.0, 1.0, 0.6666666666666666, 1, 2, -1, -111920.0] , and beta is 0.1481481481481506

Now let's make sure that some of the weights were changed. The newweights should have two values now: one value for tokens that were incorrectly classified (about ϵ=β/(1+β)=0.13 of the tokens), and a lower value for tokens that were correctly classified (the remaining 87%). The weights don't need to sum to one.

In [116]:
fig, ax = plt.subplots(1,1,figsize=(14,4))
line0,=ax.plot(newweights[0,:,:].flatten())
line1,=ax.plot(newweights[1,:,:].flatten())
ax.legend([line0,line1],['class 0','class 1'],fontsize=18)
fig.tight_layout()
print('Sum of newweights is',np.sum(newweights))
Sum of newweights is 0.25806451612903447

It looks like most of the tokens were downweighted, because they were correctly classified.

Now that train_adaboost seems to be working, let's try running it 40 times, to get 40 weak classifiers. Each weak classifier will focus its attention on tokens that were misclassified by previous weak classifiers, so that, over time, a weighted vote of these classifiers gets stronger and stronger.

Warning: this will take a lot of time.

In [120]:
import json

weights = np.ones((2,ii.shape[0],4))
betalist = []
hlist = []
for t in range(40):
    h,beta,newweights=submitted.train_adaboost(ii,rects,weights)
    print('Finished iteration',t)
    print('Its error rate was %4.4g'%(beta/(1+beta)))
    print('newweights sum to %4.4g'%(np.sum(newweights)))
    hlist.append([float(x) for x in h])
    betalist.append(float(beta))
    with open('trained_model.json','w') as f:
        json.dump({'hlist':hlist,'betalist':betalist},f)    
    print('Saved %d weak classifiers'%(len(hlist)))
    weights = newweights
    
Finished iteration 0
Its error rate was 0.129
newweights sum to 0.2581
Saved 1 weak classifiers
Finished iteration 1
Its error rate was 0.1959
newweights sum to 0.3918
Saved 2 weak classifiers
Finished iteration 2
Its error rate was 0.215
newweights sum to 0.4301
Saved 3 weak classifiers
Finished iteration 3
Its error rate was 0.2211
newweights sum to 0.4422
Saved 4 weak classifiers
Finished iteration 4
Its error rate was 0.2429
newweights sum to 0.4857
Saved 5 weak classifiers
Finished iteration 5
Its error rate was 0.2404
newweights sum to 0.4808
Saved 6 weak classifiers
Finished iteration 6
Its error rate was 0.2913
newweights sum to 0.5825
Saved 7 weak classifiers
Finished iteration 7
Its error rate was 0.2728
newweights sum to 0.5457
Saved 8 weak classifiers
Finished iteration 8
Its error rate was 0.2924
newweights sum to 0.5847
Saved 9 weak classifiers
Finished iteration 9
Its error rate was 0.2969
newweights sum to 0.5938
Saved 10 weak classifiers
Finished iteration 10
Its error rate was 0.3031
newweights sum to 0.6063
Saved 11 weak classifiers
Finished iteration 11
Its error rate was 0.2824
newweights sum to 0.5647
Saved 12 weak classifiers
Finished iteration 12
Its error rate was 0.3034
newweights sum to 0.6067
Saved 13 weak classifiers
Finished iteration 13
Its error rate was 0.3267
newweights sum to 0.6535
Saved 14 weak classifiers
Finished iteration 14
Its error rate was 0.3177
newweights sum to 0.6354
Saved 15 weak classifiers
Finished iteration 15
Its error rate was 0.3219
newweights sum to 0.6438
Saved 16 weak classifiers
Finished iteration 16
Its error rate was 0.3041
newweights sum to 0.6082
Saved 17 weak classifiers
Finished iteration 17
Its error rate was 0.2939
newweights sum to 0.5878
Saved 18 weak classifiers
Finished iteration 18
Its error rate was 0.2861
newweights sum to 0.5722
Saved 19 weak classifiers
Finished iteration 19
Its error rate was 0.2997
newweights sum to 0.5995
Saved 20 weak classifiers
Finished iteration 20
Its error rate was 0.3206
newweights sum to 0.6413
Saved 21 weak classifiers
Finished iteration 21
Its error rate was 0.3276
newweights sum to 0.6553
Saved 22 weak classifiers
Finished iteration 22
Its error rate was 0.2937
newweights sum to 0.5874
Saved 23 weak classifiers
Finished iteration 23
Its error rate was 0.3194
newweights sum to 0.6388
Saved 24 weak classifiers
Finished iteration 24
Its error rate was 0.3067
newweights sum to 0.6134
Saved 25 weak classifiers
Finished iteration 25
Its error rate was 0.3277
newweights sum to 0.6554
Saved 26 weak classifiers
Finished iteration 26
Its error rate was 0.2974
newweights sum to 0.5949
Saved 27 weak classifiers
Finished iteration 27
Its error rate was 0.3253
newweights sum to 0.6506
Saved 28 weak classifiers
Finished iteration 28
Its error rate was 0.3286
newweights sum to 0.6572
Saved 29 weak classifiers
Finished iteration 29
Its error rate was 0.337
newweights sum to 0.674
Saved 30 weak classifiers
Finished iteration 30
Its error rate was 0.3161
newweights sum to 0.6322
Saved 31 weak classifiers
Finished iteration 31
Its error rate was 0.306
newweights sum to 0.612
Saved 32 weak classifiers
Finished iteration 32
Its error rate was 0.3045
newweights sum to 0.609
Saved 33 weak classifiers
Finished iteration 33
Its error rate was 0.3122
newweights sum to 0.6243
Saved 34 weak classifiers
Finished iteration 34
Its error rate was 0.3346
newweights sum to 0.6691
Saved 35 weak classifiers
Finished iteration 35
Its error rate was 0.3304
newweights sum to 0.6607
Saved 36 weak classifiers
Finished iteration 36
Its error rate was 0.3304
newweights sum to 0.6608
Saved 37 weak classifiers
Finished iteration 37
Its error rate was 0.328
newweights sum to 0.656
Saved 38 weak classifiers
Finished iteration 38
Its error rate was 0.3263
newweights sum to 0.6526
Saved 39 weak classifiers
Finished iteration 39
Its error rate was 0.3322
newweights sum to 0.6644
Saved 40 weak classifiers

Generally speaking, as the hardest tokens get a larger and larger fraction of the overall weight, the error rates get larger. The error rates should always remain less than 0.5, however.

In [122]:
epsilon = [ beta/(1+beta) for beta in betalist ]
fig, ax = plt.subplots(1,1,figsize=(14,4))
ax.plot(epsilon)
ax.set_title('weighted error vs. iteration number',fontsize=18)
Out[122]:
Text(0.5, 1.0, 'weighted error vs. iteration number')

Part 5: Test Adaboost¶

In order to test an adaboost classifier, you need to take a weighted average of the decisions of all of the weak classifiers. The decision of the strong classifier is

h(x)={1∑tαtht(x)≥12∑tαt0otherwise

where αt=log1βt.

In [19]:
importlib.reload(submitted)
help(submitted.test_adaboost)
Help on function test_adaboost in module submitted:

test_adaboost(ii, rects, hlist, betalist)
    Test a trained adaboost classifier.
    
    @param:
    ii (ntest,m+1,n+1): a stack of npix integral images
    rects (2,ntest,4,4): for each class, for each image, 4 rects
    hlist (list of lists): 
      [[xfrac,yfrac,wfrac,hfrac,xorder,yorder,sign,threshold]
      for each weak classifier.]
    betalist (list of reals): 
      [weighted_error/(1-weighted_error) for each weak classifier]
    
    @return:
    alpha (len(betalist)): the alpha weights
    hweak (2,ntest,4,len(betalist)): predictions of weak classifiers
    hstrong (2,ntest,4): predictions of strong classifier
      hstrong = (np.dot(hweak,alpha)>=np.sum(alpha)/2).astype('int')

OK, so first let's load the test images, compute their integral images, and convert data['test'][:]['rectangles'] to a numpy array:

In [20]:
ntest = len(data['test'])
m,n = img.shape
testim = np.empty((ntest,m,n))
testii = np.empty((ntest,m+1,n+1))
for i in range(ntest):
    filename = data['test'][i]['filename']
    testim[i,:,:]=np.array(Image.open(filename).convert('L'))

testii = submitted.integral_images(testim)
testre = np.empty((2,ntest,4,4))
for i in range(ntest):
    for y in range(2):
        for r in range(4):
            testre[y,i,r,:]=np.array(data['test'][i]['rectangles'][y][r])

Now we can try running test_adaboost:

In [21]:
with open('trained_model.json') as f:
    trained_model = json.load(f)
hlist = trained_model['hlist']
betalist = trained_model['betalist']
# Convert xorder, yorder, and sign back to integers!
for t in range(len(hlist)):
    hlist[t][4] = int(hlist[t][4])
    hlist[t][5] = int(hlist[t][5])
    hlist[t][6] = int(hlist[t][6])
In [22]:
importlib.reload(submitted)
alpha,hwk,hstr=submitted.test_adaboost(testii,testre,hlist,betalist)

errorcount = np.sum(hstr[0,:,:]!=0)+np.sum(hstr[1,:,:]!=1)
errorrate = errorcount / np.prod(hstr.shape)
print('Error rate of strong classifier is',errorrate)
Error rate of strong classifier is 0.0703125

Thus the strong classifier is not perfect, but it is better than any one of the weak classifiers alone!


Extra Credit¶

For extra credit, figure out how to run your trained face classifier as a face detector. In each input image, test every possible rectangle within that image, to see which one has the highest face detection score, where you can define the score to be a weighted sum of the binary decisions from weak classifiers:

s(x)=∑tαtht(x)

To avoid outrageous computational complexity, you will not search the entire image, but only the part specified in the function call. Notice that, even with only 10 possibilities in each range, you are still searching 10,000 possible rectangles.

In [25]:
import extra, importlib
importlib.reload(extra)
help(extra.detect_faces)
Help on function detect_faces in module extra:

detect_faces(ii, xrange, yrange, wrange, hrange, hlist, betalist)
    Use a trained Adaboost face detector to find a face
    in an image.  You should look only in the set of rectangles
    (x,y,w,h) for x in xrange for y in yrange for w in wrange
    for h in hrange.
    
    @param:
    ii (m+1,n+1): integral image for just one input image
    xrange (iterable): possible upper-left corners of the rectangle
    yrange (iterable): possible upper-left corners of the rectangle
    wrange (iterable): possible widths of the rectangle
    hrange (iterable): possible heights of the rectangle
    hlist (list of lists): 
      [[xfrac,yfrac,wfrac,hfrac,xorder,yorder,sign,threshold]
      for each weak classifier.]
    betalist (list of reals): 
      [weighted_error/(1-weighted_error) for each weak classifier]
    
    @return:
    rect (4): top-rated rectangle in the image

In [40]:
importlib.reload(extra)
xrange = range(160,180,2)
yrange = range(20,40,2)
wrange = range(130,150,2)
hrange = range(170,190,2)
score = extra.detect_faces(testii[0,:,:],xrange,yrange,wrange,hrange,hlist,betalist)
In [41]:
best_ind = np.unravel_index(np.argmax(score),score.shape)
bestx = xrange[best_ind[0]]
besty = yrange[best_ind[1]]
bestw = wrange[best_ind[2]]
besth = hrange[best_ind[3]]
bestscore = score[best_ind]
print('Best rectangle is',bestx,besty,bestw,besth)
print('with score',bestscore)
Best rectangle is 170 38 130 178
with score 21.53589260657187
In [42]:
fig, ax = plt.subplots(1,1,figsize=(10,10))
ax.imshow(testim[0,:,:],cmap='gray')
for y,color in enumerate(['c--','m-']):
    for r in range(4):
        plotrect(testre[y,0,r,:],ax,color)
plotrect([bestx,besty,bestw,besth],ax,'y-')
ax.set_title('Cyan: true, Magenta: false, Yellow: detected')
Out[42]:
Text(0.5, 1.0, 'Cyan: true, Magenta: false, Yellow: detected')
In [ ]: