Lab 11 Section 2: Assignment

Lab 11 Assignment

Part A. DCT
Use MATLAB to read in lena.raw and process each 8x8 subblock of lena.raw separately. In MATLAB, there is a special function:
blkproc
that lets you process each subblock of size (m x n) separately.

Take the 2-D discrete cosine transform (DCT) of each 8x8 subblock. By using the "blkproc" command, taking the two dimensional DCT is simply done by
dct_coef = blkproc(image,[8 8],'dct2');
Please write "help blkproc" at MATLAB prompt to get more information about the usage of this function. Having found the DCT coefficients of each block of the image, compute the average energy of each of the 64 coefficients across blocks. In order to do that, square the coefficient at a chosen location in each block, and sum up these values across blocks (so that energies at the same location in different blocks are summed up). Then normalize these values by the number of the blocks to find the average energy of each of the 64 coefficients. Now, we are going to "prune" some of the coefficients in each block. Hence, we need to decide which coefficients to keep. We will keep those with the highest average energy, and discard the rest (this is where the compression takes place.) You should see that most of the image's energy is concentrated around the zeroth frequency which corresponds to the upper left corner after taking the DCT transform. At first, try keeping only 10 coefficients (the 10 with the highest average energy) and see how the reconstructed image looks after taking the 2-D inverse DCT.

Next, vary the number of coefficients kept and see how many coefficients you can discard until the image degradation is perceptually significant. Note that the subblock size matters (try several sizes and see how the quality varies, also see below for specific block sizes and number of coefficients retained you need to try for the report).

This procedure uses zonal masking, which means the same set of coefficients are kept for each mxn subblock (as opposed to a "locally adaptive" method of choosing which coefficents to keep on a subblock-by-subblock basis).

You will be using the MATLAB command "blkproc" (and its variant "newblkproc" if you prefer, see the Coding section) extensively throughout this lab. For this part, you are using two dimensional discrete cosine transform and its inverse for transformation purposes. In MATLAB, the forward and inverse DCT is built in for two dimensions, these are "dct2" and "idct2" respectively. You are going to use these functions with "blkproc".

For this part, prepare a MATLAB function called lab11a.m. Design this function such that it has 3 input and 3 output parameters.
Input parameters :
1) image (512x512 matrix)
2) block_size (typically the block size is a power of two, in the example explained above, we were using an 8x8 block)
3) coef_number (this is the number of the coefficients that we are going to keep for each block, in the example explained above the number of coefficients we are keeping for each block is 10)
Output parameters :
1) decompressed (the decompressed image which is the output after taking the 2D IDCT)
2) map (an index map that indicates the locations of coefficients to kept in each block. This output is a binary matrix of size "block_size x block_size", which consists of values 0 and 1 only. If map(i,j) = 1 at a certain location, this means that we are keeping all the coefficients at the (i,j)th location in each block. If map(i,j) = 0, this means that we are discarding the (i,j)th coefficient in every block) .
3) energy (a matrix which gives the average energies at each location of the block. The way to compute the average energy for each location is explained above. This matrix should also be of size "block_size x block_size").
4) MSE/pixel

Plot the contents of the matrix "energy" (in 2-D). Do the described experiments for block_sizes 8 and 16. You may want to try different values for the number of coefficients to keep. Give the average block energy matrix, the block coefficient mask matrix, and the resulting MSE per pixel for the following cases in your report :
1) block size = 8, 10 coefficients kept 2) block size = 8, 5 coefficients kept 3) block size = 16, 40 coefficients kept 4) block size = 16, 10 coefficients kept

Part B. KLT
The KLT is useful for compression assuming that the image is stationary and we can reliably estimate its statistical parameters (namely the mean and the correlation).

In this part of the lab, we are going to apply the KLT transform under these assumptions. We will apply the KLT to image subblocks. Then as in the previous parts, we are going to discard, the coefficients with low energies and keep the coefficients with high energies. As a matter of fact, this part is very similar to Part A (in terms of implementation) except for the fact that you are not going to work with a fixed transform that is built in MATLAB. You are going to find the KL transform yourself by estimating the corresponding statistical parameters and doing eigenvector decomposition on the autocovariance matrix as explained in class. The algorithmic summary of this part is as follows :
1) Read in the image in MATLAB.
2) Estimate the mean of subblocks. In order to do this simply take the arithmetic mean of all the blocks you have.
3) Subtract the mean matrix found from each block to make each block zero mean under our stationarity assumption.
4) Find the autocovariance matrix of each block. In order to find this, you will simply find the autocorrelation matrix of each "zero mean" block. You should first stack all pixels within each zero mean block to a vector (use a particular order, for example raster scan. Note that the way of ordering, i.e. the way you form the 1D vector from each 2D block is not important in terms of results). Having "reshaped" each subblock as a vector, The autocorrelation matrix is given by :
R = (1/N) sum (x.*x')
where R is the resulting autocorrelation matrix, N is the total number of blocks in the image, x is the 1D vector which consists of stacking all the elements of a particular block to a vector in a certain order. "x" is assumed to be a column vector and "x'" is the transpose of "x". The summation is carried out over all "x"es, i.e. over all blocks of the zero-mean image. Observe that the autocorrelation matrix, R, of the zero-mean image is equal to the autocovariance of the original image.
5) Having found the autocovariance matrix R, carry out eigenvector decomposition of this matrix by using the command "eig" ini MATLAB. The transpose of the eigenvector matrix found gives you the KL transform.
6) Apply this KL transform to each block of the original image (actually you should apply this to each block in a way that all the elements within a particular block are stacked in the vector form using the methodology you used above).
7) Having found the transform coefficients, find the average energy of each coefficient in the block. (Recall Part A - DCT, use the same methodology)
8) Keep a certain number of coefficients within each block at same location, such that these coefficients are the ones with the highest energies. Discard the rest. (Recall Part A).
9) Take the inverse KL transform. Now you have the decompressed image. (Note that the inverse KL transformation matrix is just the transpose of the forward KL transformation matrix. Therefore the inverse KL transformation matrix is the eigenvector matrix of the autocovariance matrix you found, similarly the forward KL transformation matrix is its transpose).

For this part, prepare a MATLAB function called lab11b.m. Design this function such that it has 3 input and 3 output parameters.
Input parameters :
1) image (512x512 matrix)
2) block_size (typically the block size is a power of two, in the example explained above we were using an 8x8 block)
3) coef_number (this is the number of coefficients that we are going to keep in each block)
Output parameters :
1) decompressed (the decompressed image which is the output after taking the 2D IDCT)
2) map (an index map which indicates the places of the coefficients to keep in each block. This output is supposed to be a matrix of size "block_size x block_size", which consists of values 0 and 1 only. If map(i,j) = 1 at a certain location, this means that we are keeping all the coefficients in each block that have this location. If map(i,j) = 0, this means that we are discarding each (i,j)th coefficient within every block) .
3) energy (a matrix which gives the average energies of each location of the block. The way to compute the average energy for each location is explained above. This matrix should also be of size "block_size x block_size").
4) MSE/pixel

Plot the contents of the matrix "energy" (in 2-D). Do these experiments for block_sizes 8 and 16. You may want to try different values for the number of coefficients to pick. Give the average block energy matrix, the block coefficient mask matrix, and the resulting MSE per pixel for the following cases in your report :
1) block size = 8, 10 coefficients kept 2) block size = 8, 5 coefficients kept 3) block size = 16, 40 coefficients kept 4) block size = 16, 10 coefficients kept

Part C. JPEG
JPEG, an image compression standard uses a method that is similar to the one used in part (1) but is much more complicated. Due to the complexity of the algorithm, we do not implement a JPEG scheme of our own here; instead, experiment with JPEG files using John Bradley's program "xv". Type "xv image_filename," and save the file as a JPEG file. You can use lena.gif (see the Coding section of this lab), or any other "continuous-tone" image. A color image would be good also.

When you try to save a file using xv, xv will ask you for "Quality" and "Smoothing" ratios. Save many versions of your original image using various values of each parameter and observe how your saved JPEG images appear. How does the appearance depend on each parameter?

You DO NOT need to submit any code or results for this part. It is for your own benefit to observe how the quality depends on the parameters. As the quality factor decreases, you should be able to see blocking artifacts.

Lab 11 Section 2: Assignment