Lab 11 Assignment
Part A. DCT
Use MATLAB to read in lena.raw and process each 8x8 subblock of
lena.raw separately. In MATLAB, there is a special function:
blkproc
that lets you process each subblock of size (m x n) separately.
Take the 2-D discrete cosine transform (DCT) of each 8x8 subblock.
By using the "blkproc" command, taking the two dimensional DCT is simply done
by
dct_coef = blkproc(image,[8 8],'dct2');
Please write "help blkproc" at MATLAB prompt to get more information about
the usage of this function.
Having found the DCT coefficients of each block of the image,
compute the average energy of each of the 64 coefficients across blocks.
In order to do that, square the coefficient at a chosen location
in each block,
and sum up these values across blocks (so that energies at the same location in
different blocks are summed up). Then normalize these values by the number of
the blocks to find the average energy of each of the 64 coefficients.
Now, we are going to "prune" some of the coefficients in each block.
Hence, we need to decide which coefficients to
keep. We will keep those with the highest average energy, and discard
the rest (this is where the compression takes place.)
You should see
that most of the image's energy is concentrated around the zeroth
frequency which corresponds to the upper left corner after taking the DCT
transform. At first, try keeping only 10 coefficients (the 10 with
the highest average energy) and see how the reconstructed image looks
after taking the 2-D inverse DCT.
Next, vary the number of coefficients kept and see how many
coefficients you can discard until the image degradation is
perceptually significant. Note that the subblock size matters (try
several sizes and see how the quality varies, also see below for specific block sizes and number of coefficients retained you need to try for the report).
This procedure uses zonal masking, which means the same
set of coefficients are kept for each mxn subblock (as opposed to a
"locally adaptive" method of choosing which coefficents to keep
on a subblock-by-subblock basis).
You will be using the MATLAB command "blkproc" (and its variant "newblkproc" if you prefer, see the Coding section) extensively throughout this
lab. For this part, you are using two dimensional discrete cosine transform and
its inverse for transformation purposes. In MATLAB, the forward and inverse DCT
is built in for two dimensions, these are "dct2" and "idct2" respectively. You
are going to use these functions with "blkproc".
For this part, prepare a MATLAB function called lab11a.m. Design this
function such that it has 3 input and 3 output parameters.
Input parameters :
1) image (512x512 matrix)
2) block_size (typically the block size is a power of two, in the example
explained above, we were using an 8x8 block)
3) coef_number (this is the number of the coefficients that we are going to
keep for each block, in the example explained above the number of coefficients
we are keeping for each block is 10)
Output parameters :
1) decompressed (the decompressed image which is the output after taking the
2D IDCT)
2) map (an index map that indicates the locations of coefficients to kept
in each block. This output is a binary matrix of size "block_size x
block_size", which consists of values 0 and 1 only. If map(i,j) = 1 at a
certain location, this means that we are keeping all the coefficients at the
(i,j)th location in each block. If map(i,j) = 0, this means that we are
discarding the (i,j)th coefficient in every block) .
3) energy (a matrix which gives the average energies at each location of the
block. The way to compute the average energy for each location is explained
above. This matrix should also be of size "block_size x block_size").
4) MSE/pixel
Plot the contents of the matrix "energy"
(in 2-D). Do the described
experiments for block_sizes 8 and 16. You may want to try different
values for the number of coefficients to keep. Give the average block energy matrix, the block coefficient mask matrix, and the resulting MSE per pixel for the following cases in your report :
1) block size = 8, 10 coefficients kept
2) block size = 8, 5 coefficients kept
3) block size = 16, 40 coefficients kept
4) block size = 16, 10 coefficients kept
Part B. KLT
The KLT is useful for compression assuming that the image is stationary and we
can reliably estimate its statistical parameters (namely the mean and the
correlation).
In this part of the lab, we are going to apply the KLT transform under these
assumptions.
We will apply the KLT to image subblocks. Then as in the previous parts, we are
going to discard, the coefficients with low energies and keep the coefficients
with high energies. As a matter of fact, this part is very similar to Part A
(in terms of implementation) except for the fact that you are not going to work
with a fixed transform that is built in MATLAB. You are going to find the KL
transform yourself by estimating the corresponding statistical parameters and
doing eigenvector decomposition on the autocovariance matrix as explained in
class.
The algorithmic summary of this part is as follows :
1) Read in the image in MATLAB.
2) Estimate the mean of subblocks. In order to do this simply take the
arithmetic mean of all the blocks you have.
3) Subtract the mean matrix found from each block to make each block zero mean
under our stationarity assumption.
4) Find the autocovariance matrix of each block. In order to find this, you
will simply find the autocorrelation matrix of each "zero mean" block. You
should first stack all pixels within each zero mean block to a vector (use a
particular order, for example raster scan. Note that the way of ordering, i.e.
the way you form the 1D vector from each 2D block is not important in terms of
results). Having "reshaped" each subblock as a vector, The autocorrelation
matrix is given by :
R = (1/N) sum (x.*x')
where R is the resulting autocorrelation matrix, N is the total number of
blocks
in the image, x is the 1D vector which consists of stacking all the elements
of a particular block to a vector in a certain order. "x" is assumed to be a
column vector and "x'" is the transpose of "x". The summation is carried out
over all "x"es, i.e. over all blocks of the zero-mean image. Observe
that the autocorrelation matrix, R, of the zero-mean image is equal to the
autocovariance of the original image.
5) Having found the autocovariance matrix R, carry out eigenvector
decomposition of this matrix by using the command "eig" ini MATLAB. The
transpose of the eigenvector matrix found gives you the KL transform.
6) Apply this KL transform to each block of the original image
(actually you should apply this to each block in a way that all the elements
within a particular block are stacked in the vector form using the methodology
you used above).
7) Having found the transform coefficients, find the average energy of each
coefficient in the block. (Recall Part A - DCT, use the same methodology)
8) Keep a certain number of coefficients within each block at same location,
such that these coefficients are the ones with the highest energies. Discard
the rest. (Recall Part A).
9) Take the inverse KL transform. Now you have the decompressed image.
(Note that the inverse KL transformation matrix is just the transpose of
the forward KL transformation matrix. Therefore the inverse KL
transformation matrix is the eigenvector matrix of the autocovariance
matrix you found, similarly the forward KL transformation matrix is its
transpose).
For this part, prepare a MATLAB function called lab11b.m. Design
this
function such that it has 3 input and 3 output parameters.
Input parameters :
1) image (512x512 matrix)
2) block_size (typically the block size is a power of two, in the example
explained above we were using an 8x8 block)
3) coef_number (this is the number of coefficients that we are going to
keep in each block)
Output parameters :
1) decompressed (the decompressed image which is the output after taking the
2D IDCT)
2) map (an index map which indicates the places of the coefficients to keep
in each block. This output is supposed to be a matrix of size "block_size x
block_size", which consists of values 0 and 1 only. If map(i,j) = 1 at a
certain location, this means that we are keeping all the coefficients in each
block that have this location. If map(i,j) = 0, this means that we are
discarding each (i,j)th coefficient within every block) .
3) energy (a matrix which gives the average energies of each location of the
block. The way to compute the average energy for each location is explained
above. This matrix should also be of size "block_size x block_size").
4) MSE/pixel
Plot the contents of the matrix "energy"
(in 2-D). Do these
experiments for block_sizes 8 and 16. You may want to try different
values for the number of coefficients to pick. Give the average block energy matrix, the block coefficient mask matrix, and the resulting MSE per pixel
for the following cases in your report :
1) block size = 8, 10 coefficients kept
2) block size = 8, 5 coefficients kept
3) block size = 16, 40 coefficients kept
4) block size = 16, 10 coefficients kept
Part C. JPEG
JPEG, an image compression standard uses a method that is similar to the one
used in part (1) but is much more complicated. Due to the complexity of the
algorithm, we do not implement a JPEG scheme of our own here; instead,
experiment with JPEG files using John Bradley's program "xv". Type "xv
image_filename," and save the file as a JPEG file. You can use lena.gif (see the Coding section of this lab), or any other "continuous-tone" image. A color
image would be good also.
When you try to save a file using xv, xv will ask you for "Quality" and
"Smoothing" ratios. Save many versions of your original image using various
values of each parameter and observe how your saved JPEG
images appear. How does the appearance depend on each parameter?
You DO NOT need to submit any code or results for this part. It is for your
own benefit to observe how the quality depends on the parameters. As the
quality factor decreases, you should be able to see blocking artifacts.
Lab 11 Section 2: Assignment