University of Illinois at Urbana-Champaign
Department of Electrical and Computer Engineering

ECE 511: Computer Architecture

Fall 2004

Homework 1

Purpose

The main purpose of this assignment is to give you experience building and using the simulator software.

Assignment

In class on Monday Aug 30 we discussed the performance improvements that one gets from faster transistors versus the performance improvements that come from architectural improvements. I showed you a graph and claimed that it demonstrated that of the ~72x performance improvement in Pentium processors over the last 10 years, 8x came from faster transistors and 9x came from "architecture."

The question then came up, "of the 9x performance improvement from architecture, how much (if any) was due to `easy' architecture (just making the caches bigger)?" My assertion was that average memory latency has been increasing for the last 10 years, even though caches are getting much larger. This is because, while logic transistors have sped up by a factor of about 8, DRAM speeds have increased by only about a factor of 2. Thus a DRAM access today takes about 4 times more gate delays today than it did in 1994.

Let's test my assertion. After you unpack, build and run the simulator (instructions below), make the following change. Currently the simulator has a structure somewhat similar to a circa 1994 Pentium: two way inorder with simple branch prediction and a 16K L1 cache, no L2 cache, and DRAM access cost is set to 30 cycles. The data cache model in the simulator is in the file exec_unit.cc.

Modify the simulator so that it models an L2 cache and has a DRAM access cost of 120 cycles (similar to assuming that we took a c. 1994 pentium netlist added an L2 cache to take advantage of our larger transistor budget and fabbed it on a state-of-the-art 2004 fab). You'll have to make some assumptions about L2 hit latency. One reasonable assumption is that SRAM latency is sqrt(size/16384). Thus a 64K L2 would have a 2 cycle latency, 256K => 4 cycles, 1M => 8 cycles, 4M => 16 cycles, 16M => 32 cycles.

What happens as you change your data cache orgranization? Does the number of cycles to execute the program increase or decrease, and by how much?

What to turn in

You may work alone or in groups of 2, 3 or 4. Last year I observed that people working in groups of 2 got the most out of the assignments (and the course). Each group should turn in a short (1-2 pages for assignment 1) report on what you implemented and what you learned. You should address questions like the following: What assumptions did you make? What did you try that didn't work? Which of the changes you made were most and least effective? For those features that weren't very effective, why do you think they didn't work very well?

The Simulator

The simulator is guaranteed to work only on Linux machines. It has been tested on EWS Linux Lab machines in Engineering Hall, and on a machine running RedHat Linux 9.0

Step 1

Download the tarball for the simulator: ece511-hw1.tar.gz

Step 2

Unzip and untar the simulator by running the command:

This will create a source directory named "ece511". The source directory has the following structure:

+--- ece511 +---- apps +-------- lzw (spec benchmark)
            |          +-------- gcc (spec benchmark)
            | 
            +---- sim   (the simulator)
            +---- utils (currently has just one binary: robjdump)

Step 3

Run make in the sim directory:

This compiles the simulator

Step 4

Compile and run a program on the simulator:

(lzw is a version of the unix compress program. It creates a buffer of random text, then compresses and decompresses the text). While it is running it prints out the random text and then some statistics:

The compressed/uncompressed size is: 802
Files both have length 802
Characters match.
processor halted

Number of cycles run: 2859871
Number of instructions: 952102
utilization: 0.332918

doubled clock gives equivalent speed to single-pump IPC of 0.665836

cache hits: 196000, cache misses: 22214
branches: 103858, mispredicted 7390
mispredictiction rate: 0.071155

Step 5

Make changes to the simulator. The simulator code is in the directory ece511/sim. For the first assignment you will mainly be modifying the file exec_unit.cc. To rebuild the simulator go to the directory ece511/sim and type make.