Contributor: Charles Clancy tclancy@uiuc.edu

Q: Which of the following dependencies can cause potential problems for an out-of-order processor, but not an in-order pipelined processor?

Q: Refer to "Coming Challenges in Microarchitecture and Architecture" What is instruction reuse?

Contributor: Weining Gu wgu@uiuc.edu

Q: Which of the following statements are true?

Q: Which of the following statements are true?

Contributor: Daniel Herring dherring@uiuc.edu

Q: Given a banked cache design of M banks and an N-bit data/instruction fetch per cycle, what is the minimum bank width in bits? Justify.

Q: Do the following arguments validly support a single-chip-single-processor design? Explain.

Contributor: Geoff Kent gkent@uiuc.edu

Q: Given an N-bit global history register and a 2^N entry counter table for branch prediction, what is used to index the table if g-share is the chosen indexing method?

Q: If a 3-bit history table is used in a two-level local history branch prediction scheme, which of the following loop branch taken pattern pairs will cause branch prediction interference in the counter table.

Contributor: Tyler Ralston tralston@uiuc.edu

Q: What are the advantages of Dynamic Scheduling?

Q: Some of the challenges of multiple issue with Tomasulo's algorithm are:

Contributor: Amit Patel ajpatel2@uiuc.edu

Q: Memory-memory instruction set architectures, such as VAX, are rarely used or implemented today for the following reasons:

Q: Negative interference among Global History table entries is a major problem. Which of the following approaches decreases negative interference or avoids its effect?

Contributor: Galen Rasche rasche@uiuc.edu

Q: Which of the following statements about cache performance is/are true?

Q: Which of the following directly influences the number of stalls in a pipeline?

Contributor: David Schmitt deschmit@uiuc.edu

Q: In Intel's trace based instruction caching patent they describe the manner in which cached instructions are placed in a data array. In the example given in the patent, predecessor and successor trace segments are placed in set ((x-1) modulo S) and ((x+1) modulo S) respectively (x is the set number and S is the total number of sets). The predecessor and successor segment may also be placed in anyone of 4 possible ways (or banks). To do this the tag must hold four extra bits. Two bits point to the predecessor way and two bits point to the successor way. Why not save space and restrict the predecessor and successor ways to ((current_way - 1) modulo 4) and ((current_way + 1) modulo 4) respectively?

Q: If DRAM addresses are divided into four groups of bits, what is the order.

Contributor: Nicholas Wang nwang@crhc.uiuc.edu

Q: What kind of branch predictor (local or global history) are each of the branches in the code fragment below best suited for? In other words, local and global history branch predictors are designed to take advantage of certain properties of branches. Which branch predictor is designed to take advantage of the properties exhibited by the branches below? Assume that all "if" and "while" statements are branches.

Q: Increasing the length of the global history register in a global branch prediction scheme will eventually decrease performance.. Why? Assume that all the bits in the GHR are used to index the pattern history table.

Contributor: Ilyas Ayub ayub@uiuc.edu

Q: We have two processors A and B. B's frequency is twice that of A, and B's operating voltage is half of A. Everything else about these two processors are the same. What is the power output of B?

Contributor: Lee Baugh leebaugh@students.uiuc.edu

Q. Between two adjacent levels of cache, Lk and Lk+1, what differences *could* affect the inclusion properties of the memory system?:

Q: Given a 4-way set-associative unified cache, which of the following statements is true?

Contributor: Steve Lindemann slindema@students.uiuc.edu

Q: In "And Now a Case for More Complex Instruction Sets", Flynn et al. conclude that adding support for (some) register-memory instructions and half-length instructions is useful. What are primary reasons they give for this?

Q: Characteristics of a "Billion Transistor Processor" include:

Contributor: Rob Mihalko rjmihalk@students.uiuc.edu

Q: A 1.80GHz Intel Pentium 4 processor operates at 1.75V, is manufactured using a .18 micron process, and consumes 66.1W of power. Find the savings in power consumption if the operating voltage were reduced to 1.65V.

Contributor: Naveen Neelakantam neelakan@students.uiuc.edu

Q: Which of the following design philosophies characterize a RISC?

Q: You are charged with desining the memory interface for a new shared-bus multiprocessor computer system. Your goal is to tune performance for a given set of applications. What application characteristics will help you make your decision? Assume that total memory bandwith is fixed.

Contributor: Russell Schreiber rschreib@uiuc.edu

Q: Increasing the length of cache lines has the following results:

Q: Ways to improve the trace cache performance(effective fetch rate) by a signifcant amount are:

Contributor: Fabrice Stevens

Q: What's the difference betwwen a 1-address machine and a 2-address machine?

Q: If we have a loop (1001)^n and (0011)^n, what is the shortest number or bits we have to use in order to avoid interference and conflicts in the counter table? (we're working with local history tables)

Contributor: Charles Vitu cvitu@students.uiuc.edu

Q: According to David Patterson why did CISCs dominate early architectures:

Q: Given equally sized counter tables: A global history branch prediction scheme is better than a local history branch prediction scheme in which of the following ways:

Contributor: Ritu Gupta rgupta5@crhc.uiuc.edu

Q: In general, the size of I-cache for a particular application is determined by cost of implementation for a particular application. However can the I-cache size be constatnly increased. Is the same trend seen with respect to trace cache.

Q: The three main factors which affect the delivery of fetch mechanism are- effective fetch rate, total cache miss penalty, total branch miss prediction penalty. Are all these three things affected positively by introducing partial fetch and inactive issue. (This question can also be formulated as multiple choice type)

Contributor: Chao Huang chuang10@uiuc.edu

Q: In gshare, what is the advantage of XOR of part of branch address (PC) and part of History Buffer to index Pattern History Table?

Q: Which of the following is/are true about victim cache?

Contributor: Brian Lam lam1@students.uiuc.edu

Q: Instruction supply can be improved by reducing the number of stalls or by increasing the fetch bandwidth. The following techniques address the reductions of the number of stalls by reducing the branch misprediction penalty.

Q: Which of the following are common characteristics associated with RISC?

Contributor: Qilun Liu qilunliu@students.uiuc.edu

Q: 2-bit branch prediction can: ____

Q: The key differences between the scoreboard and TomasuloĄŊs algorithm are:

Contributor: Tyler Ralston tralston@uiuc.edu

Q: What are the advantages of Dynamic Scheduling?

Q: Some of the challenges of multiple issue with Tomasulo's algorithm are

Contributor: Esther Resendiz eresendi@uiuc.edu

Q: Which of the following is(are) TRUE of Register Renaming?

Q: Which of the following is(are) TRUE of DRAMs?

Contributor: Jeff Stine jstine@uiuc.edu

Q: The advantages of using a first level victim cache are:

Q: Well-established methods for reducing microprocessor power consumption include:

Contributor: Ying Wang

Q: Choose the right statement about two-level adaptive branch prediction.

Q: Choose the statements you think is reasonable for BTB?

Contributor: Anand Shukla ashukla@uiuc.edu

Q: Superscalars and VLIW processors have their own advantages and pitfalls. As a designer, one might have to choose one over the other based on certain constraints:

Q: Bank conflict problem: Consider a non-blocking cache having 4 banks, where the bank for a given address is decided by looking at the last 2 bits in the address. Now consider a two dimensional array A with dimensions 4x8. The elements A[0][0], A[1][0], A[2][0], A[3][0] would end up hitting the same bank. This problem will not arise if:

Q: Modern processors need supply of instructions at high frequency. This is accomplished by: