## CENG3420 Homework 2

## **Due**: Mar. 21, 2021

## Please submit PDF or WORD document directly onto blackboard. DO NOT SUBMIT COMPRESSED ZIP or TARBALL.

## Q1 (10%)

- Write down the bit pattern in the fraction of value 1/3 assuming a floating-point format that uses binary numbers in the fraction. Assume there are 24 bits, and you do not need to normalize. Is this representation exact?
- Write down the binary representation of the decimal number 63.25 assuming it was stored using the single precision. IBM format (base 16, instead of base 2, with 7 bits of exponent).

- Q2 (15%) Examine the difficulty of adding a proposed lwi.d rd, rs1, rs2 ("Load With Increment") instruction to RISC-V. Interpretation: Reg[rd]=Mem[Reg[rs1]+Reg[rs2]]
  - 1. Which new functional blocks (if any) do we need for this instruction?
  - 2. Which existing functional blocks (if any) require modification?
  - 3. Which new data paths (if any) do we need for this instruction?
  - 4. What new signals do we need (if any) from the control unit to support this instruction?

Q3 (15%) Add NOP instructions to the code below so that it will run correctly on a pipeline that does not handle data hazards.

addi x11, x12, 5 add x13, x11, x12 addi x14, x11, 15 add x15, x13, x12

- Q4 (15%) In this exercise, we examine in detail how an instruction is executed in a single-cycle datapath. Problems in this exercise refer to a clock cycle in which the processor fetches the following instruction word: 0x00c6ba23.
  - 1. What are the values of the ALU control unit inputs for this instruction?
  - 2. What is the new PC address after this instruction is executed? Highlight the path through which this value is determined.
  - 3. For each mux, show the values of its inputs and outputs during the execution of this instruction. List values that are register outputs at Reg [xn].
  - 4. What are the input values for the ALU and the two add units?

- Q5 (15%) If we change load/store instructions to use a register (without an offset) as the address, these instructions no longer need to use the ALU. As a result, the MEM and EX stages can be overlapped and the pipeline has only four stages.
  - 1. How will the reduction in pipeline depth affect the cycle time?
  - 2. How might this change improve the performance of the pipeline?
  - 3. How might this change degrade the performance of the pipeline?

**Q6** (30%)

Problems in this exercise refer to the following sequence of instructions, and assume that it is executed on a five-stage pipelined datapath: add x15, x12, x11 ld x13, 4(x15) ld x12, 0(x2) or x13, x15, x13 sd x13, 0(x15)

- 1. If there is no forwarding or hazard detection, insert NOPs to ensure correct execution.
- 2. Now, change and/or rearrange the code to minimize the number of NOPs needed. You can assume register x17 can be used to hold temporary values in your modified code.
- 3. If the processor has forwarding, but we forgot to implement the hazard detection unit, what happens when the original code executes?
- 4. If there is forwarding, for the first seven cycles during the execution of this code, specify which signals are asserted in each cycle by hazard detection and forwarding units in the figure.
- 5. If there is no forwarding, what new input and output signals do we need for the hazard detection unit in the following Figure. Using this instruction sequence as an example, explain why each signal is needed.

