[SOLVED] CS compiler computer architecture scheme Pipelining 2

$25

File Name: CS_compiler_computer_architecture_scheme_Pipelining_2.zip
File Size: 499.26 KB

5/5 - (1 vote)

Pipelining 2
CS 154: Computer Architecture Lecture #15
Winter 2020
Ziad Matni, Ph.D.
Dept. of Computer Science, UCSB

Administrative
Talk is on MONDAY, MARCH 9th in our usual class Will take attendance
Final Exam info:
Tuesday, March 17th at 12:00 (not 12:30!!!) in this classroom

Arrive 10 mins early randomized seating
Cumulative Exam
Will allow some notes exact details to follow
Study guide/example Qs will be issued by this weekend
3/4/2020
Matni, CS154, Wi20 2

Re: Labs
Lab 7 still due on Sunday Lab 8 will be issued soon There IS a lab THIS Friday Re: lab NEXT Friday
3/4/2020 Matni, CS154, Wi20 3

Lecture Outline
Data Hazards in Pipelining: Forwarding vs Stalling
Control Hazards: Branch Prediction
3/4/2020 Matni, CS154, Wi20 4

Control Lines for the Last 3 Pipeline Stages
Same control signals that we learned earlier, but this time they are ferried across the pipelines
See tables in Fig. 4.48, 4.49 in textbook
These are derived from the instruction
3/4/2020 Matni, CS154, Wi20 5

Pipelined Datapath Showing Control Signals
3/4/2020 Matni, CS154, Wi20 6

Another Look at Data Hazards
Consider this sequence: sub $2, $1, $3
and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 sw $15, 100($2)
All of the instructions after sub are dependent on the result in register $2 of the first instruction.
3/4/2020 Matni, CS154, Wi20 7

Lets say that reg. $2 has value 10 at first, but that the sub instruction changes that to -20.
This diagram shows that the instructions that would get the correct value of 20 are add and sw, while the AND and OR instructions would get the incorrect value 10!
3/4/2020 Matni, CS154, Wi20 8
We could resolve these hazards using forwarding
But how do we detect when to forward?

Forwarding vs. Stalling
We could stall using bubbles (inefficient)
or we could forward the data as soon as it is available to any units that need it before it is available to read in the WB stage
Lets only consider forwarding to an operation in the EX stage
That is, either an ALU operation or an address calculation
3/4/2020
Matni, CS154, Wi20
9

A Word on Some Notation to Use
Example: ID/EX.RegisterRs
Refers to the number of a register whose value is found in the pipeline register ID/EX
The 1st part is the name of the pipeline register (ID/EX) the 2nd part is the name of the field in that register (Rs)
ALU operand register numbers in EX stage are given by ID/EX.RegisterRs and ID/EX.RegisterRt
3/4/2020 Matni, CS154, Wi20 10

Data Hazards Occur When
1a. EX/MEM.RegisterRd = ID/EX.RegisterRs 1b. EX/MEM.RegisterRd = ID/EX.RegisterRt
2a. MEM/WB.RegisterRd = ID/EX.RegisterRs 2b. MEM/WB.RegisterRd = ID/EX.RegisterRt
1st hazard here is on register $2, between the result of sub and AND instructions.
This hazard can be detected when the AND instruction is in the EX stage and the sub instruction is in the MEM stage, so this is hazard 1a:
EX/MEM.RegisterRd = ID/EX.RegisterRs = $2
3/4/2020 11
Matni, CS154, Wi20

Detecting the Need to Forward
These comparisons are not enough, though! Some instructions dont write to registers
Solution: see if the RegWrite control signal will be active Additionally, check that:
1a. EX/MEM.RegisterRd = ID/EX.RegisterRs = $zero 1b. EX/MEM.RegisterRd = ID/EX.RegisterRt = $zero
2a. MEM/WB.RegisterRd = ID/EX.RegisterRs = $zero
2b. MEM/WB.RegisterRd = ID/EX.RegisterRt = $zero
3/4/2020 Matni, CS154, Wi20 12

Forwarding Paths (simplified)
See Fig. 4.55 in textbook for full explanation of the mux selects ForwardA and ForwardB
3/4/2020 Matni, CS154, Wi20 13

Forwarding Conditions
3/4/2020 Matni, CS154, Wi20 14

Double Data Hazards!
Consider the sequence:
add $1, $1, $2 add $1, $1, $3 add $1, $1, $4
Both hazards conditions occur at once!
We want to use the most recent value in $1
We have to revise the MEM hazard condition Only forward if EX hazard condition isnt true
3/4/2020 Matni, CS154, Wi20 15

Revised Forwarding Condition
3/4/2020 Matni, CS154, Wi20 16

Pipelined Datapath Modified
To Resolve Hazards Via Forwarding
3/4/2020 Matni, CS154, Wi20 17

Load-Use Data Hazard
A case where forwarding will not work is when an instruction tries to read a register following a load instruction that writes the same register.
3/4/2020 Matni, CS154, Wi20 18

Hazard Detection
We also need a Hazard Detection Unit!
It operates during the ID stage so that it can insert the stall
between the load and its use.
ALU operand register numbers in ID stage are given by
IF/ID.RegisterRs, IF/ID.RegisterRt Has one thing to check:
3/4/2020 Matni, CS154, Wi20 19

How to Stall the Pipeline
Force control values in ID/EX register to 0 EX, MEM and WB do a nop (no-operation)
Prevent update of PC and IF/ID register Current instruction is decoded again
Following instruction is fetched again
1-cycle stall allows MEM to read data for lw Can subsequently forward to EX stage
3/4/2020 Matni, CS154, Wi20 20

How Stalls Are Actually Done
3/4/2020 Matni, CS154, Wi20 21

Compiler vs Hardware Stalls
Compilers usually rely on the hardware to resolve hazards
Sometimes compilers take measures to avoid some types of hazards
BUT a compiler must understand the pipeline to achieve the best performance.
Otherwise, unexpected stalls will reduce the performance of the compiled code.
3/4/2020
Matni, CS154, Wi20 22

Pipelined Datapath Modified
To Resolve Hazards Via Forwarding OR Stalling
3/4/2020 Matni, CS154, Wi20 23

Control Hazards
Pipeline hazards involving branches
An instruction must be fetched at every clock cycle to sustain the pipeline
Buy the decision about whether to branch doesnt occur until the MEM pipeline stage
Stalling until the branch is complete is too slow
Control hazards occur less frequently than data hazards
We end up using simpler schemes.
3/4/2020 Matni, CS154, Wi20 24

Prediction Scheme 1: Assume branch not taken
Continue execution down the sequential instruction stream.
If the branch is taken, the instructions that are being fetched and decoded must now be discarded (flushed)
Execution continues at the branch target.
If the branch is untaken half the time, and if it costs little to discard the instructions, this optimization halves the cost of control hazards
3/4/2020 Matni, CS154, Wi20 25

Prediction Scheme 1.5: Reduce the Delays
That is, reduce the cost of the taken branch
Main idea: if we move the branch execution earlier in the pipeline (from MEM), then fewer instructions need be discarded.
Many branches rely only on simple tests (equality or sign) Such tests do not require a full ALU operation
Can be done with at most a few gates.
For more complex branch decisions use a separate instruction that uses an ALU to perform a comparison
3/4/2020 Matni, CS154, Wi20 26

What Needs to be Done?
2 actions have to occur:
Computing the branch target address earlier
Easy fix: we move the branch adder from the EX stage to the ID stage
Evaluating the branch decision earlier
Harder to do
For branch equal, we would compare the two registers read during the ID stage to see if they are equal.
Can be done with 1 XOR and 1 OR gate (32b gates).
Implies additional forwarding and hazard detection hardware
3/4/2020 Matni, CS154, Wi20 27

Example
Consider this code:
Assume that, in this example, the branch will be taken
3/4/2020 Matni, CS154, Wi20 28

3/4/2020 Matni, CS154, Wi20 29

Using Simple Forwarding in Branches
If a comparison register is a destination of a 2nd or 3rd preceding ALU instruction
Can resolve using forwarding
3/4/2020 Matni, CS154, Wi20 30

Dynamic Branch Prediction
In deeper and superscalar pipelines, branch penalty is more significant, so dynamic prediction is used, needing:
Branch prediction buffer (aka branch history table)
Indexing by recent branch instruction addresses
Store outcome (taken/not taken)
To execute a branch, then:
Check table, expect the same outcome
Start fetching from fall-through or target If wrong, flush pipeline and flip prediction
3/4/2020 Matni, CS154, Wi20 31

1-bit Predictor
A branch prediction buffer is a small memory indexed by the lower portion of the address of the branch instruction.
The memory contains a bit that says whether the branch was recently taken or not.
Very simple
Remember: a prediction is just an educated guess a hint we hope is
correct
If the hint turns out to be wrong:
the incorrectly predicted instructions are deleted the prediction bit is inverted and stored back
the proper sequence is fetched and executed.
3/4/2020 Matni, CS154, Wi20 32

Downsides to a 1-bit Predictor
You could get a sequence of wrong predictions!
Example:
Mis-predict taken on last iteration of the inner loop
Then mis-predict (as in do not take) on first iteration of
inner loop next time around
3/4/2020 Matni, CS154, Wi20 33

A Better Way? Use a 2-bit Predictor
Only change prediction on two successive mis-predictions
3/4/2020 Matni, CS154, Wi20 34

Final Pipelined Datapath Showing Hazard and Forwarding Detection
3/4/2020 Matni, CS154, Wi20 35

YOUR TO-DOs for the Week
Finish Lab 7 by Sunday
New Lab 8 (last one!) will be issued Thursday
Due next week, which is last week of classes
3/4/2020 Matni, CS154, Wi20 36

3/4/2020 Matni, CS154, Wi20 37

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] CS compiler computer architecture scheme Pipelining 2
$25