“[SOLVED] Oop244 workshop #3: member functions and privacy” has been added to your cart. Continue shopping

[SOLVED] MIPS Consider the following code for a MIPS like pipelined processor with 5-stage pipeline, which computes Yi = (a.Xi + b).Xi + c for i=0..N-1 where N is the number of elements in a vector. For better readability, in the following I have named the registers as Rx, Ry, Ra, Rb, Rc, etc. Assume that the register R3 points to an array that contains 4*N, a, b, and c.

Whatsapp Us

[SOLVED] MIPS Consider the following code for a MIPS like pipelined processor with 5-stage pipeline, which computes Yi = (a.Xi + b).Xi + c for i=0..N-1 where N is the number of elements in a vector. For better readability, in the following I have named the registers as Rx, Ry, Ra, Rb, Rc, etc. Assume that the register R3 points to an array that contains 4*N, a, b, and c.

Name: [SOLVED] MIPS Consider the following code for a MIPS like pipelined processor with 5-stage pipeline, which computes Yi = (a.Xi + b).Xi + c for i=0..N-1 where N is the number of elements in a vector. For better readability, in the following I have named the registers as Rx, Ry, Ra, Rb, Rc, etc. Assume that the register R3 points to an array that contains 4*N, a, b, and c.
Brand: Assignment Chef
SKU: 5002975003
Price: 25 USD
Availability: InStock
Rating: 5 (1 reviews)

$25

File Name: MIPS_ Consider_the_following_code_for_a_MIPS_like_pipelined_processor_with_5_stage_pipeline__which_computes_Yi____a_Xi___b__Xi___c_for_i_0__N_1_where_N_is_the_number_of_elements_in_a_vector__For_better_readability__in_the_following_I_have_named_the_registers_as_Rx__Ry__Ra__Rb__Rc__etc___Assume_that_the_register_R3_points_to_an_array_that_contains_4_N__a__b__and_c_.zip
File Size: 3447.72 KB

SKU: 5002975003 Category: Programming Tags: AI, algorithm, Android, ARM, C, case study, compiler, Computer Architecture, concurrency, data mining, data science, data structure, database, decision tree, deep learning, distributed system, ER, file system, finance, GPU, GUI, Haskell, interpreter, Java, Javascript, kernel, Matlab, MIPS, PROLOG, Python, Scheme, SQL, x86

Description
Reviews (0)

5/5 - (1 vote)

Consider the following code for a MIPS like pipelined processor with 5-stage pipeline, which computes Yi = (a.Xi + b).Xi + c for i=0..N-1 where N is the number of elements in a vector. For better readability, in the following I have named the registers as Rx, Ry, Ra, Rb, Rc, etc.Assume that the register R3 points to an array that contains 4*N, a, b, and c.

LoadR1, 0(R3)//Load last address of array X
Loop:SubR1, R1, #4//In the first iteration, this will ensure that we start with last element of the array
LoadRx, 400(R1)//Load Xi (Note: array X starts at address 400)
LoadRa, 4(R3)// Load a
MulRy, Rx, Ra//Multiply by a
LoadRb, 8(R3)// Load b
AddRy, Ry, Rb//add b
MulRy, Ry, Rx//Multiply by Xi
LoadRb, 12(R3)// Load c
AddRy, Ry, Rc//Add c
StoreRy, 800(R1)// Store Yi (Note: Array Y starts at address 800)
BNZR1, Loop
Assume that all ALU operations take one cycle except for multiply that takes 4 cycles.Assume that the result forwarding is used so that the only additional stalls are as follows: (a) need an additional cycle to use the result of a load, and (b) without any prediction, the branch must stall for 2 cycles before the next instruction can be executed. Also assume 2 adder units and 2 multiply units.
Restructure the code to do the following: (a) remove any unnecessary operations to outside the loop, and (b) minimize stalls by moving around instructions and suitably changing memory addresses. Explain your restructuring.
Following the restructuring, show the timings for straight pipelined execution of the program. That is, if any pipeline stage needs n>1 cycles, the following instructions are delayed by additional n-1 cycles. You can use diagrams like in Figures C.31/C.32 of HP-CO. Indicate where the stalls happen and the number of cycles of stall. Compute the number of cycles it will take to go through the loop once and issue the BNZ instruction.
What additional stalls can you reduce if you were to unroll the loop once? Show the result of unrolling and code restructuring.
Assuming N=10, how many total cycles does this loop take in (c) if the branch is always predicted to be taken, and it takes 5 additional cycles to cleanup if the prediction is wrong.

In above consider execution using Tomasulo. Keep the restrictions above.
Indicate the cycle in which the instruction is issued (i.e., ready to execute), actually starts execution, and when it completes. Indicate how many cycles it takes to issue the BNZ instruction first time around the loop. You do not need to draw the full diagram that you saw in lecture notes, but only indicate the dependencies and clock cycle number to justify your answer.
Repeat (a) under the assumption that we have only 1 multiply unit.

Whatsapp Us

Reviews

Related products

[Solved] Python program to figure out if it is better to pay off your loans or pay off the minimal

[Solved] List Maintainer

[SOLVED] ITEC136 Python Program

[SOLVED] Project 9-1: Monthly Payment Calculator

[SOLVED] COP 3223 Program #1: Vacation Planning

[Solved] Python Program 8 solved