[Solved] EE451 Homework5- Matrix Multiplication

$25

File Name: EE451_Homework5__Matrix_Multiplication.zip
File Size: 357.96 KB

SKU: [Solved] EE451 Homework5- Matrix Multiplication Category: Tag:
5/5 - (1 vote)

1 Matrix Multiplication

In the lecture and discussion, we discussed two approaches to compute matrix multiplication (C = AB) using CUDA: (1) unoptimized implementation using global memory only and (2) block matrix multiplication using shared memory.

In this assignment, your task is implementing 1024 1024 matrix multiplication using these two approaches.

  • Approach 1 (unoptimized implementation using global memory only):
    • Name this program as p1.cu
    • The value of each element of A is 1
    • The value of each element of B is 2
    • Thread block configuration: 16 16
    • Grid configuration: 64 64
    • After computation, print the value of C[451][451]
  • Approach 2 (block matrix multiplication using shared memory):
    • Name this program as p2.cu
    • The value of each element of A is 1
    • The value of each element of B is 2
    • Thread block configuration: 32 32
    • Grid configuration: 32 32
    • More details of this algorithm can be found in the paper Matrix Multiplication with CUDA under the Readings category of blackboard.
    • After computation, print the value of C[451][451]
  • Report: measure the execution time of the kernel of Approach 1 and Approach 2, respectively. Briefly discuss your observations.

2

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[Solved] EE451 Homework5- Matrix Multiplication[Solved] EE451 Homework5- Matrix Multiplication
$25