Objective
The purpose of this lab is to introduce you to the CUDA API by implementing vector addition. You will implement vector addition by writing the GPU kernel code as well as the associated host code.
All parts of this lab will be submitted as one zipped file through eclass. Details for submission are at the end of the lab.
Instructions
Edit the code where the TODOs are specified and perform the following:
- Allocate device memory
- Copy host memory to device
- Initialize thread block and kernel grid dimensions
- Invoke CUDA kernel
- Copy results from device to host
- Free device memory
- Write the CUDA kernel
Local Setup Instructions
Steps:
- Download Lab2.zip.
- Unzip the file.
- Open the Visual Studios Solution in Visual Studios 2013.
- Build the project. Note the project has three configurations.
- Test
- Debug
- Submission
For testing it is recommended that you run the Debug configuration.
But make sure you have the Submission configuration selected when you finally submit.
- Run the program by pressing the following button:
Dont try to run the program when the Test configuration is selected.
Vector Add Testing
- The Debug configuration will show false in the last line of the programs output If your code is incorrect.
The outputted vector can be seen in DatasetVectorAddTest[0-9]. For example, DatasetVectorAddTest myOutput.raw. The first line is the size of the array. The Debug configuration will run the first test DatasetVectorAddTest .
- You can also run the program from the Command Prompt (cmd).
VectorAdd -e <expected.raw> -i <intput1.raw>,<input2.raw>
-o <output.raw> -t vector
Make sure you are in the directory with the executable before trying to run the command.
- If you want to run all tests, then you can run the Test configuration. To do this simply build the program, with the Test configuration selected, and without running the debugger.
Build -> Build Solution
In the Build Output window you should see the following:
Vector Add Testing Test 0 | ||
COMMAND | ||
Same | ||
Note that if the test fails you see Different or error instead of Same
Alternatively, you can just run the file Test.bat provided to you instead.
Using NSIGHT To Analyze Performance Instructions
This is complemented by the file Guide on Debugging, Testing, Submitting and Profiling CUDA Project on e-class, please read the file if you have not done so.
In Application Setting, enter:
Application: <Pathtotheproject>TestVectorAdd.exe
Arguments: -e output.raw -i input0.raw,input1.raw -o myOutput.raw -t vector
Working Directory: <Pathtotheproject>DatasetVectorAddTest<Test Number>
Then do the other steps like outlined in the guide file.
Submit an image called cuda_summary.jpg contain the screenshot of the CUDA Summary page that you get from running NSIGHT. For example:
Questions
Assume that the input vectors to your program has length N. Answers for the following questions must be based on N.
- How many floating operations are being performed in your vector add kernel? EXPLAIN.
- How many global memory reads are being performed by your vector add kernel? EXPLAIN.
- How many global memory writes are being performed by your vector add kernel? EXPLAIN.
- In the vector add project, how many bytes are transferred from the Host to the Device? EXPLAIN.
- In the vector add project, how many bytes are transferred from the Device to the Host? EXPLAIN.
Reviews
There are no reviews yet.