[SOLVED] 代写 C algorithm Scheme math parallel compiler cuda GPU Issue date: Due date:

30 $

File Name: 代写_C_algorithm_Scheme_math_parallel_compiler_cuda_GPU_Issue_date:_Due_date:.zip
File Size: 744.18 KB

SKU: 9298494492 Category: Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Or Upload Your Assignment Here:


Issue date: Due date:
COMP5811M
Parallel & Concurrent Programming 2019
Coursework 2
20 marks (20% of module assessment)
26 November 2019, 10:00 12 December 2019, 10:00
This coursework involves implementing an algorithm to run efficiently on the GPU.
Marks will be awarded for (a) correctness, (b) effective use of GPU resources and CUDA features,
and (c) professionalism — a well-structured program with comments.
Problem Description: Image blurring kernel
Many algorithms that operate over images are well-suited for execution on the GPU. An image can be thought of as simply a rectangular array of pixels (image formats are a different matter, but for this coursework you will be assuming that an image is a 2D array of pixels). Your task is to compute an image blurring kernel.
Mathematically, an image blurring function calculates the value of an output image pixel as a weighted sum of a patch of pixels encompassing the pixel in the input image. For this assignment, we will be taking a simplified approach for this blurring operation: each pixel is simply the average of a patch of pixels that surrounds it (i.e. we will not place a weight on the value of each pixel, which is typical in Guassian blur).
For an MxN input image, an example can be using a 3×3 patch as shown below. In this case, the value of the highlighted pixel A2,2 is equal to the average of the nine pixels outlined in red. Similarly, for every pixel
at position (row, col), we would average a 3×3 patch spanning three rows (row-1, row, row+1) and three columns (col-1, col, col+1).
M x N input image
A11 A21
A12 A22
A13 A23

A1N A2N

A31
A32
A33

A3N
AM1
AM2
AM3
AMN
Your task is to implement this algorithm using CUDA, and to optimise your implementation with efficient use of the GPU memory model.
The mark scheme has been designed so that you can pass the coursework without this optimisation step, but to obtain a first-class mark, you will need to optimise your code.

IN ADDITION TO YOUR CODE, you will need to submit a short report (maximum 1-page A4) detailing the following items:
• Briefly explain your general approach in terms of dividing input into thread blocks.
• Report your kernel run-time for 3 different grid sizes. Include your GPU computing capability and
number of SMs.
• Report any steps you have taken to optimise the code, and why you expect your design decisions to
improve performance beyond a naive solution.
How to get started
You are given C code for reading and writing PPM (Netpbm color image format) image files, to get you started. You are also given a sample PPM image “valve.ppm” that you may use as a first test case. The input image is saved in memory pixel by pixel in a row major order. For each pixel there are 3 values for red, green, and blue channels, each stored in 8 bits. You do not need to investigate alternative formats.
Generate test cases. You are provided with one input image. However, you should test your program with more cases and images of (up to 3) different sizes. You can use any input image for testing. To convert images to PPM format on the school linux machines you can use the following command:
$convert input_image.* output.ppm
Marking Scheme
Marks will be allocated as follows:
• A correct GPU implementation
• Optimization on GPU
• Written report
Submission
6 Marks 8 Marks 6 Marks
• Your work should be submitted through Minerva. An entry will be created for Coursework 2 submission.
You should submit a short report in pdf format and a single CUDA source file, if you are submitting multiple kernels, allow users to choose which kernel to run with a command line argument. To submit your files, you can ZIP them or use a unix TAR file. Other formats e.g. RAR or BZIP are NOT acceptable and will not be marked, resulting a grade of 0 for this coursework.
• Your code must compile successfully with the CUDA compilers available on the SCHOOL LINUX BASED MACHINES, otherwise you lose marks related the coding element.
• Your main function MUST accept the input file name as a parameter with the same format as the provided C code and possibly a choice of which kernel to run.
Plagiarism
Your attention is drawn to the University’s rules regarding plagiarism. You are not allowed to assist other students in implementing a solution to the coursework, or to seek assistance from elsewhere, e.g. programming sites such as “stackoverflow”. We monitor such sites, and have colleagues who can and will alert us to attempts to solicit help. We reserve the right to interview some or all students to confirm that work submitted is indeed their own work.
Questions
Any questions on the coursework should be directed to Mai Elshehaly, email: [email protected]

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] 代写 C algorithm Scheme math parallel compiler cuda GPU Issue date: Due date:
30 $