The purpose of this assignment is to give you experience at writing a program using SIMD programming techniques. This assignment will give you anopportunity to demonstrate your understanding of:structures of arrays;the where there is one there is many approach (WTIOTIM);the use of SIMD for calculation; andthe translation of conditional statements into SIMD.A .zip (or .rar) containing a Visual Studio Solution containing a project for each attempted stage of the assignment (in the format provided in thedownloadable materials provided below).A document containing:A table of timing information comparing the original single-threaded code (from assignment 1), the provided base code times, and theStage 1 & 2 SIMD implementation on each scene file (all running with 8 threads).An analysis of the above timing data.You do not need to (and shouldnt) submit executables, temporary object files, or images. In particular, you must delete the .vsdiretory before submission as it just Visual Studio temporary files and 100s of MBs. Do not however delete the Scenes folder or the Outputsfolder (but do delete the images within this one).Task/Topic Marks1. Triangles 80%A. Structures of Arrays 15%Correct declarations of data structures for SoA SIMD form of Triangle container data 5%Correct code to convert AoS container to dynamically declared SoA structures (conversion should happen after the scene has beenloaded)5%Correct rewriting of isTriangleIntersected function and the calls to it to use SoA code 5%B. Where there is One there is Many (WTIOTIM) 10%Correct implementation of WTIOTIM approach to isTriangleIntersected function to calculate distance to nearest object to beused in the objectIntersection function (in Intersection.cpp)5%Correct implementation of WTIOTIM approach to create a short-circuiting isTriangleIntersected function to return (as soon aspossible) whether any sphere intersects to be used in the isInShadow function (in Lighting.cpp)5%C. SIMD Conversion of short-circuiting isTriangleIntersected function 45%Correct conversion of scalar inputs to SIMD 5%Correct and efficient intersection point t0 calculation 5%Correct and efficient (e.g. no use of if-statements) handling of cases where no intersection occurs due to ray being parallel(ish)with triangle (i.e. first if-statement)5%Correct and efficient (e.g. no use of if-statements) handling of cases where no intersection occurs due to point being outside oftriangle (i.e. second and third if-statements)5%Correct and efficient (e.g. no use of if-statements) handling of cases where intersection does occur (i.e. fourth if-statement) 5%Correct and efficient (e.g. no use of if-statements, for-loops, or scalar code) calculation of intersection data 5%Correct declaration of loop constants before loop 5%Correct handling of sphere list length not being divisible by 8 5%No remaining scalar code (except for-loop) and correct final output of SIMD code 5%D. SIMD Conversion of non-short-circuiting isTriangleIntersected function 10%Correct and efficient (e.g. no use of if-statements) SIMD calculation of closest intersection (i.e. t) 5%Correct and efficient (e.g. no use of if-statements) SIMD calculation of corresponding triangle index (i.e. index) 5%2. Lighting 30%A. Structures of Arrays 10%Correct declarations of data structures for SoA SIMD form of Light container data 5%Correct code to convert AoS container to dynamically declared SoA structures (conversion should happen after the scene has beenloaded)5%B. SIMD Conversion of lighting functions 20%Correct SIMD conversion of applyLighting function (NOTE: the calls to isInShadow can be done with a scalar loop) 5%Correct SIMD conversion of applySpecular function (i.e. so it handles 8 lights) 5%Correct and efficient (e.g. no use of switch-statements) SIMD conversion of applyDiffuse function and applyCheckboard,applyCircles, and applyWood functions (i.e. to make it handle 8 lights)10%Documentation 10%Outputs showing timing information for Stages 1 & 2 (with 8 threads) on all applicable scene files 5%Analysis of data (comparison between Stages 1 & 2 and the base code) 5%PenaltiesFailure to comply with submission instructions (eg. no cover sheet, incorrect submission of files, abnormal solution/projectstructure, etc.)-10%Poor programming style (eg. insufficient / poor comments, poor variable names, poor indenting, obfuscated code withoutdocumentation, compiler warnings, etc.)up to -20%Lateness (-20% for up to 24 hours, -50% for up to 7 days, -100% after 7 days) up to -100%MarkingThis assignment will be marked out of 100 (NOTE: there are 120 marks available, but a maximum of 100). The following is the breakdown of marks:Programming StyleThis assignment is not focussed on programming style (although it is concerned with efficient code), but you should endeavour to follow goodprogramming practices. You should, for example:comment your code;use sensible variables names;use correct and consistent indenting; andinternally document (with comments) any notable design decisions.[NOTE: any examples in the provided assignment materials that dont live up to the above criteria, should be considered to be deliberate examples ofwhat not to do and are provided to aid your learning ;P]The Assignment TaskYou are to modify the (square-based) multithreaded implementation of a simple raytracer from the first assignment to take advantage of SIMDinstructions. As time is tight for this assignment you only need to SIMD-ify the triangle/ray intersection code and the lighting code. This requireschanges across multiple files.To aid you in this conversion, the sphere/ray intersection code has already been translated into SIMD code (by following the techniques required to dostage 1 of the assignment).From the provided (square-based) multithreaded raytracer implementation, for Stage 1 of the assignment you will create multiple subsequent versionsthat modify the triangle implementation as follows:A. Translation of array-of-structures (AoS) with structures-of-arrays (SoA) for the triangle container.B. Rewritten functions for the triangle/ray intersection tests in a where there is one there is many approach.C. Optimisation of the triangle/ray intersection test to take advantage of SIMD.For Stage 2, you will do most of that again, but for lights and the lighting calculations.Implementation (Stage 1)The following section describes in detail the steps required to complete Stage 1 of the assignment. If you complete this step, only then should youattempt Stage 2 and do similar steps for lights.A. Structures of ArraysThis stage involves modifying the Scene struct (Scene.h) to store a duplication of the triangle data via structure of arrays rather than the currentlyexisting array of structures (currently in the triangleContainer variable).In order to complete this step you will need to:Create a SoA copy of the data that is stored in triangleContainer in the Scene struct.NOTE: this means the program has two copies of the triangle data, the original one in AoS form and a second one in SoA form. As thisassignment predominately only requires changes to isTriangleIntersected (and the sites of the calls to this function) the rest of the codewill still use the original AoS version of the data.Fill this SoA copy of the sphere data after the Scene struct has been loaded (dynamically allocating memory for the SoA representation).Rewrite the isTriangleIntersected function to make use of the SoA.At times this code update may require conversions to and/or from Vector structs to the equivalently stored data in the SoA.At the end of this stage the program should still produce the same results as the base code. Be thorough in your testing (i.e. test all the scenes) toensure that everything works correctly before progressing.B. Where there is One there is ManyThis stage involves replacing the isTriangleIntersected function with two new versions that use the where there is one there is many paradigm.In order to complete this step you will need to:Write two functions that take the entire triangles container (or at least a pointer/reference) as an argument and will perform whatever actiontook place at the calling site. There are two versions as the isInShadow function (in Lighting.cpp) uses a short-circuited approach, exiting assoon as any triangle collision is discovered, and the objectIntersection function (in Intersection.cpp) finds the closest triangle intersected with(which means it must examine them all).A version of isTriangleIntersected that returns a boolean depending on whether or not a triangle intersects with the ray (i.e. stopping assoon as one is found).A version of isTriangleIntersected that returns void and updates its t parameter to be the closest triangle that intersects with the ray.C. SIMD Conversion of isTriangleIntersected functionThis stage involves converting the first of the two new version of the isTriangleIntersected function from Stage 2 (the bool returning one) into a SIMDimplementation.In order to complete this step you will need to:Convert many of the input parameters (or parts of them) into SIMD-ready values (e.g. the ray starting location and direction need to beconverted into a SIMD format).Step through the SoA in chunks proportional to the number of value stored in each SIMD variable (i.e. 8).Convert the calculation to use SIMD. Care must be taken to correctly deal with the various conditional expressions in the loop. There are four ifstatements that must be converted to SIMD code via the use of appropriate masking statements.Carefully deal with the situation where the triangle count isnt equally divisible by the number in each SIMD calculation (e.g. 9 triangle with 4 ineach SIMD value).NOTE: this is likely the hardest step of the assignment.D. SIMD Conversion of isTriangleIntersected functionThis stage involves converting the latter of the two new version of the isTriangleIntersected function from Stage 2 (the void returning one) into a SIMDimplementation.Most of this conversion is identical to step 3, and the code can just be copied from that function. It does however require the addition of:New variables to keep track of the closest intersection found, and which triangle index it was found at.Consolidation of the values calculated using SIMD into a single scalar return value. This should be done after the loop finishes.Hints / TipsThe techniques required to complete each stage rely heavily on work done in the SIMD tutorial refer to them often.When implementing the SIMD stage, its best to SIMD-ify small portions of code at a time (e.g. even a single line is often a lot) and then performthe rest of the calculation through the existing scalar code, or even compare the result of the SIMD version to the existing scalar code.Again for SIMD, its helpful to render scenes at a greatly reduced size for testing, with an equally small block size, one sample per pixel, and withone thread (e.g. try using allMaterialsZoom -size 8 8 -blockSize 8 -samples 1 -threads 1). It may even be helpful to fix the number of rays cast(MAX_RAYS_CAST) to be 1, so that reflection and refraction dont occur. This doesnt produce a very nice image, but images can be easilycompared visually for problems (the example is only 64 pixels) and printf statements can be used to verify the correctness of calculationswithout outputting a huge amount of data (the use of debuggers is somewhat perilous in a multithreaded environment, but that shouldnt be anissue here).When converting conditional statements to masks it can be helpful to do this with scalars first (although C/C++ implementations often use 1rather than 0xFFFFFFFF for true, so the calculations may be different).Write functions to output all the elements in a SIMD value for easier printf debugging (I am not a fan of how these types are shown in thedebugger).TestAverage Time (Over 5 Runs) (Milliseconds)Base Multi-Threaded Base SIMD Spheres Stage 1SIMD TrianglesStage 2SIMD Lightingcornell1024x1024x1(TODO)233mscornell1024x1024x4(TODO)3433mscornell500x300x1(TODO)607mscornell-256lights512x512x1(TODO)607msallmaterials1024x1024x1(TODO)1575ms5000spheres960x540x1(TODO)33856msbunny500.txt1024x1024x1(TODO)4336msbunny10k.txt256x256x1(TODO)114316msDocumentationWhen completing either stage of the assignment you should provide:timing information for each scene file for the average time taken over 5 runs for a render using 8 threads and a block size of 16. See a latersection for an example format for this timing table; andan explanation of the results (e.g. why theres no difference between the performance of stage 1 compared to the base code, or why a particularimplementation works well (or poorly) on a particular scene, etc.).Sample Timing Table FormatThe following table is an example of how the timing information could be presented in an easily comparable fashion.See the following sections for more information about the scene files / tests to be used.Provided MaterialsThe materials provided with this assignment contain:the source code of the base multi-threaded version of the raytracer;a set of scene files to be supplied to the program.Download the materials as an ZIP file.Source CodeSource CodeThe provided code consists of 20 source files.Raytracing logic:Raytrace.cpp: this file contains the main function which reads the supplied scene file, begins the raytracing, and writes the output BMP file.The main render loop, ray trace function, and handling of reflection and refraction is also in this file.Intersection.h and Intersection.cpp: these files define a datastructure for capturing relevant information at the point of intersectionbetween a ray and a scene object and functions for testing for individual ray-object collisions and ray-scene collisions.Lighting.h and Lighting.cpp: these files provide functions to apply a lighting calculation at a single intersection point.Texturing.h and Texturing.cpp: these files provide functions for the reading points from 3D procedural textures.Constants.h: this header provide constant definitions used in the raytracing.Basic types:Primitives.h: this header contains definitions for points, vector, and rays. It also provides functions and overloaded operators forperforming calculations with vectors and points.PrimitivesSIMD.h: this header contains operators for many common SIMD functions as well as a definition for Vector8 (as a representationfor 8 vectors). It also provides functions and overloaded operators for performing calculations with Vector8s.SceneObjects.h: this header file provides definitions for scene objects (ie. materials, lights, spheres, and triangles).Colour.h: this header defines a datastructure for representing colours (with each colour component represented as a float) and simpleoperations on colours, including conversions to/from the standard BGR pixel format.Scene definition and I/O:Scene.h and Scene.cpp: the header file contains the datastructure to represent a scene and a single function that initialises thisdatastructure from a file. The scene datastructure itself consists of properties of the scene and lists of the various scene objects asdescribed above. The implementation file contains many functions to aide in the scene loading process. Scene loading relies upon thefunctionality provided by the Config class.Config.h and Config.cpp: this class provide facilities for parsing the scene file.SimpleString.h: this is helper string class used by the Config class.Image I/O:ImageIO.h and ImageIO.cpp: these files contain the definitions of functions to read and write BMP files.Miscellaneous:Timer.h: this class provides a simple timer that makes use of different system functions depending on whether TARGET_WINDOWS,TARGET_PPU, or TARGET_SPU is defined (we dont use the latter two, but I left this file unchanged in case anyone wanted to see how suchcross-platform stuff can be handled).ExecutingThe program has the following functionality:By default it will attempt to load the scene ../Scenes/cornell.txt and render it at 10241024 with 11 samples.By default it will output a file named ../Outputs/[scenefile-name]_[width]x[height]x[sample-level]_[executable-filename].bmp (e.g. with allthe default options, ../Outputs/cornell.txt_1024x1024x1_RayTracerAss1.exe.bmp)It takes command line arguments that allow the user to specify the width and height, the anti-aliasing level (must be a power of two), the nameof the source scene file, the name of the destination BMP file, and the number of times to perform the render (to improve the timinginformation).Additionally it accepts some arguments (that are currently unused) for setting the number of threads, whether each thread will tint the area thatit renders, and the size of the block to render (only applicable for stage 3).It loads the specified scene.It renders the scene (as many times as requested).It produces timing information for the average time taken to produce a render ignoring all file IO.It outputs the rendered scene as a BMP file.For example, running the program at the command line with no arguments would perform the first test (as described in the scene file section):On execution this would produce output similar to the following (as well as writing the resultant BMP file to../Outputs/cornell.txt_1024x1024x1_RayTracerAss1.exe.bmp):average time taken (1 run(s)): 890msScene File Width Height Anti-aliasing Imagecornell.txt 1024 1024 1cornell.txt 1024 1024 4cornell.txt 500 300 1cornell-256lights.txt 512 512 1allmaterials.txt 1024 1024 15000spheres.txt 960 540 1bunny500.txt 1024 1024 1bunny10k.txt 256 256 1Scene File Width Height Anti-aliasing Threads Imagebunny500.txt 8 8 1 1cornell-256lights.txt 8 8 1 1Scene Files / TestsYour testing should use the following scene files and parameters for the image size and anti-aliasing level:You also might find the follow Scene / parameters useful for testing Triangle SIMD code conversion (using cornell-256lights at 8x8x1 is also somewhatuseful for Stage 2):
KIT408
[Solved] KIT408 Assignment2- SIMD
$25
File Name: KIT408_Assignment2__SIMD.zip
File Size: 226.08 KB
Only logged in customers who have purchased this product may leave a review.

![[Solved] KIT408 Assignment2- SIMD](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip.jpg)

![[Solved] KIT408 Assignment3-OpenCL](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip-1200x1200.jpg)
Reviews
There are no reviews yet.