[SOLVED] GPU cache cuda Microsoft PowerPoint GPU-1 [Compatibility Mode]

$25

File Name: GPU_cache_cuda_Microsoft_PowerPoint__GPU-1_[Compatibility_Mode].zip
File Size: 593.46 KB

5/5 - (1 vote)

Microsoft PowerPoint GPU-1 [Compatibility Mode]

High Performance Computing
Course Notes

GPU and CUDA I

Dr Ligang He

2Computer Science, University of Warwick

GPU

Graphics processing unit

Contains a large number of ALUs

2560 ALUs (stream processors) in Nvidia
GeForce GTX 1080

Is a PCI-e peripheral device

3Computer Science, University of Warwick

PCI-e slot

4Computer Science, University of Warwick

Performance Trend

Many-core GPU is 100x more powerful
than multicore CPU

Why is there such performance gap?

Because of the differences in the design
between GPU and CPU

5Computer Science, University of Warwick

Design of CPU

The design objective of CPU is to optimize the
performance of a sequential code

Has complicated control unit

Obtains instructions from memory

Interprets the instructions

Figure out what data are needed by instructions and where
it is stored

Issues signals to ask other functional units (ALUs) to run the
instructions

6Computer Science, University of Warwick

Design of CPU

The design objective of CPU is to optimize the
performance of a sequential code

Has complicated control unit

Complicated control unit enables

instructions from a single thread to execute out of their
sequential order (single core) or in parallel (multicore)

branch prediction

data forwarding

7Computer Science, University of Warwick

Design of CPU

The design objective of CPU is to optimize the
performance of a sequential code

Has complicated control unit

Complicated control unit enables

Has large cache to reduce the instruction and data
access latencies

Powerful ALU

8Computer Science, University of Warwick

Design Objective of CPU

Latency-oriented design

Large on-chip caches

Complicated control unit

Complicated arithmetic logic unit

They are at the cost of increased use of chip area
and power

Applications with one or
very few threads achieve
higher performance in CPU

NAND gate with transistors

9Computer Science, University of Warwick

Motivation of GPU Design

Video game industry: need to perform a massive
number of floating-point calculations per video
frame

Motivate GPU vendors to maximize the chip area
and power dedicated to floating point
calculations

Each calculation is simple: therefore simple control
logic and simple ALUs

Calculation is more important than cache, therefore
small cache, allowing memory access to have long
latency

10Computer Science, University of Warwick

GPU Design

GPU has a large number of ALUs on a chip to
increase the total throughput

The application is run with a large number of parallel
threads

While some threads are waiting for long-latency
operations (e.g., memory access), the GPU can
always find other threads to run due to the large
number of threads

Throughput-oriented design: maximize the total
throughput of a large number of threads, allowing
individual threads to take a longer time

GPU adopts the throughput-oriented design

11Computer Science, University of Warwick

GPU vs. CPU in Architecture

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] GPU cache cuda Microsoft PowerPoint GPU-1 [Compatibility Mode]
$25