5/5 - (1 vote)

‌COMP90025 Parallel and Multicore Computing

OpenMP

Lachlan Andrew

School of Computing and Information Systems The University of Melbourne

2023 Semester II

OpenMP Framework Concepts

Directives and Clauses Execution Model

Nested Regions Work Sharing Data Sharing

Task Model Tasks

OpenMP Framework Concepts

OpenMP Specification

https://www.openmp.org/specifications/

These slides draw significantly from the OpenMP 5.1 Specification (Nov 2020). You are encouraged to read the specification as well.
The OpenMP Specification defines a parallel programming model that abstracts a single process, multi-threaded program execution on a (single) machine:
- an abstract view of the machine’s resources, including multi-socket, multi-core processors, threads and thread synchronization, memory hierarchy, SIMD, NUMA and devices, e.g., o✏oading to GPUs;
- an API providing high-level parallel programming constructs that consistently encapsulate the machine’s resources, e.g., by transparently making use of the OS threading library, processor SIMD vector registers and instructions, and compiling of target code for supported devices.
  
  n
  
  n
It is not an implementation. Individual hardware and compiler vendors are required to support and implement the OpenMP Specification. An OpenMP program should run consistently across di↵erent machines/compilers, however some aspects of the specification remain implementation dependent and not all hardware/compiler implementations support or distinguish all of the specifications. Similarly the specifications may not cover every kind of machine implementation that exists.

OpenMP Framework Concepts

Getting started

OpenMP is commonly supported on commercial OSes and hardware. Linux distributions support it via the gcc compiler. You can therefore readily develop and test OpenMP programs. On HPC facilities, the OpenMP compiler implementation should be vendor specific and support everything that the machine o↵ers; e.g., using an Intel compiler for Intel processors.
If you use the gcc compiler then gcc-11 is required for OpenMP 5:
- you can build the compiler if necessary
  
  following e.g., instructions at https://iamsorush.com/posts/build-gcc11/
- compiling the compiler with o✏oading support to GPUs is more involved, however.
A minimal simple compile line for your OpenMP program might be:

g++-11.2 -fopenmp prog.cpp -o prog
You could install some other useful stu↵:
- libnuma – for C++ NUMA support functions (may need the -lnuma compile option) will allow you to find out which NUMA node a thread is running on
  
  n
  
  n
- hwloc-ls and numactl – utilities to inspect and control NUMA on your OS, will give you a definitive overview of the processor resources, such as RAM, cache, cores, hyperthreads and NUMA characteristics
  
  OpenMP Framework Concepts
  
  Runtime Environment Variables
  
  OMP_NUM_THREADS=32 OMP_THREAD_LIMIT=64 OMP_PLACES=”{0:8},{0:8}”
  
  }
  
  Internal Control Variables (ICVs)
  
  Compiler Directives
  
  Runtime Library Routines
Runtime Environment Variables allow you to set the value of OpenMP internal control variables (ICVs) prior to the program main execution. E.g. in bash-like shells, using export OMP NUM THREADS=16 prior to running your program or, if your program is called myprog then OMP NUM THREADS=16 ./myprog will set the variable for that run only. Some ICVs must be set this way.
Compiler Directives allow specifying parallelism without changing the semantics of the base language. In theory, if the directives were removed the program would produce the same output, sequentially. In practice, there can be di↵erences because parallel computation can introduce artefacts.

n

n
Runtime Library Routines allow you to inspect and modify ICVs, and to interact with the OpenMP implementation, e.g., to allocate memory with specified properties or to control parallel constructs.

n o mat em The M

n ourne)

OpenMP Framework Concepts
At any point in time, an OpenMP program’s contextdefines traits that

OpenMP Framework Concepts

Synchronization, SIMD and NUMA
Synchronization between threads, that may ordinarily be done by the programmer using primitive concurrency control techniques such as locks and signals, are provided via directives.
- A thread team consists of a number of threads. Some synchronization operations are
  
  limited to the team of threads, i.e., it does not a↵ect other teams, while some a↵ect all teams.
- A barrier directive marks a point of execution of a program encountered by a team of
  
  threads, beyond which no thread in the team may execute until all threads in the team have reached the barrier, and all explicit tasks generated by the team have executed to completion.
- A flush directive tells a thread to enforce consistency between its view and other threads’ view of memory.
- A critical directive ensures that only one thread can execute a given structured code block at a time.
- An atomic directive ensures that memory reads, writes, and updates are done by a
  
  thread without interference from other threads. Faster, but only applies for some simple operations.
A simd directive makes use of SIMD processor instructions.
Synchronization hints can enable processor optimization techniques like speculative execution to be used.
Memory management and thread affinity provide for optimization over NUMA architectures, using an abstract model of placement, memory categories and partitioning.
- Programmers can specify how the available processing units on the architecture are
  
  partitioned in places.
  
  n
  
  n
- Programmers can specify how and where memory is to be allocated, and wha t type of
  
  , parallel world.
  “;
  
  #pragma omp parallel num_threads(3) shared(x)

Directives and Clauses Execution Model

Program Execution

Understanding an OpenMP program execution is about understanding what happens when a thread of execution encounters a directive.
In the program above, the main or primary thread of the program starts executing code that is part of its task.
- It writes some text to the standard output.
- It allocates an automatic variable and initializes it to 0.
  
  n
  
  n
- It encounters a parallel directive…
  
  Directives and Clauses Execution Model
  
  The parallel directive and common clauses
  
  https://www.openmp.org/spec-html/5.1/openmpse14.html#x59-590002.6
Any thread that encounters the parallel directive will create a parallel region and create a thread team in that region, with itself as the primary thread, where every thread in the team executes the structured block. An implicit barrier exists at the end of the parallel region.

#pragma omp parallel [clause[[,] clause]…] new-line structured-block
num threads(integer-expression)
- Overrides the nthreads-var ICV. The actual number of threads created is limited by the
  
  thread-limit-var ICV and other conditions.
- The OMP NUM THREADS environment variable sets the nthreads-var and optionally also the max-active-levels-var ICVs. Programmers can use
  
  omp set num threads(int num threads) to set nthreads-var at runtime, and respectively int omp get num threads() to obtain its value.
- Similarly OMP THREAD LIMT, OMP DYNAMIC and OMP MAX ACTIVE LEVELS a↵ect
  
  the number of threads created and can be controlled at runtime.
- For details see
  
  https://www.openmp.org/spec-html/5.1/openmpsu40.html#x60-600002.6.1
private(list) – a list of program variables that each thread in the team will have its own memory allocated for

n

n
shared(list) – a list of program variables that will be shared among the threads in the team

Directives and Clauses Execution Model

#pragma omp syntax

In C/C++:

#pragma omp directive-name [[,] clause[[,] clause]… ] new-line

In C+11 and higher with C++ attribute specifiers:

[[ omp :: directive( directive-name [[,] clause[[,] clause]… ] ) ]]

or

[[ using omp :: directive( directive-name [[,] clause[[,] clause]… ] ) ]]
Only one directive-name can be specified per directive, but multiple directives can be applied to one following block.
The order in which clauses appear on directives is not significant. Clauses on directives may be repeated as needed, subject to the restrictions listed in the description of each clause or the directives on which they can appear.

n

n
Some directives are stand alone in that they instruct the runtime to do something, and some directives must be followed by a structured block or structured block sequence.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[SOLVED] COMP90025 Parallel and Multicore Computing

Reviews

Whatsapp Us

[SOLVED] COMP90025 Parallel and Multicore Computing

Reviews

Related products

[Solved] A program to remove duplicates from given array of N elemets

[Solved] Call inventory

[Solved] Assignment 1 Inventory Management System

[Solved] Assignment 2 solution: How does the typing system of PHP and JavaScript differ from

[Solved] An Applet

[Solved] A BNF grammar for the language consisting of your first name, space, and last name