OBJECT-ORIENTED OPTIMISATION II
Introduction
Encapsulation as an aid to optimisation
Copyright By Assignmentchef assignmentchef
Optimisation of low level types Java optimisation
Optimising OO code
The OO programming style is intended to improve overall code maintainability
Performance optimisation is intended to improve the performance of the code on a particular hardware/software environment
The two are therefore often in conflict
However most of a programs run-time comes from a very
small fraction of the code
Code optimisation techniques should be applied to this small fraction only
The rest of the code should be written for maintainability
Memory layout
Because each Objects is implemented as a contiguous region of memory your choice of object structure constrains the data memory layout.
However memory layout is usually very significant for code performance.
This often makes existing OO code hard to optimise.
Solution is to encapsulate the performance hot-spots
Encapsulation
Encapsulation is the key concept of OO programming
Particularly encapsulation of data structures.
Data structures are private
Only visible within the type itself
Rest of the code only interacts with type via its public interface.
This makes it easy to modify the data structures
Only the owning type needs to be changed
Provided the external interface is unchanged (or extended) rest of the code remains the same.
Performance Encapsulation
Usually most of the run-time of a program comes from a very small fraction of the code (the hot-spot).
If you can encapsulate this code fraction (and the data it works on) in a class then optimisation is now easier not harder.
As long as external interface is preserved then internal data layout can be restructured however is necessary for performance.
May have to resort to arrays rather than low level objects but only in the time critical code sections.
If the problems hot-spot is known in advance try to design- in this encapsulation from the start.
However first implementation should be written for correctness not speed as optimised version is easier to write if you have a reference version to compare results with.
OO and Hardware acceleration.
This approach is particularly useful for acceleration hardware
GPGPU FPGA
Accelerators typically have their own private memory spaces
Data needs to be copied in/out
Objects also have private data structures
Again data copied in/out
With careful design a good OO interface can completely encapsulate the use of accelerators.
Low level types
For HPC applications the main problem is likely to be heavily used small classes
C++ handles this well, though good idea to use concrete classes and default constructors.
Less successful in Java: 1000 complex number classes may have 1000 words taken up with vtable-pointers/class-references
Adding 1000 complex numbers may also take 1000 method calls This is unfortunate as small classes can be very useful
In Java try to define higher level classes
E.g. corresponding to an array of complex numbers
Better still a physically meaningful concept like pressure field
Immutable Objects
Immutable Objects dont change their internal state after construction.
Java Number classes work like this.
Operations on Numbers always produce a new Immutable
Object leaving the arguments unchanged.
Prevents many types of subtle programming error.
Not so good for performance when large amounts of internal state
Lots of additional Object creation Lots of data copying.
Need to adjust programming/design style to accommodate performance requirements.
Functional languages
Compare with functional languages
Everything is immutable
There are no variables an no assignment
Just definitions that define new immutable values as functions of others.
In functional languages the programmer has no control over memory layout
Instead the Compiler controls data location and lifetime
Scientific OO
Scientific problems are often naturally expressed in a functional or operator notation
A = f(B) A = B * C A = B
This does not always map efficiently or cleanly onto normal
If implemented as methods on B
Constructor needs to be called to generate a new A for each call
Loses symmetry between B and C for binary operations
Consider implementing as methods on A
Objects can be created at a higher level and live longer
Encapsulation still holds where result and arguments are the same type
Non intuitive
Objects cant be immutable.
Java Optimisation
As with other languages obtaining good performance from Java requires careful consideration
A number of standard performance optimisations can be applied to Java codes
e.g. loop unrolling, common sub expression elimination. JIT compilers usually very good at this.
OO Performance optimisations should also be applied e.g. minimise object creation
There are also a number of Java specific optimisations to consider
Java Optimisation
Java bytecode is typically unoptimised
Performance often comes down to the choice of JVM Use of a good JIT is essential
JIT compilers have potential advantages over static compilers
Can use profile information to identify hotspots
Full knowledge of dynamically loaded classes
On the other hand compilation speed more important so highest levels of code optimisation may not be attempted
Java Arrays
Java only implements one-dimensional arrays and arrays of arrays (see previous lecture)
Many scientific codes naturally map to multi- dimensional arrays
Arrays of arrays can have performance problems
Need multiple dereferences
Increased memory use
Less control over data access pattern
Multidimensional arrays
Solution is to use a one dimensional array and implement methods to perform the index calculations
public final double getData(int I, int j, int k){ return data[ I + I_size * ( j + (j_size * k))];
This is all a bit low level Ugly syntax
Efficient (methods should in-line)
Refactoring the data-layout is now a local change much easier than index re-ordering in C/Fortran
Data Structures
Data structures (Collections) are considerably slower than simple arrays in Java
Standard libraries still typically much faster than writing your own equivalents. java.util.ArrayList and java.util.Hashmap introduced in 1.4
Adding often faster than with java.util.Vector and java.util.Hashtable as no synchronisation present
Synchronisation
Synchronised methods and classes are often slower than unsynchronised methods and classes
Even sequentially
Overhead associated with synchronized methods also influences scaling of parallel code
General guidelines
Encapsulate the code hot-spot
This ensures you are free to optimise without impacting on the rest of the code.
Where ever possible be as restrictive as possible
Declare methods as final
Declaring a method as final helps the compiler to inline the methods
This is good programming practice as it reduces dependency between different parts of the code
Static vs instance variables
Declaring variables as constants (static final) allows the compiler to carry
out more optimisations
Instance variables are initialised every time a new object is created static variables are initialised once
Performance Case Study Simple case study
Multiply a vector of 102400 complex numbers Implement in C, F90 and Java on Sun platform
Three Java versions
Naive version
Each complex Number a separate object New object created for each multiply
Simple version
Each complex Number a separate object Method on result object
Vector version
Single object to represent the vector Method on result object
Performance Case Study Performance
Full optimisation flags used throughout
Java naive
Java simple
Java vector
Conclusions
Java JIT compilers seem to be equally good at
optimising code as conventional static compilers
Java performance issues are OO performance issues, not interpreted language issues
A range of OO specific performance issues exist
A range of Java specific performance issues exist
CS: assignmentchef QQ: 1823890830 Email: [email protected]
Reviews
There are no reviews yet.