IBM p690

Compilers

Use the xl compilers whenever possible as they provide far superior performance.

xl Compilers

Documentation

Invocation

All the xl compilers have a _r flavour that generates thread-safe code. For examples, xlc_r and xlf_r

Compiler Flags

Various flags that might be useful:

-q32 or -q64
32-bit or 64-bit. 64-bit isn't automatically faster. If the problem can always be contained in 32-bit space, -q32 is faster because the cache is better used. If the problem requires 64-bit integers for array access, then -q64 will be faster.
32-bit libraries must be linked to 32-bit libraries.
64-bit libraries must be linked to 64-bit libraries.
Hint: if your indices exceed 2 billion, then use 64-bit.

-qsmp -qthreaded
Enables automatic parallelization. Always use the xl*_r invocation.

-qsmp=omp
Enables parallelization according to the explicit user-supplied OpenMP pragma directives. Always use the xl*_r invocation.

-qarch=auto -qtune=auto -qcache=auto
Generate code optimized for the current arch.

-qstrict
-On, n > 2, includes optimizations which can generates code that might not do what you expect. -qstrict disables those optimizations.

-qipa (O4 & O5), -qnoipa
Interprocudural Analysis. A nifty optimization that sometimes goes awry. If it does, use -qnoipa

-qpdf & -qfdpr
Used for the two-step process of profile-directed feedback.

-qlibansi
Assume all functions with the same name as ANSI library functions are indeed the ANSI library function. Bit odd, this one, because it sounds to me like an error should be generated otherwise.

Recommended compiler flags

-O5 -qsmp -qthreaded -qarch=auto -qtune=auto -qcache=auto

Note: the -q64 flag generates 64-bit code. When creating libaries, use ar -X32_64 r otherwise the 64-bit object files will be ignored.