HINTS IN VECTORC
To help the optimizer and you take advantage of some of the features of the latest processors, it is possible to add hints to your code. These hints may be ignored by the optimizer and do not change the meaning of your source code.
It is recommended that you experiment with the hints given here. In some situations, these hints can slow down your program. Getting the values right can give a significant speed benefit.
Floating-point precision
__hint__((precision(12)))
This can be placed before the division operator (/) or square root function
(sqrt) to tell the optimizer that it can use the lower precision floating-point
instructions of 3D Now! and SSE. It will not force the optimizer to use these
instructions, though.
Prefetching
__hint__((prefetch))
Used in a declaration, this tells the optimizer to try and use prefetches when
accessing this memory. If used at the start of a variable declaration it
means that prefetches will be used when accessing this variable. When used before the * in a pointer declaration, it means that prefetches will be used when accessing the data pointed to by this variable. Prefetches will only be used when accessing an array in sequence and when the target processor supports it.
__hint__((prefetch (address)));
This tells the optimizer to try to issue a prefix instruction here to prefetch
the data at the given address. The value of address is a pointer value (use 'p'
if p is a pointer, not '*p'). The optimizer may remove this instruction.
Non-temporal stores
__hint__((nontemporal))
Used in a declaration, this tells the optimizer to try and write to this memory
using non-temporal writes. This means that the processor will not cache the data that is written to memory. This can speed up programs by reducing cache pollution. Only use this option if the memory being written to will not be read back until after the cache has been filled. Caches on modern processors are large and so you should only use this hint when writing out more than 128K-512K. Non-temporal writes are currently only available on Pentium III processors. You will probably need to experiment with this option to find situations where it speeds up your program as you could significantly slow down your code if used inappropriately.
Loop unrolling
__hint__((unroll (n)));
Used in a loop, this tells the optimizer to unroll the loop by factor n. This
can be useful when you know that a particular unroll factor is appropriate. For
example, if your loop deals with 16-bit values and you know that the loop would
work well with MMX, try a factor of 4 (MMX can process 4 16-bit values at a time). Or you could try 8 - to do 2 MMX operations at once. Without this hint, VectorC will decide for itself what loop unroll factor is appropriate, if any.
Vectorization
__hint__((vectorize))
Used in a declaration, this is a hint to the vectorizer to combine this variable with another inside a vector register. This is only used for vectorizing throughout an entire loop and across basic blocks. Vectorizing within basic blocks is entirely automatic.
OK to Read/Write Addresses
__hint__ ((okread (value)));
This is a statement that tells VectorC that a particular value can be read from
memory, even though no reference is made to it in this section of code. This is
used where an optimization could be applied, but only if a value can be read from memory without causing an exception. Suitable optimizations are conditional moves from a memory location or vectorizing a read of 3 bytes to a single 4-byte read.
__hint__((okwrite(value)));
This statement is similar to "okread" except that it allows a memory
address to be written to. However, this does not mean that a memory value can
be modified i.e. The value written to this address must be the value that was already there. This can be useful in vectorizing 3 single byte writes to a single 4-byte write.
|