The Intel® C++ Compiler implements the following group of routines as extensions to the OpenMP* run-time library:
Get and set the execution environment
Get and set the stack size for parallel threads
Memory allocation
Get and set the thread sleep time for the throughput execution mode
The Intel extension routines described in this section can be used for low-level tuning to verify that the library code and application are functioning as intended. These routines are generally not recognized by other OpenMP-compliant compilers, which may cause the link stage to fail in the other compiler. To execute these OpenMP* routines, use the [Q]openmp-stubs option.
In most cases, environment variables can be used in place of the extension library routines. For example, the stack size of the parallel threads may be set using the OMP_STACKSIZE
environment variable rather than the kmp_set_stacksize_s() library routine.
Note
A run-time call to an Intel extension routine takes precedence over the corresponding environment variable setting.
Execution Environment Routines
Stack Size
Function | Description |
---|---|
| Returns the number of bytes that will be allocated for each parallel thread to use as its private stack. This value can be changed with kmp_set_stacksize_s() routine, prior to the first parallel region or via the |
| Provided for backwards compatibility only. Use kmp_get_stacksize_s() routine for compatibility across different families of Intel processors. |
| Sets to size the number of bytes that will be allocated for each parallel thread to use as its private stack. This value can also be set via the |
| Provided for backward compatibility only. Use |
Memory Allocation
The Intel® C++ Compiler implements a group of memory allocation routines as an extension to the OpenMP* run-time library to enable threads to allocate memory from a heap local to each thread. These routines are: kmp_malloc(), kmp_calloc(), and kmp_realloc().
The memory allocated by these routines must also be freed by the kmp_free() routine. While you can allocate memory in one thread and then free that memory in a different thread, this mode of operation incurs a slight performance penalty.
Function | Description |
---|---|
| Allocate memory block of size bytes from thread-local heap. |
| Allocate array of nelem elements of size elsize from thread-local heap. |
| Reallocate memory block at address ptr and size bytes from thread-local heap. |
| Free memory block at address ptr from thread-local heap. Memory must have been previously allocated with kmp_malloc(), kmp_calloc(), or kmp_realloc(). |
Thread Sleep Time
In the throughput OpenMP* Support Libraries, threads wait for new parallel work at the ends of parallel regions, and then sleep, after a specified period of time. This time interval can be set by the KMP_BLOCKTIME
environment variable or by the kmp_set_blocktime() function.
Function | Description |
---|---|
| Returns the number of milliseconds that a thread should wait, after completing the execution of a parallel region, before sleeping, as set either by the |
| Sets the number of milliseconds that a thread should wait, after completing the execution of a parallel region, before sleeping. This routine affects the block time setting for the calling thread and any OpenMP* team threads formed by the calling thread. The routine does not affect the block time for any other threads. |