JEP 197: Segmented Code Cache
Summary
Divide the code cache into distinct segments, each of which contains compiled code of a particular type, in order to improve performance and enable future extensions.
Goals
- Separate non-method, profiled, and non-profiled code
- Shorter sweep times due to specialized iterators that skip non-method code
- Improve execution time for some compilation-intensive benchmarks
- Better control of JVM memory footprint
- Decrease fragmentation of highly-optimized code
- Improve code locality because code of the same type is likely to be accessed close in time
- Better iTLB and iCache behavior
- Establish a base for future extensions
- Improved management of heterogeneous code; for example, Sumatra (GPU code) and AOT compiled code
- Possibility of fine-grained locking per code heap
- Future separation of code and metadata (see JDK-7072317)
Non-Goals
The segmented code cache only provides a base for future extensions such as fine-grained locking; it does not yet implement any of these improvements.
Success Metrics
- Separation of different code types
- Shorter sweep time
- Lower execution time
- Decreased fragmentation of highly optimized code
- Reduced number of iTLB and iCache misses
Motivation
The organization and maintenance of compiled code has a significant impact on performance. Instances of performance regressions of several factors have been reported if the code cache takes the wrong actions. With the introduction of tiered compilation the role of the code cache has become even more important, since the amount of compiled code increases by a factor of 2X--4X compared to using non-tiered compilation. Tiered compilation also introduces a new compiled code type: instrumented compiled code (profiled code). Profiled code has different properties than non-profiled code; one important difference is that profiled code has a predefined, limited lifetime while non-profiled code potentially remains in the code cache forever.
The current code cache is optimized to handle homogeneous code, i.e., only one type of compiled code. The code cache is organized as a single heap data structure on top of a contiguous chunk of memory. Therefore, profiled code which has a predefined limited lifetime is mixed with non-profiled code, which potentially remains in the code cache forever. This leads to different performance and design problems. For example, the method sweeper has to scan the entire code cache while sweeping, even if some entries are never flushed or contain non-method code.
Description
Instead of having a single code heap, the code cache is segmented into distinct code heaps, each of which contains compiled code of a particular type. Such a design enables us to separate code with different properties. There are three different top-level types of compiled code:
- JVM internal (non-method) code
- Profiled-code
- Non-profiled code
The corresponding code heaps are:
-
A non-method code heap containing non-method code, such as compiler buffers and bytecode interpreter. This code type will stay in the code cache forever.
-
A profiled code heap containing lightly optimized, profiled methods with a short lifetime.
-
A non-profiled code heap containing fully optimized, non-profiled methods with a potentially long lifetime.
The non-method code heap has a fixed size of 3 MB to account for the VM internals plus additional space for the compiler buffers. This additional space is adjusted according to the number of C1/C2 compiler threads. The remaining code cache space is distributed evenly among the profiled and the non-profiled code heaps.
The following command-line switches are introduced to control the sizes of the code heaps:
-
-XX:NonProfiledCodeHeapSize
: Sets the size in bytes of the code heap containing non-profiled methods. -
-XX:ProfiledCodeHeapSize
: Sets the size in bytes of the code heap containing profiled methods. -
-XX:NonMethodCodeHeapSize
: Sets the size in bytes of the code heap containing non-method code.
The interface and implementation of the code cache is adapted to support multiple code heaps. Because the code cache is a central component of the JVM, many other components are affected by these changes, including the following:
- Code cache sweeper: Now only iterates over the method-code heaps
- Tiered compilation policy: Sets compilation thresholds according to free space in code heaps
- Java Flight Recorder (JFR): Events related to the code cache
- Indirect references from:
- Serviceability Agent: Java interface to code-cache internals
- DTrace ustack helper script (
jhelper.d
): Resolving of names of compiled Java methods - Pstack support library (
libjvm_db.c
): Stack tracing of compiled Java methods
Alternatives
An alternative implementation would define logical memory regions into which different code types are preferably allocated. If there is free space, we allocate into the preferred memory region and if there is no free space left, we allocate somewhere else.
Testing
Intensive correctness testing using JPRT, Nashorn + Octane, SPECjbb2013, SPECjbb2005, SPECjvm2008.
We need to make sure that there is no performance degredation, especially for embedded usage with small code-cache sizes.
Testing of affected components including Serviceability Agent, DTrace, Pstack, Java Flight Recorder.
Risks and Assumptions
Having a fixed size per code heap leads to a potential waste of memory in case one code heap is full and there is still space in another code heap. Especially for very small code cache sizes it may happen that the compilers are shut off even if there is still space available. To solve this problem an option will be added to turn off the segmentation for small code-cache sizes.
The size of the non-method code depends on the Java application, the underlying platform, and the JVM settings. It is therefore hard to determine the required space in the non-method code heap at JVM startup.
Future versions of this patch may implement dynamic resizing (supported by the sweeper) or different allocation strategies to lower the risk of wasting memory.