It’s complicated… Relationship verification is very important, but can we be a little lazy about it?

Java bytecode verification entails several processes, one of which is class relationship verification. Java Virtual Machine (JVM) startup includes any JVM or application setup preceding the actual execution of the program, and one of these steps is verification. Consequently, if startup is prolonged, then so is the delay to run the application.

Fast startup times are beneficial across use cases, but are particularly important in serverless architectures and in continuous deployment workflows. In serverless computing, applications tend to be smaller and are often spun up and shut down at higher frequencies than traditional applications.

Similarly, continuous deployment involves delivering software in short, frequent cycles, where startup time can impact the speed at which software can be released and redeployed. Any delay in startup time translates to a delay in running the application. As such, it is important to reduce JVM startup time as much as possible, so that the cost of starting and restarting applications is minimized.

If we take a different approach to bytecode verification than what currently exists, we can improve the way that classes and interfaces are checked for validity, such that VM startup time can be reduced. How? Lazy relationship verification!

Recently added to OpenJ9 (v0.17.0 release), the new command line option -XX:+ClassRelationshipVerifier enables a modified verification technique that enhances JVM startup performance.

Let’s start with a bit of a background on verification; then, we’ll talk about how this new feature works.

Background

Class Files

Java class files are produced as a result of compiling JVM language source files. Each class file contains Java bytecodes which define a class, interface or module. The JVM loads these files and executes the bytecodes – but first, it needs to confirm that the class files are properly formed.

After checking that each class file follows the expected class file format, bytecode verification occurs. These checks are necessary to ensure the integrity of the files before execution, since JVMs can be asked to load class files that may have been tampered with, that were generated by an unreliable compiler, or that do not satisfy required constraints as detailed by the JVM Specification.

Essentially, verification is the inspection of class files, with the purpose of preventing the Java interpreter from receiving malicious bytecode.

StackMapTable

The StackMapTable is a component of Java class files that is used for type checking during verification. Each table holds a number of stack map frame entries, with each frame specifying the type that corresponds to a particular bytecode offset in the class file.

In other words, stack map frames indicate the expected types at each point in the bytecode. Verification involves using these explicit types to compare against the inferred types to ensure that they are compatible. We’ll look into an example below.

From this point on, classes and interfaces will be referred to as ‘classes’ to reduce verbosity, except when a distinction between the two is important.

Loading, Linking, Initialization

Java classes are dynamically loaded, linked and initialized by the JVM. Loading involves the creation of a class from its binary representation in a class file. During the linking stage, a loaded class is then integrated with the JVM runtime state so that it can be used. Finally, the class undergoes initialization, where the class’s initialization method (aka <clinit>) is invoked by the JVM.

It is a requirement of the JVM Specification for a class to be completely loaded before it is verified and to be completely verified (and prepared) before it is initialized. As long as this order is maintained, the point in time during which these processes occur (during startup, runtime, or a combination thereof) can be manipulated.

To kick off these three processes, a class loader creates an initial class, which is then linked and initialized, followed by the invocation of the main method, i.e. public static void main(String[]). The application class loader is responsible for loading the main class, which is the class that contains the main method. From there, further loading, linking and initialization of classes occurs.

Let's say we have the following classes A, B and C.

// A.java
public class A {
    public static void main(String[] args) {
        B b = new B();
        acceptC(b);
    }
    public void acceptC(C c) {}
}
// B.java
public class B extends C {}
// C.java
public class C {}

A undergoes loading, linking and initialization, which then triggers the same chain of events for class B and class C.

As we can see, acceptC() (line 7) expects to be provided an object of type C, as defined in its method signature. acceptC(b) in the main method (line 5) is represented in class A‘s stack map table. The relevant stack map frames specify that C is the expected type for acceptC(); thus, the passed in object of type B must be assignable to class C.

Verification seeks to prove that b can be passed as a C, which triggers the loading of classes B and C. Then, verification requires that C is either a superclass of B, or that C is an interface (whether or not B implements it – implementation is checked at runtime). As long as one of these cases is true, verification will pass.

Since B is defined to extend C, i.e. C is in fact a superclass of B, verification is successful. Thus, the passed in object to acceptC() is compatible with the expected type indicated in the stack map table. If B was not defined to extend C – that is, the two classes were unrelated to one another – then verification would fail and a java.lang.VerifyError would be thrown.

Inefficiencies

Since classes need to be loaded before they are verified, the verification of one class often triggers the loading of other classes, which can then trigger class loading for further classes, and so on.

In a basic example like the one above, there is not much overhead caused by having to load and verify classes, due to the simple nature of the code, the small number of classes, and the lack of complex or lengthy class hierarchies. However, in more sophisticated applications, we can imagine the extent to which class loading can be propagated, resulting in a far more inflated overhead.

Lazily doing some of the class loading during verification can be an effective technique to improve startup time performance by avoiding these inefficiencies as much as possible. Let’s look at how we can more efficiently approach verification, by being a bit lazier.


Class Relationship Verifier

The Idea

The particular area of verification that we will look at is type checking (i.e. class compatibility) – that is, verifying that a class’s type is assignable to the expected type. Verification of class relationships requires a class and its expected class type to be loaded in order to confirm that their class hierarchy is compatible. As mentioned earlier, verification tends to propagate, which impacts startup time due to the chain of class loading that is triggered.

Currently, the JVM experiences the cost of loading each class upfront (during startup) in order to verify that each class relationship is correct. Is it really necessary to load and verify every class encountered in the bytecode? What if we spend all this time loading and verifying classes that we never end up using?

For example, it’s typical to load and verify a variety of Exception classes which may be used at some point in the application. However, we wouldn’t be using these classes except in cases where an exception occurs – so why spend time verifying these classes during startup; why not defer verification of these classes until the classes are actually requested, if they are ever requested? Well, let’s get lazy!

The Implementation

Class compatibility is verified between two classes: the source class (child class; the inferred type) and the target class (parent class; the expected type). In the example provided earlier, class B is the source class and class C is the target class. There are two general cases to consider at the point where compatibility between two classes is to be checked during verification:

  • Case 1: Both classes are loaded
    • Both of the classes have already been loaded.
  • Case 2: At least one of two classes is not loaded
    • Only one of the classes is already loaded, or neither of the classes have been loaded.

For Case 1 where both classes are already loaded, we proceed to verification as usual. Initialization of the source class is still triggered when the class needs to used during execution.

Otherwise, in Case 2, in place of loading the unloaded class(es) and verifying the relationship between source and target on the spot, we instead record the relationship. Other relationships may be recorded in the meantime, between the source class and other target classes, or between different classes altogether. Once the source class is requested during program execution, the class is loaded. Then, we verify that each recorded relationship between the source and its targets holds. Next, the source class is initialized and made available for use.

By recording a relationship between a source class and a target class instead of loading them on the spot (during startup), we defer class loading and verification until the class is required by execution (during runtime); or, if the class is never used, then we avoid class loading and verification altogether. This “being lazy” thing is sounding pretty beneficial…

If a source class and its target class are already loaded when encountered during startup (Case 1), verification progresses as follows:

Case 1: Both Current and Optimized Verification Processes

  • During startup, class compatibility is verified between already loaded classes.
  • At runtime, the requested class is initialized and made available.

Current and Optimized Verification

The following diagrams portray the class verification process for Case 2:

Case 2: Current Verification Process

  • During startup, each class encountered in the bytecode is loaded (if not already loaded) and verified.
  • At runtime, classes that are requested are initialized and made available.

Current Verification Process

Case 2: Optimized Verification Process using -XX:+ClassRelationshipVerifier

  • During startup, if the source class and/or target class is not already loaded, then the relationship between source and target is recorded, but neither class loading nor verification occurs.
  • At runtime, requested classes are loaded (if not already loaded), verified (which includes validating class relationships), initialized and made available.

Optimized Verification Process

Essentially, the time impact of loading and verifying a class is deferred to runtime if the class ends up being used; however, we avoid the impact altogether if the class is never used. In the diagram above, if a class is never used, then none of the steps in the runtime section occur.

If you’re curious about implementation details, you can check out the eclipse/openj9 pull request for -XX:[+|-]ClassRelationshipVerifier.


Performance

What impact does this feature have on JVM performance? Where are the numbers?! Here they are:

Sample Application

PetClinic (v2.1.0), a Spring Boot application that simulates a veterinary clinic application was used to observe JVM startup performance. More details about the application can be found here.

PetClinic was run with three different configurations:

  • None: no verification, using the command-line option -Xverify:none.
  • Default: default verification.
  • CRV: verification with Class Relationship Verification (CRV) enabled, by specifying the command-line option -XX:+ClassRelationshipVerifier.

Measuring Startup

Startup measurements were taken from the output of PetClinic:

Started PetClinicApplication in X seconds (JVM running for Y)

The startup time illustrated in the following tables is time Y in milliseconds.

Note that the command-line option -Xshareclasses:none was used in each run, which disables class data sharing. This feature was disabled in order to more clearly observe the impact of using Class Relationship Verification.

Number of Classes Loaded

The command-line option -Xdump:java:events=vmstop was used to generate a javacore file for each run, which indicates the number of classes loaded.

Results

The following values were recorded for JVM startup when running PetClinic:

PetClinic JVM Startup Typical Results

Comparing against the measured startup with verification disabled gives us a rough idea of what verification overhead looks like during startup:

PetClinic JVM Startup Results Compared with Verification Disabled

There is a clear enhancement in JVM startup performance with the use of -XX:+ClassRelationshipVerifier. The 997 fewer classes loaded than default verification directly translates into reduced startup time, as seen below:

PetClinic JVM Startup Results with CRV


Usage

This feature was first made available in the September 17th, 2019 Nightly Builds for OpenJDK 8, 11 and 13 and was shipped with the v0.17.0 release for OpenJDK 8, 11 and 13 on October 18th, 2019: Option to record class relationships in the verifier. Head to AdoptOpenJDK to download the latest release!

The Class Relationship Verifier feature is enabled using the java command-line option -XX:+ClassRelationshipVerifier and is an opt-in feature, meaning it is not enabled by default.

Additional v0.17.0 release information and links to binaries can be found here: Eclipse OpenJ9 v0.17.0.

Limitations

-XX:[+|-]ClassRelationshipVerifier is incompatible with -Xfuture, since -Xfuture enables strict class file format checks.

Note that -Xverify:all also enables -Xfuture, so it cannot be used with this feature either.

Additionally, class relationships are not recorded for class files with major version less than 51 (Java 7). Stack Maps were introduced in Java 6 (class file version 50) as part of bytecode verification, but were optional until Java 7. Prior to the introduction of Stack Maps, class files were verified by type inference.

The StackMapTable attribute of a class file is used for type checking during verification. Each table holds a number of stack map frame entries, with each frame specifying the types that correspond to a particular bytecode offset. Due to this change in verification, the Class Relationship Verifier feature is only applicable to class files beyond version 51.


Conclusion

OpenJ9 has always put a strong emphasis on JVM performance, continuously building new features and iterating on existing ones to provide fast startup. -XX:[+|-]ClassRelationshipVerifier is a recent addition to that story, leveraging the art of laziness to enhance startup.

In many situations, being proactive is an effective way to stay on top of things. However, in our case, loading and verifying all these classes upfront isn’t the most efficient way to go about verification. Sometimes, being lazy is the way to go.

If you’re interested in other ways OpenJ9 seeks to improve JVM performance, check out our other blog posts on: