Tuning the JVM for Performance

Overview

ColdFusion is a versatile, feature-rich platform for building web and mobile applications. Since version 6, ColdFusion has been powered by Java. This has opened up a wealth of resources for performance tuning and scaling ColdFusion to meet enterprise-grade demand. In this guide, we will introduce some foundational concepts of the Java Virtual Machine and how we can leverage its built-in capabilities to resolve bottlenecks in your ColdFusion application.

It is important to understand that performance tuning is a process, not a destination. Tuning your application for speed should be integrated into your broader maintenance strategy. We recommend integrating application performance tuning as part of your regular release pipeline testing. Small changes in your code can impact your application memory footprint and effect JVM performance.

Java Virtual Machine: The Beating Heart of ColdFusion

ColdFusion has been a Java-based server since version 6 (also known as MX). As with most Java applications, at its core is the Java Virtual Machine (or JVM). This is where your tuning effort will start. Just as a Formula One car cannot win races with a misfiring engine, if the JVM is not tuned adequately, nothing else in your application will work optimally.

Sun Microsystems created Java as a high-level programming language which should run on as many operating systems and hardware variations as possible. The goal of “write once, run anywhere” required a predictable, encapsulated operating system. The JVM is a multi-threaded environment abstracting the actual computer operating system from the application code.  Adobe previously deployed its own JVM called JRUN, but switched to Tomcat in ColdFusion 10.

ColdFusion Stack Diagram

The default configuration of the JVM that ships with ColdFusion is intended to satisfy the broadest number of requirements. This is a good thing, as it means your application will work right out of the box. However, if you have not tuned your JVM to the needs of your particular application and infrastructure, there is a high likelihood you are running a sub-optimal configuration. Fortunately, the JVM ships with everything you need to begin inspecting the performance of your application.

First, It is important to understand some basic principles of how the JVM operates. The JVM segments its allocated memory into Generations.

ColdFusion JVM

Memory is subdivided into three Generations: Young, Old, and Permanent. In very general terms, think of the Young and Old Generations as containing your application code, and the Permanent Generation containing the classes needed to run the ColdFusion server. This is a slight simplification for the purposes of this guide. The actual distribution of memory is much more complicated.

When code is executed in ColdFusion, objects are created and reside in the Eden Space. There are two other spaces in the Young Generation: S0 and S1 (the Survivor Spaces). During the execution of requests, objects traverse between the Eden and Survivor spaces. This activity is monitored by a process known as the Garbage Collector (GC). Objects which are no longer referenced by any other objects are considered unused. The GC will attempt to collect (and release the associated memory) of those objects roughly 40-45 times while they are in the Young Generation. If the objects are not collected, they are moved to the Old Generation.

Why would the objects in the Young Generation not get collected?  Usually, because there are persistent references to them in the application code. Shared scope variables (session, application, server) are an example of objects that retain references for an extended period of time and wind up in the Old Generation. The more shared scope variables your application employs, the larger your Old Generation will grow. Objects that never release their references will become stuck in the Old Generation. This can cause memory usage to slowly grow over time and is commonly referred to as a memory leak.

There be Dragons Here

A final word of caution before you proceed. Settings discussed in the following sections alter how the underlying JVM operates. It is possible to put the JVM into an unresponsive state. Please make sure you have current backups of your system before making any changes.

As an Adobe Solution Partner, Convective has performed hundreds of Performance, Analysis, and Tuning engagements to improve server responsiveness. On-site or remote, our dedicated team of tuning professionals can get your servers up to speed fast.

Professional ColdFusion Tuning Services by Convective

Configuring the JVM

Changes to the configuration of the JVM are managed in the jvm.config file. This file can be found in {cf_root}/runtime/bin or {jrun_root}/bin depending on how you installed ColdFusion.

Make a backup of the jvm.config file before you proceed any further.

Open the jvm.config file in the text editor of your choice and find the line that begins with java.args. This entry controls many of the configuration options for the JVM.

The default java.args configuration looks like this, although yours may vary slightly.

java.args=-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005 -Xms256m -Xmx1024m -XX:MaxMetaspaceSize=192m -XX:+UseParallelGC -Xbatch -Dcoldfusion.home={application.home} -Djava.awt.headless=true -Duser.language=en -Dcoldfusion.rootDir={application.home} -Djava.security.policy={application.home}/lib/coldfusion.policy -Djava.security.auth.policy={application.home}/lib/neo_jaas.policy -Dcoldfusion.classPath={application.home}/lib/updates,{application.home}/lib,{application.home}/lib/axis2,{application.home}/gateway/lib/,{application.home}/wwwroot/WEB-INF/cfform/jars,{application.home}/wwwroot/WEB-INF/flex/jars,{application.home}/lib/oosdk/lib,{application.home}/lib/oosdk/classes -Dcoldfusion.libPath={application.home}/lib -Dorg.apache.coyote.USE_CUSTOM_STATUS_MSG_IN_HEADER=true -Dcoldfusion.jsafe.defaultalgo=FIPS186Random -Dorg.eclipse.jetty.util.log.class=org.eclipse.jetty.util.log.JavaUtilLog -Djava.util.logging.config.file={application.home}/lib/logging.properties

Note that it is one long line, with individual options separated by a space and prefixed with a dash.

It is critical that you performance test your application under simulated loads both before and after you introduce a change to your JVM. These tests should be conducted in a controlled, non-production environment. Otherwise, you will have no means of determining if response times have actually improved. Iterate through your changes with additional performance testing to arrive at optimal settings for your server.

Tip: Make sure you do not introduce any line breaks into java.args. This can cause ColdFusion to fail to start.

Tuning the Garbage Collector

Often considered a “low hanging fruit” for performance tuning, the GC is a good first target for your tuning effort. The JVM has 4 different GC options. Only one can be used at a time. The options are:

  • Serial Garbage Collector
  • Parallel Garbage Collector
  • Concurrent Mark Sweep (CMS) Garbage Collector
  • G1 Garbage Collector

The default GC for ColdFusion is the Parallel Collector. The priority of this collector is performance, as it attempts minor collections in parallel. This leads to much more efficient garbage collection and throughput. It is usually the right GC for applications with medium to large datasets running on multiprocessor or multithreaded hardware. However, during a garbage collection event, your system may periodically become unresponsive (pause) for brief intervals. These can sometimes last 1 second or longer. Once garbage collection is complete, the system resumes processing requests.

Two collectors worth considering are the so-called “mostly concurrent collectors”: CMS and G1. These collectors run concurrently with your application to minimize pauses at the cost of introducing a slight overhead to your applications. These may be a good option for applications where response time is more important than overall throughput. To switch to one of these GCs, update java.args by removing the Parallel Collector (-XX:+UseParallelGC) and adding one of the mostly concurrent collectors (-XX:+UseConcMarkSweepGC to enable the CMS collector or -XX:+UseG1GC for G1).

The rest of this tuning guide will assume the system is running Parallel Garbage Collection. This can be verified by checking that -XX:+UseParallelGC is specified in java.args.

The main goal of tuning the JVM is to maximize the available free heap to no less than 30% while keeping full garbage collections to a minimum. The best way to achieve this is to collect objects while they are still in the Young Generation.

The minimum and maximum memory reservations are set with two arguments in your jvm.config. -Xms is the lower bound and -Xmx is the upper. Follow the 30/60 rule:

  • If less than 30% of your heap is free (unoccupied), the JVM will trigger more frequent GC cycles, which can reduce performance.
  • If more than 60% of your heap is free, the JVM will trigger fewer GC cycles. However, these infrequent cycles will be longer than necessary, causing longer pause times, which can also reduce performance.

We recommend setting your max heap size (-Xmx) to at least 43% more than the maximum occupancy (total required heap size) of your application. For an application with a max occupancy of 120 MB, the maximum heap size of 172 would be adequate for the -Xmx argument based on the following calculation:

120 + (120 * 0.43)

The maximum occupancy of your application can be determined by observation under controlled load testing. As load is applied to the application, observe the heap size using a tool such as FusionReactor.

Tip: See here for more information on viewing JVM metrics.

Also of important note is, that as your application approaches the upper limits (70%) of the available heap, more and more full GC cycles will run, which greatly reduces performance because request processing is paused each time a full GC is run. Seeing your heap size at 70% or more is a good indication that a performance problem is right around the corner, and the JVM will need settings to allocation more RAM from the host server.

Step 1: Enable GC Logging

In order to inspect the GC, we need to enable GC logging.

Append the following options to jvm.args:
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=3 -XX:GCLogFileSize=10240k -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -verbose:gc -Xloggc:convectiveGC.log
Next, restart the server. Verify that a new log file convectiveGC.log has been created in {cf_root}/runtime/bin or {jrun_root}/bin (the location will depend on your installation).

Step 3: Analyze The Resulting Log

We recommend the GcViewer tool for GC log analysis. After installing the tool, open the convectiveGC.log file within GcViewer. This will present graphical and textual statistics on the log file for inspection. The view below depicts log data collected while running the CMS GC.

ColdFusion JVM Tuning

Recall our goal from above is a free heap of no less than 50% while keeping full garbage collections to a minimum.

We see from this output that there were no Full Garbage Collections (good), but the overall the JVM Heap is at 73.3% used and the Old-Tenured generation is at 68.2% used. Based on this output, we would advise adding a bit more system RAM so we can increase the JVM heap to meet our goal.

Conclusion

As you begin to tune your ColdFusion application, consider the JVM as a focal point. It is the often invisible engine that drives your applications. It will quietly chug along even when choked for resources. Tuning this engine can yield great dividends.

This, of course, is only the beginning. ColdFusion has many native settings that can be tuned, in addition to the operating system, network, and your code itself. Performance tuning is a process that must be woven into your development and maintenance regimen.

Further reading: Performance Tuning with Java Virtual Machine Parameters