G1GC Performance Tuning Parameters

In this post, I will list some common observations from G1GC logs, and the parameters you can try to tune the performance.

To print GC logs, please refer to my blog about how to print gc logs.

Threads Related

In JDK9, with -Xlog:gc*=info, or -Xlog:gc+cpu=info, you can get log entry like:

[12.420s][info][gc,cpu ] GC(0) User=0.14s Sys=0.03s Real=0.02s

This can give you some indication about the cpu utilization for the GC pause. For example, this entry indicates for this gc pause, total user-cpu is 0.14s, wall time is 0.02s, and system time is 0.03s. The ratio of User/Real could be used as an estimation of number of gc threads you need. For this case, 18 gc threads should be good. If you see long termination time, User/Real less than the gc threads, you can try to reduce the number of gc threads. Here is a list of threads related parameters:

Name	Parameters	Description	Default
ParallelGCThreads	-XX:ParallelGCThreads	Threads for Parallel Work during a gc pause	# of cpus (up to 8) or 8+(#processors-8)(5/8)
Parallel Marking Threads	-XX:ConcGCThreads	threads for concurrent marking	Max((parallelGCThreads+2)/4,1)
G1 Main Concuurrent Mark Thread		Controller of marking work	1
G1 Concurrent Refinement Threads	-XX:G1ConcRefinementThreads	update Rember Set	Max: ParallelGCThreads+1

Remember Set Related:

If you see long remember set related pauses (Update RS/Scan RS), you can try to push more work to concurrent phase by increasing G1ConcRefinementThreads, manually adjusting the Zones to wake up more concurent threads. Or examine the usage of remember set memory usage, if there are a lot of coarsening, you can try to add entries in the fine table.

-XX:G1RSetUpdatingPauseTimePercent=10

This determines the percentage of total garbage collection time G1 should spend in the Update RS phase updating any remaining remembered sets. G1 controls the amount of concurrent remembered set updates using this setting.

-XX:G1SummarizeRSetStatsPeriod=0

This is the period in a number of GCs that G1 generates remembered set summary reports. Set this to zero to disable. Generating remembered set summary reports is a costly operation, so it should be used only if necessary, and with a reasonably high value. Use gc+remset=trace to print anything.

`-XX:-G1UseAdaptiveConcRefinement` `-XX:G1ConcRefinementGreenZone=`<n> `-XX:G1ConcRefinementYellowZone=`<n> `-XX:G1ConcRefinementRedZone=`<n> `-XX:G1ConcRefinementThreads=`<n>	Disable the Ergonomic for concurrent refinement threads. Use with caution.
`-XX:G1RSetUpdatingPauseTimePercent=10`	This determines the percentage of total garbage collection time G1 should spend in the Update RS phase updating any remaining remembered sets. G1 controls the amount of concurrent remembered set updates using this setting.
`-XX:G1SummarizeRSetStatsPeriod=<n> -XX:log:gc+remset=trace`	This is the period in a number of GCs that G1 generates remembered set summary reports. Set this to zero to disable. Generating remembered set summary reports is a costly operation, so it should be used only if necessary, and with a reasonably high value.
`-XX:G1RSetRegionEntries`	The number of entries in the remember set fine table. If there are log of coarsening in the remember set statistics, try to increase this number. But the memory footprint for remember set could increase.

Other Ergonomic Related:

Parameters related to gc pause, young gen size, humongous objects, dynamic ihop, reference processing, mixed gc...

Parameters	Description	Default
-XX:MaxGCPauseMillis	targeted maximum gc pause	200
-XX:G1NewSizePercent	minimum young gen size in terms of percentage of heap size	5
-XX:G1MaxNewSizePercent	maximum young gen size in terms of percentage of heap size	60
-XX:G1HeapRegionSize	G1 heap region size. This deterimines humongous object size	a power of 2 number between 1-32m. The goal is to have 1024 regions.
-XX:G1HeapWastePercent	For this percent of heap, g1 is ok to not collect.	5
-XX:G1ReservePercent	G1 will try to reserve this percent of heap for to-space exhausted	10
-XX:G1MixedGCCountTarget	For each marking cycle, G1 will try to collect the old regions incrementally by this number of mixed gc	8
-XX:G1MixedGCLiveThresholdPercent	Old regions whose live object occupancy lower than this will be considered as candidates for mixed gc	85
-XX:G1UseAdaptiveIHOP	Enable adjusting IHOP dynamically	TRUE
-XX:InitiatingHeapOccupancyPercent	the initial heap occupancy percent	45
-XX:ParallelRefProcEnabled	Enable Parallel Reference Processing	FALSE

Performance Test with Gatling in Cloud

Introduction: A lot of our performance tests are driven by gatling. When moving our tests to Google cloud, we face a lot of questions. One of them is how to manage the life cycle of the test. First, we need to generate the .csv file as the feeder to the gatling test. Second, we need to know when the test is finished. Third, we need to retrieve the results from the cloud. According to Distributed load testing with Gatling and Kubernetes , a Kubernetes Job should be used. While this blog provides good information, I still need to figure out how to create the feeder .csv and share it with gatling script. Using InitContainers to pre-populate Volume data in Kubernetes provides me another piece of information. In this blog post, I will show you my experiment of creating a Kubernetes job to drive performance workload with Gatling. Create Feeder script: My gatling script reads the feeder .csv (benchmark.csv). I have a python script generating the benchmark....

hemaMay 10, 2021 at 3:18 AM
Excellent idea!!! I really enjoyed reading your post. Thank you for your efforts. Share more like this.
Software Testing Courses Online Certification
kumarJanuary 13, 2022 at 2:31 AM
This post is so interactive and informative.keep updating more information...
Career In .NET
About Dot NET

Performance: All Things Work Together

Search This Blog