This post has 2659 words. Reading it will take approximately 13 minutes.
I’m working on some Java code I’ve inherited. I’m doing some speed improvements in critical places, and to test my improvements, I need test cases to compare different implementations. Unfortunately, manually generating test cases is way too time consuming (need need thousands and generating one manually takes minutes or even hours). Unfortunately, randomly generated test cases don’t work either, as my test cases are (propositional) LTL formulae, and a randomly generated one of the size I expect to occur from realistic use of the program, is very unlikely to be satisfiable, which real-life formulae are expected to be, in part helped by continuous integration and test by the tool.
To fix this, I devised a simple plan: generate test cases and sort them based on whether they are satisfiable or not, and add a third category for formulae that are not easily categorizable (i.e., I run out of memory when doing the test using a simple but fast and efficient implementation). This all results in me generating tens and hundred of thousands of formulae, test whether they are satisfiable, collect a few of them and discard the rest.
This should all be very simple, but I noticed weird behavior of my program when running: after executing, it was slowing down and the easily categorizable formulae became more frequent. In one example, I had no satisfiable formulae, 8000 unsatisfiable formulae and 1 or 2 failures. Letting the computation run, I had 0 satisfiable formulae, 9000 unsatisfiable formulae, and 20 failures. Then the program ran out of memory. This behavior should not be possible: I catch the OutOfMemoryException during generation, and all memory should be released afterwards, so I should never be able to run out of memory. Secondly, the formulae are generated completely randomly, so having 2 failures out of 8000 runs but 18 out of 1000 runs does not seem probable. All in all, this pointed towards something accumulating in memory.
Normally, you can just look at your code and pin-point where a memory leak occurs. In garbage collected languages, like Java, you typically encounter memory leaks in two ways: the garbage collector cannot see that some memory is no longer used or you forgot to remove some reference to an object you’ll not use anymore. The former is not very likely, as Java uses quite sophisticated techniques, including a mark and sweep algorithm traversing the entire heap (though this does not run very frequently as it is computationally expensive). It does not really matter which of the two is really the problem, as the fix is the same: make sure to set references you no longer use to null, so the garbage collector can figure out they are no longer used.
You may be interested in taking a look at the java.lang.ref package. which contains classes that let you interact with the garbage collector, in particular WeakReference, which references an object, but does not force it to stay in memory, so if there are only weak references to an object, it can be garbage collected, and SoftReference, which references objects that should be kept in memory but can be garbage collected if the program is running low on memory. Weak references are useful for implementing factories that ensure that if
a == b. This is an advantage if you do a lot of comparisons of complex objects, and can be implemented by adding to the factory a set of produced objects and making sure you always look up objects in that set and return the same instance if you are asked to create an equal one. Unfortunately, your set will keep the object alive unless you use a weak reference, causing a memory leak. Soft references are useful for creating in-memory caches of complex computations or vales obtained from slower storage. The value can easily be recomputed/reread, but not doing this is faster. Therefore the cached value should be kept in memory if possible, but if the application needs memory for other uses, it is more useful to reclaim the cache than swapping important values to disk.
Monitoring Your Program Using JConsole
The first thing to do is to figure out if the program really seems to be leaking memory. We can do that using JConsole, which ships with the JDK form Sun (or Oracle, but I don’t like them as they put the Java API on a slow server). We simply start up JConsole (on OS X or Unix, just type
jconsole at the prompt, on Windows I guess find it in the start menu or whatever). First we pick the process to monitor:
When JConsole starts, up, we switch to the Memory tab and monitor the memory. As we are interested in the long-time behavior, we select to only show the oldest generation of the heap objects. The younger generations fluctuate wildly, as I allocate and destroy objects like crazy, and we are only interested in whether anything accumulates, which means it will eventually be promoted to old status. For good measure, we can hit the Perform GC button once in a while to clean up memory. This should give a slight decrease in the memory used, but you should still get a general trend like this:
This is the memory profile of an unhealthy program: the size just increases. The 3 or 4 bumps result from me pressing the Perform GC button. When memory consumptions reaches 2 GB, it flattens (as I’ve forced Java to use at most 2 GB memory). When the memory hits 2 GB I also see a noticeable decrease in performance, and my program now stops at short intervals to perform garbage collection. This is definitely bad.
Generating a Heap Image
Now, we’ve established that my code contained a memory leak. How do we continue from there? One way is to go thru our code and remove references at random until you no longer see the erroneous behavior, but that is a bit ad-hoc and difficult to know when you’re done. Instead, we will use a memory profiler. Luckily, the JDK comes with all the tools you need built right in. All you have to do is create a heap image and analyze that for orphaned objects.
You can generate the heap in basically two ways, each having advantages and disadvantages: using
hprof or using
hprof is much slower, but provides you with more information, such as class names and allocation stacks, whereas
jmap runs your program at full speed but only provides names for some classes. This section is in part based on this blog-post, but mine makes the distinctions between the options and phases clearer and provides pictures.
Simply start your program with
java -Dcom.sun.management.jmxremote \ -Dcom.sun.management.jmxremote.port=9000 \ -Dcom.sun.management.jmxremote.authenticate=false \ -Dcom.sun.management.jmxremote.ssl=false \ -agentlib:hprof=heap=dump,file=hprof.bin,format=b,depth=10 \ -jar java_app.jar
This basically starts the profiler, disables all security and tells it to dump to a file called
hprof.bin. If your program is not in a jar-file, you can simply replace the last line by the main class of your application. Now exercise your program, and when you’re done, press Ctrl-C to dump the heap and quit your application.
Start your program normally, either from the command line, from your file manager (Finder or Windows Explorer or whatever abomination Lunix users use nowadays), or from your IDE. Then find the process id of your program. You can do this in several different ways, but the easier one is just to look at the JConsole, where this is available. In the example above, the process id is 59735 (see the PID column on the process selection screen or the title bar when monitoring a process). Now exercise your program.
Then dump the image using
jmap -dump:format=b,file=hprof.bin 59735
Remember to replace 59735 with the process id you found using JConsole.
Exercising Your Program
We’re now merging from two different branches. Readers that took the branch you didn’t had a much more pleasant reading experience, but what have you? Well, we need to exercise our programs. That is, execute the operations we believe lead to errors. In my case, I should just start the program and let it run, but you may need to load a file, print it or whatever your program does. Do it a couple times so errors are more obvious in the heap.
Analyzing the Heap Image
When we have generated a heap image using either method, it is time to analyze it. Luckily, JDK comes with such a tool, jhat (I don’t know who comes up with these names, but they should be shot and then killed). Start
jhat -J-Xmx2G hprof.bin
Note there is no space between
-Xmx2G. We set this option to allow
jhat to use 2 GB of RAM, which it’ll probably need to analyze our rather large heap image.
jhat will process (which may take very long), and start a web-server on port 7000 on your local machine, because why the fuck not. To view the generated data, point your browser to localhost:7000. You’ll get something like this:
Very 90s. Immediately scroll to the bottom:
Pick the Show heap histogram link (note to dimwitted readers: your browser may not have a red rectangle pointing out the correct link, seeing as I added that in post-processing to point out the link to illiterate readers). Get the über-cool purple unicorn status bar for Safari here. The histogram, for my application, looks like this if I used
or like this if I used
The numbers differ because with
hprof the time is determined by how long I’m willing to wait and for
jmap it’s determined by how fast I can dump the image. Note that the
jmap output not always displays nice class names. In the following, I’ll only show the
hprof output as it’s nicer, but anything could have been done using the
jmap output as well.
This output should be read like a normal profiler output: focus on the ones you have a lot of. In my case, I’ve got a whole bunch of Formula and HashMap$Entry classes. In fact I’ve got several hundred more of these than anything else. It makes sense, I have a bunch of Formula classes, as I deal with formulae, but not over a million. While the data-structure is recursive, I may use maybe a couple thousand for a complex formula, not a million. Also, the HashMap$Entry intrigues me. This indicates something is stored in a HashMap, and comparing the numbers, that something may just be Formulas. Looking at the Formula class reveals no HashMap, so I’m a bit at a loss: what contains all these Formulas?
To investigate, let’s look at all of those Formulas. We click on Formula and get this screen:
Quickly hammer the References summary by type link, because it’s started loading a long-ass list containing all over a million references, which may just kill your browser. We then see this:
This tells us two thing: Formula is clearly a recursive datatype (most instances are referred by other Formula objects) and the rest are referred by HashMap$Entry objects. The Formula array in the third place is ok, as it contains the templates from which my formulae are generated (I can see this by following the link, but I’ve not shown it here). Let’s dab on the HashMap$Entry and see where that takes us. It takes us here:
So, they’re either referenced from arrays or other HashMap$Entry objects. Guess Java’s hash tables are implemented using an array of entries using linked lists of HashMap$Entry objects to handle collisions. 18000 collisions is not really that good… Anyway, let’s look at the arrays:
Unsurprisingly, these live in HashMaps. Following:
These are used in many places, but my code only shows up in a few places. The Nodes makes perfect sense: I do have some graphs referring to nodes, and the number of Nodes seem small enough to make sense. The SinglePropertyGraph also contains formulas, and once again the number of objects (1) makes perfect sense. The DefaultFormulaFactory is a bit weird, though. Why should it have a map of formulas? Also, it’s the only object that lives thru the execution of my code. Fire up Eclipse and load the factory:
Seems like we’ve found our culprit. The factory canonicalizes the Formula objects. This is obviously bad when you generate millions of them. The reason for canonicalizing the objects is to allow quick comparison, so I could fix this using WeakReferences, but instead I made the Formula object immutable (impossible to modify), and added precalculation of the hash value and an efficient equality function:
Now, I just got rid of canonicalization and instead rely on my more efficient hash function and equality test. I then ran my regression tests to make sure I didn’t break anything (regression tests suck to create, but they really rock when you do something like this).
Verifying Memory-leak is Gone
To verify this really fixed the memory-leak, I tried out JConsole again, and it now showed me something like this:
We no longer have a steady increase in memory consumption, even over time. The spikes are ok.
To ensure I’ve not missed anything, I then start my program again. I let run for a few seconds and create a dump (using
jmap) and let the program continue for a minute or two and create a secondary dump (to a new file).
Then we run
jhat again, this time like
jhat -J-Xmx2G -baseline heap1.bin heap2.bin
heap1.bin is the first heap image I generated after fixing the bug and
heap2.bin is the image I created after letting the program run for a couple minutes. The
-baseline option tells
jhat to use the first heap image as baseline and the mark any objects that are in both images as “old” and any objects that are only in the latter image as “new”. Launching out browser, we now pick the Show instance counts link neat the bottom:
This shows how many instances and how many of those are new:
This shows that while I still have a lot of Formula instances, all of these are new, and I have thus not leaked any. The same goes for any HashMap related problems. In fact, I see that almost all objects are new, which is unusual but makes perfect sense for my program, which allocates a bunch of formulae, tests them, dumps them, and then discards them.
Here we have seen how to use JDK supplied tools to squash a memory-leak in a program. My program is a bit special, as it dumps almost everything from memory when time progresses, making detecting memory-leaks especially easy. Still, the technique used to verify that the memory-leak has been found and removed can be used in general cases as well: for example, start your program and get it in a stable state. Then make a dump. Now exercise your program and bring it back to the stable state. Then make a new dump, and compare it to the old dump. You should see barely any new object – new objects correspond to leaks.
You can also explore some of the options of the programs; for example,
jmap has a
-histo option, which shows immediately the memory historgram in your console, which is much faster than making and analyzing a dump file.
Other than that, it’s just about doing profiling of your code when it is nearly done, both for time and space, and making sure you have a solid regression test suite before making changes to a working program for performance reasons.
Time person of the year 2006, Nobel Peace Prize winner 2012.