26 January 2008

Perceived jemalloc memory footprint

For the past couple of months I have been working with the Mozilla folks to integrate jemalloc into Firefox. This past week, Stuart has been doing lots of performance testing to make sure jemalloc is actually an improvement, and he ran into an interesting problem on Windows: jemalloc appears to use more memory than the default allocator, because Windows' task manager reports mapped memory rather than actual working set. As near as we could tell, jemalloc was actually reducing the working set a bit, but the perception from looking at the task manager statistics was that jemalloc was a huge pessimization. This is because jemalloc manages memory in chunks, and leaves each chunk mapped until it is completely empty. Unfortunately, even though there is a way to tell Windows that unused pages can be discarded if memory becomes tight, appearances make it seem as if jemalloc is hogging memory. Well, appearances do matter, so I have been working frantically the past few days to come up with a solution. The upshot is that I may have ended up with a solution to related problems for jemalloc in FreeBSD, its native setting.

In FreeBSD, there is an optional runtime flag that tells malloc to call madvise(3) for pages that are still mapped, but for which the data are no longer needed. This would be great, but madvise() is quite expensive to call, which leaves us with little choice but to disable those calls by default. What that means is that when memory becomes tight and the kernel needs to free up some RAM, it has to swap out the junk in those pages, just as if the junk were critical data. The repercussions are system-wide, since pretty much every application has those madvise() calls disabled.

The solution is pretty straightforward: rather than calling madvise() as soon as pages of memory can be discarded, simply make a note that those pages are dirty. Then, if the amount of dirty discardable memory exceeds some threshold, march down through memory and call madvise() until the amount of dirty memory has been brought under control. This tends to vastly reduce the number of madvise() calls, but without ever leaving very much dirty memory laying around.

I still need to do a bunch of performance analysis before integrating this change into FreeBSD, but my expectation is that as an indirect result of trying to make jemalloc look good on Windows, FreeBSD is going to benefit.