Jan 252006

I just completed reworking the GSlice and g_malloc() memory debugging hooks. With this, GLib-2.10 (or the next GLib-2.9 development version due this weekend) will (without recompilation) support G_SLICE=always-malloc and G_DEBUG=gc-friendly. These cause all GSlice allocations to go through g_malloc() and g_free() instead and cause all released memory to be memset() to 0. This helps memory checkers to trace pointer lists and can be used to debug g_malloc()/g_free() and g_slice_new()/g_slice_free() allocation code to find and plug memory leaks:

	G_SLICE=always-malloc G_DEBUG=gc-friendly valgrind --leak-check=full mybuggyprogram

Or, in combination with MALLOC_CHECK_, it can be used to catch and debug invalid allocation/release calls:

	G_SLICE=always-malloc G_DEBUG=gc-friendly MALLOC_CHECK_=2 gdb mybuggyprogram

Dec 202005

The other day, Tommi Komulainen pointed out to me that GSlice is using more memory than memchunks for him after bootup of the N770. Now, GSlice is supposed to be faster than memchunks, yes. And its supposed to scale far better across multiple threads. In the long run it is also meant to be more efficient in terms of memory consumption.
However, that is mostly due to the fact that many code portions which use GMemChunk are keeping their own never-freeing trash stacks. Such code should now be migrated to the GSlice API and the trash stacks should be removed. GSlice maintains its own working set memory in per-thread trash stacks. Home grown trash stacks will just waste memory and clutter up the cache lines.
Also, using separate GMemChunks prevents chunks of equal sizes from being shared across a program, this again wastes memory and clutters up the cache lines.
Because of the long-term wastage that GMemChunk application tends to build up, significant memory savings from GSlice are actually only to be expected for longer running programs which certainly is not a scenario met directly after N770 boot up 😉
That being said, the original check-in of the slab allocator in the GSlice code does behave a bit greedy actually. Basically, it allocates a new page (4096 bytes on IA32) per every different size, memory chunks are allocated at (chunk sizes are aligned to 8 bytes on IA32). That means, initially allocating 8 + 16 + 24 + 32 + 40 bytes in separate chunks does require opening up of 5 caches, so it uses 5 * 4096 = 20KB already. I’ve tuned the most recent CVS version now to do more economical caching, so the above scenario now ends up at roughly the power-of-2 sums of 8*8, 16*8, 24*8, 32*8 and 40*8, which is approximately 1.6KB and can all be allocated from a single memory page. While this is a significant memory saver, it also does have some performance impacts. However, in all test scenarios on my machine, the GSlice performance didn’t drop by more than 5%. That’s probably bearable, considering how significant the savings are.

Dec 022005

I finally managed to finish up the new g_slice_*() allocator for GLib CVS and commit it. For most platforms, it should be a lot faster than malloc(), and on my machine it saves 20-30% in performance over allocating with mem-chunks. But most importantly, it does share equally sized chunks across a program, which mem-chunks didn’t, so they opened up several independent heaps of equally sized chunks, scattered all over the program and causing large wastage.
I would like to thank Matthias Clasen for an initial version of the magazine cache and Stefan Westerfeld for helping me with optimizing the common code paths.

Aug 012005

Today, i started working at imendio, we’re doing some pretty interesting hacking here, so i hope the job is going to be fun.

And, the hopefully last incarnation of the atomic ref-counting patch for GObject i blogged about earlier finally made it into CVS. With atomic ref-counting, GObject users still have to make sure they don’t modify or read-out the same object concurrently from different threads (such as using g_object_set_qdata() on it or using object methods), they may however invoke g_object_ref() and g_object_unref() concurrently without having to ensure mutual exclusion, the same goes for GParamSpec and GClosure.