Clone wiki

InplaceArray / Home

Welcome

This is simple proof of concept library for Dlang. The main aim is to reduce GC usage and improve data locality by replacing dynamic array.

Problem

It is common to work with small (less than 20 bytes) strings. Many programs are affected by issues highlighted here. It is ineffective to use dynamic memory and GC memory for small arrays. Dlang uses 2 * size_t (pointer + size) for any dynamic array. Heap and GC also require additional memory consumption per allocated block. Inplace arrays allow to store up to 3 * size_t -1 bytes as value type without heap at all.

Design

The main ideas:

  • Use value type for small array and dynamic array for other

  • Use union for smooth switching

  • Cache hash code for dynamic array to improve memory locality (equals & hash code methods).

Key code:

  union InplaceArray(T){
    size_t[3] a;
    align (1): struct{
        T[maxInplaceLength!T()] data; //3*size_t.sizeof - 1
        Status status; //one byte field
    }
  }

 struct IArray(T){
   union{
    InplaceArray!(T) iarr;
    struct {
        T[] arr;
        size_t cachedHash;
    }
}

Union

Inplace array layout:

[.....array_size.....][....pointer......][....hash_code.....]
union
[.....array_items...................................][status]

There is possibility to use one bit from hash code as mark to distinguish inplace array and dynamic array.

Drawbacks

It is mix between value type and ref type. As result this approach is recommended only for immutable data. Also it has 20% performace gap (at least now, in word counter test)

Demo

I include small demo - word counter. This program builds AA array (word => counter) for given text file in UTF-8. It is possible switch between builtin array/string and inplace array by changing one line code. I used:

Here results. Buildin string:

Done 83163128. Inplace: 0; total: 5624511; 0
GC summary: 1355 MB, 12 GC 5659 ms, Pauses 3116 ms < 1436 ms

Inplace array:

Done 83163128. Inplace: 5592138; total: 5624511; 99.4244
GC summary: 715 MB, 22 GC 5281 ms, Pauses 4207 ms < 938 ms

As you can see it is possible noticable reduce memory consumption (1355Mb -> 715Mb) and max pause time (1436ms -> 938ms).

The same test with original gzip:

Buildin string:

Done 83162236. Inplace: 0; total: 5624510; 0
GC summary: 672 MB, 46 GC20140 ms, Pauses16019 ms < 1474 ms

Inplace array:

Done 83163128. Inplace: 5592138; total: 5624511; 99.4244
GC summary: 494 MB, 52 GC15264 ms, Pauses13732 ms < 926 ms

Have fun!

Updated