Background

The background is that we have very limited memory available for our new board, around 2.2MB.

  • The upper layer like UI would have a lot of allocations&frees while the system is running.
  • As time goes , there will be fragments that cannot be freed, and the system will finally have no available memory.
  • The underline allocator consumes at least 32 bytes for one allocation, because it has to maintain additional information about the allocation, the previous node, the next node, the size, etc.

First we can do is cutting the code and functionalities.

To reduce the memory consumption, the cp team splits allocations to one-time allocations and dynamic allocations, and handles them with different strategies.

Also the cp team hoped that the upper layer can reduce allocations and memory consumption.

Thus I took the burden to implement a memory pool for UI to achieve the goals above.

Design

My design is that:

  • different align sizes have different pools
  • allocate a block with 1024 bytes (configurable)
  • track allocations&frees in a single block with bitmap
  • insert&sort blocks by raw pointer address with binary search
  • maintain double linked lists for different align sizes
  • maintain the free node of a pool for quick path allocations
  • each pool keeps at least one node

The data structures look like:

struct block_t {
    block_t* prev;
    block_t* next;
    uint8_t align_size;
    bitmap_t bitmap;
    void* raw;
};

struct pool_t {
        block_t* head;
        block_t* free;
        uint8_t align_size;
};

The benefits of the design are:

  • it can quickly determine if a pointer belongs to the pool or not with binary search by raw pointers of blocks
  • in a block, it can quickly determine the position by modulo of align size
  • in a block, it can quickly determine the given address is valid or not by checking modulo of align size
  • in a block, it can quickly check the address is freed or not by bitmap

The allocation steps are straightforward:

  • calc the align size for the allocation to determine it should be handled by our pool
  • find the pool for the align size
  • check if the free node is available and have available spaces
  • if not, check if any blocks have available spaces
  • if not, allocate a new block from the underline allocator and do the allocation

Performance

Performance on Windows PC with single thread

PC details:

Intel cpu i7-4790 3.6GHZ
DDR3 1600Mhz 8GB

Perf:

alloc (20176653.6/s)
free (20470494.1/s)

Performance on our board

alloc (199480.0/s)
free (387777.1/s)