Sunday, January 30, 2011

Zero-Copy Memory

Looking at my code paths, I've seen many methods that read from a data-structure to a buffer and then write back it to another data structure, and so on...

For example read from disk, write to a block cache, read a piece of data and put it in another other data-structure for processing. 

One way to avoid all this memcpy and data duplication is to use a copy-on-write data-structure. For strings or "single-block-data" is pretty easy. As you can see on picture above, you've a chunk of memory that is referenced from different objects. Each object keeps a pointer to the data and if needed an offset and a length to reference just a part of the data.

If someone call write() on one of those objects, internally data-blob is duplicated and modified. In this way the modified objects points to a new block and the others points to the old block.

This is a really simple copy-on-write technique and the string class is really suitable for that. But looking at my code, this case doesn't happen often... In the common case, I've object that references two or more blocks and sometimes happens that i need to remove or inject data in the middle of the buffer.

Extent-tree that i use for the Flow Objects in RaleighFS plus copy-on-write on blocks has the behavior that I need to avoid all this memcpy and data duplication.

Extent-Tree allows you to read from a list of blocks as a single block. Each block as an offset and a length, and blocks are sorted by offset, in the tree. This allows you to reach your specified offset fast and insert or remove at any specified offset.

Using this extent-tree has the advantage that you can avoid to memcpy large blocks from a data-structure to another, you can merge two extent tree or do other fancy thing without allocate new memory and copy data to it. But insert and split require a malloc, so you've to tune your node and extent allocation with an object pool, and this is pretty simple due to the fixed size of those object.

I've made a draft implementation that you can find on my github repo.

No comments:

Post a Comment