Friday, April 20, 2012

Embedded Store, under the hood...

This week I've found an interesting bug, that can be summarized in this way. The user does not have any idea of what happens under the hood, and his main usage is always against your design.

To give you more context, the bug was related to embedded storage systems, something like bsddb, sqlite or just your simple B+Tree or your on-disk HashTable.

So, How an embedded storage system is designed?
As you can see from the simplified schema on the right
  • The lowest level is the raw access to the disk data structure (e.g. B+Tree, HashTable, ...) so each request goest directly to disk.
  • On top of that, to speedup things, you add a cache to avoid fetching data from disk all the time.
  • And everything is packed in a nice api that provides you some sort of get and put functionality, at maximum speed using the cache.

Everything seems fine, You can create an application that access your library capable of handling tons of request without accessing the disk due to the cache layer and so on, and you can think even to build a service to be able to query your storage from a remote host, and here the problems begin.


Your first super happy user arrive and decide to build its super scalable infrastructure with your embedded storage.
..and what is the easy way to get a super scalable service? obviously adding some threads.. but threads are not a problem, because the user has learned the lesson and now knows that he should not use shared variables. So the brilliant solution is each thread has its own instance of the "storage object", to be more clear each thread do something like db.open("super-storage.db")

Everything seems fine... but after a couples of days the user started crying... sometimes data is missing, logs contains strange page not found messages, some part of the store is corrupted, and so on...

Can you spot the problem? Yes, is the cache...
No one is aware of the changes in the file, every thread use its own cache, and the first request to a block not in cache ends up to create some bad state.

So the solution for the user is to use the library as you've designed, with just one thread/process/what ever that access the store file.

But if you want slow down your super cool library and make the user happy you can always add an ID to the super-block and every time the user request something... you fetch the super-block from disk, compare with the one in cache, and if they are different you can just invalidate all the caches...

3 comments:

  1. pretty common problem)) very interesting person who not took in consider multithread case. solution also pretty common: superbus with arbiter to cache, or several caches with some sort of cache coherency algo.

    UNIX® Systems for Modern Architectures: Symmetric Multiprocessing and Caching for Kernel Programmer

    covers all this issue, but from cpu POV

    ReplyDelete
  2. Is it a problem with caching or flushing?

    ReplyDelete
    Replies
    1. The big problem here is the cache inside the library, since the library always read from his cache without re-read from disk.
      The read/write consistency with multiple writers is another problem.

      Delete