Sunday, June 24, 2012

Data-center Rolling Upgrades coordinated by ZooKeeper

Still playing around trying to improve the daily deploy work in the data-centers.

The idea is to replace a sequential/semi-manual process with something more automatic that don't need human intervention unless some failure happens.

Services and Deploy rules:
  • Services has dependencies (Service B depends on Service A), Deploy order matter!
  • You can't bring down all the machines at the same time! 
  • One or more machine can be unreachable during the deploy (network problems, hw failures, ...).
  • Each machine need to be self-sufficient!
Must to Have (Monitoring)
  • Current service state of each machines (online/offline, service v1, v2)
  • Current "deploy" state (Ready to roll?)

The idea is quite simple, using ZooKeeper to keep track of each Service (A, B, ..., K) with the list of machines available (ephemeral znodes) and to keep track of te deploy state ("staging").
  • /dc/current: Contains a list of services with the list of online machines (and relative service version).
  • /dc/staging: Contains a list of services with the list of machines ready to roll.
  • /dc/deploy: Deploy order queue each node represent the service to upgrade.
When you're ready to deploy something new you can create the new znodes:
  • Add services to "staging" with the useful metadata (version, download path, ...)
  • Define a deploy "order" queue
Each service is notified about the new staging version and starts downloading (see "data-center deploy using torrent and mlock()" post). Once the download is completed, the service register it self to the "staging" queue.

Now the tricky part is when can I start switching to the new version? The idea is to specify a quorum foreach service. The First machine in the "Staging" queue for the first service in the "Deploy" queue, looks for the quorum, and when is time shutdown it self and restart the new service. Once is done  adds it self to the "Current" list and remove it self from the staging queue.

And one by one each machine start upgrading it self, until the deploy queue is empty. If a machine is down during the deploy, the "Current" node is checked to find which version is the most popular, and the service will be started.


  1. Somebody necessarily help to make severely posts I might state. This is the first time I frequented your website page and to this point? I surprised with the research you made to create this particular post extraordinary. Well done admin..
    Skilled Manpower Services in Chennai

  2. Great Article… I love to read your articles because your writing style is too good, its is very very helpful for all of us and I never get bored while reading your article because, they are becomes a more and more interesting from the starting lines until the end.

    Digital Marketing Company in Chennai

  3. I have definitely picked up anything new from right here. I did however expertise a few technical points using this site, since I experienced to reload the web site a lot of times previous to I could get it to load correctly.
    Office Interiors in Chennai

  4. I truly appreciate this post. I’ve been looking all over for this! Thank goodness I found it on Bing. You have made my day! Thanks again! Keep update more excellent posts..
    Housekeeping services in Chennai
    House cleaning service in Chennai

  5. Really Good article.provided a helpful information.keep updating...
    E-mail marketing company in india

  6. Its a wonderful post and very helpful, thanks for all this information. You are including better information regarding this topic in an effective way.Thank you so much

    Installment loans
    Payday loans
    Title loans
    Cash Advances

  7. The post is very nicely written and it contains many useful facts. I am happy to find your distinguished way of writing the post. Now you make it easy for me to understand and implement. Thanks for sharing with us.
    Website Development Company Bangalore
    Website Design and Development Companies in Bangalore
    Outsource magento ecommerce services india

  8. The Mobile Accessories is a largest mobile retail Chain dealing in leading international and Indian Brands of mobile phones and accessories headquartered with using special offers and low cost of the latest branded mobile phones. This is amazing offers with some of days.

    Mobile Showrooms in OMR

  9. Great post! I am actually getting ready to across this information, is very helpful my friend. Also great blog here with all of the valuable information you have. Keep up the good work you are doing here.Well, got a good knowledge.
    Paper Publishing Sites
    Naas Rated Journals
    Language Translation Services
    Research Paper Writing Service
    Article Writing Services

  10. I have really happy to these reading your post. This product control and maintenance of our health.The daily routine can assist you weight lose quickly and safely.My life is completely reworked once I followed this diet.I feeling nice concerning myself.

    Herbalife in Chennai
    Nutrition centers in Chennai
    Weight Loss in Chennai
    Weight Gain in Chennai

  11. Robotic Process Automation (RPA) is one of the most exciting developments in Business Process Management (BPM) in recent history. Some industry experts believe it may be even more transformational than cloud computing transformational than cloud Automationminds team. (RPA)Automationminds lets you program in (RPA),
    Robotic Process Automation Anywhere and course bluePrism