In this post, I will share my experience of how I managed to reduce ten fold the number of servers for our web service and for three fold more traffic. So, actually I could increase our systems efficiency 30 times. For people who can’t scale it, I tell you, this much improvement is tremendous. We are able to save a lot of money. There are still more improvements that can be done. We have seen a lot of posts, articles of how to tweak your Nginx configuration to make it serve lot of requests, I have now actually felt that.
Earlier, for 8000 users/sec, we used to have 23-25 servers running. Now, we have served more than 25000 users/sec with only four machines (Though we also went with one server for around 12000 users/sec, but it gave little high cpu utilization). Even we have served around 45000 users/sec for just 6 servers. Also, the machines used earlier were of higher configuration, than now.
How did we modify our nginx configuration?
Earlier, we had used a normal configuration with most of the parameters having default or recommended values. My team told me that it used to handle a lot of requests earlier but something happened recently which gave all of us nightmares. We were trying hard to figure out the problem in all possible domains. We improved our code, we tried to modify nginx parameters’ values, db parameters’ values. We were still running on 22 machines for not much load.
Honestly, I didn’t care much about the old configuration anymore, and wanted to write a whole new one. I started with this book Mastering Nginx by Dimitri Aivaliotis. This is a great book. I read mainly 2nd, 4th and 5th chapters. Then I also read these threads, http://tweaked.io/guide/nginx/, http://www.aosabook.org/en/nginx.html, https://www.nginx.com/blog/inside-nginx-how-we-designed-for-performance-scale/, http://highscalability.com/blog/2014/4/30/10-tips-for-optimizing-nginx-and-php-fpm-for-high-traffic-si.html .
I started writing my own Nginx configuration. I used loader.io and ab testing tools all through the way to set correct values for the parameters. I increased timeout values to 300s from 15s. I also altered other values. Then I also implemented caching(microcaching) in Nginx. And caching of file descriptors for caching. I also changed the memory for caching to RAM. Then, I went for TCP connection from Unix socket connection to upstream servers. I observed drastic improvement on loader.io and ab tools tests. Then we went live with changed configuration and we had actually improved a lot. Surprisingly, we could handle the same load with just four machines(we even went with one machine but more cpu usage).
We were so happy, we were shouting like anything.
Further, I implemented hhvm to replace php5-fpm which actually has brought down the number of machine to just one.
Now we all are flying. We see yet more improvement to be done.
Thank You everyone. 🙂