Categories
aws optimization php static

Optimization – Store Static Files on AWS S3 with Git Hooks, AWS CLI tool

In this tutorial, we will learn on how to store your static files, like JS, CSS, JSON, images, etc. on AWS S3. This is going to make a drastic improvement for your web servers. The reason for improvement is for every page load, there are over tons of follow up such static file requests. Though, such requests are not processed (only then can be kept at S3 as it is a non-processing storage), these request still hit your web servers significantly. When kept on S3, it saves a lot of file power to the web servers. We are going to use Git Hooks, AWS CLI tool to achieve our goal.

The challenge is repeated uploads of your static files to S3, which, if done manual, is tedious. Here we can make use of Git Hooks and AWS CLI tool to work together to automate the syncing of your static files.

Git Hooks

Git Hooks, in simple terms, are the intermediate steps behind a git command. For example, git pre-push hook is a hook which executes on every git push command, but prior to pushing the code.

AWS CLI

AWS CLI tool in simple terms is a command line interface tool (full form actually) with various commands to communicate with AWS services like S3, EC2, etc.

Here we can write a command for aws using aws cli tool, in the pre-push hook.

The process is that, you can define a constant ASSET_URL for the static files base url location. For ex- for your test environment, it would be http://localhost/project/ and for production it should be your s3 address (or cloudfront url, https://cdn.example.com/, infront of s3). So, the static file urls would look like, ASSETS_URL.'assets/img/a.png', ASSETS_URL.'assets/css/a.css', ASSETS_URL.'assets/js/a.js', etc.

Now the testing and development process would remain same, as all the file copies remain on your local server. But on the production environment, it will look for these files on cdn address provided. So, before sending your code to production servers, you need to update all the new / modified files to the s3.

At this step, git hooks would come into picture. One of the git hook is pre-push hook, which you can edit/create with .git/hooks/pre-push. https://stackoverflow.com/a/14672883/2560576. Example sample for pre-push hook.

In the pre-push hook, you can add aws cli command to update your local assets folder to your s3 assets folder. For example- aws s3 sync assets/ s3://bucket/assets/ --profile aws_credential_profile --acl public-read

So, now when development is complete, and you can execute git push code to push your code to remote repository as usual. But with the help of Git pre-push hook, all the static files will be synced to your s3 bucket’s assets folder just before the actual push.

Now only processing requests are made to your web server, and all static file requests are routed to S3.

Hope this helps someone. Pl give your feedback to improve or add anything.

More- Automatic PWA Converter Platform

Thanks!

Categories
access-log aws elasticsearch elk filter grok grok-debugger gui input kibana log-format logging logrotate logstash monitoring nginx optimization output s3 webbrowser

ELK : Configure Elasticsearch, Logstash, Kibana to visualize and analyze the logs.

This is about how to configure elasticsearch, logstash, kibana together to visualize and analyze the logs. There are situations when you want to see whats happening on your server, so you switch on your access log. Of course you can tail -f from there or grep from that, but I tell you that it is real cumbersome to analyze the logs through that log file.

What if I told you there is a super cool tool, Kibana, which lets you analyze your logs, gives all kind of information in just clicks.   Kibana is a gui tool designed for the purpose to analyze the large log files which when used properly with logstash and elasticsearch(configure elasticsearch, logstash and kibana together) can be a boon for developers.

Now, logstash is a tool which is used to move logs from one file or storage(s3) to another. Elasticsearch is a search server which is used to store the logs in some format.

Now, here is the picture in short, Logstash will bring logs from s3, formatize them and send them to elasticsearch. Kibana will fetch logs from elasticsearch and show it to you. With that lets see the actual thing which matters, the code. I assume you everything installed, ELK, and s3 as the source of our logs(why?).

So first we configure our Logstash. There are mainly three blocks in logstash, input, filter, output. In input, we specify the source of the log file, in filter block, we format the logs the way we want it to be stored in elastic search, in output block we specify the destination for the output.

Code :

open up terminal

nano /etc/logstash/conf.d/logstash.conf

edit it for the following code,

input {

s3 {

bucket => “bucket_name_containing_logs”

credentials => [“ACCESS_KEY”, “SECRET_KEY”]

region_endpoint => “whichever_probably_us-east-1”

codec => json {

charset => “ISO-8859-1”

}

}

}

filter {

grok {

match => {“message” => “grok_pattern”}

}

}

output {

#stdout {

#codec =>json

#}

elasticsearch_http {

host => “localhost”

port => “9200”

codec => json {

charset => “ISO-8859-1”

}

}

}

Explanation :

In input block, we specify that our log comes from s3. We provide necessary credentials to access the s3 bucket. We specify the charset for the input.

In filter block, we use grok tool to create custom fields in kibana by making proper pattern for the input logs. You can use grokdebugger to debug your pattern for a given input.

In output block, we specify the destination for output as elasticsearch, its address. We also specify the charset for the output.

You can uncomment the stdout block in output block to print data on to console.

Elasticsearch

We don’t need to change anything for elasticsearch configuration for now. Though if curious you can find it at /etc/elasticsearch/elasticsearch.yml . One thing which we should keep in mind that we need high configuration machine for this ELK system, otherwise, you might encounter different errors when elasticsearch gets full. One workaround that can be done for that whenever your elasticsearch is full, clear it out.

The following command will remove all index and bring elasticsearch to its initial state.

curl -XDELETE ‘ http://localhost:9200/_all/ ‘

You can read here to optimize elastic search for memory.

Kibana

You don’t have to do much here in its configuration file, just make sure its listening to the correct port number. You can find its configuration file at /opt/kibana/config/kibana.yml

Now go ahead and enter the ip of the machine wherever the kibana is setup and port number or whatever url you specified in kibana.yml into the browser.

Now you can see something like thisconfigure elasticsearch

You can now explore a lot of things from here, visualize logs, compare fields. Go ahead check out different settings in kibana.

That’s it for now.

Welcome and do let me know when you configure elasticsearch, logstash, kibana combo pack.

Cheerio ūüôā

Categories
ab-testing caching configuration hhvm loader.io mastering-nginx microcaching nginx optimization php5-fpm

Super Nginx Configuration : Feeling the 100000 r/s capability of Nginx.

In this post, I will share my experience of how I managed to reduce ten fold the number of¬†servers for our web service and for three fold more traffic. So, actually I could increase our systems efficiency 30 times. For people who can’t scale it, I tell you, this much improvement is tremendous. We are able to save a lot of money. There are still more improvements that can be done. We have seen a lot of posts, articles of how to tweak¬†your Nginx configuration to make it serve lot of¬†requests, I have now actually felt that.

Earlier, for 8000 users/sec, we used to have 23-25 servers running. Now, we have served more than 25000 users/sec with only four machines (Though we also went with one server for around 12000 users/sec, but it gave little high cpu utilization). Even we have served around 45000 users/sec for just 6 servers. Also, the machines used earlier were of higher configuration, than now.

How did we modify our nginx  configuration?

Earlier, we had used a normal configuration with most of the parameters having default or recommended values. My team told me that it used to handle a lot of requests earlier but something happened recently which gave all of us nightmares. We were trying hard to figure out the problem in all possible domains. We improved our code, we tried to modify nginx parameters’ values, db parameters’ values. We were still running on 22 machines for not much load.

Honestly, I didn’t care much about the old configuration anymore, and wanted to write a whole new one. I started with this book Mastering Nginx by¬†Dimitri Aivaliotis.¬†This is a great book. I read mainly 2nd, 4th and 5th chapters.¬†Then I also read these threads,¬†http://tweaked.io/guide/nginx/, ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬†http://www.aosabook.org/en/nginx.html, ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬† ¬†¬†https://www.nginx.com/blog/inside-nginx-how-we-designed-for-performance-scale/, ¬† ¬† ¬†http://highscalability.com/blog/2014/4/30/10-tips-for-optimizing-nginx-and-php-fpm-for-high-traffic-si.html .

I started writing my own Nginx configuration. I used loader.io and ab testing tools all through the way to set correct values for the parameters. I increased timeout values to 300s from 15s. I also altered other values. Then I also implemented caching(microcaching) in Nginx. And caching of file descriptors for caching. I also changed the memory for caching to RAM. Then, I went for TCP connection from Unix socket connection to upstream servers. I observed drastic improvement on loader.io and ab tools tests. Then we went live with changed configuration and we had actually improved a lot. Surprisingly, we could handle the same load with just four machines(we even went with one machine but more cpu usage).

We were so happy, we were shouting like anything.

Further, I implemented hhvm to replace php5-fpm which actually has brought down the number of machine to just one.

Now we all are flying. We see yet more improvement to be done.

Thank You everyone. ūüôā