Categories
access-log aws elasticsearch elk filter grok grok-debugger gui input kibana log-format logging logrotate logstash monitoring nginx optimization output s3 webbrowser

ELK : Configure Elasticsearch, Logstash, Kibana to visualize and analyze the logs.

This is about how to configure elasticsearch, logstash, kibana together to visualize and analyze the logs. There are situations when you want to see whats happening on your server, so you switch on your access log. Of course you can tail -f from there or grep from that, but I tell you that it is real cumbersome to analyze the logs through that log file.

What if I told you there is a super cool tool, Kibana, which lets you analyze your logs, gives all kind of information in just clicks.   Kibana is a gui tool designed for the purpose to analyze the large log files which when used properly with logstash and elasticsearch(configure elasticsearch, logstash and kibana together) can be a boon for developers.

Now, logstash is a tool which is used to move logs from one file or storage(s3) to another. Elasticsearch is a search server which is used to store the logs in some format.

Now, here is the picture in short, Logstash will bring logs from s3, formatize them and send them to elasticsearch. Kibana will fetch logs from elasticsearch and show it to you. With that lets see the actual thing which matters, the code. I assume you everything installed, ELK, and s3 as the source of our logs(why?).

So first we configure our Logstash. There are mainly three blocks in logstash, input, filter, output. In input, we specify the source of the log file, in filter block, we format the logs the way we want it to be stored in elastic search, in output block we specify the destination for the output.

Code :

open up terminal

nano /etc/logstash/conf.d/logstash.conf

edit it for the following code,

input {

s3 {

bucket => “bucket_name_containing_logs”

credentials => [“ACCESS_KEY”, “SECRET_KEY”]

region_endpoint => “whichever_probably_us-east-1”

codec => json {

charset => “ISO-8859-1”

}

}

}

filter {

grok {

match => {“message” => “grok_pattern”}

}

}

output {

#stdout {

#codec =>json

#}

elasticsearch_http {

host => “localhost”

port => “9200”

codec => json {

charset => “ISO-8859-1”

}

}

}

Explanation :

In input block, we specify that our log comes from s3. We provide necessary credentials to access the s3 bucket. We specify the charset for the input.

In filter block, we use grok tool to create custom fields in kibana by making proper pattern for the input logs. You can use grokdebugger to debug your pattern for a given input.

In output block, we specify the destination for output as elasticsearch, its address. We also specify the charset for the output.

You can uncomment the stdout block in output block to print data on to console.

Elasticsearch

We don’t need to change anything for elasticsearch configuration for now. Though if curious you can find it at /etc/elasticsearch/elasticsearch.yml . One thing which we should keep in mind that we need high configuration machine for this ELK system, otherwise, you might encounter different errors when elasticsearch gets full. One workaround that can be done for that whenever your elasticsearch is full, clear it out.

The following command will remove all index and bring elasticsearch to its initial state.

curl -XDELETE ‘ http://localhost:9200/_all/ ‘

You can read here to optimize elastic search for memory.

Kibana

You don’t have to do much here in its configuration file, just make sure its listening to the correct port number. You can find its configuration file at /opt/kibana/config/kibana.yml

Now go ahead and enter the ip of the machine wherever the kibana is setup and port number or whatever url you specified in kibana.yml into the browser.

Now you can see something like thisconfigure elasticsearch

You can now explore a lot of things from here, visualize logs, compare fields. Go ahead check out different settings in kibana.

That’s it for now.

Welcome and do let me know when you configure elasticsearch, logstash, kibana combo pack.

Cheerio 🙂

Categories
access-log aws bucket cron logging logrotate monitoring nginx s3 s3cmd web-server

Logrotate : Switch on your access log

This is about hot to configure logrotate tool, for your access/error log.
A log is a record of some kind of activity which is usually maintained to track or analyze that activity. An access log is a record of the activities on your server. The activities are like what requests are made to server, how does the server response to some particular server, who accesses the server, etc. It is a good habit and very important to store the access logs. Despite, why do we keep it off (in nginx, access_log off)? It is because on a production server, there is heavy usage of your server and in minutes or so the access log will consume the whole disk space of your server leading to catastrophic consequences.

So it is a bad idea to keep your access log on your machine. There is an amazon service s3( simple storage service) which is quite cheap for such thing. It is used to store data and just store, but no processing. So yeah we can use that to store access log.

This article is about how to configure logrotate to rotate logs on your server to s3. We will use logrotate tool of ubuntu. Logrotate is built mainly for this purpose to rotate large log files periodically. We also use s3cmd tool to sync file from server to s3.

Enough of chit-chat, lets see some real code. The technologies involved are ubuntu 14.04, nginx(some stable version, don’t remember and not in mood to check that out, s3, s3cmd).

open up your terminal.

nano /etc/logrotate.d/nginx;

change the file according to following code.

/var/log/nginx/access.log {

copytruncate

size 100k

missingok

rotate 10

compress

delaycompress

notifempty

create 0640 www-data adm

sharedscripts

dateext

dateformat -%Y-%m-%d-%s

postrotate

INSTANCE=`curl –silent http://instance-data/latest/meta-data/local-ipv4`

/usr/bin/s3cmd sync /var/log/nginx/access.*.gz s3://s3_bucket_name/${INSTANCE}/

endscript

}

Explanation : I will focus on the main directives.

/var/log/nginx/access.log – specify the log file to be rotated.

copytruncate – used in order to avoid reloading nginx after each rotate

size 100k – logs are rotated when the size of the logs exceeds 100k

dateext – append date at the end of the file name in s3

dateformat – specify the date format

you can skip first line, which is used to get the ip of the instance.

/usr/bin/s3cmd – used to sync the access log from their zipped file to the s3 bucket.

Now you can rotate your log file with the following command,

logrotate /etc/logrotate.d/nginx;

We can automate the log rotation scheduling cron for some fixed interval.

open up terminal.

nano /etc/cron.d/logrotate

paste,

*/1 * * * * root /usr/sbin/logrotate /etc/logrotate.conf

save and then exit from the machine and relogin for session to refresh.

Now you have a cron scheduled that runs every minute and rotate the logs if exceeding 100k size.

In cron, you cannot make schedule for seconds, for that you can write a bash script, which is very easy.

So, yeah, by now you must be able to configure logrotate to rotate your access log every minute to s3.

Ok Bye.

Note : don’t place any space in between INSTANCE=’command’ line otherwise it won’t run.

Ask me if any queries. 🙂

 

 

Categories
aws bad configuration nginx s3

The Dark-Knight : How to configure your nginx to retrieve static pages of your website from amazon s3 storage during 502 bad gateway error?

Nginx is widely used as a front-end proxy server to php server which means that when a client makes a request, nginx receives them first and then sends them to php(back-end) server to process them. Nginx has got ability to handle as much as 10000 requests at any point of time.

Now, lets talk about the topic.
There can be situations when you may receive server errors(502 bad gateway, 503 service unavailable, or anything like that). They are seriously nightmares if you have ever faced them. That could create bad impression of you to your users. I have faced it so I can tell you that it is really embarrassing. You probably lose out a lot of your users and other disasters might happen.

The cause for such errors could be from bad coding to high load(good thing, but unable to handle high load is not good).

Now, this is one of the solution where you learn how to configure your system to save yourself from embarrassment. I named it Dark Knight because you shouldn’t need this, but still it guards your website.

Solution:

Suppose, the url to backup has the following form, www.example.com/a/b .

Suppose the static page for www.example.com/a/b is stored in s3 storage as bucket/a/b/a_b.html. The url for same would be https://s3.amazonaws.com/bucket/a/b/a_b.html.

Now following changes can be added in your nginx configuration,

location @static{

rewrite ^ $request_uri;

rewrite /(.*)/(.*) /bucket/$1/$2/$1_$2.html;

proxy_pass https://s3.amazonaws.com;

}

location /index.php {

error_page 502 =200 @static;

fastcgi_intercept_errors on;

#  body

}

 

Explanation:

The usual cause for 502 error is when php fail to handle anymore requests.

The statements, fastcgi_intercept_errors keeps listening if php sends any error to some request, changes the response code for client and error_page redirects to some location on some particular error code(502 here) sent by php.

Now in @static location,

The current value of $uri is /index.php and $request_uri is /a/b. So rewrite it to the form of our static page in s3 bucket url.

The first rewrite changes $uri from /index.php -> /a/b.

The second rewrite changes $uri from /a/b -> /bucket/a/b/a_b.html.

Now proxy_pass directive sends data from given url(static page url) without changing the url.
Note that we can’t append uri after url in proxy_pass in any named_location(it won’t run).
Do provide proper permission to s3 bucket for access.

Cheers. You have learnt now how to configure your nginx to handle 502 error. Similar things can also be done to other server errors.

Here are few related questions, here, here, here, here on stackoverflow.