Log file analysis helps you understand how search engines are crawling a website and their impact on SEO. These insights are great to help you improve your crawlability and SEO performance.
With these data, you can analyze crawl behavior and determine some interesting metrics like:
The great thing is that you can also do it for free. OnCrawl offers an open source log analyzer.
It will help you spot:
How does it work?
Install Docker Tool Box.
Choose Docker Quickstart terminal to start.
Copy/paste the IP address 192.168.99.100
Then, download oncrawl-elk release: https://github.com/cogniteev/oncrawl-elk/archive/1.1.zip
Add these lines in the terminal to create a directory and unzip the file:
And then, add:
Docker-compose will download all necessary images from docker hub, this may take a few minutes. Once the docker container has started, you can enter the following address in your browser: http://DOCKER-IP:9000. Make sure to replace DOCKER-IP with the IP you copied earlier.
You should see the OnCrawl-ELK dashboard, but there are no data yet. Let’s get some data to analyze.
Importing data is as easy as copying log access files to the right folder. Logstash start indexing any file found at logs/apache/*.log , logs/nginx/*.log , automatically.
If your web server is powered by Apache or NGinx, make sure the format is combined for log format. They should look like:
127.0.0.1 — — [28/Aug/2015:06:45:41 +0200] “GET /apache_pb.gif HTTP/1.0” 200 2326 “http://www.example.com/start.html” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
Drop your .log files into the logs/apache or logs/nginx directory accordingly.
Go back to http://DOCKER-IP:9000. You should have figures and graphs, congrats !
You can also combine those data with crawl data and access a complete view of your SEO performance. You will be able to detect active orphan pages, check crawl ratio by depth or page groups and many more interesting information. To know more about combined analysis, you can check this page.