Using server logs to your advantage

Alex Johnson

alex (at) white.net@alex_cestrian

USING SERVER LOGS TO YOUR ADVANTAGE

@alex_cestrian #OptimiseOxford#OptimiseOxford

What are server logs?


A server log is a simple text file which records activity on a server.



So why bother looking at server logs?


There is only one resource that tells you what search engines are looking for on a domain…

These are web server logs.

including stuff they found 13 years ago.


How do we analyse all that data?






2 SCENARIOS


Scenario 1 IDENTIFY ORPHAN PAGES


An orphan is a page that is not linked to by another page on the site.

Homepage

Dresses Skirts Our offers

Summer 2016 offers


Summer 2016 Offers

@alex_cestrian #OptimiseOxford

Why are orphan pages bad?

• There may be a lot of them, and they may be competing with your ‘live’ content

• They waste GoogleBot’s crawl budget for your domain


So how do we find orphan pages using log files?


Upload a crawl of your website (from SF, DeepCrawl etc)

URLs that return a 200 status code✅ …that don’t appear in the crawl of your site


Redundant content, off little value

404/410 status code

Relevant, valuable but out-of-date

301 redirect to relevant live page

Useful content that orphaned accidentally

Re-attach the page to the website


If GoogleBot is wasting lots of time in a specific folder full of orphan pages that hold no value, block it via robots.txt


Scenario 2IMPROVING CRAWL EFFICIENCY


Find where GoogleBot is wasting time

Find parameter driven pages



Block GoogleBot from crawling these URLs


Find infrequently visited pages Order by number of events: low to high


• Is this URL in the xml sitemap?

• Is the page too deep within the architecture?

• Is internal linking to this page optimal?

• Are links to this page travelling through multiple redirects?

• Can GoogleBot actually parse the links pointing to this page?


Look at all urls, and filter by average response time

Find slow loading pages


If time taken is consistently high, you need to look at how you can reduce the load of the page


“See what GoogleBot is actually consuming. Improve GoogleBot’s diet.”Oliver Mason at Brighton SEO 2016

THANK YOU

@alex_cestrianALEX JOHNSON

THANK YOU

ALEX

Data & Analytics

Using server logs to your advantage