37
Patrick Stox | @patrickstox #TechSEOBoost Will Go Wrong Everything That Can Go Wrong

TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Embed Size (px)

Citation preview

Page 1: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Will Go Wrong

Everything That Can

Go Wrong

Page 2: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Who is Patrick Stox

I write, mainly for Search Engine Land

I speak at some conferences like this one, SMX, Pubcon, etc.

Organizer for the Raleigh SEO Meetup (most successful in US)

We also run a conference, the Raleigh SEO Conference

Also Beer & SEO Meetup (because beer)

2017 US Search Awards Judge, 2017 UK Search Awards Judge

Some of you may know me from the Wix SEO Hero contest

(I was the one that got disqualified https://beanseohero.com/)

Page 3: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

But First, The Most Important Takeaway

Page 4: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

But First, The Most Important Takeaway

Bing processes JavaScript

Page 5: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

The Typical Enterprise Environment

Page 6: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Multiple Points Of Failure

Multiple infrastructures

Multiple CMS

Middleware systems

Error handling

Redirect handling

Tags in HTTP Header, <head>, Sitemap

Coming from CMS, server, CDN, JavaScript, Tags, Theme, Plugin/Module

Page 7: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Real stuff.

Page 8: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Duplicate International Pages

Hreflang correct, country targeting set in GSC.

Google folded the duplicates together. Info:domain/page for other versions showed the canonicalized version.

This meant the wrong country-language versions showed in Google.

Solution: rework the whole strategy to have fewer, stronger pages and localize where needed.

Page 9: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Error Page Cleanup Project

Mapped all the redirects fine.

Put them in meta refreshes (not ideal to begin with).

Someone else had planned to put them in the HTTP Header and had a location: tag set but with no destination.

Users ended up in the correct location but Google was redirected to nothing before they could see the meta refreshes.

Solution: move the redirects to the HTTP Header.

Page 10: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Robots.txt Blocking Everything

Intermittent issue, in that sometimes we were blocking everything and other times everything was normal.

Turned out to be a shared cache.

Solution: Disabled caching of .txt files at CDN.

Page 11: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Removing Subdomain From Index

They added noindex on the pages, but blocked crawlers with a disallow in robots.txt.

Bots couldn’t crawl the page to see the noindex.

Solution: remove the disallow.

Page 12: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Staging Site Indexed

There’s a canonical to the main website but then they blocked it from being crawled in robots.txt as well.

It’s the same environment, so we are limited in what we can do.

We should use authentication or IP whitelisting, but can’t in this case until we can split it.

Solution TBD: Wait until we can split, let it be crawled (they’re worried about crawl budget), remove in GSC, noindex in robots.txt (unsupported and could cause issues with the canonical), ignore it.

Page 13: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

No Game In Robots.txt

I wanted to do this, and you can. In fact, you can put a whole website in there:

http://ohgm.co.uk/host-a-website-inside-robots-txt/

Page 14: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

JavaScript Redirect Calling JS File

Rather than use the page location, this system processed the redirect in a referenced JavaScript file. This is the only time Ayima Redirect Path has ever failed me.

Page 15: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Failed To Redirect Index File

A hard coded index.php file was live after a migration and resulted in a huge drop in rankings for an important page. This version was indexed along with the newer version of the page, but the rankings dropped off.

Index files usually have special rules for redirecting, but of note is it’s also important to redirect all versions.

Page 16: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

URL Parameter Issue

One setting for a parameter in GSC was set to tell Google it just tracks usage, when in fact each page had different content.

This knocked several hundred thousand pages out of the index.

Page 17: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Links Followed In Redirect Chains

Looked into many long redirect chains to see if links to pages in GSC showed any from the previous versions as “via intermediate link”. Standards recommend no more than 5 hops.

Found one as long as 10 hops that was followed and many that cut off at 6 hops.

Working on processing each in our redirect engine to update to final location.

Page 18: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

English Text In SERPs – Non-English Pages

Google only showed English text for what should have been other languages.

Investigation showed Google was being redirected which turned out to be based on browser language.

Solution (temporary): chose not to redirect Googlebot. This system was going away in 2 months and it wasn’t worth the fight for a more permanent solution.

Page 19: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Canadian English Page – SERP In French

We were seeing French language SERPs for a Canadian English page.

Going to the page we saw English, but Google’s cache was in French.

Page 20: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

React JS

URL: us/en-us?technologyTopics%5B0%5D%5B0%5D=cat.topic%3AITInfrastructure

Links: href="javascript:void(0);“

So unfriendly URLs and uncrawlable links.

React-router created patterns to match /{country}/{locale}/{category}/{slug}

React JS kind of gives you full control of everything, so just help educate the devs on what needs to be done.

Page 21: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

HTTP To HTTPS – Content Security Policy

Everyone read https://searchengineland.com/http-https-seos-guide-securing-website-246940?

Add Content Security Policy (CSP) which you can set to auto-upgrade-insecure-requests and fix your mixed content issues.

Still may have issues with canonical tags and internal links.

Page 22: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Also Referrer Policy

https://searchengineland.com/need-know-referrer-policy-276185

This lets you control how the referrer is sent and can pass it when going from HTTPS to HTTP websites, which may be important for some websites to show they sent traffic out.

We launched a new section set to no-referrer, which dropped the referrer even for internal traffic. It’s now no-referrer-when-downgrade which still drops when going to an insecure page. It’s more of a problem for us because we separate no-referrer and direct traffic in our analytics unlike GA.

Page 23: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Changed Links In CMS To Relative

Yep, all of them. Including the canonicals. Issue was fixed but pages need to be re-published and this is an older CMS without much support.

Page 24: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Why Are 404s Hard?

Throwing 200 instead of 404 on error pages.

Just blanket redirect everything that should 404.

^These are wrong, don’t do this.

Page 25: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

What About 418?

https://www.google.com/teapot

Page 26: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

UTF-8 Characters

Be sure to check your whole tech stack if using UTF-8 characters in the URL. Just because a CMS supports it doesn’t mean everything does.

Page 27: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

301s, 302s, 307s

Everyone knows about 301s, but still some people think that because they’re called “permanent” they can be removed after a period of time. Nope.

We use a lot of 302s to keep original URLs indexed when offloading to a different place or when doing temporary moves.

307s make it interesting. These are usually browser cached from HSTS and there could be a 301 or 302 behind them. You can use fetch in GSC or a browser with no history or incognito mode to see if it’s a 301 or 302 behind this. It matters.

Page 28: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

AMP

Cable connection (70Mbps)

50 sites tested, ~1 second avg difference from the prerendering vs direct load.

Perceived speed is a lot faster.

https://searchengineland.com/the-amp-is-a-lie-278401

Why this is hard for an enterprise:• Analytics support (IBM Digital Analytics working on it.)

• Legal requirements for hosting content.

• Could make our own cache, but wouldn’t get prerendering.

• Business/tracking reasons for scripts that would invalidate cache.

• Extending prerender spec to more than one page could kill AMP (Feature Policy could solve the issues with bandwidth, CPU, security, etc.)

Page 29: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Final Thoughts

Page 30: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

You May Not Ever Fix Everything

There are lots of older systems that aren’t supported but happen to be cheaper to keep running than to get rid of them or fix them.

Page 31: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Get Ahead Of Problems

Look at any new system on the way.

Setup automated testing with server side scripts, selenium, chrome dev tools extensions, etc.

Middleware to let multiple systems interact with each other.

Page 32: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Technical SEO Isn’t Always The Answer

Pick your battles. Sometimes there’s more than one right answer or even if the way you want is technically right, it may not be right for the business needs. Know when to push and when to take a step back.

Look at things like page consolidation and content topics.

Page 33: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Will What You Ask For Make A Difference?

You really need to know this ahead of time. If there’s no noticeable outcome then everyone will be less likely to work with you in the future.

I just did a whole deck at Pubcon challenging conventional wisdom on hreflangtags and pointing out where even tools got it wrong.

https://www.slideshare.net/patrickstox/pubcon-vegas-2017-youre-going-to-screw-up-international-seo-patrick-stox

Page 34: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Tips To Troubleshoot

Info: operator can help with duplicates, hijacking, folded pages.

&filter=0 to Google search URL to show more pages in consideration set.

Site: operator for parameterized pages, sections you may not know about, etc.

Site:domain.com keyword check who is eligible for featured snippets

For JS, use inspect element and fetch and render, not view source and cache.

There’s more here:

https://searchengineland.com/tips-make-better-technical-seo-285329

Page 35: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Machine Learning + Search

Talk to me later.

Useful:

Keyword Classifier (based on ontology/taxonomy and a lot of data points)

Duplicate Content (doc2vec that looks at several levels and goes well beyond phrase match duplicate content checkers)

Meh:

Mapping old content to current for redirects (site: + keyword was faster)

Finding cannibalization (again, a site: search was faster)

Finding internal link opportunities (again, site: search was faster)

Determining link relevance. (LDA)

Page 36: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Want To Work On These Issues Every Day?

We’ll be hiring again soon (early 2018).

Twitter https://twitter.com/patrickstox

LinkedIn https://www.linkedin.com/in/patrickstox

Ask for email (because spam)

Page 37: TechSEO Boost 2017: Everything that Can Go Wrong, Will Go Wrong

Patrick Stox | @patrickstox #TechSEOBoost

Thank You All

Special thanks to Paul Shapiro, Catalyst, and all the sponsors like Stat, DeepCrawl, and Rio SEO