SEO: FTW!

SEO: FTW!How to make the web a better place by improving your site positioning on search engines

In 2013 Internet will celebrate 20 years of publicly available Search Engines, since the first of them (ALIWEB) was made available to the public in a user friendly HTML page SEO has come a long way and is today a well established branch of the Web industry. But what has changed since the long gone days of doorway pages, hidden text and forum spamming? Which rules are still valid and which new ones have beed adopted recently? How new SEO standards will make the Web a better place? What still has to come in the near future?

Follow me while I'll walk through the evolution of SEO as an organic entity and analyze its inner workings and techniques, from the basics to the pro tips. Some new concept will be introduced, some old rules will be confirmed, some myths busted.

And as Search-based functionalities sneak into our daily lives thanks to smarter and smaller Internet-enabled devices, new SEO challenges arise on the horizon of an extremely dynamic scenario; new concepts like Mobile SEO, SEO 2.0 and SEO for UGC (User Generated Content) will become familiar to you as we get to the final section of the presentation.

Federico "Lox" Lucignano

Senior Mobile & SEO Solutions Developer

Wikia sp.z.o.o.

Witam | Welcome | Bevenuti

✔ Who are you (or better: Who am I)?– Federico “Lox” Lucignano

• Senior Mobile & SEO Solutions Engineer @ Wikia

I'm a member of that amazing team which is Wikia's Engineering, Wikia is a great place to work both as a developer and as a SEO/Mobile enthusiast.

Even though my personal interest in Search Engines and their inner workings dates back to the 20th century, my professional involvement in SEO began in 2005; during the last 6 years I had a chance to work on SEO strategies for an infinite variety of websites on different platforms, from the small DVD rental service to big Wedding portals till the huge Community services. Today I'm here to share this experience with you.

Before anyone comes up with two obvious questions let me clarify: YES! I'm Italian (probably my name and surname betrayed me) and...

[next slide]

Pasta ain't my favourite dish :P

...NO! Despite a common misconception that famous Italian movies from the 50's and 60's spread around the world, in Italy we DO NOT eat pasta every day :P

✔ Why are you here (or better: What is this all about)?– SEO, SEM, ATO, [infinite list of obscure

acronyms here]• Evolution

• Techniques

• Standards

Witam | Welcome | Bevenuti

As mentioned, today we'll deal with SEO as an organic entity, we'll quickly go through recent developments, we'll analyze the techniques SEO specialists and enthusiasts of any level are using with success on millions of websites and we'll take a sneak peek into what to expect in the near future.

This brings us to the next big question...

Why should we care about SEO?

Marketing and Monetization are not the only reasons anymore as search engines are sneaking in our daily activities, they even started following us wherever we go...

How?Let's take a quick look at some of the most popular internet enabled applications/platforms

In simple words: making money is not the main reason behind SEO anymore!

That is already a concept of the past, in today's internet-connected world Search Engines are become the main entry point for the Web as vinyl discs have been the main entry point to the world of Music for millions of fans till the introduction of tapes.

But how Google & co. are taking over our daily life?

Simple, by introducing extremely easy to use search functionalities in popular software applications and electronic devices.

Let's take a look at some examples.

[next slide]

Web browsers: Internet Explorer

Versions 7 & 8search box in the upper right corner of the UI

Version 9Search integrated in the

Address bar

The first step was to integrate Web Search capabilities right into the UI of prime time Web browsers, which are probably the software used more and more often than any other nowadays.

At first the search field was a standalone UI widget which would run a search only if the user interacted directly with it by inputting one or more keywords and then pressing a button (e.g. Internet Explorer 7 & 8, Firefox 3, Safari).

But in the latest versions this functionality got integrated in the URL bar, this means that if a user types there anything that is not an URL (even a mistyped one or an incomplete one) a search will be initiated on the (not always) user customizable search engine.

This made the trick, Search Engines are now always there, waiting for your input!

Web browsers: Google Chrome

OmniboxServes both as the

Addess bar and as a search field with realtime

search results and suggestions

To this Chrome adds realtime as-you-tipe search suggestion, basically this implies that every time you type a letter the browser will ask the Search Engine for those suggestions..

Web browsers: Firefox

Awesome barServes both as the

Addess bar and as a search field

Search boxAt the top right corner in

the UI

Default HomepageA Mozilla-customized

Google page

With the release of version 4, Mozilla couldn't make it more clear: they love Google.

By default the browser will let the user search the web via the old-fashioned stand-alone search field (on the right in the UI), via the so called “Awesome bar” (which works in a pretty similar way to Chrome's “Omnibox” URL bar) and via the Firefox-themed default Google homepage. 3 in 1, who could resist such a offer?

I was amazed when discovering that some of my friends are used to type the domain name (without extension) in the first available search filed to them even though they perfectly know the full URL of the website they want to reach and THEN click the first result the Search engine would show. If you think about it it makes a lot of sense, not all the Computer users (and electronics consumers more in general) have a Master Degree in IT, to most of them what matters is to get what they want with the minimum effort (in this case the effort is to type or remember the full address of a website)

It doesn't matter that this way they're triggering a double roundtrip (one to get the search result, another to get to the real site), in the end who cares of such a thing in the era of HighSpeed Internet connections? Exactly, the Search Engines, they DO care since like this they're able to know where the users want to go (even when they don't really need to search anything)!

Desktop, Laptop and netbooks

Desktop Search boxLet's you find any file and

run any action on your system, of course it let

you run searches on the web too

But there's a different breed of software that makes this integration even more pervasive: Desktop Search/Application Launchers.

These applications, which are pretty familiar to Linux and Mac users, let you run commands/tasks, open websites, search for files on your PC, play videos/music, make calculations and, most important of all, initiate Web searches together with integrated results listings. All by just pressing a simple combination of keys and then start typing, whenever you want, doesn't matter what other task your computer is already busy with (except games).

Mobile devices: iOS and Android

Android browser's omniboxAs in chrome, serves both as the address bar and a search bar with realtime suggestions

Mobile Safari's search boxAlways available at the top-

right corner in the UI

Android search widgetSimilar to Google Desktop

search, lets you find any file and run actions on the system altogether with web searches

What said for Web browsers and Desktop Search applies also to modern (and smarter) mobile devices.

The Android default browser and Safari Mobile on iOS behave exactly like Chrome (omnibar) and Safari (standalone search field) while the Android's Seach Widget is a cutted down version of a Desktop Search client.

Here we go, the Search Engine can now follow you any time wherever you go, ready to help you find what you need.

And much more...

Google refrigeratorShows you realtime

suggestions about what you'd like to drink! :D

I'm skipping a long list of other Internet-enabled devices like gaming consoles (my Nintendo Wii runs Opera), Media Centers (which unfortunately have born already old) and new generation TV's/content delivery systems (like Apple TV), anything that can connect to the Internet has the possibility to run a Web search.

One day even your fridge! ;)

A new definition of User eXperience

UX

AccessibleFindable

Usable

Valuable Credible

Desirable

With search functionalities beginning to be so prominent users are starting to abuse them constantly as the main way to get to content, even the one they already know how to get to.

This new habit of “finding” content defines a new set of attributes for the content to be found easily, this is what has been recently named “Findability”; it's one of this new “Internet Slang” words, like Gamification... They sound so New Age, don't they?

This new set of attributes cross the limits of plain SEO and require all the parties involved in the Internet Publishing industry (Design, Marketing, Copywriting, Development, etc.) to cooperate in a much deeper way than before, long are gone the days when a Designer was not supposed to be aware of what markup would be required to lay out his sketches and which impact the position of a piece of content across the page would have in the relevancy given to it.

Findability is actually the main goal of what is being defined as “SEO 2.0”, or “Emotional SEO”; we'll come back to this later, first things first, let's start from the beginning: the title of this lecture...

[switch slide]

What's in an acronym?

✔ SEO– White hats

– Black hats

✔ FTW– For The Win!

– F#@$ The World!

SEO and FTW: how they're related?

•both the acronyms started to be used in the mid-90's

•Both the acronyms are made up of 3 letters

•Both had initially a negative connotation

•Both evolved to outline positive concepts

•The difference is: the negative connotation of FTW didn't have any economic benefit, while SEO Black hat techniques still generate money even though they work only for a short amount of time, that's why the latter still sticks around.

Before we start

✔ SEO✔ SEM✔ SERP✔ PR✔ IM✔ PV✔ CPC✔ ATO✔ Robot / crawler / spider

✔ Page title✔ Meta description✔ Meta keywords✔ Bounce rate✔ Conversion rate✔ Keyword density✔ Web directory✔ Natural / organic results✔ Spamdexing✔ IRYD

Before we start, let's quickly review the meaning of some common terms/acronyms I will use during this talk.

TBD

If you don't know what the last item in the list stands for... that's good! It doesn't exist, I just put it there to check if you are paying attention :) Actually it's a quote from Transformers the movie, “I Rise, You Die” - Optimus Prime

A bit of history first: 20th century

✔ 1993 – ALIWEB, the first public search engine✔ 1995 – Altavista, the first BIG search engine (later

absorbed by Yahoo)✔ 1997 – the SEO acronym appears for the first time

on a web page✔ 1997 – Search engines acknowledge Webmaster's

SEO efforts and started fighting back spamdexing✔ 1998 – Page and Brin develop Backrub and later

found Google

But first a bit of history (courtesy of WikiPedia)

Webmasters and content providers began optimizing sites for search engines in the mid-1990s, as the first search engines were cataloging the early Web. Initially, all webmasters needed to do was submit the address of a page, or URL, to the various engines which would send a "spider" to "crawl" that page, extract links to other pages from it, and return information found on the page to be indexed.

Site owners started to recognize the value of having their sites highly ranked and visible in search engine results, creating an opportunity for both white hat and black hat SEO practitioners.

The first documented use of the term Search Engine Optimization was John Audette and his company Multimedia Marketing Group as documented by a web page from the MMG site from August, 1997 on the Internet Way Back machine (Document Number 19970801004204).

Early versions of search algorithms relied on webmaster-provided information such as the keyword meta tag, or index files in engines like ALIWEB. Meta tags provide a guide to each page's content.

By relying so much on factors such as keyword density which were exclusively within a webmaster's control, early search engines suffered from abuse and ranking manipulation. To provide better results to their users, search engines had to adapt to ensure their results pages showed the most relevant search results, rather than unrelated pages stuffed with numerous keywords by unscrupulous webmasters.

Graduate students at Stanford University, Larry Page and Sergey Brin, developed "backrub," a search engine that relied on a mathematical algorithm to rate the prominence of web pages. The number calculated by the algorithm, PageRank, is a function of the quantity and strength of inbound links.

A bit of history first: 21th century

✔ 2004 – All the major Search Engines officially switch to a PageRank-like algorithm and start to partially disclose details through Webmaster-targeted portals

✔ 2005 – Google introduces personalized search results for logged in users

✔ 2009 – Google introduces history, location and realtime search features, Social “Bookmarking” gains consensus

✔ 2010 – The era of mobile search begins with the rise of smarter mobile devices (iOS, Android)

✔ 2011 – Social networks and recommendation services redefine the web landscape, Google announces +1 and Recipes Search, Facebook embraces microformats

By 2004, search engines had incorporated a wide range of undisclosed factors in their ranking algorithms to reduce the impact of link manipulation. Google says it ranks sites using more than 200 different signals. The leading search engines, Google, Bing, and Yahoo, do not disclose the algorithms they use to rank pages. Notable SEO service providers, such as Rand Fishkin, Barry Schwartz, Aaron Wall and Jill Whalen, have studied different approaches to search engine optimization, and have published their opinions in online forums and blogs. SEO practitioners may also study patents held by various search engines to gain insight into the algorithms.

In 2005 Google began personalizing search results for each user. Depending on their history of previous searches, Google crafted results for logged in users. In 2008, Bruce Clay said that "ranking is dead" because of personalized search. It would become meaningless to discuss how a website ranked, because its rank would potentially be different for each user and each search.

In 2007 Google announced a campaign against paid links that transfer PageRank.[16] On June 15, 2009, Google disclosed that they had taken measures to mitigate the effects of PageRank sculpting by use of the nofollow attribute on links. Matt Cutts, a well-known software engineer at Google, announced that Google Bot would no longer treat nofollowed links in the same way, in order to prevent SEO service providers from using nofollow for PageRank sculpting.[17] As a result of this change the usage of nofollow leads to evaporation of pagerank. In order to avoid the above, SEO engineers developed alternative techniques that replace nofollowed tags with obfuscated Javascript and thus permit PageRank sculpting. Additionally several solutions have been suggested that include the usage of iframes, Flash and Javascript. [18]

In December 2009 Google announced it would be using the web search history of all its users in order to populate search results.[19]

Real-time-search was introduced in late 2009 in an attempt to make search results more timely and relevant. Historically site administrators have spent months or even years optimizing a website to increase search rankings. With the growth in popularity of social media sites and blogs the leading engines made changes to their algorithms to allow fresh content to rank quickly within the search results.[20]

SEO evolution so far: a retrospective

Bruce Clay: “Something must have gone totally wrong”

Retrospective on SEO evolution so far

Who is Bruce Clay? SEO international consultant, Board of Directors member of SEMPO (Search Engine Marketing Professional Organization), member of Web Analytics Association, American Marketing Association, International Internet Marketing Association, Author of the SEO Code of Ethics (translated in 18 languages, published in 2001)

Back in 2005, when Google introduced per-user personalized Search Results, he declared “ranking is dead” since the position of each result in SERPs would be different per each user.

Mr. Clay belongs to the old generation of SEO consultants, those old White hats that saw in a mathematically calculated PageRank the “Panacea” to all the evil concerning the rising SEO industry, to those people, and even more to Black hats, the idea that the ranking of a page would be more and more dependent on the result of actions out of the direct reach of a webmaster is frightening.

It's normal, Search Engine policies are taking power away from the hands of those guys and putting it back in the hands of the users (and their friends, see social developments of the upcoming Google +1 and Stumble Upon), if my salary was totally depending from it I would be scared too.

IMHO: the best has yet to come

The ugly

1993 - 1997

The bad

1998 - 2004

The good

2005 – till now

The best

Yet to come

The best has yet to come: my personal opinion

I'm no big authority in the SEO industry as Mr. Clay is, but as many other “no-ones” who are good at SEO I have my personal, strong opinion that can't but be totally opposite to the one of this Gurus of long fame.

We went through the worst between 1993 (the first public search engines appear) and 1997 (Spamdexing spreads as a standard SEO practice, the Search Engines start to fight back).

Then the situation started to improve between 1998 (Backrub, the first PageRank driven Search algorithm is implemented) and 2004 (all the major search engines switch to a PageRank-like implementation, but still one that could be tricked in many ways).

Starting from 2005 the big players started to introduce new concepts, and more important, STANDARDS; they also started to share some detailed information to Webmasters and SEO enthusiasts and began to panish the bad practices with no mercy. Also the PageRank algorithm became smarter and harder to trick, and keeps improving day by day.

The best is yet to come, with Findability becoming a more prominent goal and Social Networks taking the stage we'll probably see big changes in the not so far future, actually the first signs are appearing right now (Google +1, Stumble Upon, Digg, the new possibilities opened by the interaction between mobile and non mobile devices and the environment [augmented reality] and, best of all, the SEMANTIC WEB [thanks to microformats])

First things first: get crawled

✔ Be sure that your site has interesting content to attract crawlers and check the server uptime

✔ Prepare at least a standard XML Sitemap✔ Have a robots.txt file at the root of your site✔ Subscribe to Webmaster's services for the main

search engines✔ Setup an analytics service for monitoring your traffic✔ Submit your sitemaps manually for fine-grained

control

Before starting

Let's quickly talk about crawlers...

[next slide]

Know the beast

Meet the crawler... ...and his daily meal

I'm amazed sometimes that some “SEO-aware” people still doesn't exactly know how a crawler works, especially they have a really magical/mystical idea of what a crawler “sees” when doing his duty. So let me give you a clear description of it and its' inner workings (even though I'm sure many in this room already know what I'm gonna say).

A web crawler (or spider, or bot) is nothing more than an HTML parser on steroids, he's aware of semantic meaning of the markup and can recognize tricks watching out for tricky CSS/JS at a basic level, it is even able to intepret a series of binary files like PDF's, DOC's and TXT's

Specialized crawlers can do even more, e.g. Googlebot-image is able to analyze an image size, extract the EXIF data, analyze the colours, recognize faces, etc. no matter the file format.

Once you've attracted a crawler attention you need to keep in mind that each crawling pass has a time limit, the crawler won't go through all your contents at once, so be sure to avoid it loosing time on non-relevant and unimportant stuff (i.e. make good use of sitemaps and robots.txt; avoid the crawler get stuck parsing a 25Mb PDF, have a description of the file somewhere in your page where the download link is instead)


✔ Be sure that your site has interesting content to attract crawlers and check the server uptime

✔ Prepare at least a standard XML Sitemap

Point 1

Do not submit to search engines your “under construction” website on your testing server, in the early days of a new website crawling rate is really low and days could pass before your real content would be indexed and start driving traffic.

Point 2

[next slide]

XML Sitemaps

Details @ http://sitemaps.org/

Google expanded the original Sitemaps protocol to include:

•Image data support in standard Sitemaps (giving Googlebot-image crawler image data even before starting crawling a site, with license and caption included)

•News sitemaps (for website accepted in Google News, Googlebot-news)

•Video sitemaps (same as images, data can be included in the regular sitemap, Googlebot-video crawler will collect that data before Googlebot starts crawling the site)

•Mobile sitemaps (this is a separate sitemap, even if your site uses the same URL, Googlebot-mobile and YahooSeeker/M1A1-R2D2 )


✔ Have a robots.txt file at the root of your site

[next slide]

Robots.txt

User-agent: *Disallow: /wikistats/Disallow: /*action=history*Allow: /Special:Sitemap*Allow: /wiki/Special:Sitemap*Disallow: /wiki/Special:*Disallow: /Special:*

Sitemap: http://www.wikia.com/sitemap.xml

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

Details @ http://robotstxt.org

A good usage of the allow/disallow directives is e.g. login/protected content and mobile content (disallow normal crawler, allow mobile crawler, will discuss it later)

WARNING: use it to disallow crawlers from reaching internal SERPs as there's a high probablility that would be flagged as search spam (at least by Google, which made a official announcement on the topic back in 2007)

Part of the robots.txt “protocol” is the ROBOTS HTML meta tag, which differs from the “rel=nofollow” directive introduced by Google for anchor tags.

Those links won't get any credit when Google ranks websites in the search results, thus removing the main incentive behind blog comment spammers robots. This directive only affects the ranking, and the Google robot may still follow the links and index them.

There are two important considerations when using the robots <META> tag:

•robots can ignore your <META> tag. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.

•the NOFOLLOW directive only applies to links on this page. It's entirely likely that a robot might find the same links on some other page without a NOFOLLOW (perhaps on some other site), and so still arrives at your undesired page.


✔ Subscribe to Webmaster's services for the main search engines

✔ Setup an analytics service for monitoring your traffic

✔ Submit your sitemaps manually for fine-grained control

Point 1

Webmaster's services (or tools) are a great resource for monitoring your SEO efforts effects, submit new content and tweak crawling rate. Always keep an eye on their documentation as it gets continously updated (without notice!!!)

Point 2

Analytics let you analyze from where your traffic comes from, using which keywords, how they use your site and where they spend time on it

Point 3

Some search engines (e.g. Yahoo) won't let you access some information/stats deriving from sitemaps discovered only via robots.txt

REMEMBER TO VALIDATE YOUR SITEMAPS BEFORE SUBMISSION (you can use any XSD schema-based validator)

It's just the first step, do it right

Guess I'm getting all this crawling thing totally wrong...

Getting crawled by the major search engines is just a matter of following the simple steps I've descibed, but still many first-timers fail from the very beginning.

Be sure to start the right way ;)

(It's actually a real picture, don't ask me how the Giraffe got over there...)

One step further: be valuable

✔ Write original, quality content✔ Avoid pages with duplicated content✔ Be sure your markup makes the most

out of HTML semantic structures

Point 1

Quality content: people want it, crawlers search for it, you'd better have some. If you think about it, it does make a lot of sense!

Point 2

Recently the term “duplicated content” was extended to what some content farms (now punished by the changes to Google's algorithm done in January 2011) do by copying and slightly change some quality content from other sources, but still the original concept apply, NEVER serve two pages with the same content and different URL (unless you're done your URL normalization right, more about it later)

Point 3

[next slide]

Be brave, embrace HTML5

HTML5 enhanced semantic structuresEnable you to literally model data, in a meaningful way

Using semantic structures helps in reinforcing the meaning and in giving priority to different pieces of content in the same page (e.g. it's possible to use more than one h1 per page wrapping them in sections/articles/headers/footers).

The era of “html constructs just for the sake of layout” is finished, get rid of tables and multifunctional divs. “Web Designers” you have been warned, get an HTML5 book and start reading!

While Google announced they've just started supporting HTML5 tags in a preliminary way, our tests confirms it is working, and it's working good:

[mention to Wikia's Image SEO project]

One more thing: semantic web is still not here (we're just at the beginning), but will be soon, using semantic structures makes your site future-proof.


✔ Use good quality images, enrich them with metadata and captions

[next slide]

Do your image homework

Athlete's epic failure at Beijing Olympic Games

Filename:/pictures/sport/epic_failure-Olympic_Games.png

HTML:<figure>

<img src=”...” width=”235” height=”300”alt=”epic failure at Olympic Games” />

<figcaption>Athlete's <em>epicfailure</em> at Beijing<em>OlympicGames</em>

</figcaption></figure>

Caption

With image search becoming more and more popular (you would be impressed at how much traffic a blog gets from the images used in the post's headline) Image SEO becomes a need rather than a detail, some useful tips:

•use a smantically meaningful folder hierarchy to phisically store the file (the image crawler tries to make sense out of it)

•use keywords associated with the image content in the file name (separate with dash multiple keywords/phrases), in the alt attribute (but totally avoid keyword stuffing, just use what describes the image briefly) and in the image caption

•use images in PNG, JPG and GIF formats, exactly in this order (PNG gives the best results)

•use quality images (at least 235px * 300px, but the bigger the better, you can place a thumbnail in the page and link to the full size version)

•once again: exploit HTML semantic structures to bind the caption to the image itself

•Optionally, give emphasis to the keywords in the caption text

•always try to place the image in a block of text dealing with the same topic (or the reverse, avoid using images that have nothing in common with the surrounding text)

•keep EXIF metadata if your image has it (date, location, camera model, etc.), sooner or later that will turn to be useful for advanced searches


✔ Give specific content that extra dose of semantic meaning with Microformats

[next slide[

The “Micro” advantage

Choose your weaponMicroformats, Microdata or RDFa

It's all about adding semantic value to content<footer class="vcard">This article has been written by <span class="fn">John Doe</span></footer>

But it doesn't work for everythingReviews, People, Products, Businesses and organizations, Recipes, Events, Video

I find Microformats to be an amazing idea, it takes so little to make your (existing) content so much more samantically valuable!

And you have plenty of choice:

- Microformats (based on vcard/hcard standards)

- Microdata (exploiting HTML5's extremely flexible data attribute)

- RDFa (the most complex and verbose [XML based] but probably the most flexible and open)

With the recent launch of Google Recipes Search and with Fecebook embracing hcard microformats on their events pages this became the new focus in the SEO Industry, the Semantic Web has never been so near...

[see the Google Recipes search result example at the bottom of the slide, it shows cooking time, ratings, reviews and amount of calories, all this is stored in the HTML markup of the target page via Microformats]]


✔ Keep on updating contents (ad libitum)

[next slide]

If you can, get a trained update monkey

Done, Boss! Weekly update pushed to server!

The reason why blogs are easier to position than corporate websites, and also the reason why Black hat SEO uses “splogs” (spam blogs) for backlinks is that it's in the nature of this kind of platforms to be continuosly updated and be less complex to crawl (thanks to a simplified site structure)

Crawlers love that, they just crave for fresh content more than a transilvanian vampire craves for vingin's blood.

Keep updating your contents, the best rate is at least twice per week.

Improve your position: be smart

✔ Keep the total number of links on a page below 50

[next slide]

PageRank, it's simpler than THEY think

PageRankthe TeleTubbies' way

PageRank is partially calculated on real attributes/merits of a page (like the content quality) but is partially based on a “weighted voting system”.

Each page is given the possibility to vote for other pages (internal or external, it doesn't matter), the total value of this vote is 1.0. This value get's divided by all the links in that page, but not equally, links using relevant keywords and higher in the page get more “weight”.

Understanding how this works makes it quite clear why a sane SEO strategy won't let any page have more than 50 links, and is also the reason why when requesting backlinks from other sites you should never pick pages that cross that number.

The more “weighted” votes a page gets, the higher its PageRank will be.

Improve your position: be smart

✔ Give a glimpse of the site structure and of the latest updates on your Homepage

✔ Cross link your content to provide more links to most important pages

✔ Normalize your URLs to avoid sequential content being flagged as duplicate content

✔ If your site is about a local business/activity remember to include information about the location at least on the homepage

✔ Mind your domain

Point 1

On your site the Homepage is the page which vote is given the higher weight, whatever you link from there will have higher chances to rank better, so using it to link to relevant/fresh content is a good way to give a boost to those pages while helping the new visitor start browsing your website, it also helps the familiar visitor to get the latest updates quickly.

Point 2

The more a page is linked across a website, the more relevance will be given to it. Keep your cross links updated and use them for really important/relevant stuff.

Point 3

Paginated listings of items (being a list of books to buy or the list of articles contained in a category) can lead to duplicate content, especially if the user can change the sorting order at will. Place a <link rel="canonical" href="[Main URL here]" /> in the page header to tell the crawler that it's just another URL to access the same content. This is the most common scenario but not the only one (e.g. content accessible via different categories)

Point 4

After Google announced the addition of Location-based searches (both via a Geo-locatable device, google maps, or the google search itself) pages containing location information rank higher in that specific kind of SERP. Now that “Personal search” is more than a reality, this is even more important.

Point 5

Try to get a domain name as short as possible, it should contain at least the main keyword you're targeting (is getting harder over time, so when something new comes up, act quickly)

SEO like a PRO

✔ Be sure to include meta tags (for description and keywords) on each page

[next slide[

Use META for fine-grained control on SERP

Old stuff never dies, but changes over time. Today's Search Engines pay little (Yahoo) or no (Google) attention at all to Meta tags in the perspective of granting PageRank. So you won't see those epic battles in big corporations, as depicted in the picture with the two zebras, anymore. That is a thing of the past.

But those tools still have a use in focused SEO strategies, they let you have more control on SERP summaries/snippets, they let you tell the Search Engine for which keywords the Meta description content should be used as a snippet to represent the page in the results listings. When these tags are not present (or the keyword used for searching is not matching their contents) the Search engine will extract a snippet directly from the page content, and most of the times it fails in a very bad way.

So the goal switched from attracting the attention of the crawler to attracting the attention of the user.

When choosing keywords you should take into consideration using long tail ones: the total amount of traffic generated by a “cluster” of those keywords is the same as one/two more general keywords (it's a technique also known as cluster optimization)

SEO like a PRO

✔ Keep your Keyword Density below 3%✔ Use domains at least 3 years old, and more

than 1✔ Model your URLs

Point 1

KD is the percentage of times a keyword or phrase appears on a web page compared to the total number of words on the page, the optimum keyword density is 1 to 3 percent, using a keyword more than that could be considered search spam; there are simple and specific equations to calculate KD for both single keywords and keyword phrases (can find them on WikiPedia)), but there are also numerous automated tools that let you analyze all the pages in a website with a single click.

Point 2

Older domains are given trust (think of those scam/phishing sites that appear and disappear)

Point 3

[next slide]

Make your URLs sexy

http://mysite.com/index.php?itemid=3346 http://mysite.com/angelina-jolie-bikini

Which link would you click on?

This is a classic SEO joke, but the problem is real and still very common.

It doesn't matter if you're writing your own Content Management System or using an existing one, you need to pay attention to the URLs that it generates.

Avoid dynamic formats, using URL rewrite you can achieve quite more interesting page addresses that can catch the attention of the user (since he can understand them) and give valuable information to crawlers (even in a semantic way, e.g. mysite.com/equipment/sport/running/shoes/reebok-air-white).

SEO like a PRO

✔ Be sure to include meta tags (description and keywords) on each page

✔ Keep your Keyword Density below 3%✔ Use domains at least 3 years old, and more than 1✔ Model your URLs✔ Make a wise use of 301 redirects✔ Link to and get linked from relevant resources✔ Make it to Web Directories✔ Keep monitoring and tweaking, your job is never done✔ PubSubHub indexing, only for the good guys

Point 1

Let's clarify it once for all: 302 redirects are meant only for temporary changes, 301 redirects automatically transfer all the attributes of a URL to a new one, PR included. Use it wisely, it's a powerful tool for link sculpting.

Point 2

In a Link Building strategy getting Backlinks from resources that are not relevant nor related to your site focus area are is meaningless, no PR will come from there and they will smell of paid link spamming no matter what, watch out!

Point 3

As for the previous point, Web Directories are considered to be both Credible and Relevant to any topic (since their nature), try to make your site be included in Dmoz's collection of links, submit your site as early as possible for a review.

Point 4

SEO industry is in continous evolution, Ranking algorithms get tweaked at a steady rate, new standards rise and fall and, most of all, the traffic reaching your site changes continously. Keep on monitoring search keywords that bring visitors to your site and tweak your strategy accordingly,

Point 5

[next slide]

Are you the Chosen One?

PubSubHub Google IndexingThis option is offered only to a strict selection of websites

Details @ http://code.google.com/p/pubsubhubbub

Slide transcript TBD.

This will be your obsession

Google logo, in all its glorious variants

As you'll get more and more familiar (and involved) with advanced SEO principles you'll be able to productively exploit them and dramatically improve your site position in SERPs.

But there's still a lot you could do! SEO has a more “commercial” variant, SEM.

SEM: if you've got $$$ to spend

✔ Paid search campaigns✔ Paid links campaigns✔ ATO✔ Paid articles and featurettes (+backlink)✔ Paid blogs

Search Engine Marketing is about being able to apply SEO techniques on someone else's site for your advantage, and of course as you can guess this implies a consistent transfer of money.

But it does it's dirty job pretty well.

Usually this is where you end up digging when you can't do anything more to get better in the organic results and your previous SEO efforts brought you an embarrassing amount of money. When you start doing this, then you'll need to start dealing with the Dark Lord, a.k.a. ROI.

Just one simple rule: whatever you end up buying, be sure to do it on relevant resources related to your website focus area (don't buy links on a automotive website if you sell vegetables, it's just a waste if money)

The power of Bling

If you've got enough budget then the answer to your SEM question is:

YES, YOU CAN!

In the end SEM is all about how big your budget is.

Black hats: meet the bad guys

✔ Keyword stuffing✔ Hidden or invisible text✔ Meta-tag stuffing✔ Doorway pages✔ Scraper sites✔ Article spinning✔ Link spam✔ Link-building software✔ Link farms✔ Hidden links✔ Sybil attack

✔ Spam blogs✔ Page hijacking✔ Buying expired domains✔ Cookie stuffing✔ Spam in blogs✔ Comment spam✔ Wiki spam✔ Referrer log spamming✔ Mirror websites✔ URL redirection✔ Cloaking

Scraper sites

Scraper sites sites, are created using various programs designed to "scrape" search-engine results pages or other sources of content and create "content" for a website.

Article spinning

Article spinning involves rewriting existing articles. This process is undertaken by hired writers or automated using a thesaurus database or a neural network.

Sybil attack

A Sybil attack is the forging of multiple identities for malicious intent. A spammer may create multiple web sites at different domain names that all link to each other, such as fake blogs.

Page hijacking

Page hijacking is achieved by creating a rogue copy of a popular website which shows contents similar to the original to a web crawler but redirects web surfers to unrelated or malicious websites.

Cookie stuffing

Cookie stuffing involves placing an affiliate tracking cookie on a website visitor's computer without their knowledge, which will then generate revenue for the person doing the cookie stuffing.

Referrer log spamming

Referrer spam takes place when a spam perpetrator or facilitator accesses a web page (the referee), by following a link from another web page (the referrer), so that the referee is given the address of the referrer by the person's Internet browser.

Mirror websites

A mirror site is the hosting of multiple websites with conceptually similar content but using different URLs. Some search engines give a higher rank to results where the keyword searched for appears in the URL.

Know your enemy

The Google Almighty Search Patrolling Task Force Team member

If you'll decide to go for the “Dark side” then be aware of who you're gonna fight!

These guys pass their lives buried in the Geekiest corner of Googleplex, probably the last (and only) date they had was with their optometrist for a new pair of 80's looking glasses, this should tell you how much they hate the rest of the world sitting on the other side of the screen, especially SEO smart-asses who go from club to club on their porsche carrera with the typical top model sitting on the passenger side...

They've been trained to kill and they've got a clearance from CIA to do that. If I were in your shoes, I would avoid catching their attention.

Sooner or later...

MHH?!? I sense a disturbance in the SERPS...

Sooner or later Search Engines catch the bad guys, here's usually what happens...

[next slide]

...this is what happens usually

Sorry man, we've just changed our algorithm

And this is usually the explanation! They get huge PageRank penalties (which bury them in the depth of SERPs, when no man has ever been before) or, in the worst cases, their sites get removed from the indexes.

[see what happened to J.C. Penney (linkspamming and non-related paid backlinks) and BMW (doorway pages)]

SEO of the (near) future

✔ SEO for UGC✔ Mobile SEO✔ SEO and Social recommendations✔ SEO and AR

I left this part of the lecture a bit more opened, I'll quickly describe each item on the list, but we can discuss them as you wish (this is valid if you're attending a live lecture/talk)

Point 1

Old way: site staff imposes manual/automated SEO strategy on Community content, New way: involve the community in SEO strategy via engaging/game mechanism [explain SEO dashboard project]

Point 2

Meta.txt, Mobile Sitemaps, .mobi domain, Apps/Markets presence

Point 3

Facebook-Bing integration, Google +1, Stumble Upon, Digg

Point 4

[next slide]

Augmented Reality

Is the “environment” that determines the search results, in a combination of proximity detection, geolocation and image recognition


Wait!

What about SEO 2.0?

Emotional SEO (SEO 2.0) 1/2

SEO✔ Link building, manually adding them,

submitting static websites to directories, link exchange, paying for links

✔ On site optimization for spiders. Example: Repetitive page titles concentrating (solely) on keywords

✔ Competition: You compete with others to be on the first page/in the Google top 10 for keywords

✔ Barter: You give me a link and only then I will give you one

✔ Hiding: We’re not doing SEO, we can’t show our client list publicly, generic SEO company

✔ keywords

SEO 2.0✔ Getting links, via blogging, writing

pillar content, creating link bait, socializing

✔ On site optimization for users. Example: Kick ass post headlines

✔ Cooperation: You cooperate with each other sharing fellow blogger’s post on social media, you link to them

✔ Giving: I link you regardless whether you link back, but in most cases you will, more than once

✔ Being open: Welcome our new client xyz, we are proud to work together with them, Rand Fishkin, Lee Odden, BlueGlass

✔ tags

Here's a brief comparison (courtesy if Tadeusz Szewczyk) of how SEO in the near future will differ from what we currently do and how we currenly think.

Additional transcript TBD.

[continues on next slide]

Emotional SEO (SEO 2.0) 2/2✔ optimization for links and rankings✔ clicks, page views, visits✔ DMOZ✔ Main traffic sources: Google, Yahoo,

MSN✔ one way communication✔ top down, corporations and old

media decide what succeeds✔ undemocratic, who pays most is on

top✔ 50% automated, half of the SEO

tasks can be done by SEO software✔ technocratic

✔ optimization for traffic and engagement

✔ conversions, ROI, branding

✔ Delicious, Digg

✔ Main traffic sources: Facebook, Twitter, StumbleUpon, niche social news sites, blogs

✔ dialog, conversation

✔ bottom up, wisdom of crowds determines true popularity via participation

✔ democratic, who responds to popular demand is on top

✔ 10% automated, most SEO 2.0 tasks are about content and interaction

✔ emotional


Links you might find interesting

Webmaster Tools✔ http://www.google.com/webmasters/tools/✔ https://siteexplorer.search.yahoo.com/mysites✔ http://www.bing.com/webmaster/

References✔ http://www.google.com/support/webmasters/✔ http://help.yahoo.com/l/us/yahoo/search/webcrawler/index.html

News✔ http://www.seroundtable.com/✔ http://www.ysearchblog.com/✔ http://www.bing.com/community/site_blogs/b/webmaster/default.aspx✔ http://googlewebmastercentral.blogspot.com/

Here's a list of links definitely worth a mention (and a visit).

Before we finish...

Questions, anyone?

If you're not attending a live lecture/talk please feel free to contact me via:

Twitter: http://www.twitter.com/federico_lox

Facebook: http://www.facebook.com/flucignano

Tumblr: http://loxzone.tumblr.com

Thank you for your time

Thanks for your attention, I hope you enjoyed the lecture!

Thanks to

Wikia for letting me prepare this lecture

Politechnika Poznańska for hosting it

Dr Andrzej P. Urbański for organizing the event

DISCLAIMER

No animal, evil Dark Lord or U.S. President has been harmed during the preparation of this presentation.

Software

SEO: FTW!