Upload
keyon-trumper
View
215
Download
1
Tags:
Embed Size (px)
Citation preview
Seek And Ye Shall FindThe Collected Wisdom Gleaned
from the EdSeek Project
Enlightenments of the Glaringly Obvious Only After We Learned How Glaringly
Obvious They Were.
BREAK HERE
Training Ground
• Eating your own dog food• Try finding content on your
own website in major search engines1) School Lunch Menu
“shiloh lunch menu”2) CNC Router
“cnc router”
Seekology Primary School:
Life on the Playground
Shiloh School April 2002
Shiloh School June 2002
How do “spiders” find pages
• “robots” or “spiders” follow links• Follow standard html links• DO NOT follow image maps, java
animated menus, etc.• Therefore – you need standard links to all
pages in your site• Ideally include a SITEMAP.html file
– EdSeek.org – functional sitemap– www.shiloh.k12.il.us - main navigation menu
Categorized How?
• The search engine software parses and reads all TEXT in the page (script/comments ignored)
• It assigns priorities to the words– based on location in page
• Title, Header section, Body section
– based on number of times it is used– based on proximity of requested words to
each other• Priority is given to Meta-Tags
Description Meta-Tag Example
Smithsonian Institute
<html>•<head>•<title>Smithsonian Institution</title>•<meta NAME="description" CONTENT="The Smithsonian Institution is composed of sixteen museums and galleries and the National Zoo and numerous research facilities in the United States and abroad.">•</head>
Keyword Meta-Tag Example
IL-TCE Conference - IL-Ed&Tech Conference
<html>•<head>•<title>Illinois Technology Conference for Educators IL-TCE</title>•<meta name="keywords" content="education, educators, educational, youth, conference, opportunities, improve, alternative, program, training, equipment, illinois, technology, ideas, schools">•</head>
Taking Controlling of which Pages are Indexed
•“ROBOTS.TXT” file•Placed at root of Webserver
–Or start in any folder
What is invisible to search engines?
• Images (use alt tags)• Script (Java etc, some Image Maps)• Comments/Scripts• PDF & DOC files are not easily
indexed• Dynamic generated pages from
Databases
School Website Model
BREAK HERE
Seekology 101: Introduction to Seekology
A Primer for Uber-Geeks, Alpha-Geeks,
Neo-Geeks and Non-Geeks Who Seek
Enlightenment No. 1
Nothing is available on the global
network (web) unless someone puts it
there.
Enlightenment No. 2
If you put something on the global
network and don’t tell anyone that it’s there, it might as well not be on the
global network.
Enlightenment No. 3
The most fundamental unit of information on the
global network is the file.
Enlightenment No. 4
The most fundamental method of accessing the most fundamental unit of information is the
hypertext link.
Enlightenment No. 5
Humans access units of information by
following hypertext links using a process
colloquially called “clicking”.
Enlightenment No. 6
Web servers are software programs designed to listen
and respond to requests for files
(clicks) from clients on the global
network.
Enlightenment No. 7
If it cannot be “clicked”, it probably cannot be located by the average human
user, unless one knows the exact location (URL or
address).
Enlightenment No. 8
Search engines consist of 2 separate software programs:
“crawlers” and “indexers”
Enlightenment No. 9
“Crawling” is done by software
programs called “robots”
(aka “spiders” and “spidering bots”).
Enlightenment No. 10
“Robots” work the same way humans do…they “click” on hypertext links and
follow them from file to file.
Enlightenment No. 11
If there is no hypertext link to a
file on a web server, a “robot” cannot find
that file.
Links on your home page are your key to
seekology enlightenment
Enlightenment No. 12Even if a robot finds a file on the web, it may not be able to
parse (read) it.
Enlightenment No. 13
Basic spidering robots can “read”
ASCII. Only the most advanced robots can
read *.doc, *.xls, *.rtf, *.pdf, etc.
Enlightenment No. 14
Unless your web server publishes
“indexes” of files, any file that is not the
target of a hyperlink is invisible to robots
Example of Enlightenment No. 15
This is how one brand of web server shows a file
index
Enlightenment No. 16
The “indexing” software component of the search engine
parses files and stores keywords in a
database on the server.
Enlightenment No. 17
Your interaction with the search engine is in the form of a
keyword search of the database, from which it creates pages of
hyperlinks to the files that contain the keywords along with a brief listing of what those files
contain.
Enlightenment No. 18
The robot and indexing software are designed to pay special attention to text found within meta tagged brackets in the
header area of web pages.
<meta>info </meta>
Enlightenment No. 19
Search Engines are not
intelligent.
Enlightenment No. 20
Search Engines are only as effective as the organization of the global network allows them to be.
Enlightenment No. 21
When material is placed on your web, make sure
there is a “clickable” path to find it.
Seekology 380: Optimization Strategies – Beyond the Primer
Enlightenments You Can Use on Monday.
Enlightenment No. 22
Use either the web server’s automatic
indexing system or a tool such as
“dir2htmldir2html” to create hyperlinked indexes
of files.
Enlightenment No. 23Publish in ASCII when
possible.
This can include plain text, html, asp, php, or other text-
based codingAdvantages = small,
easily parsed, no plugins
Enlightenment No. 24Use META-tags
abundantly for high-profile documents.
META-tag Generators make this easy.
Enlightenment No. 25
Use “ALT=“ attributes to
describe graphics that have valued
context.<img src=“krebs-cycle.gif” alt=“Diagram of Krebs Cycles”>
Enlightenment No. 26
JavaScript menus can mean dead ends to
robots.
Enlightenment No. 27
Deflect robots from your
sensitive server areas with the
“robots.txt” fileUser-agent: * Disallow: /search Disallow: /groups Disallow: /images
Seekology Graduate School:
Seekology Secrets 5010
Anyone can deploy a Search Engine.
Big Secret No. 1
Search Engine Anatomy
How It’s Done
Tools