|
Meta
Tags to control indexing and crawling
::
Search
engines dispatch search spiders (robots)
to visit your site and follow the links
on your site. This is called 'crawling'.
Every page that a spider went to, it will
collect
keywords
and report
it back to base. This action is called
'indexing'. When a user search for a keyword
on a search engine,
the search
engine then uses this database of keywords
collected by its spiders to show the user
relevant search results.
Search spiders will visit every page that
is linked from the certain page that they
are on. Once they moved to the linked page,
they will visit every link on that linked
page. This process is carried out until
every linked page on your site is visited
and indexed. Once this happens, they will
leave your site and come back to it once
in a while to index changes to your site.
By default, spiders will index every page
and follow every link on each page. To
control this default action, you can use
Meta Tags.
Implementing
the robots Meta Tags
::
Put the robots Meta Tag in your HTML
in the header
of your web page between the <head> and </head> tags
in the form of a meta tag.
Following
are four examples of this code;
<meta name="robots" content="index,
follow">
<meta name="robots" content="noindex,
follow">
<meta name="robots" content="index,
nofollow">
<meta name="robots" content="noindex,
nofollow">
As you can observe, they are different
combinations of index, noindex, follow
and nofollow. The meaning of each parameter
is explained below:
index - spiders will analyse the page for
search results.
noindex - spiders will not analyse the
page for search results.
follow - spiders will follow the links
used on that page.
nofollow - spiders will
not follow the links used on that page.
By default, all spiders will treat pages
without the robots meta tag as "index,
follow".
Based on your combination of index and
follow parameters, you would then insert
the robots meta tags
into your
HTML in the header as follows:
<html>
< head>
<title>A
Page</title>
<meta name="robots" content="noindex,
nofollow">
</head>
That's it! The above example is using
noindex and nofollow which will be treated
by spiders as if that page is not there.
Special Notes:
If you have highly confidential material
that you do not want to be seen by just
anybody, this is not adequate protection.
To protect
sensitive data, implement password
protection that requires visitors to log
in.
|