Sign in / Join

What Exactly A Search Engine Robot Can See On Your Site


google botSearch engine crawlers and robots are just software applications. That said, they are very powerful programs.  All the major search engines crawl, copy and index billions of pages across the globe. They then perform incredibly complicated analysis not just on the pages, but also how they all link to each other. All this information is organised in very large data stores, that are additionally replicated across global data centers to make the search service as speedy to the user as current technology allows. As it stands, the most users get responses to their search queries in time measured in tenths of a second. For most of us, the connection to the internet, and not the search engine’s servers that is the bottleneck.

Amazing as this is, there are limitations to what search engines can do. Being mechanical, sometimes the search engine software’s understanding of the content on a page can be limited. When visitors come to your site, they can see the displayed text. When search engines crawl a page however, they retrieve the raw HTML content. Using your browser to “View page source” will give you an example of how the search engines view your page.

Most HTML encoding is ignored by the search engines, which focus on the text or content between these tags. This content is how they determine the ‘unique’ content for each page on your site.

That said, there are some exceptions to this. The most obvious, is the page title. This one element does play a fairly significant factor in determining the overall ranking for your page. To visitors to your site, the page title is displayed in the blue line right at the top of the browser window, also known as the title bar. In most cases, this title tag is what is used as the title of your page when it is shown in the search engine results pages.

This is not always the case though, and some of the more common exceptions are when a site has obtained a DMOZ or Yahoo directory listing. In such cases, the search engines sometimes use the page title used in these listings instead of that available on the page. Using HTML meta tags though, you can also override the default behavior.

Additionally, the search engines will also read any “meta keyword” tags you might have on the site. This is usually a list of keywords that you wish to use to additionally describe the page’s content, even though they may not themselves appear in the actual content as it is displayed to visitors. In the past, spammers used this tag so much, that it’s value has significantly been reduced and is now virtually negligible. Google in particular, no longer uses this tag at all in determining its rankings, while Yahoo! and Bing do seem to still include it in their analysis. As a consequence, spending a lot of time on meta keywords in not going to result in significant improvement in your site’s SERP performance.

The “meta description tag on the other hand, plays no role in your page’s SERP rankings, but is often used as the description for your page in search results. A well-crafted “meta description” which will be displayed to search engine users can therefore influence whether your users actually click on your site on the results pages.

Understanding how search engine’s see your pages can therefore help you spend your limited SEO resources wisely. Failure to do so, will have you spending a lot of time and money optimizing HTML tags that have no effect whatsoever on your SERPS.