Tuesday, May 20, 2008

All about google bot

Robot : a machine that resembles a human and does mechanical, routine tasks on command.


As per the definition in the dictionary goes. Well the google is the google robot that actually visits all your pages and then indexes them . Right now the internet is so huge and vast that people say that only 1/6 th of the total database is indexed as of now. Though Google is the round about the biggest in terms of knowledge database as far as the net is concerned but it still has a lot to go.


How has this come to happen. This happens because the crawlers that google guys had initially wrote depends on the list of URL’s they found embedded in the html files to start with. This tree like structure then grows and keep adding new urls everyday. Googlebot needs your webpage to be linked from somewhere for them to come and access it. With surprising revelations from the Google Story you will come to know that the indexing algorithm that they employ is one of its kind and they keep it a very guarded secret together with the compression they are trying to achieve.


Another way to get your page uploaded to their pages it eh google ping which I will write about in a later article. You can do a little bit of search to know about it more. This will have a location to allow you to add your URL . Now there are hundreds and may be millions of pages out there which don’t have any backlinks from any of the site Google has in its database , so there is no way they will come to know of this pages and no way they will be indexed unless Google bot goes out of the way to access each and every individual server that hosts any thing online. And once Google has it we have it.


Google bot has two types, The deep bot and the fresh bot. The deep bot will do the whole indexing together with words and sentences in your articles, pages and blogs once a month. Given that the number of pages is increasing exponentially every day and the number of people spamdexing deep bot  has to be really strike a balance between speed and correct indexing.


There is also a Fresh bot which will cater to the needs of immediate indexing of the recently added pages.

Currently webmasters complain about the google bot clogging the badwidth with their crawling ( one too many times ) .You can actually change the settings using a Webmaster Tools setting for the crawler on your website .

