What is the GoogleBot?
The googlebot is the automatic spider program that Google uses to visit websites and report back with results to put into it’s search engine results. By understanding how the GoogleBot works and how it interprets your web pages, can increase your search engine rankings.
There are many factors that Google considers when indexing web pages. Here are a couple of questions and answers about how the GoogleBot works and how to get better search engine rankings.
What is host load limit?
Host load limit is the number of connections that your webserver allows at any given time.
A webserver with a lot of busy websites, would be limited to the number of connections each website can have.
That is why it’s important to have a reliable host that does not have overloaded servers. This means that the GoogleBot can index your website quickly and entirely without being restricted.
What is crawl budget?
Google will index a number of web pages on your website which is directly proportional to your PageRank. The lower the PageRank, the less pages of your website will be crawled. If you have a higher PageRank, most, if not all of your pages will be crawled.
The better your PageRank, the deeper the GoogleBot will go with indexing your website.
How does duplicate content affect crawl budget?
If the GoogleBot crawls three of your web pages and they all have the same content, Google will only index one of those pages and scrap the other two pages. It will then assign a lower PageRank value because of the duplicate content resulting in less pages being indexed by Google.
Are Session IDs bad for my website?
Google does not recommend the use of session ids for a number of reasons. They look ugly which is hard for the user to remember the address and less users are likely to click on a complicated link.
Google and Yahoo both have tools to remove unwanted variables in the address bar. Any variables that do not add value to the user’s experience can be removed for indexing.
Does 301 redirects take the link juice?
If you move a page or even change domain names, it is highly recommended to use 301 redirects. The link juice is carried to the new page. If the original page has a PageRank of 3, the new redirected page would take it.
There is some decay in the PageRank value but much of it is carried on.
What is a canonical tag?
Google has a fantastic post that describes everything about canonical tags including the new cross referencing of domain names. Specify your canonical.