Why does Google Bot not Crawled Enough Pages on some sites
Why does Google Bot not Crawled Enough Pages on some sites
Why does Google Bot not Crawled Enough Pages on some sites Google’s John Mueller discusses how many pages on a website crash and why certain pages don’t crash Google’s John Mueller was asked at Google SEO Office Hours why Google didn’t scan enough web sites. The questioner noted that Google was crawling at a rate that was not sufficient to keep up with a huge website. John Mueller described why Google may not crawl sufficient sites.
What’s the budget for Google Crawl?
Google Bot is Google’s crawler’s moniker, which goes to the web page to index them for ranking purposes. However, because the Web is vast, Google has a policy that only high quality web sites are indexed and not low quality web pages indexed.
According to Google’s developer page for major websites, “Google’s time and resources for scanning a site are generally dubbed the crawl budget of the site.
Note that not everything crawled on your site will necessarily be indexed; every page has to be reviewed, aggregated, and analysed in order to decide if it is indexed once it has crawled.
Two key factors are the budget of Crawl: capacity limit and demand for crawl.”
Related: Google SEO 101: Crawl Budget website explained
What does GoogleBot Crawl Budget decide?
The questioner has a website of hundreds of thousands of pages. But Google was crawling just over 2,000 web pages a day, a rate too sluggish for such a big site. Google’s John Mueller was asked during a Google SEO Office Hours why Google does not scan enough web pages. The questioner noted that Google was crawling at a rate that was not sufficient to keep up with a huge website. John Mueller described why Google may not crawl sufficient sites.
What’s the budget for Google Crawl?
Google Bot is Google’s crawler’s moniker, which goes to the web page to index them for ranking purposes. However, because the Web is vast, Google has a policy that only high quality web sites are indexed and not low quality web pages indexed. According to Google’s developer page for major websites, “Google’s time and resources for scanning a site are generally dubbed the crawl budget of the site.
Note that not everything crawled on your site will necessarily be indexed; every page has to be reviewed, aggregated, and analysed in order to decide if it is indexed once it has crawled. Two key factors are the budget of Crawl: capacity limit and demand for crawl.”
What does GoogleBot Crawl Budget decide?
The questioner has a website of hundreds of thousands of pages. But Google was scanning around 2,000 web pages each day, which is extremely sluggish for such a big site. The questioner asked the following question: “Have you got any more recommendations to obtain an insight into the existing budget? “Just because I feel like we’ve been striving to improve but haven’t seen a jump every day.”
Google’s Mueller asked the individual how large the site is.
The questioner replied: “There are hundreds of thousands of pages on our website.
And we have noticed that about 2,000 pages per day are crawled even though a backlog of over 60,000 pages has been identified, but not yet indexed or crawled.”
John Mueller of Google said, “So I see two key reasons why it happens in practise. On the one hand, if the server is substantially sluggish, i.e. the response time, I suppose you may notice it in the statistics report, too.This is an area where… if, like if I had to give you a figure, I’d say I’m going to strive for anything less than 300, 400 milliseconds, on average.
Because it permits us to crawl as much as we need. It’s not the same as the speed of the page. So that’s… one thing you should watch out for. GoogleBot Crawl Budget Can Impact Site Quality
The next problem of site quality was raised by Google’s John Mueller. Poor website quality may prevent a GoogleBot crawler from crawling on a website. John Müller of Google said: “The other major reason we don’t crawl a lot from websites is that we are not sure of the overall quality.
So that’s something I occasionally see us fighting with, particularly with newer sites. And I can also see people occasionally thinking good, since we have a database and we only put it online, it is theoretically feasible to construct a website with a million pages. And just by that, we will find a lot of these pages mostly from one day to the next, but we won’t be sure yet about the quality of these pages.
And we will be a little more careful to crawl and index them till we are confident the quality is good.” Factors affecting the number of pages Crawls Google Other factors that impact how frequently Google crawls pages have not been discussed.
For example, a website housed on a shared server may not be able to deliver Google pages quickly as there may be other sites on the server with excessive resources, which would slow down the server for hundreds of other sites on the server. Another explanation might be that the server is being hit and slowed down by rogue bots.
The recommendation of John Mueller to notice how quickly the server serves web pages is helpful. Check after hours at night since many crawlers like Google rack in the early hours of the morning because it’s usually a less disturbing time to crawl and less visitors to the website at this time.