How To Ensure Optimal Crawling Of Your Website?

Ensure Optimal Crawling Of Your Site
It is one thing to create a website and put up some content, but quite another to get it noticed by Google. Often, the more content you have, the higher your number of crawled and indexed pages in search engines. But that is not always the case. If the crawling process is not optimal, search engines might miss out on some of your content. Today, we have some guidelines for you from Google, explaining which fields in sitemaps are important, when to use XML Sitemaps and RSS/Atom feeds, and how to optimize then for Google.

XML Sitemaps or RSS feeds?

The first question that you could ask is, which to use; XML Sitemaps or RSS/Atom feeds? Should you use RSS/Atom feeds alongside XML Sitemaps? XML Sitemaps are an indispensable part of your site, and they describe a whole set of URLs within it. On the other hand, RSS/Atom feeds describe the most recent changes.

RSS

The problem with XML Sitemaps is, they contain complete site information, and hence are much larger than RSS feeds. Ergo they're also downloaded less frequently. So it's not a question of why, and rather why not use both these formats? Each has its own use, and complements the other.

XML sitemaps give Google information about all the pages on your site, while RSS/Atom feeds let Google know what has been most recently updated on your site. Google also adds that “submitting sitemaps or feeds does not guarantee the indexing of those URLs.”

Sitemap and RSS feeds best practices

In order to optimize the crawl process, you should use XML Sitemaps along with RSS/Atom feeds. Here are some best practices for them from Google.
  • The two most important pieces of information for Google are the URL itself and its last modification time.
  • Only include URLs that can be fetched by Googlebot (ie, don’t include URLs blocked by robots.txt).
  • Only include canonical URLs.
  • Specify a last modification time for each URL in an XML sitemap and RSS/Atom feed
  • For a single XML sitemap, update it at least once a day and ping Google each time.
  • For a set of XML sitemaps, maximize the number of URLs in each XML sitemap. The limit is 50,000 URLs or a maximum size of 10MB uncompressed. Ping Google when each XML sitemap is updated.
  • When a new page is added or an existing page meaningfully changed, add the URL and the modification time to the RSS/Atom feed.
  • In order for Google to not miss updates, the RSS/Atom feed should have all updates in it since at least the last time Google downloaded it. The best way to achieve this is by using PubSubHubbub.
Good luck getting your webpages crawled quickly :)

If you don't want to get yourself into Serious Technical Trouble while editing your Blog Template then just sit back and relax and let us do the Job for you at a fairly reasonable cost. Submit your order details by Clicking Here »

Post a Comment

PLEASE NOTE:
We have Zero Tolerance to Spam. Chessy Comments and Comments with 'Links' will be deleted immediately upon our review.