WordPress Search Engine Optimization: Google XML sitemaps
How to let search engines know which are all the pages of your site? How often are they updated? Is the page indexed and cached by the search engine up to date with the current online version or should the spider re-index it? Since a couple of years, the answer to these questions is: with XML sitemaps.
An XML Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site (source: http://www.sitemaps.org/).
If you want to develope your own xml sitemap, you might be interested in studying the xml sitemap protocol.
I implemented my own sitemap builder for several sites with a number of pages that goes from 20,000 to 100,000. Because of the high number of urls, the sitemap is split in several files, each containing the pages created during a certain month. Every night a cron job rebuilds the “current month” xml file and pings search engines to warn them of the update.
But if you’re using WordPress, you don’t need to develop anything because there’s a very good plugin that does everything: XML-Sitemap.
I left almost every plugin setting to its default value except:
- Notify YAHOO about updates of your Blog: checked and updated with my application id
- Add sitemap URL to the virtual robots.txt file: unchecked, I have my own robots.txt file with the sitemap directive
- Sitemap Content: I checked Include homepage , Include posts , Include static pages , Include tag pages so that it does not include pages set as duplicated in the All in one SEO plugin
- Minimum post priority (Even if auto calculation is enabled): 0.9; yes the home page is the most important page because it contains the list of the newest posts, but it is the post detail what I want to be displayed in SERPs, so I chose to have the post priority higher
As soon as you have your own sitemap (you can see mine here), sign up for the Google Webmaster Tools program.
Google Webmaster Tools
The first thing you need to do is add your site url and verify it. Google wants to be sure you are the site owner, because if you are, it lets you know some very interesting information. If you need help during this process, leave a comment.
Then submit your sitemap in the “Sitemaps” section and then… wait. The sitemap takes a few hour to be analyzed but unfortunately all the other data that Webmaster Tools provides might take weeks to be collected. Google is fast for users but very slow for webmasters :D
Helpful data Google Webmasters Tools provides
I suggest you to have a look at every webmasters tools page and, of course, to the help section too, which gives us some pretty cool hidden nuggets.
These are sections I use the most:
- Dashboard > Diagnostics > Web crawl: warns you about possible errors the spider came across
- Dashboard > Diagnostics > Content analysis: warns you about title tag and meta description problems; I have 2 too short meta descriptions that were indexed before I installed the All in one SEO plugin
- Dashboard > Statistics > Top search queries > Impressions: while you can see search queries who pushed visitors to your site from your statistic tool, here you see queries where your site appeared in search results but was not chosen; it’s the best place to understand title tags and meta descriptions that need to be more persuading, if the keyword phrase shown there was relevant to your page
- Dashboard > Statistics > What GoogleBot sees: link text in <a> tags of external sites pointing to your one. It’s very important because of: 1) you understand how your blog is perceived outside; 2) keywords contained in links to one of your page are summed up with the ones written on the page itself, they might render them less relevant (try contacting the website owner) or help you (thanks him)
- Sitemaps > Sitemap details: provide information about your submitted sitemap and compares url submitted from the xml file with the ones actually indexed by Google
Yahoo! Site Explorer
Yahoo site Explorer is quite similar to Google webmasters tools. Once you submit and verify your website, you access information about how many pages of you blog are indexed by Yahoo! Adding your feed url, in the “Feed” section might help the search engine sending a spider to check your latest posts.
Is it really worth it?
I often wonder if sitemaps really worth the effort they take. My answer tends to no. You can submit all the pages of your site, but if they have no back links or are not well built, they won’t appear in search results.
Does the ping of the just updated sitemap bring the spider to your site so that your new / updated pages are indexed? No, the spider comes when the Search engine wants. In the big sites I manage (where many contents are published daily), there are always couple of spiders downloading pages but they do not download the newly created ones, not always. They just follow links. The sentence included at the end of sitemaps.org and repeated by every search engine who claims to support them, cannot be more true: Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site. But, since we have a plugin that does all the work, it’s not really an effort producing sitemaps :)





















































