Sitemaps
A map of a website: sitemaps portray the structure of web presences, including all directories and subpages. Nevertheless, they are not necessarily intended for visitors to the website. There is usually a much clearer navigation tool for users. However, this doesn’t mean that website operators can neglect their sitemap. But for which purposes do you need this overview, and what variations of sitemaps are there?
What is a sitemap?
A sitemap contains all documents—in other words, webpages—of a website, and presents them hierarchically. This means that the structure of the entire web presence is duplicated in this overview. To understand it, you should briefly familiarise yourself with the set-up of a website: a basic website is comprised of individual HTML documents, which are stored in various folders and interconnected via hyperlinks. All of them together are found on the webspace. In the sitemap, pages are recorded along with corresponding URLs.
In the early days of the World Wide Web, the sitemap was principally created to make users’ website navigation easier. Often inserted as a frame in addition to the main content, sitemap documents gave visitors the opportunity to move from one site to another at any time, without having to click through individual hyperlinks one after the other. Nowadays, the navigation process is usually solved much more elegantly, but the sitemap is still justified. For one thing, having this additional navigation tool can increase user-friendliness, and for another, search engines make use of these files.
XML vs. HTML: sitemap comparison
It is common to differentiate between two versions of the sitemap: it exists in the XML format, and there are also HTML sitemaps. If you want to make the sitemap available to website visitors, you should make an HTML sitemap. This is essentially an additional document that is part of the website, and can be incorporated into the structure of its online presence just like any other HTML page. A sitemap that is created in the XML format, however, is primarily oriented toward search engines. XML is a markup language just like HTML, but the former boasts more functions.
This results in advantages and disadvantages of XML and HTML sitemaps. A navigation file in the HTML format can be used by visitors to the website without complications. Users can easily find their way around the site via the links when they are looking for something. In this way, the sitemap becomes akin to the search function and the navigation bar. The sitemap is thus an additional website component that increases user-friendliness. These days, the sitemap usually isn’t integrated as a frame. Instead, it is common to provide a link to the overview document, above the header or footer of the website, for example.
If you create a sitemap in the XML format, you have the option to submit it to the Google Search Console. This will allow the search engine to gain a better understanding of your entire website. XML also allows you to create a so-called video sitemap. It is difficult for Google and other search engines to read the content of video files, making the search engines dependent on additional data, called metadata. If you have incorporated videos into your site and would like Google to integrate them into its video search, you should provide a video sitemap.
This requires creating an XML file that supplies data about the individual clips on the site. The data includes the title and description of the video file, the URL of the subpage on which the clip is shown, a link to a thumbnail picture, and the storage location of the video player you used. The same strategy also applies to images, so that they show up in image searches.
As a webmaster, you luckily don’t even need to decide whether you would rather trust in an XML or HTML sitemap. Using both is possible; in fact, it provides the best results, for visitors to the website as well as for the web crawlers sent by Google and other search engines. Although the XML option is directly oriented toward the search engine, HTML sitemaps are also used in the web crawlers’ examination of the website as an easy way to take all pages into account.
You can find more information about how to make a strong XML sitemap in our comprehensive article on the topic.
Sitemaps and SEO
Sitemaps play a large role in search engine optimisation (SEO). Why is this? Search engines allow programs – the well-known web crawlers, otherwise known as search bots – to sift through the Internet in order to understand and index it as completely as possible. When such a program arrives on a website, it follows the hyperlinks to find out their content. It is not necessarily guaranteed that the web crawler will be able to record all subpages, though. This is especially relevant to very extensive websites. A sitemap – in XML as well as in HTML format – simplifies the search engine bot’s examination process by providing it with an index of all the webpages.
Even when it comes to pages that are not very well connected to other pages, a sitemap is more than helpful. Web crawlers always follow hyperlinks to move through the World Wide Web. This is why every single page should be linked in a sitemap. Google unfortunately can’t guarantee that the bot will truly take each page into consideration, but the chances of this are at least higher. It is also relevant if the website is still fairly new and if few or no other websites link to its pages.
A strong sitemap in the XML format provides the search engine with additional data about the website: When was it created? How often was it updated? What is one page’s relation to the others of the website? How important is the content in the context of the overall appearance?
Even though one can generally say that an HTML sitemap is oriented more toward users and its counterpart in XML more toward web crawlers, both are important in an SEO context. Sitemaps in an HTML format have just as much influence on ranking, because these documents are also considered during the examination of a site. When determining the ranking order of the search results, Google pays attention to websites’ user-friendliness, too. A clearly organised sitemap boosts usability and can lead to an improved ranking.
Creating a sitemap – explained with examples
The creation of a sitemap is not a hard process, and using a sitemap generator makes it even easier. The best course of action depends on the format you are going for. The HTML sitemap is generally easier to create. This only requires knowing a few HTML ground rules – especially how to correctly mark links. By using href attributes, you can compile a list with links. In actuality, web masters direct more energy toward sitemap creation and, for example, adapting the design of the navigation document to the rest of the website.
<li class="lpage"><a href="http://one-test.website/" title="Theme Preview – Previewing Another WordPress Blog">Theme Preview – Previewing Another WordPress Blog</a></li>
Theme Preview – Previewing Another WordPress Blog
<li class="lpage"><a href="http://one-test.website/about-us" title="About us – Theme Preview">About us – Theme Preview</a></li>
<li class="lpage"><a href="http://one-test.website/our-projects" title="Our Projects – Theme Preview">Our Projects – Theme Preview</a></li>
<li class="lpage"><a href="http://one-test.website/sample-page" title="Sample Page – Theme Preview">Sample Page – Theme Preview</a></li>
<li class="lpage"><a href="http://one-test.website/shop" title="Products – Theme Preview">Products – Theme Preview</a></li>
The creation of the file in the XML format is significantly more extensive. The sitemap begins with a <urlset> tag. Individual URLs are entered within these brackets. The URLs are in turn each embedded in a <url> tag, while the actual link to the subpage should be found in a <loc> tag. While these elements always need to be included, the additional details about the frequency of page edits (<changefreq>), about the date of the last edit (<lastmode>), and about the importance of the page (<priority>), are optional.
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"></urlset>
<url></url>
<loc>http://one-test.website/</loc>
<lastmod>2018-03-23T14:32:21+00:00</lastmod>
<priority>1.00</priority>
<url></url>
<loc>http://one-test.website/about-us</loc>
<lastmod>2018-03-23T14:32:21+00:00</lastmod>
<priority>0.80</priority>
Those who want to make it easier for themselves and avoid writing the entire sitemap manually can fall back on a sitemap generator. All you have to do when using these web services is simply enter the main URL of your own web presence – the sitemap generator will then search the entire website and create an index of all its pages in the process. These helpful online tools are available for XML as well as for HTML sitemaps. Some generators even create several variations at once for the user. For some content management systems, such as WordPress, plugins for the creation of sitemaps are available.
Google suggestions for sitemaps
Although you have a lot of freedom in deciding what the navigation document will look like, there are a few requirements that Google sets for sitemaps that you should meet if you want to improve your search engine ranking. As a result, the sitemap should be coded in UTF-8, should not include more than 50,000 URLs, and should not be larger than 50 MB. The size limit applies to the uncompressed file. You can submit compressed versions of sitemaps to Google as well, but this doesn’t increase the maximum file size.
Google recommends the creation of several sitemaps for web presences that are especially extensive. After doing so, you must create an index file that references all other sitemaps, and submit this to the search engine.
You don’t necessarily need to include all pages of a website in the sitemap. Even with pages that can be accessed via various URLs, you only need to choose the preferred address. The same applies to websites that are very similar (for example, websites that have the same content but were created for use on different devices). You only need to enter the so-called canonical page that Google is supposed to work with.
To finally put the sitemap on Google’s radar, there are two options. One option is to directly upload the file in the Search Console, and the other is to add a reference to the file in robots.txt. This text file is specially designed for search engines and is the first to be retrieved by web crawlers. By linking to the sitemap on the server, the search bot is told where it should look next.