How To Build a Custom Sitemap for Your Gatsby.js Site

When you require an easy to deploy setup, it is possible to build static sites with React using Gatsby.js. These provide good speed and smooth developer experience. Gatsby has been consistently growing with developers using it for blog, marketing, and e-commerce sites.

Every time, you build a site you can help search engine’s crawlers to improve organic search rankings. You have to ensure that search engines like Google can understand your site’s architecture and index it intelligently. You can do all that by including a sitemap.xml file at the root of your site.

Gatsby homepage

With the growing popularity of Gatsby.js and JAMstack sites, in this tutorial you’ll automatically generate a custom XML sitemap file in your Gatsby powered websites.

XML Sitemap

A website is composed of several web pages like About, Contact, Blog, Subscribe, etc. A sitemap file maintains a list of all these pages to tell search engines like Google about the organization of your site content. Search engine web crawlers like Googlebot read through this file and crawl your site intelligently.

Back in the early days of the web, HTML sitemap which was a manually generated bullet-list were in trend. But, owing to their popularity and importance, sitemaps are published in XML format instead of HTML since their target audience is search engines and not people.

So, an XML sitemap file communicates with the search engines about all the pages that exist on your website.

Importance of Adding a Sitemap File

Considering Search Engine Optimization (SEO), sitemaps are very important. However, they do not affect your search rankings. Instead, if there’s a web page which is not indexed, then sitemap tells the search engines about that page to get it appropriately indexed.

Sitemaps are equally important for both new and old sites. Especially if your site is relatively new then it is recommended to add one since it is difficult for search engines to find posts and pages of a new site. You want to make the search engine’s job as easy as possible to get the most out of it.

You will find sitemap.xml files on most websites. This helps the search engine bots to keep a tab on various updates and basically everything that goes about on a site that should be indexed.

Adding a Sitemap in Gatsby

One key highlight of Gatsby is its growing collection of plugins that implement Gatsby API through simple NPM packages.

Now, to create a sitemap you don’t have to bother writing several lines of code. There is a Gatsby plugin to generate the sitemap file called the gatsby-plugin-sitemap.

You’ll need to have a Gatsby site up and running before continuing with this tutorial.

Installation

To install the gatsby-plugin-sitemap package, run the following command in the root folder:

  • npm install --save gatsby-plugin-sitemap

Using the Plugin: gatsby-plugin-sitemap

Now the plugin is successfully installed, it’s time to add this plugin to the gatsby-config.js file. A quick reminder to those who are new to Gatsby; that inside gatsby-config.js file you’ll find all the site configuration options which are placed in the root folder.

One of these configuration options is for plugins. Here you’ll find an array of plugins that implement Gatsby APIs. Some plugins are listed by name, while others may take options as well — and gatsby-plugin-sitemap carries options as well.

So, add the following lines of code in your gatsby-config.js file:

gatsby-config.js

module.exports = {   siteMetadata: {     title: 'Your Site Title',     siteUrl: 'https://yoursite.com',   },   plugins: ['gatsby-plugin-sitemap'], } 

Make sure that inside siteMetadata you change the title and siteUrl according to your project details.

Generating A Sitemap File

To create a sitemap you need to generate a production build and start the server. In your terminal type the following command and hit ENTER.

  • npm run build

Wait for a few seconds and you get a working sitemap with Gatsby.

Gif of the output from the `npm run build` command

By default, the sitemap is generated in the root of your website which is a folder called public . When you deploy your site, you can access it through /sitemap.xml and it will display all of your site’s pages that are currently accessible to users.

You can access the sitemap of your site with the following URL:

https://your-domain/sitemap.xml 

The gatsby-plugin-sitemap plugin supports advanced custom options so this default functionality can be changed accordingly. Let’s dig a little deep with these options.

Advanced Options for gatsby-plugin-sitemap

The gatsby-plugin-sitemap supports different advanced options that you can customize to gain more control over your sitemap.xml files. Let’s take a look at some of these:

  • output: The default file name is sitemap.xml you can use the output option to change the name of the output file. For example with output: '/some-other-sitemap.xml', the URL now becomes https://your-domain/some-other-sitemap.xml.
  • exclude: This option can help you exclude any links from the sitemap for whatever reasons.
  • query: Custom GraphQL query to fetch info like siteMetadata, siteURL, allSitePage, etc.

There are a couple of other handy options as well for sitemapSize and sitemap index. You can visit the official plugin repo for more info here.

Customized Options Example

For example, in this tutorial, we’re customizing the plugin’s options to generate data of our choice. Here, we’ve customized the GraphQL query:

 {       resolve: `gatsby-plugin-sitemap`,       options: {         query: `{           site {             siteMetadata {               siteUrlNoSlash             }           }           allSitePage {             edges {               node {                 path               }             }           }           allMarkdownRemark {             edges {               node {                 fields {                   slug                 }               }             }           }         }`,         serialize: ({ site, allSitePage, allMarkdownRemark }) => {           let pages = []           allSitePage.edges.map(edge => {             pages.push({               url: site.siteMetadata.siteUrlNoSlash + edge.node.path,               changefreq: `daily`,               priority: 0.7,             })           })           allMarkdownRemark.edges.map(edge => {             pages.push({               url: `${site.siteMetadata.siteUrlNoSlash}/${                 edge.node.fields.slug               }`,               changefreq: `daily`,               priority: 0.7,             })           })            return pages         },       },     }, 

Here, we use the query option to fetch data for our site that includes info about siteMetadata and siteUrlNoSlash. Further, we query the allSitePage to get all site pages URL paths that is to retrieve path property for each graph node through all edges. And finally, we use the allMarkdownRemark that reads files written in markdown and then converts them into HTML pages. Here we are getting the slug info for each markdown post from inside the field property.

Towards the end, we’ve called the serialize function to map data that is fetched for allSitePage and allMarkdownRemark. Each returns a page URL with changefreq: 'daily' and a priority: 0.7.

This was one demonstration of playing around with custom options for the gatsby-plugin-sitemap, you can do it according to the requirement of your project.

You can access a live sitemap demo for Gatsby.js here.

Gif showing the sitemap.xml

Conclusion

Generating sitemaps can be more manageable with Gatsby.js. The result of creating a sitemap.xml file with gatsby-plugin-sitemap shows the developer experience of using Gatsby.js is improving.