PowerShell script to make an XML sitemap

A while back I wrote a post on how to create a sitemap in the standard sitemap.org format using Python. This post does the same task using PowerShell. The solution presented here is an idiomatic PowerShell solution using pipes, not a direct translation of the Python code. I’ll introduce the script in pieces, then present the entire script at the end.

The final line of the script is

dir | wrap | out-file -encoding ASCII sitemap.xml

The heart of the script is the function wrap that wraps each file’s properties in the necessary XML tags. This function uses the pipeline, and so it has begin, process, and end blocks. The begin block prints out the XML header and the opening <urlset> tag. The end block prints out the closing </urlset> tag. In between is the process block that does most of the work.

Since all unassigned expressions are returned from PowerShell functions, the code is very clean. No need for print statements, just state the strings that make up the output. Variable interpolation helps keep the code succinct as well: simply use the name of a variable where you want to insert that variable’s value in a string. (Be sure to use double quotes if you want interpolation.)

The wrap function uses the implicit variable $_ which means “the next thing in the pipeline.” Since we’re piping in the output of dir (alias for Get-ChildItem), $_ represents a FileSystemInfo object. We look at the extension property on this object to see whether the file is one of the types we want to include in the sitemap. In this case, .html, .htm, or .pdf. Obviously you can edit the value of the variable $extensions if you want to include different file types in your sitemap.

Getting the file timestamp in the necessary format is particularly easy. The format specifier {0:s} causes the date and time to be written in the ISO 8601 format that the sitemap standard requires. The Z tacked on at the end says that time is UTC rather than some other time zone.

This script will produce a file sitemap.xml in the standard format. Once you upload the sitemap to your server, you’ve got to let the search engines know how to find it. The simplest way to do this is to create a file called robots.txt at the top of your site containing one line, Sitemap: followed by the URL of your sitemap.

Sitemap: http://www.yourdomain.com/sitemap.xml

Now here’s the full script.

# Change this to your URL
$domain = "http://www.yourdomain.com"

# file extensions to include in sitemap
$extensions = ".htm", ".html", ".pdf"

# wrap file information in XML tags
function wrap
{
    begin
    {
        '<?xml version="1.0" encoding="UTF-8"?>'
        '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">'
    }

    process
    {
        if ($extensions -contains $_.extension)
        {
        "`t<url>"
        "`t`t<loc>$domain/$_</loc>"
        "`t`t<lastmod>{0:s}Z</lastmod>" -f $_.LastWriteTimeUTC
        "`t</url>"
        }
    }

    end
    {
        "</urlset>"
    }
}

dir | wrap | out-file -encoding ASCII sitemap.xml

One thought on “PowerShell script to make an XML sitemap

  1. At first I was a little confused as to how this script should be executed. I initially thought that the script ended with the last close-curly-brace. I expected to dot-source the code then execute the script via “dir | wrap | out-file -encoding ASCII sitemap.xml”.
    That did work, but I later realized that ALL your code should be put into a file (myScript.ps1) and I should execute the file via “.myScript.ps1”.
    Very cool results, either way!

Comments are closed.