Overview

Generate XML sitemaps compatible with Google and other search engines

First, define functions that return URLInfo objects describing the URLs on your site:

from fresco_sitemap import URLInfo

# These can be simple lists:
page_urls = [URLInfo('index.html', changefreq=URLInfo.daily,
                     priority="0.9"),
             URLInfo('termsconditions.html', changefreq=URLInfo.yearly,
                     priority="0.1")]

# Or functions that return lists of URLInfo objects
def morepage_urls():
    return [URLInfo('red.html'),
            URLInfo('green.html'),
            URLInfo('blue.html')]

# Or functions that generate long lists of URLs, eg from a database
def product_urls():
    query = db.execute("SELECT id, modified_at FROM products")
    for id, modified_at in query:
        yield URLInfo(
            urlfor('myapp.views.view_product', id=id),
            lastmod=lastmod
        )

Now create an instance of SitemapViews:

from fresco_sitemap.views import SitemapViews
from fresco_sitemap import SitemapFile

sitemap_views = SitemapViews(

    output_dir='htdocs/sitemaps',

    # sitemap index filename. This can be None if you only have a single
    # sitemap file
    index='sitemap.xml',

    # List the individual sitemap files you want to produce.
    # Notice that each sitemap file may contain more than one source of
    # URLs
    SitemapFile('sitemap-products.xml', product_urls),
    SitemapFile('sitemap-pages.xml', page_urls, morepage_urls),
)

# Finally, include this in your app configuration
app = FrescoApp()
app.include('/', sitemap_views)

This example would create three sitemap files:

  • /sitemap.xml (index file)
  • /sitemap-pages.xml
  • /sitemap-products.xml.

To update the sitemaps you will need to write a short python script, for example:

import requests
from myapp import app

with app.requestcontext("http://www.example.org/"):
    sitemap_views._update()
    sitemap_url = app.urlfor(sitemap_views.sitemap_file,
                             filename=sitemap_views.index_file)

requests.get('http://www.google.com/ping', params={'sitemap': sitemap_url})

You will need to ensure that the user running this script has write access to the specified output_dir.

This script requires the requests library to notify google once the sitemap is successfully updated.