datasette-atom
Datasette plugin that adds support for generating Atom feeds with the results of a SQL query.
Installation
Install this plugin in the same environment as Datasette to enable the .atom
output extension.
$ pip install datasette-atom
Usage
To create an Atom feed you need to define a custom SQL query that returns a required set of columns:
atom_id
- a unique ID for each row. This article has suggestions about ways to create these IDs.atom_title
- a title for that row.atom_updated
- an RFC 3339 timestamp representing the last time the entry was modified in a significant way. This can usually be the time that the row was created.
The following columns are optional:
atom_content
- content that should be shown in the feed. This will be treated as a regular string, so any embedded HTML tags will be escaped when they are displayed.atom_content_html
- content that should be shown in the feed. This will be treated as an HTML string, and will be sanitized using Bleach to ensure it does not have any malicious code in it before being returned as part of a<content type="html">
Atom element. If both are provided, this will be used in place ofatom_content
.atom_link
- a URL that should be used as the link that the feed entry points to.atom_author_name
- the name of the author of the entry. If you provide this you can also provideatom_author_uri
andatom_author_email
with a URL and e-mail address for that author.
A query that returns these columns can then be returned as an Atom feed by adding the .atom
extension.
Example
Here is an example SQL query which generates an Atom feed for new entries on www.niche-museums.com:
select
'tag:niche-museums.com,' || substr(created, 0, 11) || ':' || id as atom_id,
name as atom_title,
created as atom_updated,
'https://www.niche-museums.com/browse/museums/' || id as atom_link,
coalesce(
'<img src="' || photo_url || '?w=800&h=400&fit=crop&auto=compress">',
''
) || '<p>' || description || '</p>' as atom_content_html
from
museums
order by
created desc
limit
15
You can try this query by pasting it in here - then click the .atom
link to see it as an Atom feed.
Using a canned query
Datasette's canned query mechanism is a useful way to configure feeds. If a canned query definition has a title
that will be used as the title of the Atom feed.
Here's an example, defined using a metadata.yaml
file:
databases:
browse:
queries:
feed:
title: Niche Museums
sql: |-
select
'tag:niche-museums.com,' || substr(created, 0, 11) || ':' || id as atom_id,
name as atom_title,
created as atom_updated,
'https://www.niche-museums.com/browse/museums/' || id as atom_link,
coalesce(
'<img src="' || photo_url || '?w=800&h=400&fit=crop&auto=compress">',
''
) || '<p>' || description || '</p>' as atom_content_html
from
museums
order by
created desc
limit
15
Disabling HTML filtering
The HTML allow-list used by Bleach for the atom_content_html
column can be found in the clean(html)
function at the bottom of datasette_atom/init.py.
You can disable Bleach entirely for Atom feeds generated using a canned query. You should only do this if you are certain that no user-provided HTML could be included in that value.
Here's how to do that in metadata.json
:
{
"plugins": {
"datasette-atom": {
"allow_unsafe_html_in_canned_queries": true
}
}
}
Setting this to true
will disable Bleach filtering for all canned queries across all databases.
You can disable Bleach filtering just for a specific list of canned queries like so:
{
"plugins": {
"datasette-atom": {
"allow_unsafe_html_in_canned_queries": {
"museums": ["latest", "moderation"]
}
}
}
}
This will disable Bleach just for the canned queries called latest
and moderation
in the museums.db
database.