An open source multi-tool for exploring and publishing data

PyPI Changelog Python 3.x License

Annotated version of this introductory video

Datasette is a tool for exploring and publishing data. It helps people take data of any shape or size, analyze and explore it, and publish it as an interactive website and accompanying API.

Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to share with the world. It is part of a wider ecosystem of tools and plugins dedicated to making working with structured data as productive as possible.

Try a demo and explore 33,000 power plants around the world, then take a look at some other examples of Datasette in action.

Then read how to get started with Datasette, subscribe to the monthly-ish newsletter and consider signing up for office hours for an in-person conversation about the project.

New: Datasette Desktop - a macOS desktop application for easily running Datasette on your own computer!

Exploratory data analysis

Import data from CSVs, JSON, database connections and more. Datasette will automatically show you patterns in your data and help you share your findings with your colleagues.

Instant data publishing

datasette publish lets you instantly publish your data to hosting providers like Google Cloud Run, Heroku or Vercel.

Rapid prototyping

Spin up a JSON API for any data in minutes. Use it to prototype and prove your ideas without building a custom backend.

Latest news

5th December 2021 #

Observable notebooks recently added a SQL cell type, allowing SQL queries to be executed as part of an interactive notebook workflow. Alex Garcia built a Datasette Client for these which allows you to excute queries against any Datasette instance and explore and visualize the results using JavaScript code running in a notebook.

14th October 2021 #

Datasette 0.59 adds column descriptions in metadata, a new register_command plugin hook, enhanced --cors support and a bunch of other fixes and documentation improvements. See also the annotated release notes.

8th September 2021 #

Datasette Desktop is a new macOS desktop application version of Datasette, which supports opening SQLite files on your computer, importing CSV files and installing plugins. I wrote more about how it works in Datasette Desktop—a macOS desktop application for Datasette.

28th July 2021 #

The Baked Data architectural pattern describes a pattern commonly used with Datasette where the content for a site is bundled inside a SQLite database file and included alongside templates and application code in a deployment to a serverless hosting provider.

15th July 2021 #

Datasette 0.58 has new plugin hooks, a huge performance improvement for faceting, support for Unix domain sockets and several other improvements. Read the annotated release notes for extra background and context on the release.

5th June 2021 #

Datasette 0.57 is out with an important security patch plus a number of new features and bug fixes. Datasette 0.56.1, also out today, provides the security patch for users who are not yet ready to upgrade to the latest version.

10th May 2021 #

Django SQL Dashboard is a new tool that brings a useful authenticated subset of Datasette to Django projects that are built on top of PostgreSQL.

28th March 2021 #

Datasette 0.56 has bug fixes and documentation improvements, plus some new documented internal APIs for plugin authors and SpatiaLite 5 bundled with the official Datasette Docker container.

18th February 2021 #

Datasette 0.55 adds support for cross-database SQL queries. You can now run datasette --crossdb one.db two.db and then run queries that join data from tables in both of those database files - see cross-database queries in the documentation for more details.

sqlite-utils 3.6 adds similar features: a db.attach(alias, filepath) Python API method and --attach alias filepath.db command-line option, both for attaching additional databases in order to execute cross-database queries.

7th February 2021 #

This new Video introduction to Datasette and sqlite-utils provides a full introduction to both Datasette and sqlite-utils in 17 minutes, including a live demo of creating a database from a CSV file and publishing it to Google Cloud Run.

3rd February 2021 #

Serving map tiles from SQLite with MBTiles and datasette-tiles. datasette-tiles is a new plugin that adds a tile server to Datasette, serving map tiles from databases that conform to the MBTiles specification. download-tiles is a tool for building these databases, and datasette-basemap is a plugin that bundles a 22MB SQLite database with OpenStreetMap tiles covering zoom levels 0-6 for the entire world.

25th January 2021 #

Datasette 0.54 is out today. Highlights include the new _internal in-memory database exposing details of connected tables, plus support for JavaScript modules in plugins and add-on scripts. More commentary on this release is available in the annotated release notes.

24th January 2021 #

Drawing shapes on a map to query a SpatiaLite database introduces the new datasette-leaflet-freedraw plugin, which adds support for drawing shapes on a map to specify a GeoJSON MultiPolygon that can be used to query SpatiaLite databases.

7th January 2021 #

APIs from CSS without JavaScript: the datasette-css-properties plugin introduces datasette-css-properties, a highly experimental plugin that can output table rows and SQL query results as CSS stylesheets defining custom properties that can then be used to customize a static HTML page.

19th December 2020 #

New on this site: a Datasette Tools directory and a search engine that covers documentation, tools, plugins, releases and more. The search engine uses Dogsheep Beta - I wrote about how that works in Building a search engine for datasette.io.

All news

Latest releases

7th December 2021

git-history 0.6

  • Fixed critical bug where columns were incorrectly recorded as consistently toggling between null and their current value. #33
  • Documentation now includes links to live examples of databases created using this tool. #30
  • --wal option for turning on SQLite WAL mode - useful if you want to safely run queries against the database file while it is still being built. #31
  • Fixed bug where list and dict values were not correctly compared for equality. #32
  • The item_version_detail SQL view now includes a _changed_column JSON array of column names that changed in each version. #37
  • Nested packages such as --import xml.etree.ElementTree can now be imported. #39
  • item_version._item is now an indexed column. #38

s3-credentials 0.8

  • s3-credentials create my-bucket --public option for creating public buckets, which allow anyone with knowledge of a filename to download that file. This works by attaching this public bucket policy to the bucket after it is created. #42
  • s3-credentials put-object now sets the Content-Type header on the uploaded object. The type is detected based on the filename, or can be specified using the new --content-type option. #43
  • s3-credentials policy my-bucket --public-bucket outputs the public bucket policy that would be attached to a bucket of that name. #44

3rd December 2021

git-history 0.5

  • The item_version table now only records values that have changed since the previous item version. A new item_changed many-to-many table records exactly which columns were changed in which item version, to compensate for ambiguous null values. #21
  • New --full-versions option for storing full copies of each version instead of storing just the columns that have changed.
  • Major backwards-incompatible schema changes - see README for details of the new schema.
  • New --dialect option for specifying a CSV dialect if you don't want to use auto-detection. #27
  • The history for multiple files can now be stored in the same database, using the new --namespace option. #13
  • --skip HASH, --start-at HASH and --start-after HASH options for skipping specific Git commits or starting processing at or after a specific hash. #26, #28

1st December 2021

github-to-sqlite 2.8.3

  • Minor documentation and inline help improvements.

30th November 2021

s3-credentials 0.7

  • s3-credentials policy command, to output the JSON policy that would be used directly to the terminal. #37
  • README now includes examples of the three different policies. #36
  • s3-credentials put-object and s3-credentials get-object commands for uploading and downloading files from an S3 bucket. #38

datasette 0.59.4

  • Fixed bug where columns with a leading underscore could not be removed from the interactive filters list. (#1527)
  • Fixed bug where columns with a leading underscore were not correctly linked to by the "Links from other tables" interface on the row page. (#1525)
  • Upgraded dependencies aiofiles, black and janus.

22nd November 2021

datasette-publish-vercel 0.12

  • New --generate-vercel-json option to generate the vercel.json that would be used and output it without running the deploy. #51
  • You can then edit that vercel.json file to add custom options, then pass it to the new --vercel-json option for a custom deployment. #51
  • --template-dir and --plugins-dir options now work, thanks Romain Clement. #41
  • DATASETTE_SECRET environment variable can now be used to set a persistent Datasette secret. Thanks, Romain Clement. #43

21st November 2021

git-history 0.4

  • Major changes to the database schema. Foreign keys now use integer primary key IDs rather than using lengthy item or commit hashes, which reduces the database size for large repositories by almost half. #12
  • Python generators can now be used in --convert functions. #16
  • Reserved columns are now marked by an underscore prefix, for example _id and _commit. #14

sqlite-utils 3.19

  • The table.lookup() method now accepts keyword arguments that match those on the underlying table.insert() method: foreign_keys=, column_order=, not_null=, defaults=, extracts=, conversions= and columns=. You can also now pass pk= to specify a different column name to use for the primary key. (#342)

20th November 2021

datasette 0.59.3

  • Fixed numerous bugs when running Datasette behind a proxy with a prefix URL path using the base_url setting. A live demo of this mode is now available at datasette-apache-proxy-demo.datasette.io/prefix/. (#1519, #838)
  • ?column__arraycontains= and ?column__arraynotcontains= table parameters now also work against SQL views. (#448)
  • ?_facet_array=column no longer returns incorrect counts if columns contain the same value more than once.

All releases