An open source multi-tool for exploring and publishing data

PyPI Changelog Python 3.x License

Annotated version of this introductory video

Datasette is a tool for exploring and publishing data. It helps people take data of any shape or size, analyze and explore it, and publish it as an interactive website and accompanying API.

Datasette is aimed at data journalists, museum curators, archivists, local governments and anyone else who has data that they wish to share with the world. It is part of a wider ecosystem of tools and plugins dedicated to making working with structured data as productive as possible.

Try a demo and explore 33,000 power plants around the world, then take a look at some other examples of Datasette in action.

Then read how to get started with Datasette, subscribe to the weekly-ish newsletter and consider signing up for office hours for an in-person conversation about the project.

Exploratory data analysis

Import data from CSVs, JSON, database connections and more. Datasette will automatically show you patterns in your data and help you share your findings with your colleagues.

Instant data publishing

datasette publish lets you instantly publish your data to hosting providers like Google Cloud Run, Heroku or Vercel.

Rapid prototyping

Spin up a JSON API for any data in minutes. Use it to prototype and prove your ideas without building a custom backend.

Latest news

5th June 2021 #

Datasette 0.57 is out with an important security patch plus a number of new features and bug fixes. Datasette 0.56.1, also out today, provides the security patch for users who are not yet ready to upgrade to the latest version.

10th May 2021 #

Django SQL Dashboard is a new tool that brings a useful authenticated subset of Datasette to Django projects that are built on top of PostgreSQL.

28th March 2021 #

Datasette 0.56 has bug fixes and documentation improvements, plus some new documented internal APIs for plugin authors and SpatiaLite 5 bundled with the official Datasette Docker container.

18th February 2021 #

Datasette 0.55 adds support for cross-database SQL queries. You can now run datasette --crossdb one.db two.db and then run queries that join data from tables in both of those database files - see cross-database queries in the documentation for more details.

sqlite-utils 3.6 adds similar features: a db.attach(alias, filepath) Python API method and --attach alias filepath.db command-line option, both for attaching additional databases in order to execute cross-database queries.

7th February 2021 #

This new Video introduction to Datasette and sqlite-utils provides a full introduction to both Datasette and sqlite-utils in 17 minutes, including a live demo of creating a database from a CSV file and publishing it to Google Cloud Run.

3rd February 2021 #

Serving map tiles from SQLite with MBTiles and datasette-tiles. datasette-tiles is a new plugin that adds a tile server to Datasette, serving map tiles from databases that conform to the MBTiles specification. download-tiles is a tool for building these databases, and datasette-basemap is a plugin that bundles a 22MB SQLite database with OpenStreetMap tiles covering zoom levels 0-6 for the entire world.

25th January 2021 #

Datasette 0.54 is out today. Highlights include the new _internal in-memory database exposing details of connected tables, plus support for JavaScript modules in plugins and add-on scripts. More commentary on this release is available in the annotated release notes.

24th January 2021 #

Drawing shapes on a map to query a SpatiaLite database introduces the new datasette-leaflet-freedraw plugin, which adds support for drawing shapes on a map to specify a GeoJSON MultiPolygon that can be used to query SpatiaLite databases.

7th January 2021 #

APIs from CSS without JavaScript: the datasette-css-properties plugin introduces datasette-css-properties, a highly experimental plugin that can output table rows and SQL query results as CSS stylesheets defining custom properties that can then be used to customize a static HTML page.

19th December 2020 #

New on this site: a Datasette Tools directory and a search engine that covers documentation, tools, plugins, releases and more. The search engine uses Dogsheep Beta - I wrote about how that works in Building a search engine for datasette.io.

13th December 2020 #

I wrote more about how this site works in Weeknotes: datasette.io, an official project website for Datasette.

10th December 2020 #

Datasette 0.53 - datasette serve --create option, ?column__arraynotcontains= table filter, ?_header=off CSV option and updated links to reflect the new project website.

New Datasette project website - three years after the first release Datasette finally has its own website! The site includes project news, a plugin directory, use-cases and examples. The site is implemented using Datasette custom templates, and the source code is available on GitHub.

28th November 2020 #

Datasette 0.52 - --config is now --setting, new database_actions plugin hook, datasette publish cloudrun --apt-get-install option and several bug fixes.

31st October 2020 #

Datasette 0.51 - A new visual design, plugin hooks for adding navigation options, better handling of binary data, URL building utility methods and better support for running Datasette behind a proxy. Annotated release notes.

All news

Latest releases

19th June 2021

sqlite-utils 3.10

This release introduces the sqlite-utils memory command, which can be used to load CSV or JSON data into a temporary in-memory database and run SQL queries (including joins across multiple files) directly against that data.

Also new: sqlite-utils insert --detect-types, sqlite-utils dump, table.use_rowid plus some smaller fixes.

sqlite-utils memory

This example of sqlite-utils memory retrieves information about the all of the repositories in the Dogsheep organization on GitHub using this JSON API, sorts them by their number of stars and outputs a table of the top five (using -t):

$ curl -s 'https://api.github.com/users/dogsheep/repos'\
  | sqlite-utils memory - '
      select full_name, forks_count, stargazers_count
      from stdin order by stargazers_count desc limit 5
    ' -t
full_name                            forks_count    stargazers_count
---------------------------------  -------------  ------------------
dogsheep/twitter-to-sqlite                    12                 225
dogsheep/github-to-sqlite                     14                 139
dogsheep/dogsheep-photos                       5                 116
dogsheep/dogsheep.github.io                    7                  90
dogsheep/healthkit-to-sqlite                   4                  85

The tool works against files on disk as well. This example joins data from two CSV files:

$ cat creatures.csv
species_id,name
1,Cleo
2,Bants
2,Dori
2,Azi
$ cat species.csv
id,species_name
1,Dog
2,Chicken
$ sqlite-utils memory species.csv creatures.csv '
  select * from creatures join species on creatures.species_id = species.id
'
[{"species_id": 1, "name": "Cleo", "id": 1, "species_name": "Dog"},
 {"species_id": 2, "name": "Bants", "id": 2, "species_name": "Chicken"},
 {"species_id": 2, "name": "Dori", "id": 2, "species_name": "Chicken"},
 {"species_id": 2, "name": "Azi", "id": 2, "species_name": "Chicken"}]

Here the species.csv file becomes the species table, the creatures.csv file becomes the creatures table and the output is JSON, the default output format.

You can also use the --attach option to attach existing SQLite database files to the in-memory database, in order to join data from CSV or JSON directly against your existing tables.

Full documentation of this new feature is available in Querying data directly using an in-memory database. (#272)

sqlite-utils insert --detect-types

The sqlite-utils insert command can be used to insert data from JSON, CSV or TSV files into a SQLite database file. The new --detect-types option (shortcut -d), when used in conjunction with a CSV or TSV import, will automatically detect if columns in the file are integers or floating point numbers as opposed to treating everything as a text column and create the new table with the corresponding schema. See Inserting CSV or TSV data for details. (#282)

Other changes

  • Bug fix: table.transform(), when run against a table without explicit primary keys, would incorrectly create a new version of the table with an explicit primary key column called rowid. (#284)
  • New table.use_rowid introspection property, see .use_rowid. (#285)
  • The new sqlite-utils dump file.db command outputs a SQL dump that can be used to recreate a database. (#274)
  • -h now works as a shortcut for --help, thanks Loren McIntyre. (#276)
  • Now using pytest-cov and Codecov to track test coverage - currently at 96%. (#275)
  • SQL errors that occur when using sqlite-utils query are now displayed as CLI errors.

13th June 2021

dogsheep-beta 0.10.2

  • Fixed bug where searches consisting only of whitespace caused a server error. #33

yaml-to-sqlite 1.0

  • New feature: yaml-to-sqlite dogs.db dogs.yaml --single-column=name can be used to import YAML files consisting of a list of items into a single table with a single primary key column. #5
  • Importing YAML files with new columns will now alter the table to add those columns. #4

markdown-to-sqlite 1.0

  • BREAKING CHANGE: The option order for this command is now markdown-to-sqlite database.db tablename file1.md file2.md - prior to version 1.0 the order was markdown-to-sqlite file1.md file2.md database.db tablename. This change was made for improved consistency with other x-to-sqlite tools.

sqlite-utils 3.9.1

  • Fixed bug when using table.upsert_all() to create a table with only a single column that is treated as the primary key. (#271)

datasette-geojson v0.3.0

  • Spatialite's geometry blob is now automatically parsed into GeoJSON, so the same query will output GeoJSON with and without Spatialite.

12th June 2021

sqlite-utils 3.9

11th June 2021

db-to-sqlite 1.4

  • New --postgres-schema option for specifying a different PostgreSQL schema, thanks Jeremy Dormitzer. #29
  • Correctly handles table names that need escaping, such as user in PostgreSQL. #25
  • Empty tables are now created in the SQLite database with a matching schema, rather than being skipped. #28
  • Connection strings starting with postgres:// are treated as if they started with postgresql://. #34

10th June 2021

datasette-mask-columns 0.2.1

  • No longer pins a specific version of Datasette.

datasette 0.58a0

All releases