An open source multi-tool for exploring and publishing data

PyPI Changelog Python 3.x License

Annotated version of this introductory video

Datasette is a tool for exploring and publishing data. It helps people take data of any shape, analyze and explore it, and publish it as an interactive website and accompanying API.

Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to share with the world. It is part of a wider ecosystem of 39 tools and 92 plugins dedicated to making working with structured data as productive as possible.

Try a demo and explore 33,000 power plants around the world, then follow the tutorial or take a look at some other examples of Datasette in action.

Then read how to get started with Datasette, subscribe to the monthly-ish newsletter and consider signing up for office hours for an in-person conversation about the project.

New: Datasette Desktop - a macOS desktop application for easily running Datasette on your own computer!

Exploratory data analysis

Import data from CSVs, JSON, database connections and more. Datasette will automatically show you patterns in your data and help you share your findings with your colleagues.

Instant data publishing

datasette publish lets you instantly publish your data to hosting providers like Google Cloud Run, Heroku or Vercel.

Rapid prototyping

Spin up a JSON API for any data in minutes. Use it to prototype and prove your ideas without building a custom backend.

Latest news

30th June 2022 #

s3-ocr is a new tool which can run OCR (via Amazon Textract) against every PDF file in an S3 bucket and write the results to a searchable SQLite database, ready to use with Datasette. Read more about it in s3-ocr: Extract text from PDF files stored in an S3 bucket.

5th May 2022 #

Datasette Lite is a new way to run Datasette: entirely in your browser, thanks to the Pyodide project which provides a full Python environment compiled to WebAssembly. You can use it to explore any SQLite database file hosted on a CORS-enabled static hosting provider, which includes GitHub and GitHub Pages. Read more about this project in Datasette Lite: a server-side Python web application running in a browser.

12th April 2022 #

Datasette for geospatial analysis describes how Datasette can be used in conjunction with SpatiaLite to work with geospatial data, including details of several geospatial plugins and tools from the Datasette ecosystem.

23rd March 2022 #

Datasette 0.61 introduces two potentially backwards-incompatible changes in preparation for the forthcoming 1.0 release: hashed URL mode has been moved to a new plugin, and the way URLs are generated to tables or databases containing special characters such as . or / has changed. Datasette 0.61.1 fixes a small bug in that release. See also the annotated release notes for these two versions.

27th February 2022 #

The first two of an ongoing series of official Datasette tutorials are now available: Exploring a database with Datasette introduces the Datasette web interface and shows how it can be used to explore a new database, and Learn SQL with Datasette provides an introduction to SQL using Datasette as a learning environment.

13th January 2022 #

Datasette 0.60 adds a new filters_from_request plugin hook, new internal methods for writing to the database, better performance and various faceting improvements. See also the annotated release notes.

5th December 2021 #

Observable notebooks recently added a SQL cell type, allowing SQL queries to be executed as part of an interactive notebook workflow. Alex Garcia built a Datasette Client for these which allows you to excute queries against any Datasette instance and explore and visualize the results using JavaScript code running in a notebook.

14th October 2021 #

Datasette 0.59 adds column descriptions in metadata, a new register_command plugin hook, enhanced --cors support and a bunch of other fixes and documentation improvements. See also the annotated release notes.

8th September 2021 #

Datasette Desktop is a new macOS desktop application version of Datasette, which supports opening SQLite files on your computer, importing CSV files and installing plugins. I wrote more about how it works in Datasette Desktop—a macOS desktop application for Datasette.

28th July 2021 #

The Baked Data architectural pattern describes a pattern commonly used with Datasette where the content for a site is bundled inside a SQLite database file and included alongside templates and application code in a deployment to a serverless hosting provider.

15th July 2021 #

Datasette 0.58 has new plugin hooks, a huge performance improvement for faceting, support for Unix domain sockets and several other improvements. Read the annotated release notes for extra background and context on the release.

5th June 2021 #

Datasette 0.57 is out with an important security patch plus a number of new features and bug fixes. Datasette 0.56.1, also out today, provides the security patch for users who are not yet ready to upgrade to the latest version.

10th May 2021 #

Django SQL Dashboard is a new tool that brings a useful authenticated subset of Datasette to Django projects that are built on top of PostgreSQL.

28th March 2021 #

Datasette 0.56 has bug fixes and documentation improvements, plus some new documented internal APIs for plugin authors and SpatiaLite 5 bundled with the official Datasette Docker container.

18th February 2021 #

Datasette 0.55 adds support for cross-database SQL queries. You can now run datasette --crossdb one.db two.db and then run queries that join data from tables in both of those database files - see cross-database queries in the documentation for more details.

sqlite-utils 3.6 adds similar features: a db.attach(alias, filepath) Python API method and --attach alias filepath.db command-line option, both for attaching additional databases in order to execute cross-database queries.

All news

Latest releases

3rd July 2022

datasette-expose-env 0.1 - Datasette plugin to expose selected environment variables at /-/env for debugging

  • Initial release: configure this plugin to expose specific environment variables at /-/env. #1

datasette-upload-csvs 0.7.2 - Datasette plugin for uploading CSV files and converting them to database tables

  • Fixed bug where encoding of file was not correctly detected if non-ASCII characters occurred after the first 2KB. The tool now inspects the first 2MB of content (as originally intended) and also upgrades ASCII to latin-1 since ASCII is a complete subset of latin-1 and using latin-1 increases the chance of a successful import. #25

datasette-packages 0.2 - Show a list of currently installed Python packages

datasette-graphql 2.1 - Datasette plugin providing an automatic GraphQL API for your SQLite databases

1st July 2022

datasette-edit-schema 0.5 - Datasette plugin for modifying table schemas

  • More human-friendly labels for column types. #29
  • edit-schema permission check now considers the database name. #32
  • Now depends on datasette>=0.59 #30 and sqlite-utils>=3.10. #33

30th June 2022

s3-ocr 0.4 - Tools for running OCR against files stored in S3

  • New command: s3-ocr inspect-job <job_id> returns information about the status of a specific job. #15
  • Added a live demo at s3-ocr-demo.datasette.io. #16

s3-credentials 0.12 - A tool for creating credentials for accessing S3 buckets

  • New --statement JSON option for both the s3-credentials create and s3-credentials policy commands, allowing one or more additional policy statements (provided as JSON strings) to be added to the generated IAM policy. #72

s3-ocr 0.3 - Tools for running OCR against files stored in S3

First non-alpha release.

  • Breaking change: the order of arguments for s3-ocr index <bucket> <database_file> has been swapped, for consistency with other commands. #9
  • Breaking change: the start command no longer defaults to processing every .pdf file in the bucket. It now accepts a list of keys, or use the --all option to process every PDF file. #10
  • New s3-ocr fetch <bucket> <path> command for fetching the raw OCR JSON data for that file. #7
  • New s3-ocr text <bucket> <path> command for outputting just the extracted OCR text for a specified file. #8

29th June 2022

s3-ocr 0.2a0

  • New s3-ocr index database.db name-of-bucket command for creating a SQLite database containing the OCR results that have been written to the bucket. #2

s3-ocr 0.1a0

  • s3-ocr start <bucket> command for triggering OCR runs using Textract for every PDF file in a bucket. #1
  • s3-ocr status <bucket> command for checking on the status of the ongoing OCR tasks.

23rd June 2022

datasette-scale-to-zero 0.1.2 - Quit Datasette if it has not recieved traffic for a specified time period

  • No longer logs a traceback on server exit. #2

22nd June 2022

datasette-scale-to-zero 0.1.1

  • Reduced log output when server exits. #2

21st June 2022

datasette-scale-to-zero 0.1

  • Initial release. Can be configured to cause Datasette to exit if it has not received traffic in a specified time period. #1

17th June 2022

datasette-socrata 0.3 - Import data from Socrata into Datasette

  • Any errors that occur during an import are now stored in the errors column of the socrata_imports table, and displayed by the JavaScript progress indicator. #12
  • CSV imports with long field values no longer trigger an error. #13
  • Progress bars now show the number of records imported and remaining, and hint to refresh the page when the import is complete. #10
  • Checks for low disk space using the plugin hook provided by datasette-low-disk-space-hook. #4

15th June 2022

sqlite-utils 3.27 - CLI tool and Python utility functions for manipulating SQLite databases

  • Documentation now uses the Furo Sphinx theme. (#435)
  • Code examples in documentation now have a "copy to clipboard" button. (#436)
  • sqlite_utils.utils.utils.rows_from_file() is now a documented API, see Reading rows from a file. (#443)
  • rows_from_file() has two new parameters to help handle CSV files with rows that contain more values than are listed in that CSV file's headings: ignore_extras=True and extras_key="name-of-key". (#440)
  • sqlite_utils.utils.maximize_csv_field_size_limit() helper function for increasing the field size limit for reading CSV files to its maximum, see Setting the maximum CSV field size limit. (#442)
  • table.search(where=, where_args=) parameters for adding additional WHERE clauses to a search query. The where= parameter is available on table.search_sql(...) as well. See Searching with table.search(). (#441)
  • Fixed bug where table.detect_fts() and other search-related functions could fail if two FTS-enabled tables had names that were prefixes of each other. (#434)

All releases