Annotated version of this introductory video
Datasette is a tool for exploring and publishing data. It helps people take data of any shape, analyze and explore it, and publish it as an interactive website and accompanying API.
Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to share with the world. It is part of a wider ecosystem of 40 tools and 93 plugins dedicated to making working with structured data as productive as possible.
Try a demo and explore 33,000 power plants around the world, then follow the tutorial or take a look at some other examples of Datasette in action.
Then read how to get started with Datasette, subscribe to the monthly-ish newsletter and consider signing up for office hours for an in-person conversation about the project.
New: Datasette Desktop - a macOS desktop application for easily running Datasette on your own computer!
Exploratory data analysis
Import data from CSVs, JSON, database connections and more. Datasette will automatically show you patterns in your data and help you share your findings with your colleagues.
Instant data publishing
datasette publish
lets you instantly publish your data to hosting providers like Google Cloud Run, Heroku or Vercel.
Rapid prototyping
Spin up a JSON API for any data in minutes. Use it to prototype and prove your ideas without building a custom backend.
Latest news
14th August 2022 #
Datasette 0.62 introduces compatibility with Pyodide for Datasette Lite, and incorporates a number of bug fixes, plugin hook upgrades and other improvements.
31st July 2022 #
New tutorial and accompanying ten minute video: Cleaning data with sqlite-utils and Datasette.
30th June 2022 #
s3-ocr is a new tool which can run OCR (via Amazon Textract) against every PDF file in an S3 bucket and write the results to a searchable SQLite database, ready to use with Datasette. Read more about it in s3-ocr: Extract text from PDF files stored in an S3 bucket.
5th May 2022 #
Datasette Lite is a new way to run Datasette: entirely in your browser, thanks to the Pyodide project which provides a full Python environment compiled to WebAssembly. You can use it to explore any SQLite database file hosted on a CORS-enabled static hosting provider, which includes GitHub and GitHub Pages. Read more about this project in Datasette Lite: a server-side Python web application running in a browser.
12th April 2022 #
Datasette for geospatial analysis describes how Datasette can be used in conjunction with SpatiaLite to work with geospatial data, including details of several geospatial plugins and tools from the Datasette ecosystem.
23rd March 2022 #
Datasette 0.61 introduces two potentially backwards-incompatible changes in preparation for the forthcoming 1.0 release: hashed URL mode has been moved to a new plugin, and the way URLs are generated to tables or databases containing special characters such as .
or /
has changed. Datasette 0.61.1 fixes a small bug in that release. See also the annotated release notes for these two versions.
27th February 2022 #
The first two of an ongoing series of official Datasette tutorials are now available: Exploring a database with Datasette introduces the Datasette web interface and shows how it can be used to explore a new database, and Learn SQL with Datasette provides an introduction to SQL using Datasette as a learning environment.
13th January 2022 #
Datasette 0.60 adds a new filters_from_request
plugin hook, new internal methods for writing to the database, better performance and various faceting improvements. See also the annotated release notes.
5th December 2021 #
Observable notebooks recently added a SQL cell type, allowing SQL queries to be executed as part of an interactive notebook workflow. Alex Garcia built a Datasette Client for these which allows you to excute queries against any Datasette instance and explore and visualize the results using JavaScript code running in a notebook.
14th October 2021 #
Datasette 0.59 adds column descriptions in metadata, a new register_command
plugin hook, enhanced --cors
support and a bunch of other fixes and documentation improvements. See also the annotated release notes.
8th September 2021 #
Datasette Desktop is a new macOS desktop application version of Datasette, which supports opening SQLite files on your computer, importing CSV files and installing plugins. I wrote more about how it works in Datasette Desktop—a macOS desktop application for Datasette.
28th July 2021 #
The Baked Data architectural pattern describes a pattern commonly used with Datasette where the content for a site is bundled inside a SQLite database file and included alongside templates and application code in a deployment to a serverless hosting provider.
15th July 2021 #
Datasette 0.58 has new plugin hooks, a huge performance improvement for faceting, support for Unix domain sockets and several other improvements. Read the annotated release notes for extra background and context on the release.
5th June 2021 #
Datasette 0.57 is out with an important security patch plus a number of new features and bug fixes. Datasette 0.56.1, also out today, provides the security patch for users who are not yet ready to upgrade to the latest version.
10th May 2021 #
Django SQL Dashboard is a new tool that brings a useful authenticated subset of Datasette to Django projects that are built on top of PostgreSQL.
Latest releases
18th August 2022
sqlite-diffable 0.5 - Tools for dumping/loading a SQLite database to diffable directory structure
sqlite-diffable objects path-to/table.ndjson
command for converting a newline-delimited file of JSON arrays into a sequence of JSON objects. #7
14th August 2022
datasette-sentry 0.2 - Datasette plugin for configuring Sentry
- Now uses the new handle_exception() plugin hook introduced in Datasette 0.62. #3
datasette-sentry 0.2a1
- Preview of 0.2 for final testing. #3
datasette 0.62 - An open source multi-tool for exploring and publishing data
Datasette can now run entirely in your browser using WebAssembly. Try out Datasette Lite, take a look at the code or read more about it in Datasette Lite: a server-side Python web application running in a browser.
Datasette now has a Discord community for questions and discussions about Datasette and its ecosystem of projects.
Features
- Datasette is now compatible with Pyodide. This is the enabling technology behind Datasette Lite. (#1733)
- Database file downloads now implement conditional GET using ETags. (#1739)
- HTML for facet results and suggested results has been extracted out into new templates
_facet_results.html
and_suggested_facets.html
. Thanks, M. Nasimul Haque. (#1759) - Datasette now runs some SQL queries in parallel. This has limited impact on performance, see this research issue for details.
- New
--nolock
option for ignoring file locks when opening read-only databases. (#1744) - Spaces in the database names in URLs are now encoded as
+
rather than~20
. (#1701) <Binary: 2427344 bytes>
is now displayed as<Binary: 2,427,344 bytes>
and is accompanied by tooltip showing "2.3MB". (#1712)- The base Docker image used by
datasette publish cloudrun
,datasette package
and the official Datasette image has been upgraded to3.10.6-slim-bullseye
. (#1768) - Canned writable queries against immutable databases now show a warning message. (#1728)
datasette publish cloudrun
has a new--timeout
option which can be used to increase the time limit applied by the Google Cloud build environment. Thanks, Tim Sherratt. (#1717)datasette publish cloudrun
has new--min-instances
and--max-instances
options. (#1779)
Plugin hooks
- New plugin hook: handle_exception(), for custom handling of exceptions caught by Datasette. (#1770)
- The render_cell() plugin hook is now also passed a
row
argument, representing thesqlite3.Row
object that is being rendered. (#1300) - The configuration directory is now stored in
datasette.config_dir
, making it available to plugins. Thanks, Chris Amico. (#1766)
Bug fixes
- Don't show the facet option in the cog menu if faceting is not allowed. (#1683)
?_sort
and?_sort_desc
now work if the column that is being sorted has been excluded from the query using?_col=
or?_nocol=
. (#1773)- Fixed bug where
?_sort_desc
was duplicated in the URL every time the Apply button was clicked. (#1738)
Documentation
- Examples in the documentation now include a copy-to-clipboard button. (#1748)
- Documentation now uses the Furo Sphinx theme. (#1746)
- Code examples in the documentation are now all formatted using Black. (#1718)
Request.fake()
method is now documented, see Request object.- New documentation for plugin authors: Registering a plugin for the duration of a test. (#903)
12th August 2022
s3-credentials 0.13 - A tool for creating credentials for accessing S3 buckets
- Documentation now lives on a dedicated documentation website: https://s3-credentials.readthedocs.io/ #71
s3-credentials create ... --website --create-bucket
now creates an S3 bucket that is configured to act as a website, withindex.html
an the index page anderror.html
as the page used for any errors. #21s3-credentials list-buckets --details
now returns the bucket region and the URL to the website, if it is configured to act as a website. #77- Fixed a bug where
list-bucket
would return an error if the bucket (or specified--prefix
) was empty. #76
10th August 2022
s3-ocr 0.6.3 - Tools for running OCR against files stored in S3
- Pages with no OCR text on them are now recorded as rows with empty strings, instead of being skipped entirely. #23
9th August 2022
s3-ocr 0.6.2
- Fixed bug where commands were sometimes not properly registered. #26
s3-ocr 0.6.1
- Now pins to
click>=8.0
, which should avoid a bug where installing this on a machine with an older version of Click present would lead to the commands failing to register. #25 s3-ocr --help
now includes links to the documentation and changelog.
8th August 2022
datasette-nteract-data-explorer 0.4.1 - automatic visual data explorer for datasette
What's Changed
- feat: add demo site landing page by @hydrosquall in https://github.com/hydrosquall/datasette-nteract-data-explorer/pull/14
- feature: synchronize graph state to URL parameters, enabling permalinking + sharing graphs by @hydrosquall in https://github.com/hydrosquall/datasette-nteract-data-explorer/pull/19
- Added PR templates + Issue contributing guide
Full Changelog: https://github.com/hydrosquall/datasette-nteract-data-explorer/compare/0.3.1...0.4.1
datasette-nteract-data-explorer 0.4.0
ignore - see 0.4.1.
7th August 2022
s3-ocr 0.6 - Tools for running OCR against files stored in S3
s3-ocr start
now automatically pauses and then retries if Textract complains that there are too many jobs running. This can be turned into an early exit with an error message using the new--no-retry
option. #21- New
s3-ocr start --dry-run
option for displaying what would happen without starting the OCR process. #22 - Textract now runs in the same region as the S3 bucket it is writing to, avoiding an error. #24
5th August 2022
datasette-scale-to-zero 0.2 - Quit Datasette if it has not received traffic for a specified time period
- New
"max-age": "10h"
configuration setting, which causes the server to exit after the specified amount of time whether or not it has received any traffic. #3
2nd August 2022
shot-scraper 0.14.3 - A command-line utility for taking automated screenshots of websites
- Improved example workflow in Optimizing PNGs using Oxipng.
- Fixed typos in README and documentation. #83
1st August 2022
s3-credentials 0.12.1 - A tool for creating credentials for accessing S3 buckets
- Using the
--policy
or--statement
options now implies--user-permissions-boundary=none
. Previously it was easy to use these options to accidentally create credentials that did not work as expected since they would have a default permissions boundary that locked them down to only being able to access S3. #74 - The
s3-credentials.AmazonS3FullAccess
role created by this tool in order to issue temporary credentials previously used the defaultMaxSessionDuration
value of 3600, preventing it from creating credentials that could last more than an hour. This has been increased to 12 hours. See this issue comment for instructions on fixing your existing role if this bug is affecting your account. #75
31st July 2022
datasette-sqlite-fts4 0.3.2 - Datasette plugin exposing SQL functions from sqlite-fts4
- Now depends on sqlite-fts4 1.0.3