Annotated version of this introductory video
Datasette is a tool for exploring and publishing data. It helps people take data of any shape, analyze and explore it, and publish it as an interactive website and accompanying API.
Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to share with the world. It is part of a wider ecosystem of 43 tools and 115 plugins dedicated to making working with structured data as productive as possible.
Try a demo and explore 33,000 power plants around the world, then follow the tutorial or take a look at some other examples of Datasette in action.
Then read how to get started with Datasette, subscribe to the monthly-ish newsletter and consider signing up for office hours for an in-person conversation about the project.
New: Datasette Desktop - a macOS desktop application for easily running Datasette on your own computer!
Exploratory data analysis
Import data from CSVs, JSON, database connections and more. Datasette will automatically show you patterns in your data and help you share your findings with your colleagues.
Instant data publishing
datasette publish
lets you instantly publish your data to hosting providers like Google Cloud Run, Heroku or Vercel.
Rapid prototyping
Spin up a JSON API for any data in minutes. Use it to prototype and prove your ideas without building a custom backend.
Latest news
24th March 2023 #
I built a ChatGPT plugin to answer questions about data hosted in Datasette describes a new experimental Datasette plugin to enable people to query data hosted in a Datasette interface via ChatGPT, asking human language questions that are automatically converted to SQL and used to generate a readable response.
23rd February 2023 #
Using Datasette in GitHub Codespaces is a new tutorial showing how Datasette can be run in GitHub's free Codespaces browser-based development environments, using the new datasette-codespaces plugin.
28th January 2023 #
Examples of sites built using Datasette now includes screenshots of Datasette deployments that illustrate a variety of problems that can be addressed using Datasette and its plugins.
13th January 2023 #
Semantic search answers: Q&A against documentation with GPT3 + OpenAI embeddings shows how Datasette can be used to implement semantic search and build a system for answering questions against an existing corpus of text, using two new plugins: datasette-openai and datasette-faiss, and a new tool: openai-to-sqlite.
9th January 2023 #
Datasette 0.64 is out, and includes a strong warning against running SpatiaLite in production without disabling arbitrary SQL queries, plus a new --setting default_allow_sql off setting to make it easier to do that. See Datasette 0.64, with a warning about SpatiaLite for more about this release. A new tutorial, Building a location to time zone API with SpatiaLite, describes how to safely use SpatiaLite and Datasette to build and deploy an API for looking up time zones for a latitude/longitude location.
15th December 2022 #
Datasette 1.0a2: Upserts and finely grained permissions describes the new upsert API and much improved permissions capabilities introduced in the latest Datasette 1.0a2 alpha release.
2nd December 2022 #
Datasette’s new JSON write API: The first alpha of Datasette 1.0 introduces the new write API shipped in the first of the Datasette 1.0 alpha series of releases, including detailed descriptions of two demos that show how the API can be used.
27th October 2022 #
Datasette 0.63 is out. Here are the annotated release notes.
8th September 2022 #
Exploring the training data behind Stable Diffusion describes the process of building and deploying a 4GB searchable SQLite database using Datasette, starting with Parquet data that was used to train the Stable Diffusion image generation model. See also Exploring 12 Million of the 2.3 Billion Images Used to Train Stable Diffusion’s Image Generator.
21st August 2022 #
Analyzing ScotRail audio announcements with Datasette—from prototype to production provides a detailed walk-through of the process of constructing an initial rapid prototype using Datasette Lite, extending it with a custom plugin and then deploying it as a full Datasette instance using GitHub Actions and Vercel.
14th August 2022 #
Datasette 0.62 introduces compatibility with Pyodide for Datasette Lite, and incorporates a number of bug fixes, plugin hook upgrades and other improvements.
31st July 2022 #
New tutorial and accompanying ten minute video: Cleaning data with sqlite-utils and Datasette.
30th June 2022 #
s3-ocr is a new tool which can run OCR (via Amazon Textract) against every PDF file in an S3 bucket and write the results to a searchable SQLite database, ready to use with Datasette. Read more about it in s3-ocr: Extract text from PDF files stored in an S3 bucket.
5th May 2022 #
Datasette Lite is a new way to run Datasette: entirely in your browser, thanks to the Pyodide project which provides a full Python environment compiled to WebAssembly. You can use it to explore any SQLite database file hosted on a CORS-enabled static hosting provider, which includes GitHub and GitHub Pages. Read more about this project in Datasette Lite: a server-side Python web application running in a browser.
12th April 2022 #
Datasette for geospatial analysis describes how Datasette can be used in conjunction with SpatiaLite to work with geospatial data, including details of several geospatial plugins and tools from the Datasette ecosystem.
Latest releases
21st May 2023
sqlite-utils 3.32.1 - CLI tool and Python library for manipulating SQLite databases
- Examples in the CLI documentation can now all be copied and pasted without needing to remove a leading
$
. (#551) - Documentation now covers Setting up shell completion for
bash
andzsh
. (#552)
sqlite-utils 3.32
- New experimental
sqlite-utils tui
interface for interactively building command-line invocations, powered by Trogon. This requires an optional dependency, installed usingsqlite-utils install trogon
. There is a screenshot in the documentation. (#545) sqlite-utils analyze-tables
command (documentation) now has a--common-limit 20
option for changing the number of common/least-common values shown for each column. (#544)sqlite-utils analyze-tables --no-most
and--no-least
options for disabling calculation of most-common and least-common values.- If a column contains only
null
values,analyze-tables
will no longer attempt to calculate the most common and least common values for that column. (#547) - Calling
sqlite-utils analyze-tables
with non-existent columns in the-c/--column
option now results in an error message. (#548) - The
table.analyze_column()
method (documented here) now acceptsmost_common=False
andleast_common=False
options for disabling calculation of those values.
8th May 2023
sqlite-utils 3.31
- Dropped support for Python 3.6. Tests now ensure compatibility with Python 3.11. (#517)
- Automatically locates the SpatiaLite extension on Apple Silicon. Thanks, Chris Amico. (#536)
- New
--raw-lines
option for thesqlite-utils query
andsqlite-utils memory
commands, which outputs just the raw value of the first column of evy row. (#539) - Fixed a bug where
table.upsert_all()
failed if thenot_null=
option was passed. (#538) - Fixed a
ResourceWarning
when usingsqlite-utils insert
. (#534) - Now shows a more detailed error message when
sqlite-utils insert
is called with invalid JSON. (#532) table.convert(..., skip_false=False)
andsqlite-utils convert --no-skip-false
options, for avoiding a misfeature where the convert() mechanism skips rows in the database with a falsey value for the specified column. Fixing this by default would be a backwards-incompatible change and is under consideration for a 4.0 release in the future. (#527)- Tables can now be created with self-referential foreign keys. Thanks, Scott Perry. (#537)
sqlite-utils transform
no longer breaks if a table defines default values for columns. Thanks, Kenny Song. (#509)- Fixed a bug where repeated calls to
table.transform()
did not work correctly. Thanks, Martin Carpenter. (#525) - Improved error message if
rows_from_file()
is passed a non-binary-mode file-like object. (#520)
2nd May 2023
datasette-dashboards 0.5.2 - Datasette plugin providing data dashboards from metadata
30th April 2023
s3-credentials 0.15 - A tool for creating credentials for accessing S3 buckets
29th April 2023
openai-to-sqlite 0.3 - Save OpenAI API results to a SQLite database
- New
openai-to-sqlite query data.db SQL
command for executing SQL queries against a database with access to custom OpenAI SQL functions. #11 - New
chatgpt(prompt)
andchatgpt(prompt, system_prompt)
SQL functions for use withopenai-to-sqlite query
.
27th April 2023
datasette 0.64.3 - An open source multi-tool for exploring and publishing data
shot-scraper 1.2 - A command-line utility for taking automated screenshots of websites
- New
--omit-background
option to theshot
command to optionally create transparent PNGs. Thanks, Ben Welsh. #108 - Fixed bug that caused
shot-scraper
to fail to take screenshots on Windows. Thanks, Omer Rosenbaum. #104 - New
--silent
option for theshot
,multi
,pdf
andhtml
commands, to disable the default console output. #107
25th April 2023
datasette-dashboards 0.5.1 - Datasette plugin providing data dashboards from metadata
24th April 2023
datasette-explain 0.1a2 - Explain SQL queries executed using Datasette
- Fix for "You did not supply a value for binding parameter" error. #4
20th April 2023
datasette-dashboards 0.5.0 - Datasette plugin providing data dashboards from metadata
11th April 2023
swarm-to-sqlite 0.3.4 - Create a SQLite database containing your checkin history from Foursquare Swarm
- Fixed an error in the
checkins_detail
view. #15
4th April 2023
datasette-explain 0.1a1 - Explain SQL queries executed using Datasette
:param
parameters in SQL queries are now supported. #3
30th March 2023
datasette-dashboards 0.4.0 - Datasette plugin providing data dashboards from metadata
24th March 2023
datasette-chatgpt-plugin 0.1 - A Datasette plugin that turns a Datasette instance into a ChatGPT plugin
- Initial release. Install this plugin to expose the first database in your Datasette instance as a ChatGPT plugin, provided you have preview access to that feature. #1