Annotated version of this introductory video
Datasette is a tool for exploring and publishing data. It helps people take data of any shape, analyze and explore it, and publish it as an interactive website and accompanying API.
Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to share with the world. It is part of a wider ecosystem of 46 tools and 128 plugins dedicated to making working with structured data as productive as possible.
Try a demo and explore 33,000 power plants around the world, then follow the tutorial or take a look at some other examples of Datasette in action.
Then read how to get started with Datasette, subscribe to the monthly-ish newsletter and consider signing up for office hours for an in-person conversation about the project.
New: Datasette Desktop - a macOS desktop application for easily running Datasette on your own computer!
Exploratory data analysis
Import data from CSVs, JSON, database connections and more. Datasette will automatically show you patterns in your data and help you share your findings with your colleagues.
Instant data publishing
datasette publish
lets you instantly publish your data to hosting providers like Google Cloud Run, Heroku or Vercel.
Rapid prototyping
Spin up a JSON API for any data in minutes. Use it to prototype and prove your ideas without building a custom backend.
Latest news
22nd August 2023 #
Datasette 1.0a4 has a fix for a security vulnerability in the Datasette 1.0 alpha series: the API explorer interface exposed the names of private databases and tables in public instances that were protected by a plugin such as datasette-auth-passwords, though not the actual content of those tables. See the security advisory for more details and workarounds for if you can't upgrade immediately. The latest edition of the Datasette Newsletter also talks about this issue.
15th August 2023 #
datasette-write-ui: a Datasette plugin for editing, inserting, and deleting rows introduces a new plugin adding add/edit/delete functionality to Datasette, developed by Alex Garcia. Alex built this for Datasette Cloud, and this post is the first announcement made on the new Datasette Cloud blog - see also Welcome to Datasette Cloud.
9th August 2023 #
Datasette 1.0a3 is an alpha release of Datasette that previews the new default JSON API design that’s coming in version 1.0 - the single most significant change planned for that 1.0 release.
1st July 2023 #
New tutorial: Data analysis with SQLite and Python. This tutorial, originally presented at PyCon 2023, includes a 2h45m video and an extensive handout that should be useful with or without the video. Topics covered include Python's sqlite3
module, sqlite-utils
, Datasette, Datasette Lite, advanced SQL patterns and more.
24th March 2023 #
I built a ChatGPT plugin to answer questions about data hosted in Datasette describes a new experimental Datasette plugin to enable people to query data hosted in a Datasette interface via ChatGPT, asking human language questions that are automatically converted to SQL and used to generate a readable response.
23rd February 2023 #
Using Datasette in GitHub Codespaces is a new tutorial showing how Datasette can be run in GitHub's free Codespaces browser-based development environments, using the new datasette-codespaces plugin.
28th January 2023 #
Examples of sites built using Datasette now includes screenshots of Datasette deployments that illustrate a variety of problems that can be addressed using Datasette and its plugins.
13th January 2023 #
Semantic search answers: Q&A against documentation with GPT3 + OpenAI embeddings shows how Datasette can be used to implement semantic search and build a system for answering questions against an existing corpus of text, using two new plugins: datasette-openai and datasette-faiss, and a new tool: openai-to-sqlite.
9th January 2023 #
Datasette 0.64 is out, and includes a strong warning against running SpatiaLite in production without disabling arbitrary SQL queries, plus a new --setting default_allow_sql off setting to make it easier to do that. See Datasette 0.64, with a warning about SpatiaLite for more about this release. A new tutorial, Building a location to time zone API with SpatiaLite, describes how to safely use SpatiaLite and Datasette to build and deploy an API for looking up time zones for a latitude/longitude location.
15th December 2022 #
Datasette 1.0a2: Upserts and finely grained permissions describes the new upsert API and much improved permissions capabilities introduced in the latest Datasette 1.0a2 alpha release.
2nd December 2022 #
Datasette’s new JSON write API: The first alpha of Datasette 1.0 introduces the new write API shipped in the first of the Datasette 1.0 alpha series of releases, including detailed descriptions of two demos that show how the API can be used.
27th October 2022 #
Datasette 0.63 is out. Here are the annotated release notes.
8th September 2022 #
Exploring the training data behind Stable Diffusion describes the process of building and deploying a 4GB searchable SQLite database using Datasette, starting with Parquet data that was used to train the Stable Diffusion image generation model. See also Exploring 12 Million of the 2.3 Billion Images Used to Train Stable Diffusion’s Image Generator.
21st August 2022 #
Analyzing ScotRail audio announcements with Datasette—from prototype to production provides a detailed walk-through of the process of constructing an initial rapid prototype using Datasette Lite, extending it with a custom plugin and then deploying it as a full Datasette instance using GitHub Actions and Vercel.
14th August 2022 #
Datasette 0.62 introduces compatibility with Pyodide for Datasette Lite, and incorporates a number of bug fixes, plugin hook upgrades and other improvements.
Latest releases
28th November 2023
datasette-pretty-json 0.3 - Datasette plugin that pretty-prints any column values that are valid JSON objects or arrays
- Now renders with
white-space: pre-wrap
to avoid overly wide columns . #4
21st November 2023
datasette-sentry 0.4 - Datasette plugin for configuring Sentry
- Added support for Sentry Performance Monitoring with the new
"enable_tracing": true
setting. #5
6th November 2023
llm 0.12 - A CLI utility and Python library for interacting with Large Language Models, including OpenAI, PaLM and local models installed on your own machine.
- Support for the new GPT-4 Turbo model from OpenAI. Try it using
llm chat -m gpt-4-turbo
orllm chat -m 4t
. #323 - New
-o seed 1
option for OpenAI models which sets a seed that can attempt to evaluate the prompt deterministically. #324
llm 0.11.2
- Pin to version of OpenAI Python library prior to 1.0 to avoid breaking. #327
4th November 2023
datasette-edit-schema 0.7.1 - Datasette plugin for modifying table schemas
- Fixed a bug where editing a schema raised a 500 error if any of the table column names included a single quote. #43
sqlite-utils 3.35.2 - CLI tool and Python library for manipulating SQLite databases
- The
--load-extension=spatialite
option and find_spatialite() utility function now both work correctly onarm64
Linux. Thanks, Mike Coats. (#599) - Fix for bug where
sqlite-utils insert
could cause your terminal cursor to disappear. Thanks, Luke Plant. (#433) datetime.timedelta
values are now stored asTEXT
columns. Thanks, Harald Nezbeda. (#522)- Test suite is now also run against Python 3.12.
1st November 2023
shot-scraper 1.3 - A command-line utility for taking automated screenshots of websites
- New
--bypass-csp
option for bypassing any Content Security Policy on the page that prevents executing further JavaScript. Thanks, Brenton Cleeland. #116 - Screenshots taken using
shot-scraper --interactive $URL
- which allows you to interact with the page in a browser window and then hit<enter>
to take the screenshot - it no longer reloads the page before taking the shot (which ignored your activity). #125 - Improved accessibility of documentation. Thanks, Paolo Melchiorre. #120
llm 0.11.1 - A CLI utility and Python library for interacting with Large Language Models, including OpenAI, PaLM and local models installed on your own machine.
- Fixed a bug where
llm embed -c "text"
did not correctly pick up the configured default embedding model. #317 - New plugins: llm-python, llm-bedrock-anthropic and llm-embed-jina (described in Execute Jina embeddings with a CLI using llm-embed-jina).
- llm-gpt4all now uses the new GGUF model format. simonw/llm-gpt4all#16
26th October 2023
datasette-edit-schema 0.7 - Datasette plugin for modifying table schemas
- Ability to add an index (or a unique index) to a column on a table. #27
- Ability to drop an index from a table.
25th October 2023
datasette-ripgrep 0.8.2 - Web interface for searching your code using ripgrep, built as a Datasette plugin
- Fix for issue where templates were not rendered with access to the current request. #29
8th October 2023
datasette-llm-embed 0.2 - llm_embed(model_id, text) SQL function for Datasette
- New
llm_embed_decode(blob)
function returning a string JSON array of floats. #4
datasette 0.64.5 - An open source multi-tool for exploring and publishing data
- Dropped dependency on
click-default-group-wheel
, which could cause a dependency conflict. (#2197)
datasette-llm-embed 0.1 - llm_embed(model_id, text) SQL function for Datasette
- First non-alpha release.
llm_embed(model_id, value)
function now works for models that require API keys, such asada-002
. The key can be configured as a plugin secret in themetadata.json
ormetadata.yaml
file. #3
datasette-llm-embed 0.1a1
- New
llm_embed_cosine(a, b)
function to calculate cosine similarity between two binary blob vectors. #2