Tool for running transformations on columns in a SQLite database.
⚠️ This tool is no longer maintainedI added a new tool to sqlite-utils called sqlite-utils convert which provides a super-set of the functionality originally provided here.
sqlite-transformis no longer maintained, and I recommend switching to usingsqlite-utils convertinstead.
pip install sqlite-transform
These subcommands will run all values in the specified column through dateutils.parser.parse() and replace them with the result, formatted as an ISO timestamp or ISO date.
For example, if a row in the database has an opened column which contains 10/10/2019 08:10:00 PM, running the following command:
sqlite-transform parsedatetime my.db mytable opened
Will result in that value being replaced by 2019-10-10T20:10:00.
Using the parsedate subcommand here would result in 2019-10-10 instead.
In the case of ambiguous dates such as 03/04/05 these commands both default to assuming American-style mm/dd/yy format. You can pass --dayfirst to specify that the day should be assumed to be first, or --yearfirst for the year.
The jsonsplit subcommand takes columns that contain a comma-separated list, for example a tags column containing records like "trees,park,dogs" and converts it into a JSON array ["trees", "park", "dogs"].
This is useful for taking advantage of Datasette's Facet by JSON array feature.
sqlite-transform jsonsplit my.db mytable tags
It defaults to splitting on commas, but you can specify a different delimiter character using the --delimiter option, for example:
sqlite-transform jsonsplit \
my.db mytable tags --delimiter ';'
Values within the array will be treated as strings, so a column containing 123,552,775 will be converted into the JSON array ["123", "552", "775"].
You can specify a different type for these values using --type int or --type float, for example:
sqlite-transform jsonsplit \
my.db mytable tags --type int
This will result in that column being converted into [123, 552, 775].
The lambda subcommand lets you specify Python code which will be executed against the column.
Here's how to convert a column to uppercase:
sqlite-transform lambda my.db mytable mycolumn --code='str(value).upper()'
The code you provide will be compiled into a function that takes value as a single argument. You can break your function body into multiple lines, provided the last line is a return statement:
sqlite-transform lambda my.db mytable mycolumn --code='value = str(value)
return value.upper()'
You can also specify Python modules that should be imported and made available to your code using one or more --import options:
sqlite-transform lambda my.db mytable mycolumn \
--code='"\n".join(textwrap.wrap(value, 10))' \
--import=textwrap
The --dry-run option will output a preview of the transformation against the first ten rows, without modifying the database.
Each of these commands accepts optional --output and --output-type options. These can be used to save the result of the transformation to a separate column, which will be created if the column does not already exist.
To save the result of jsonsplit to a new column called json_tags, use the following:
sqlite-transform jsonsplit my.db mytable tags \
--output json_tags
The type of the created column defaults to text, but a different column type can be specified using --output-type. This example will create a new floating point column called float_id with a copy of each item's ID increased by 0.5:
sqlite-transform lambda my.db mytable id \
--code 'float(value) + 0.5' \
--output float_id \
--output-type float
You can drop the original column at the end of the operation by adding --drop.
Sometimes you may wish to convert a single column into multiple derived columns. For example, you may have a location column containing latitude,longitude values which you wish to split out into separate latitude and longitude columns.
You can achieve this using the --multi option to sqlite-transform lambda. This option expects your --code function to return a Python dictionary: new columns well be created and populated for each of the keys in that dictionary.
For the latitude,longitude example you would use the following:
sqlite-transform lambda demo.db places location \
--code 'return {
"latitude": float(value.split(",")[0]),
"longitude": float(value.split(",")[1]),
}' --multi
The type of the returned values will be taken into account when creating the new columns. In this example, the resulting database schema will look like this:
CREATE TABLE [places] (
[location] TEXT,
[latitude] FLOAT,
[longitude] FLOAT
);The code function can also return None, in which case its output will be ignored.
You can drop the original column at the end of the operation by adding --drop.
By default each command will show a progress bar. Pass -s or --silent to hide that progress bar.