til: django_export-csv-from-django-admin.md
This data as json
| path | topic | title | url | body | html | shot | created | created_utc | updated | updated_utc | shot_hash | slug |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| django_export-csv-from-django-admin.md | django | Django Admin action for exporting selected rows as CSV | https://github.com/simonw/til/blob/main/django/export-csv-from-django-admin.md | I wanted to add an action option to the Django Admin for exporting the currently selected set of rows (or every row in the table) as a CSV file. I ended up using a pattern inspired by [this Django Snippet](https://djangosnippets.org/snippets/10767/), but with an added touch for more efficient exports. In order to avoid using up too much memory for the export, I use keyset pagination to fetch 500 rows at a time. The `keyset_pagination_iterator()` helper function accepts any queryset, orders it by the primary key and then repeatedly fetches 500 items. It then modifies the queryset to add a `WHERE id > $last_seen_id` clause. This is a relatively inexpensive way to paginate, so having an endpoint perform that query dozens or even hundreds of times should hopefully avoid adding too much load to the database. The action itself uses a pattern that combines `StringIO` and `csv.writer()` to stream out the results as a CSV file. Django's `StreamingHttpResponse` mechanism is really neat: it accepts a Python iterator or generator and returns a streaming response derived from that sequence. The Django documentation says "Streaming responses will tie a worker process for the entire duration of the response. This may result in poor performance" - this particular project runs on Google Cloud Run so I'm less concerned about tying up a worker than I would be normally, plus the export option is only available to trusted staff users with access to the Django Admin interface. To add the CSV export option to a `ModelAdmin` subclass, do the following: ```python from .admin_actions import export_as_csv_action @admin.register(County) class CountyAdmin(admin.ModelAdmin): actions = [export_as_csv_action()] ``` Here's `admin_actions.py`: ```python import csv from io import StringIO from django.http import StreamingHttpResponse def keyset_pagination_iterator(input_queryset, batch_size=500): all_queryset = input_queryset.order_by("pk") last_pk = None while True: queryset = all_queryset if last_pk is not None: queryset = all_queryset.filter(pk__gt=last_pk) queryset = queryset[:batch_size] for row in queryset: last_pk = row.pk yield row if not queryset: break def export_as_csv_action(description="Export selected rows to CSV"): def export_as_csv(modeladmin, request, queryset): def rows(queryset): csvfile = StringIO() csvwriter = csv.writer(csvfile) columns = [field.name for field in modeladmin.model._meta.fields] def read_and_flush(): csvfile.seek(0) data = csvfile.read() csvfile.seek(0) csvfile.truncate() return data header = False if not header: header = True csvwriter.writerow(columns) yield read_and_flush() for row in keyset_pagination_iterator(queryset): csvwriter.writerow(getattr(row, column) for column in columns) yield read_and_flush() response = StreamingHttpResponse(rows(queryset), content_type="text/csv") response["Content-Disposition"] = ( "attachment; filename=%s.csv" % modeladmin.model.__name__ ) return response export_as_csv.short_description = description return export_as_csv ``` | <p>I wanted to add an action option to the Django Admin for exporting the currently selected set of rows (or every row in the table) as a CSV file.</p> <p>I ended up using a pattern inspired by <a href="https://djangosnippets.org/snippets/10767/" rel="nofollow">this Django Snippet</a>, but with an added touch for more efficient exports. In order to avoid using up too much memory for the export, I use keyset pagination to fetch 500 rows at a time.</p> <p>The <code>keyset_pagination_iterator()</code> helper function accepts any queryset, orders it by the primary key and then repeatedly fetches 500 items. It then modifies the queryset to add a <code>WHERE id > $last_seen_id</code> clause. This is a relatively inexpensive way to paginate, so having an endpoint perform that query dozens or even hundreds of times should hopefully avoid adding too much load to the database.</p> <p>The action itself uses a pattern that combines <code>StringIO</code> and <code>csv.writer()</code> to stream out the results as a CSV file.</p> <p>Django's <code>StreamingHttpResponse</code> mechanism is really neat: it accepts a Python iterator or generator and returns a streaming response derived from that sequence.</p> <p>The Django documentation says "Streaming responses will tie a worker process for the entire duration of the response. This may result in poor performance" - this particular project runs on Google Cloud Run so I'm less concerned about tying up a worker than I would be normally, plus the export option is only available to trusted staff users with access to the Django Admin interface.</p> <p>To add the CSV export option to a <code>ModelAdmin</code> subclass, do the following:</p> <div class="highlight highlight-source-python"><pre><span class="pl-k">from</span> .<span class="pl-s1">admin_actions</span> <span class="pl-k">import</span> <span class="pl-s1">export_as_csv_action</span> <span class="pl-en">@<span class="pl-s1">admin</span>.<span class="pl-en">register</span>(<span class="pl-v">County</span>)</span> <span class="pl-k">class</span> <span class="pl-v">CountyAdmin</span>(<span class="pl-s1">admin</span>.<span class="pl-v">ModelAdmin</span>): <span class="pl-s1">actions</span> <span class="pl-c1">=</span> [<span class="pl-en">export_as_csv_action</span>()]</pre></div> <p>Here's <code>admin_actions.py</code>:</p> <div class="highlight highlight-source-python"><pre><span class="pl-k">import</span> <span class="pl-s1">csv</span> <span class="pl-k">from</span> <span class="pl-s1">io</span> <span class="pl-k">import</span> <span class="pl-v">StringIO</span> <span class="pl-k">from</span> <span class="pl-s1">django</span>.<span class="pl-s1">http</span> <span class="pl-k">import</span> <span class="pl-v">StreamingHttpResponse</span> <span class="pl-k">def</span> <span class="pl-en">keyset_pagination_iterator</span>(<span class="pl-s1">input_queryset</span>, <span class="pl-s1">batch_size</span><span class="pl-c1">=</span><span class="pl-c1">500</span>): <span class="pl-s1">all_queryset</span> <span class="pl-c1">=</span> <span class="pl-s1">input_queryset</span>.<span class="pl-en">order_by</span>(<span class="pl-s">"pk"</span>) <span class="pl-s1">last_pk</span> <span class="pl-c1">=</span> <span class="pl-c1">None</span> <span class="pl-k">while</span> <span class="pl-c1">True</span>: <span class="pl-s1">queryset</span> <span class="pl-c1">=</span> <span class="pl-s1">all_queryset</span> <span class="pl-k">if</span> <span class="pl-s1">last_pk</span> <span class="pl-c1">is</span> <span class="pl-c1">not</span> <span class="pl-c1">None</span>: <span class="pl-s1">queryset</span> <span class="pl-c1">=</span> <span class="pl-s1">all_queryset</span>.<span class="pl-en">filter</span>(<span class="pl-s1">pk__gt</span><span class="pl-c1">=</span><span class="pl-s1">last_pk</span>) <span class="pl-s1">queryset</span> <span class="pl-c1">=</span> <span class="pl-s1">queryset</span>[:<span class="pl-s1">batch_size</span>] <span class="pl-k">for</span> <span class="pl-s1">row</span> <span class="pl-c1">in</span> <span class="pl-s1">queryset</span>: <span class="pl-s1">last_pk</span> <span class="pl-c1">=</span> <span class="pl-s1">row</span>.<span class="pl-s1">pk</span> <span class="pl-k">yield</span> <span class="pl-s1">row</span> <span class="pl-k">if</span> <span class="pl-c1">not</span> <span class="pl-s1">queryset</span>: <span class="pl-k">break</span> <span class="pl-k">def</span> <span class="pl-en">export_as_csv_action</span>(<span class="pl-s1">description</span><span class="pl-c1">=</span><span class="pl-s">"Export selected rows to CSV"</span>): <span class="pl-k">def</span> <span class="pl-en">export_as_csv</span>(<span class="pl-s1">modeladmin</span>, <span class="pl-s1">request</span>, <span class="pl-s1">queryset</span>): <span class="pl-k">def</span> <span class="pl-en">rows</span>(<span class="pl-s1">queryset</span>): <span class="pl-s1">csvfile</span> <span class="pl-c1">=</span> <span class="pl-v">StringIO</span>() <span class="pl-s1">csvwriter</span> <span class="pl-c1">=</span> <span class="pl-s1">csv</span>.<span class="pl-en">writer</span>(<span class="pl-s1">csvfile</span>) <span class="pl-s1">columns</span> <span class="pl-c1">=</span> [<span class="pl-s1">field</span>.<span class="pl-s1">name</span> <span class="pl-k">for</span> <span class="pl-s1">field</span> <span class="pl-c1">in</span> <span class="pl-s1">modeladmin</span>.<span class="pl-s1">model</span>.<span class="pl-s1">_meta</span>.<span class="pl-s1">fields</span>] <span class="pl-k">def</span> <span class="pl-en">read_and_flush</span>(): <span class="pl-s1">csvfile</span>.<span class="pl-en">seek</span>(<span class="pl-c1">0</span>) <span class="pl-s1">data</span> <span class="pl-c1">=</span> <span class="pl-s1">csvfile</span>.<span class="pl-en">read</span>() <span class="pl-s1">csvfile</span>.<span class="pl-en">seek</span>(<span class="pl-c1">0</span>) <span class="pl-s1">csvfile</span>.<span class="pl-en">truncate</span>() <span class="pl-k">return</span> <span class="pl-s1">data</span> <span class="pl-s1">header</span> <span class="pl-c1">=</span> <span class="pl-c1">False</span> <span class="pl-k">if</span> <span class="pl-c1">not</span> <span class="pl-s1">header</span>: <span class="pl-s1">header</span> <span class="pl-c1">=</span> <span class="pl-c1">True</span> <span class="pl-s1">csvwriter</span>.<span class="pl-en">writerow</span>(<span class="pl-s1">columns</span>) <span class="pl-k">yield</span> <span class="pl-en">read_and_flush</span>() <span class="pl-k">for</span> <span class="pl-s1">row</span> <span class="pl-c1">in</span> <span class="pl-en">keyset_pagination_iterator</span>(<span class="pl-s1">queryset</span>): <span class="pl-s1">csvwriter</span>.<span class="pl-en">writerow</span>(<span class="pl-en">getattr</span>(<span class="pl-s1">row</span>, <span class="pl-s1">column</span>) <span class="pl-k">for</span> <span class="pl-s1">column</span> <span class="pl-c1">in</span> <span class="pl-s1">columns</span>) <span class="pl-k">yield</span> <span class="pl-en">read_and_flush</span>() <span class="pl-s1">response</span> <span class="pl-c1">=</span> <span class="pl-v">StreamingHttpResponse</span>(<span class="pl-en">rows</span>(<span class="pl-s1">queryset</span>), <span class="pl-s1">content_type</span><span class="pl-c1">=</span><span class="pl-s">"text/csv"</span>) <span class="pl-s1">response</span>[<span class="pl-s">"Content-Disposition"</span>] <span class="pl-c1">=</span> ( <span class="pl-s">"attachment; filename=%s.csv"</span> <span class="pl-c1">%</span> <span class="pl-s1">modeladmin</span>.<span class="pl-s1">model</span>.<span class="pl-s1">__name__</span> ) <span class="pl-k">return</span> <span class="pl-s1">response</span> <span class="pl-s1">export_as_csv</span>.<span class="pl-s1">short_description</span> <span class="pl-c1">=</span> <span class="pl-s1">description</span> <span class="pl-k">return</span> <span class="pl-s1">export_as_csv</span></pre></div> | <Binary: 91,616 bytes> | 2021-04-25T17:38:06-07:00 | 2021-04-26T00:38:06+00:00 | 2021-04-25T17:38:06-07:00 | 2021-04-26T00:38:06+00:00 | da3a8857be8af5bfa07a3e637e9929cc | export-csv-from-django-admin |