til: cloudrun_tailing-cloud-run-request-logs.md
This data as json
| path | topic | title | url | body | html | shot | created | created_utc | updated | updated_utc | shot_hash | slug |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| cloudrun_tailing-cloud-run-request-logs.md | cloudrun | Tailing Google Cloud Run request logs and importing them into SQLite | https://github.com/simonw/til/blob/main/cloudrun/tailing-cloud-run-request-logs.md | The `gcloud` CLI tool has [the alpha ability to tail log files](https://cloud.google.com/logging/docs/reference/tools/gcloud-logging#live-tailing) - but it's a bit of a pain to setup. You have to install two extras for it. First, this: gcloud alpha logging tail That installs the functionality, but as the documentation will tell you: > To use `gcloud alpha logging tail`, you need to have Python 3 and the `grpcio` Python package installed. Assuming you have Python 3, the problem you have to solve is *which Python* is the `gcloud` tool using to run. After digging around in the source code using `cat $(which gcloud)` I spotted the following: CLOUDSDK_PYTHON=$(order_python python3 python2 python2.7 python) So it looks like (on macOS at least) it prefers to use the `python3` binary if it can find it. So this works to install `grpcio` somewhere it can see it: python3 -m pip install grpcio Having done that, you can start running commands. `gcloud logging logs list` shows a list of logs: ``` ~ % gcloud logging logs list NAME projects/datasette-222320/logs/cloudaudit.googleapis.com%2Factivity projects/datasette-222320/logs/cloudaudit.googleapis.com%2Fdata_access projects/datasette-222320/logs/cloudaudit.googleapis.com%2Fsystem_event projects/datasette-222320/logs/cloudbuild projects/datasette-222320/logs/clouderrorreporting.googleapis.com%2Finsights projects/datasette-222320/logs/cloudtrace.googleapis.com%2FTraceLatencyShiftDetected projects/datasette-222320/logs/run.googleapis.com%2Frequests projects/datasette-222320/logs/run.googleapis.com%2Fstderr projects/datasette-222320/logs/run.googleapis.com%2Fstdout projects/datasette-222320/logs/run.googleapis.com%2Fvarlog%2Fsystem ``` Then you can use `gcloud alpha logging tail projects/datasette-222320/logs/run.googleapis.com%2Frequests` to start logging. Only you also need a `CLOUDSDK_PYTHON_SITEPACKAGES=1` environment variable so that `gcloud` knows to look for the `grpcio` dependency. CLOUDSDK_PYTHON_SITEPACKAGES=1 \ gcloud alpha logging tail projects/datasette-222320/logs/run.googleapis.com%2Frequests The default format is verbose YAML. A log entry looks like this: ```yaml httpRequest: latency: 0.123684963s remoteIp: 66.249.69.240 requestMethod: GET requestSize: '510' requestUrl: https://www.niche-museums.com/browse/museums.json?_facet_size=max&country=United+States&_facet=osm_city&_facet=updated&_facet=osm_suburb&_facet=osm_footway&osm_city=Santa+Cruz responseSize: '6403' serverIp: 142.250.125.121 status: 200 userAgent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) insertId: 611171fe000a38a469d59595 labels: instanceId: 00bf4bf02dab164592dbbb9220b56c3ce64cb0f1c1f37812d1d61e851a931e9964ba539c2ede42886773c82662cc28aa858749d2697f537ff7a61e7b service: niche-museums logName: projects/datasette-222320/logs/run.googleapis.com%2Frequests receiveTimestamp: '2021-08-09T18:20:46.935658405Z' resource: labels: configuration_name: niche-museums location: us-central1 project_id: datasette-222320 revision_name: niche-museums-00039-sur service_name: niche-museums type: cloud_run_revision severity: INFO timestamp: '2021-08-09T18:20:46.669860Z' trace: projects/datasette-222320/traces/306a0d6e7e055ba66172003a74c926c2 ``` I decided to import into a SQLite database so I could use [Datasette](https://datasette.io/) to analyze the log files (hooray for facets). Adding `--format json` switches the output to JSON - but it's a pretty-printed array of JSON objects, something like this: ```json [ { "httpRequest": { "latency": "0.112114537s", "remoteIp": "40.77.167.88", "requestMethod": "GET", "requestSize": "534", "requestUrl": "https://datasette.io/content/repos?forks=0&_facet=homepage&_facet=size&_facet=open_issues&open_issues=3&size=564&_sort=readme_html", "responseSize": "72757", "serverIp": "216.239.38.21", "status": 200, "userAgent": "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" }, "insertId": "6111722f000b5b4c4d4071e2", "labels": { "instanceId": "00bf4bf02d1d7fe4402c3aff8a34688d9a910e6ee6d2545ceebc1edefb99461481e6d9f9ae8de4e907e3d18b98ea9c7f57b2abb527c8857d9163ed193db766c349a1ee", "service": "datasette-io" }, "logName": "projects/datasette-222320/logs/run.googleapis.com%2Frequests", "receiveTimestamp": "2021-08-09T18:21:36.061693305Z", "resource": { "labels": { "configuration_name": "datasette-io", "location": "us-central1", "project_id": "datasette-222320", "revision_name": "datasette-io-00416-coy", "service_name": "datasette-io" }, "type": "cloud_run_revision" }, "severity": "INFO", "timestamp": "2021-08-09T18:21:35.744268Z", "trace": "projects/datasette-222320/traces/016d640caf845fbf8709486bc8dff9c7" } ] ``` I want to stream the logs into `sqlite-utils` using [newline-delimited JSON](https://sqlite-utils.datasette.io/en/stable/cli.html#inserting-newline-delimited-json) since that can insert while the data is still being tailed. I ended up using two new `jq` recipes: cat example.json | jq -cn --stream 'fromstream(1|truncate_stream(inputs))' This turns an `[{"array": "of objects"}, {"like": "this one"}]` into a stream of newline-delimited objects. I [found the recipe here](https://github.com/stedolan/jq/issues/1984#issuecomment-568918146) - I don't understand it. As you can see above, the objects are nested. I want them as flat objects so that `sqlite-utils insert` will create a separate column for each nested value. I used [this recipe](https://til.simonwillison.net/jq/flatten-nested-json-objects-jq) for that. The end result was this: ``` CLOUDSDK_PYTHON_SITEPACKAGES=1 gcloud alpha logging tail \ projects/datasette-222320/logs/run.googleapis.com%2Frequests \ --format json \ | jq -cn --stream 'fromstream(1|truncate_stream(inputs))' \ | jq -c '[leaf_paths as $path | { "key": $path | join("_"), "value": getpath($path) }] | from_entries' \ | sqlite-utils insert /tmp/logs.db logs - --nl --alter --batch-size 1 ``` That last line inserts the data into the `/tmp/logs.db` database file. `--nl` means "expect newline-delimited JSON", `--alter` means "add new columns if they are missing", `--batch-size 1` means "commit after every record" (so I can see them in Datasette while they are streaming in). **UPDATE:** [sqlite-utils 3.15](https://sqlite-utils.datasette.io/en/stable/changelog.html#v3-15) added a `--flatten` option which you can use instead of that second `jq` recipe, so this should work instead: ``` CLOUDSDK_PYTHON_SITEPACKAGES=1 gcloud alpha logging tail \ projects/datasette-222320/logs/run.googleapis.com%2Frequests \ --format json \ | jq -cn --stream 'fromstream(1|truncate_stream(inputs))' \ | sqlite-utils insert /tmp/logs.db logs - --nl --alter --batch-size 1 --flatten ``` The resulting schema looks like this (via `sqlite-utils schema /tmp/logs.db`): ```sql CREATE TABLE [logs] ( [httpRequest_latency] TEXT, [httpRequest_remoteIp] TEXT, [httpRequest_requestMethod] TEXT, [httpRequest_requestSize] TEXT, [httpRequest_requestUrl] TEXT, [httpRequest_responseSize] TEXT, [httpRequest_serverIp] TEXT, [httpRequest_status] INTEGER, [httpRequest_userAgent] TEXT, [insertId] TEXT, [labels_instanceId] TEXT, [labels_service] TEXT, [logName] TEXT, [receiveTimestamp] TEXT, [resource_labels_configuration_name] TEXT, [resource_labels_location] TEXT, [resource_labels_project_id] TEXT, [resource_labels_revision_name] TEXT, [resource_labels_service_name] TEXT, [resource_type] TEXT, [severity] TEXT, [timestamp] TEXT, [trace] TEXT, [httpRequest_referer] TEXT ); ``` Then I ran `datasette /tmp/logs.db` to start exploring the logs. Faceting by `resource_labels_service_name` was particularly useful. <img width="1340" alt="Screenshot of logs in Datasette" src="https://user-images.githubusercontent.com/9599/128755995-fab7e478-82a5-4d80-a959-f89f7dd39209.png"> The `httpRequest_latency` column contains text data that looks like `0.012572683s` - thankfully if you cast it to a `float` the trailing `s` will be ignored. Here's an example query showing the services with the highest average latency: ```sql select resource_labels_service_name, avg(cast(httpRequest_latency as float)) as avg_latency, count(*) from logs group by resource_labels_service_name order by avg_latency desc ``` ## Using the Logs explorer Alternatively, you can use the Google Cloud logs explorer! It has pretty decent faceted search built in. Here's a query showing results from that log file: ``` resource.type="cloud_run_revision" log_name="projects/datasette-222320/logs/run.googleapis.com%2Frequests" ``` Run that at https://console.cloud.google.com/logs/query - or here's a link I can use to execute that directly (for the last 7 days): https://console.cloud.google.com/logs/query;query=resource.type%3D%22cloud_run_revision%22%0Alog_name%3D%22projects%2Fdatasette-222320%2Flogs%2Frun.googleapis.com%252Frequests%22;timeRange=P7D;?project=datasette-222320 | <p>The <code>gcloud</code> CLI tool has <a href="https://cloud.google.com/logging/docs/reference/tools/gcloud-logging#live-tailing" rel="nofollow">the alpha ability to tail log files</a> - but it's a bit of a pain to setup.</p> <p>You have to install two extras for it. First, this:</p> <pre><code>gcloud alpha logging tail </code></pre> <p>That installs the functionality, but as the documentation will tell you:</p> <blockquote> <p>To use <code>gcloud alpha logging tail</code>, you need to have Python 3 and the <code>grpcio</code> Python package installed.</p> </blockquote> <p>Assuming you have Python 3, the problem you have to solve is <em>which Python</em> is the <code>gcloud</code> tool using to run. After digging around in the source code using <code>cat $(which gcloud)</code> I spotted the following:</p> <pre><code>CLOUDSDK_PYTHON=$(order_python python3 python2 python2.7 python) </code></pre> <p>So it looks like (on macOS at least) it prefers to use the <code>python3</code> binary if it can find it.</p> <p>So this works to install <code>grpcio</code> somewhere it can see it:</p> <pre><code>python3 -m pip install grpcio </code></pre> <p>Having done that, you can start running commands. <code>gcloud logging logs list</code> shows a list of logs:</p> <pre><code>~ % gcloud logging logs list NAME projects/datasette-222320/logs/cloudaudit.googleapis.com%2Factivity projects/datasette-222320/logs/cloudaudit.googleapis.com%2Fdata_access projects/datasette-222320/logs/cloudaudit.googleapis.com%2Fsystem_event projects/datasette-222320/logs/cloudbuild projects/datasette-222320/logs/clouderrorreporting.googleapis.com%2Finsights projects/datasette-222320/logs/cloudtrace.googleapis.com%2FTraceLatencyShiftDetected projects/datasette-222320/logs/run.googleapis.com%2Frequests projects/datasette-222320/logs/run.googleapis.com%2Fstderr projects/datasette-222320/logs/run.googleapis.com%2Fstdout projects/datasette-222320/logs/run.googleapis.com%2Fvarlog%2Fsystem </code></pre> <p>Then you can use <code>gcloud alpha logging tail projects/datasette-222320/logs/run.googleapis.com%2Frequests</code> to start logging. Only you also need a <code>CLOUDSDK_PYTHON_SITEPACKAGES=1</code> environment variable so that <code>gcloud</code> knows to look for the <code>grpcio</code> dependency.</p> <pre><code>CLOUDSDK_PYTHON_SITEPACKAGES=1 \ gcloud alpha logging tail projects/datasette-222320/logs/run.googleapis.com%2Frequests </code></pre> <p>The default format is verbose YAML. A log entry looks like this:</p> <div class="highlight highlight-source-yaml"><pre><span class="pl-ent">httpRequest</span>: <span class="pl-ent">latency</span>: <span class="pl-s">0.123684963s</span> <span class="pl-ent">remoteIp</span>: <span class="pl-s">66.249.69.240</span> <span class="pl-ent">requestMethod</span>: <span class="pl-s">GET</span> <span class="pl-ent">requestSize</span>: <span class="pl-s"><span class="pl-pds">'</span>510<span class="pl-pds">'</span></span> <span class="pl-ent">requestUrl</span>: <span class="pl-s">https://www.niche-museums.com/browse/museums.json?_facet_size=max&country=United+States&_facet=osm_city&_facet=updated&_facet=osm_suburb&_facet=osm_footway&osm_city=Santa+Cruz</span> <span class="pl-ent">responseSize</span>: <span class="pl-s"><span class="pl-pds">'</span>6403<span class="pl-pds">'</span></span> <span class="pl-ent">serverIp</span>: <span class="pl-s">142.250.125.121</span> <span class="pl-ent">status</span>: <span class="pl-c1">200</span> <span class="pl-ent">userAgent</span>: <span class="pl-s">Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)</span> <span class="pl-ent">insertId</span>: <span class="pl-s">611171fe000a38a469d59595</span> <span class="pl-ent">labels</span>: <span class="pl-ent">instanceId</span>: <span class="pl-s">00bf4bf02dab164592dbbb9220b56c3ce64cb0f1c1f37812d1d61e851a931e9964ba539c2ede42886773c82662cc28aa858749d2697f537ff7a61e7b</span> <span class="pl-ent">service</span>: <span class="pl-s">niche-museums</span> <span class="pl-ent">logName</span>: <span class="pl-s">projects/datasette-222320/logs/run.googleapis.com%2Frequests</span> <span class="pl-ent">receiveTimestamp</span>: <span class="pl-s"><span class="pl-pds">'</span>2021-08-09T18:20:46.935658405Z<span class="pl-pds">'</span></span> <span class="pl-ent">resource</span>: <span class="pl-ent">labels</span>: <span class="pl-ent">configuration_name</span>: <span class="pl-s">niche-museums</span> <span class="pl-ent">location</span>: <span class="pl-s">us-central1</span> <span class="pl-ent">project_id</span>: <span class="pl-s">datasette-222320</span> <span class="pl-ent">revision_name</span>: <span class="pl-s">niche-museums-00039-sur</span> <span class="pl-ent">service_name</span>: <span class="pl-s">niche-museums</span> <span class="pl-ent">type</span>: <span class="pl-s">cloud_run_revision</span> <span class="pl-ent">severity</span>: <span class="pl-s">INFO</span> <span class="pl-ent">timestamp</span>: <span class="pl-s"><span class="pl-pds">'</span>2021-08-09T18:20:46.669860Z<span class="pl-pds">'</span></span> <span class="pl-ent">trace</span>: <span class="pl-s">projects/datasette-222320/traces/306a0d6e7e055ba66172003a74c926c2</span></pre></div> <p>I decided to import into a SQLite database so I could use <a href="https://datasette.io/" rel="nofollow">Datasette</a> to analyze the log files (hooray for facets).</p> <p>Adding <code>--format json</code> switches the output to JSON - but it's a pretty-printed array of JSON objects, something like this:</p> <div class="highlight highlight-source-json"><pre>[ { <span class="pl-s"><span class="pl-pds">"</span>httpRequest<span class="pl-pds">"</span></span>: { <span class="pl-s"><span class="pl-pds">"</span>latency<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>0.112114537s<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>remoteIp<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>40.77.167.88<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>requestMethod<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>GET<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>requestSize<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>534<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>requestUrl<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>https://datasette.io/content/repos?forks=0&_facet=homepage&_facet=size&_facet=open_issues&open_issues=3&size=564&_sort=readme_html<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>responseSize<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>72757<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>serverIp<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>216.239.38.21<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>status<span class="pl-pds">"</span></span>: <span class="pl-c1">200</span>, <span class="pl-s"><span class="pl-pds">"</span>userAgent<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)<span class="pl-pds">"</span></span> }, <span class="pl-s"><span class="pl-pds">"</span>insertId<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>6111722f000b5b4c4d4071e2<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>labels<span class="pl-pds">"</span></span>: { <span class="pl-s"><span class="pl-pds">"</span>instanceId<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>00bf4bf02d1d7fe4402c3aff8a34688d9a910e6ee6d2545ceebc1edefb99461481e6d9f9ae8de4e907e3d18b98ea9c7f57b2abb527c8857d9163ed193db766c349a1ee<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>service<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>datasette-io<span class="pl-pds">"</span></span> }, <span class="pl-s"><span class="pl-pds">"</span>logName<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>projects/datasette-222320/logs/run.googleapis.com%2Frequests<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>receiveTimestamp<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>2021-08-09T18:21:36.061693305Z<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>resource<span class="pl-pds">"</span></span>: { <span class="pl-s"><span class="pl-pds">"</span>labels<span class="pl-pds">"</span></span>: { <span class="pl-s"><span class="pl-pds">"</span>configuration_name<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>datasette-io<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>location<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>us-central1<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>project_id<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>datasette-222320<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>revision_name<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>datasette-io-00416-coy<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>service_name<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>datasette-io<span class="pl-pds">"</span></span> }, <span class="pl-s"><span class="pl-pds">"</span>type<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>cloud_run_revision<span class="pl-pds">"</span></span> }, <span class="pl-s"><span class="pl-pds">"</span>severity<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>INFO<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>timestamp<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>2021-08-09T18:21:35.744268Z<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>trace<span class="pl-pds">"</span></span>: <span class="pl-s"><span class="pl-pds">"</span>projects/datasette-222320/traces/016d640caf845fbf8709486bc8dff9c7<span class="pl-pds">"</span></span> } ]</pre></div> <p>I want to stream the logs into <code>sqlite-utils</code> using <a href="https://sqlite-utils.datasette.io/en/stable/cli.html#inserting-newline-delimited-json" rel="nofollow">newline-delimited JSON</a> since that can insert while the data is still being tailed.</p> <p>I ended up using two new <code>jq</code> recipes:</p> <pre><code>cat example.json | jq -cn --stream 'fromstream(1|truncate_stream(inputs))' </code></pre> <p>This turns an <code>[{"array": "of objects"}, {"like": "this one"}]</code> into a stream of newline-delimited objects. I <a href="https://github.com/stedolan/jq/issues/1984#issuecomment-568918146">found the recipe here</a> - I don't understand it.</p> <p>As you can see above, the objects are nested. I want them as flat objects so that <code>sqlite-utils insert</code> will create a separate column for each nested value. I used <a href="https://til.simonwillison.net/jq/flatten-nested-json-objects-jq" rel="nofollow">this recipe</a> for that.</p> <p>The end result was this:</p> <pre><code>CLOUDSDK_PYTHON_SITEPACKAGES=1 gcloud alpha logging tail \ projects/datasette-222320/logs/run.googleapis.com%2Frequests \ --format json \ | jq -cn --stream 'fromstream(1|truncate_stream(inputs))' \ | jq -c '[leaf_paths as $path | { "key": $path | join("_"), "value": getpath($path) }] | from_entries' \ | sqlite-utils insert /tmp/logs.db logs - --nl --alter --batch-size 1 </code></pre> <p>That last line inserts the data into the <code>/tmp/logs.db</code> database file. <code>--nl</code> means "expect newline-delimited JSON", <code>--alter</code> means "add new columns if they are missing", <code>--batch-size 1</code> means "commit after every record" (so I can see them in Datasette while they are streaming in).</p> <p><strong>UPDATE:</strong> <a href="https://sqlite-utils.datasette.io/en/stable/changelog.html#v3-15" rel="nofollow">sqlite-utils 3.15</a> added a <code>--flatten</code> option which you can use instead of that second <code>jq</code> recipe, so this should work instead:</p> <pre><code>CLOUDSDK_PYTHON_SITEPACKAGES=1 gcloud alpha logging tail \ projects/datasette-222320/logs/run.googleapis.com%2Frequests \ --format json \ | jq -cn --stream 'fromstream(1|truncate_stream(inputs))' \ | sqlite-utils insert /tmp/logs.db logs - --nl --alter --batch-size 1 --flatten </code></pre> <p>The resulting schema looks like this (via <code>sqlite-utils schema /tmp/logs.db</code>):</p> <div class="highlight highlight-source-sql"><pre>CREATE TABLE [logs] ( [httpRequest_latency] <span class="pl-k">TEXT</span>, [httpRequest_remoteIp] <span class="pl-k">TEXT</span>, [httpRequest_requestMethod] <span class="pl-k">TEXT</span>, [httpRequest_requestSize] <span class="pl-k">TEXT</span>, [httpRequest_requestUrl] <span class="pl-k">TEXT</span>, [httpRequest_responseSize] <span class="pl-k">TEXT</span>, [httpRequest_serverIp] <span class="pl-k">TEXT</span>, [httpRequest_status] <span class="pl-k">INTEGER</span>, [httpRequest_userAgent] <span class="pl-k">TEXT</span>, [insertId] <span class="pl-k">TEXT</span>, [labels_instanceId] <span class="pl-k">TEXT</span>, [labels_service] <span class="pl-k">TEXT</span>, [logName] <span class="pl-k">TEXT</span>, [receiveTimestamp] <span class="pl-k">TEXT</span>, [resource_labels_configuration_name] <span class="pl-k">TEXT</span>, [resource_labels_location] <span class="pl-k">TEXT</span>, [resource_labels_project_id] <span class="pl-k">TEXT</span>, [resource_labels_revision_name] <span class="pl-k">TEXT</span>, [resource_labels_service_name] <span class="pl-k">TEXT</span>, [resource_type] <span class="pl-k">TEXT</span>, [severity] <span class="pl-k">TEXT</span>, [<span class="pl-k">timestamp</span>] <span class="pl-k">TEXT</span>, [trace] <span class="pl-k">TEXT</span>, [httpRequest_referer] <span class="pl-k">TEXT</span> );</pre></div> <p>Then I ran <code>datasette /tmp/logs.db</code> to start exploring the logs. Faceting by <code>resource_labels_service_name</code> was particularly useful.</p> <p><a href="https://user-images.githubusercontent.com/9599/128755995-fab7e478-82a5-4d80-a959-f89f7dd39209.png" target="_blank" rel="nofollow"><img width="1340" alt="Screenshot of logs in Datasette" src="https://user-images.githubusercontent.com/9599/128755995-fab7e478-82a5-4d80-a959-f89f7dd39209.png" style="max-width:100%;"></a></p> <p>The <code>httpRequest_latency</code> column contains text data that looks like <code>0.012572683s</code> - thankfully if you cast it to a <code>float</code> the trailing <code>s</code> will be ignored. Here's an example query showing the services with the highest average latency:</p> <div class="highlight highlight-source-sql"><pre><span class="pl-k">select</span> resource_labels_service_name, <span class="pl-c1">avg</span>(cast(httpRequest_latency <span class="pl-k">as</span> float)) <span class="pl-k">as</span> avg_latency, <span class="pl-c1">count</span>(<span class="pl-k">*</span>) <span class="pl-k">from</span> logs <span class="pl-k">group by</span> resource_labels_service_name <span class="pl-k">order by</span> avg_latency <span class="pl-k">desc</span></pre></div> <h2> <a id="user-content-using-the-logs-explorer" class="anchor" href="#using-the-logs-explorer" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Using the Logs explorer</h2> <p>Alternatively, you can use the Google Cloud logs explorer! It has pretty decent faceted search built in.</p> <p>Here's a query showing results from that log file:</p> <pre><code>resource.type="cloud_run_revision" log_name="projects/datasette-222320/logs/run.googleapis.com%2Frequests" </code></pre> <p>Run that at <a href="https://console.cloud.google.com/logs/query" rel="nofollow">https://console.cloud.google.com/logs/query</a> - or here's a link I can use to execute that directly (for the last 7 days): <a href="https://console.cloud.google.com/logs/query;query=resource.type%3D%22cloud_run_revision%22%0Alog_name%3D%22projects%2Fdatasette-222320%2Flogs%2Frun.googleapis.com%252Frequests%22;timeRange=P7D;?project=datasette-222320" rel="nofollow">https://console.cloud.google.com/logs/query;query=resource.type%3D%22cloud_run_revision%22%0Alog_name%3D%22projects%2Fdatasette-222320%2Flogs%2Frun.googleapis.com%252Frequests%22;timeRange=P7D;?project=datasette-222320</a></p> | <Binary: 70,287 bytes> | 2021-08-09T11:32:01-07:00 | 2021-08-09T18:32:01+00:00 | 2021-08-13T22:07:23-07:00 | 2021-08-14T05:07:23+00:00 | eba49d224d98a67308c137bcc3f0e777 | tailing-cloud-run-request-logs |