A jq idiom for nested log parsing
I keep re-deriving this. Application logs in JSON-lines are great, but
they nest deep, and jq calls turn into a wall of .foo.bar.baz?.
The trick is paths. Say I have:
{"req":{"id":"abc","headers":{"x-trace":"7af"}},"latency_ms":12}
{"req":{"id":"def","headers":{"x-trace":"8ab"}},"latency_ms":18}
To get every (id, x-trace, latency) tuple without typing the paths:
jq -r '[.req.id, .req.headers["x-trace"], .latency_ms] | @tsv'
That’s the obvious one. But when the schema drifts — say
half the rows lose headers — the missing-key error stops the
pipeline. The fix is ? on every step or // "-" after the lookup:
jq -r '[.req.id // "-", .req.headers["x-trace"]? // "-",
.latency_ms // "-"] | @tsv'
Three idioms I lean on:
to_entries[]to flatten a map of metrics into rows.select(.latency_ms > 100)for filtering after projection — reads like SQL.--slurpfilefor joining two log streams by ID (jq -n).
There’s mlr (Miller) for tabular data, but for JSONL,
jq plus column -t is still my default.