PostgreSQL EXPLAIN ANALYZE: the things I always forget
EXPLAIN ANALYZE is the only honest answer to “why is this query
slow.” I’ve been using it for a decade and still re-read the
output every time. A short list of things I always forget.
The actual time columns are (start, end). The first number is
when the node started producing rows (after startup cost), the second
is when it finished. A Sort node with actual time=200..200 is
fast; one with actual time=10..200 had startup cost.
Rows are per-loop, not total. When you see
actual time=0.001..0.003 rows=5 loops=10000, the actual row count
is 5 * 10000 = 50000. This is the number-one source of confused
“but the planner says X rows!” complaints.
Buffers: shared hit/read/dirtied is enabled by EXPLAIN (ANALYZE, BUFFERS). If read is large, you’re hitting disk.
If dirtied is non-zero on a SELECT, you’re writing hint bits
— common after a bulk insert.
BUFFERS, FORMAT YAML is the most readable layout for
nested plans. Pipe through less to navigate.
SET track_io_timing = on; EXPLAIN (ANALYZE, BUFFERS); adds
I/O Timings: read=... write=... per node. That tells you whether
slowness is CPU (no I/O time) or disk (lots of I/O time).
If you want a graphical view, explain.dalibo.com is the one I
trust. Paste the YAML output, get a tree with hot paths highlighted.
The fix is almost always an index, except when it isn’t.