Prepared statements are one of PostgreSQL’s most powerful features for query optimization. By parsing and planning queries once, then reusing those plans for subsequent executions, they can dramatically improve performance. But this optimization comes with a hidden danger: sometimes caching the same plan for every execution can lead to catastrophic memory exhaustion and performance degradation.
In this deep dive, we’ll explore how prepared statement plan caching works, when it fails spectacularly, and how PostgreSQL has evolved to address these challenges.
1. Understanding Prepared Statements and Plan Caching
When you execute a prepared statement in PostgreSQL, the database goes through several phases:
- Parsing: Converting the SQL text into a parse tree
- Planning: Creating an execution plan based on statistics and parameters
- Execution: Running the plan against actual data
The promise of prepared statements is simple: do steps 1 and 2 once, then reuse the results for repeated executions with different parameter values.
-- Prepare the statement
PREPARE get_orders AS
SELECT * FROM orders WHERE customer_id = $1;
-- Execute multiple times with different parameters
EXECUTE get_orders(123);
EXECUTE get_orders(456);
EXECUTE get_orders(789);
PostgreSQL uses a clever heuristic to decide when to cache plans. For the first five executions, it creates a custom plan specific to the parameter values. Starting with the sixth execution, it evaluates whether a generic plan (one that works for any parameter value) would be more efficient. If the average cost of the custom plans is close enough to the generic plan’s cost, PostgreSQL switches to reusing the generic plan.
2. The Dark Side: Memory Exhaustion from Plan Caching
Here’s where things can go catastrophically wrong. Consider a partitioned table:
CREATE TABLE events (
id BIGSERIAL,
event_date DATE,
user_id INTEGER,
event_type TEXT,
data JSONB
) PARTITION BY RANGE (event_date);
-- Create 365 partitions, one per day
CREATE TABLE events_2024_01_01 PARTITION OF events
FOR VALUES FROM ('2024-01-01') TO ('2024-01-02');
CREATE TABLE events_2024_01_02 PARTITION OF events
FOR VALUES FROM ('2024-01-02') TO ('2024-01-03');
-- ... 363 more partitions
Now consider this prepared statement:
PREPARE get_events AS
SELECT * FROM events WHERE event_date = $1;
The Problem: Generic Plans Can’t Prune Partitions
When PostgreSQL creates a generic plan for this query, it doesn’t know which specific date you’ll query at execution time. Without this knowledge, the planner cannot perform partition pruning the critical optimization that eliminates irrelevant partitions from consideration.
Here’s what happens:
- Custom plan (first 5 executions): PostgreSQL sees the actual date value, realizes only one partition is relevant, and creates a plan that touches only that partition. Fast and efficient.
- Generic plan (6th execution onward): PostgreSQL creates a plan that must be valid for ANY date value. Since it can’t know which partition you’ll need, it includes ALL 365 partitions in the plan.
The result: Instead of reading from 1 partition, PostgreSQL’s generic plan prepares to read from all 365 partitions. This leads to:
- Memory exhaustion: The query plan itself becomes enormous, containing nodes for every partition
- Planning overhead: Even though the plan is cached, initializing it for execution requires allocating memory for all partition nodes
- Execution inefficiency: The executor must check every partition, even though 364 of them will return zero rows
In extreme cases with thousands of partitions, this can consume gigabytes of memory per connection and bring your database to its knees.
3. Partition Pruning: The Critical Optimization and How It Works
Partition pruning is the process of eliminating partitions that cannot possibly contain relevant data based on query constraints. Understanding partition pruning in depth is essential for working with partitioned tables effectively.
3.1 What Is Partition Pruning?
At its core, partition pruning is PostgreSQL’s mechanism for avoiding unnecessary work. When you query a partitioned table, the database analyzes your WHERE clause and determines which partitions could possibly contain matching rows. All other partitions are excluded from the query execution entirely.
Consider a table partitioned by date range:
CREATE TABLE sales (
sale_id BIGINT,
sale_date DATE,
amount NUMERIC,
product_id INTEGER
) PARTITION BY RANGE (sale_date);
CREATE TABLE sales_2023_q1 PARTITION OF sales
FOR VALUES FROM ('2023-01-01') TO ('2023-04-01');
CREATE TABLE sales_2023_q2 PARTITION OF sales
FOR VALUES FROM ('2023-04-01') TO ('2023-07-01');
CREATE TABLE sales_2023_q3 PARTITION OF sales
FOR VALUES FROM ('2023-07-01') TO ('2023-10-01');
CREATE TABLE sales_2023_q4 PARTITION OF sales
FOR VALUES FROM ('2023-10-01') TO ('2024-01-01');
When you execute:
SELECT * FROM sales WHERE sale_date = '2023-05-15';
PostgreSQL performs partition pruning by examining the partition constraints. It determines that only sales_2023_q2 can contain rows with sale_date = ‘2023-05-15’, so it completely ignores the other three partitions. They never get opened, scanned, or loaded into memory.
3.2 The Two Stages of Partition Pruning
PostgreSQL performs partition pruning at two distinct stages in query execution, and understanding the difference is crucial for troubleshooting performance issues.
Stage 1: Plan Time Pruning (Static Pruning)
Plan time pruning happens during the query planning phase, before execution begins. This is the ideal scenario because pruned partitions never appear in the execution plan at all.
When it occurs:
- The query contains literal values in the WHERE clause
- The partition key columns are directly compared to constants
- The planner can evaluate the partition constraints at planning time
Example:
EXPLAIN SELECT * FROM sales WHERE sale_date = '2023-05-15';
Output might show:
Seq Scan on sales_2023_q2 sales
Filter: (sale_date = '2023-05-15'::date)
Notice that only one partition appears in the plan. The other three partitions were pruned away during planning, and they consume zero resources.
What makes plan time pruning possible:
The planner evaluates the WHERE clause condition against each partition’s constraint. For sales_2023_q2, the constraint is:
sale_date >= '2023-04-01' AND sale_date < '2023-07-01'
The planner performs boolean logic: “Can sale_date = ‘2023-05-15’ be true if the constraint requires sale_date >= ‘2023-04-01’ AND sale_date < ‘2023-07-01’?” Yes, it can. For the other partitions, the answer is no, so they’re eliminated.
Performance characteristics:
- No runtime overhead for pruned partitions
- Minimal memory usage
- Optimal query performance
- The execution plan is lean and specific
Stage 2: Execution Time Pruning (Dynamic Pruning)
Execution time pruning, also called runtime pruning, happens during query execution rather than planning. This occurs when the planner cannot determine which partitions to prune until the query actually runs.
When it occurs:
- Parameters or variables are used instead of literal values
- Subqueries provide the filter values
- Join conditions determine which partitions are needed
- Prepared statements with parameters
Example:
PREPARE get_sales AS
SELECT * FROM sales WHERE sale_date = $1;
EXPLAIN (ANALYZE) EXECUTE get_sales('2023-05-15');
With execution time pruning, the plan initially includes all partitions, but the output shows:
Append (actual rows=100)
Subplans Removed: 3
-> Seq Scan on sales_2023_q2 sales (actual rows=100)
Filter: (sale_date = '2023-05-15'::date)
The key indicator is “Subplans Removed: 3”, which tells you that three partitions were pruned at execution time.
How execution time pruning works:
During the initialization phase of query execution, PostgreSQL evaluates the actual parameter values and applies the same constraint checking logic as plan time pruning. However, instead of eliminating partitions from the plan, it marks them as “pruned” and skips their initialization and execution.
The critical difference:
Even though execution time pruning skips scanning the pruned partitions, the plan still contains nodes for all partitions. This means:
- Memory is allocated for all partition nodes (though less than full initialization)
- The plan structure is larger
- There is a small runtime cost to check each partition
- More complex bookkeeping is required
This is why execution time pruning, while much better than no pruning, is not quite as efficient as plan time pruning.
3.3 Partition Pruning with Different Partition Strategies
PostgreSQL supports multiple partitioning strategies, and pruning works differently for each.
Range Partitioning
Range partitioning is the most common and supports the most effective pruning:
CREATE TABLE measurements (
measurement_time TIMESTAMPTZ,
sensor_id INTEGER,
value NUMERIC
) PARTITION BY RANGE (measurement_time);
Pruning logic: PostgreSQL uses range comparison. Given a filter like measurement_time >= '2024-01-01' AND measurement_time < '2024-02-01', it identifies all partitions whose ranges overlap with this query range.
Pruning effectiveness: Excellent. Range comparisons are computationally cheap and highly selective.
List Partitioning
List partitioning groups rows by discrete values:
CREATE TABLE orders (
order_id BIGINT,
country_code TEXT,
amount NUMERIC
) PARTITION BY LIST (country_code);
CREATE TABLE orders_us PARTITION OF orders
FOR VALUES IN ('US');
CREATE TABLE orders_uk PARTITION OF orders
FOR VALUES IN ('UK');
CREATE TABLE orders_eu PARTITION OF orders
FOR VALUES IN ('DE', 'FR', 'IT', 'ES');
Pruning logic: PostgreSQL checks if the query value matches any value in each partition’s list.
SELECT * FROM orders WHERE country_code = 'FR';
Only orders_eu is accessed because ‘FR’ appears in its value list.
Pruning effectiveness: Very good for equality comparisons. Less effective for OR conditions across many values or pattern matching.
Hash Partitioning
Hash partitioning distributes rows using a hash function:
CREATE TABLE users (
user_id BIGINT,
username TEXT,
email TEXT
) PARTITION BY HASH (user_id);
CREATE TABLE users_p0 PARTITION OF users
FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE users_p1 PARTITION OF users
FOR VALUES WITH (MODULUS 4, REMAINDER 1);
-- ... p2, p3
Pruning logic: PostgreSQL computes the hash of the query value and determines which partition it maps to.
SELECT * FROM users WHERE user_id = 12345;
PostgreSQL calculates hash(12345) % 4 and accesses only the matching partition.
Pruning effectiveness: Excellent for equality on the partition key. Completely ineffective for range queries, pattern matching, or anything except exact equality matches.
3.4 Complex Partition Pruning Scenarios
Real world queries are often more complex than simple equality comparisons. Here’s how pruning handles various scenarios:
Multi Column Partition Keys
CREATE TABLE events (
event_date DATE,
region TEXT,
data JSONB
) PARTITION BY RANGE (event_date, region);
Pruning works on the leading columns of the partition key. A query filtering only on event_date can still prune effectively. A query filtering only on region cannot prune at all because region is not the leading column.
OR Conditions
SELECT * FROM sales
WHERE sale_date = '2023-05-15' OR sale_date = '2023-08-20';
PostgreSQL must access partitions for both dates (Q2 and Q3), so it keeps both and prunes Q1 and Q4. OR conditions reduce pruning effectiveness.
Inequality Comparisons
SELECT * FROM sales WHERE sale_date >= '2023-05-01';
PostgreSQL prunes partitions entirely before the date (Q1) but must keep all partitions from Q2 onward. Range queries reduce pruning selectivity.
Joins Between Partitioned Tables
SELECT * FROM sales s
JOIN products p ON s.product_id = p.product_id
WHERE s.sale_date = '2023-05-15';
If sales is partitioned by sale_date, that partition pruning works normally. If products is also partitioned, PostgreSQL attempts partitionwise joins where possible, enabling pruning on both sides.
Subqueries Providing Values
SELECT * FROM sales
WHERE sale_date = (SELECT MAX(order_date) FROM orders);
This requires execution time pruning because the subquery must run before PostgreSQL knows which partition to access.
3.5 Monitoring Partition Pruning
To verify partition pruning is working, use EXPLAIN:
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM sales WHERE sale_date = '2023-05-15';
What to look for:
Plan time pruning succeeded:
Seq Scan on sales_2023_q2
Only one partition appears in the plan at all.
Execution time pruning succeeded:
Append
Subplans Removed: 3
-> Seq Scan on sales_2023_q2
All partitions appear in the plan structure, but “Subplans Removed” shows pruning happened.
No pruning occurred:
Append
-> Seq Scan on sales_2023_q1
-> Seq Scan on sales_2023_q2
-> Seq Scan on sales_2023_q3
-> Seq Scan on sales_2023_q4
All partitions were scanned. This indicates a problem.
3.6 Why Partition Pruning Fails
Understanding why pruning fails helps you fix it:
- Query doesn’t filter on partition key: If your WHERE clause doesn’t reference the partition column(s), PostgreSQL cannot prune.
- Function calls on partition key:
WHERE EXTRACT(YEAR FROM sale_date) = 2023prevents pruning because PostgreSQL can’t map the function result back to partition ranges. UseWHERE sale_date >= '2023-01-01' AND sale_date < '2024-01-01'instead. - Type mismatches: If your partition key is DATE but you compare to TEXT without explicit casting, pruning may fail.
- Generic plans in prepared statements: As discussed in the main article, generic plans prevent plan time pruning and older PostgreSQL versions struggled with execution time pruning.
- OR conditions with non-partition columns:
WHERE sale_date = '2023-05-15' OR customer_id = 100prevents pruning because customer_id isn’t the partition key. - Volatile functions:
WHERE sale_date = CURRENT_DATEmay prevent plan time pruning (but should work with execution time pruning).
3.7 Partition Pruning Performance Impact
The performance difference between pruned and unpruned queries can be staggering:
Example scenario: 1000 partitions, each with 1 million rows. Query targets one partition.
With pruning:
- Partitions opened: 1
- Rows scanned: 1 million
- Memory for plan nodes: ~10KB
- Query time: 50ms
Without pruning:
- Partitions opened: 1000
- Rows scanned: 1 billion (returning ~1 million)
- Memory for plan nodes: ~10MB
- Query time: 45 seconds
The difference is not incremental; it’s exponential as partition count grows.
4. Partition Pruning in Prepared Statements: The Core Problem
Let me illustrate the severity with a real-world scenario:
-- Table with 1000 partitions
CREATE TABLE metrics (
timestamp TIMESTAMPTZ,
metric_name TEXT,
value NUMERIC
) PARTITION BY RANGE (timestamp);
-- Create 1000 daily partitions...
-- Prepared statement
PREPARE get_metrics AS
SELECT * FROM metrics
WHERE timestamp >= $1 AND timestamp < $2;
After the 6th execution, PostgreSQL switches to a generic plan. Each subsequent execution:
- Allocates memory for 1000 partition nodes
- Initializes executor state for 1000 partitions
- Checks 1000 partition constraints
- Returns data from just 1-2 partitions
If you have 100 connections each executing this prepared statement, you’re multiplying this overhead by 100. With connection poolers reusing connections (and thus reusing prepared statements), the problem compounds.
6. The Fix: Evolution Across PostgreSQL Versions
PostgreSQL has steadily improved partition pruning for prepared statements:
PostgreSQL 11: Execution-Time Pruning Introduced
PostgreSQL 11 introduced run-time partition pruning, but it had significant limitations with prepared statements. Generic plans still included all partitions in memory, even if they could be skipped during execution.
PostgreSQL 12: Better Prepared Statement Pruning
PostgreSQL 12 made substantial improvements:
- Generic plans gained the ability to defer partition pruning to execution time more effectively
- The planner became smarter about when to use generic vs. custom plans for partitioned tables
- Memory consumption for generic plans improved significantly
However, issues remained in edge cases, particularly with: • Multi level partitioning • Complex join queries involving partitioned tables • Prepared statements in stored procedures
PostgreSQL 13-14: Refined Heuristics
These versions improved the cost model for deciding between custom and generic plans:
- Better accounting for partition pruning benefits in the cost calculation
- More accurate statistics gathering on partitioned tables
- Improved handling of partitionwise joins
PostgreSQL 15-16: The Real Game Changers
PostgreSQL 15 and 16 brought transformative improvements:
PostgreSQL 15:
- Dramatically reduced memory usage for generic plans on partitioned tables
- Improved execution-time pruning performance
- Better handling of prepared statements with partition pruning
PostgreSQL 16:
- Introduced incremental sorting improvements that benefit partitioned queries
- Enhanced partition-wise aggregation
- More aggressive execution-time pruning
The key breakthrough: PostgreSQL now builds “stub” plans that allocate minimal memory for partitions that will be pruned, rather than fully initializing all partition nodes.
Workarounds for Older Versions
If you’re stuck on older PostgreSQL versions, here are strategies to avoid the prepared statement pitfall:
1. Disable Generic Plans
Force PostgreSQL to always use custom plans:
-- Set at session level
SET plan_cache_mode = force_custom_plan;
-- Or for specific prepared statement contexts
PREPARE get_events AS
SELECT * FROM events WHERE event_date = $1;
-- Before execution
SET LOCAL plan_cache_mode = force_custom_plan;
EXECUTE get_events('2024-06-15');
This sacrifices the planning time savings but ensures proper partition pruning.
2. Use Statement Level Caching Instead
Many ORMs and database drivers offer statement level caching that doesn’t persist across multiple executions:
# psycopg2 example - named cursors create server-side cursors
# but don't persist plans
cursor = connection.cursor()
cursor.execute(
"SELECT * FROM events WHERE event_date = %s",
(date_value,)
)
3. Adjust plan_cache_mode Per Query
PostgreSQL 12+ provides plan_cache_mode:
-- auto (default): use PostgreSQL's heuristics
-- force_generic_plan: always use generic plan
-- force_custom_plan: always use custom plan
SET plan_cache_mode = force_custom_plan;
For partitioned tables, force_custom_plan is often the right choice.
4. Increase Custom Plan Count
The threshold of 5 custom plans before switching to generic is hardcoded, but you can work around it by using different prepared statement names or by periodically deallocating and recreating prepared statements:
DEALLOCATE get_events;
PREPARE get_events AS SELECT * FROM events WHERE event_date = $1;
5. Partition Pruning Hints
In PostgreSQL 12+, you can sometimes coerce the planner into better behavior:
-- Using an explicit constraint that helps the planner
SELECT * FROM events
WHERE event_date = $1
AND event_date >= CURRENT_DATE - INTERVAL '1 year';
This additional constraint provides a hint about the parameter range.
Best Practices
- Monitor your query plans: Use
EXPLAIN (ANALYZE, BUFFERS)to check if partition pruning is happening:
EXPLAIN (ANALYZE, BUFFERS) EXECUTE get_events('2024-06-15');
Look for “Partitions removed” in the output.
- Check prepared statement statistics: Query
pg_prepared_statementsto see generic vs. custom plan usage:
SELECT name,
generic_plans,
custom_plans
FROM pg_prepared_statements;
- Upgrade PostgreSQL: If you’re dealing with large partitioned tables, the improvements in PostgreSQL 15+ are worth the upgrade effort.
- Design partitions appropriately: Don’t over-partition. Having 10,000 tiny partitions creates problems even with perfect pruning.
- Use connection pooling wisely: Prepared statements persist per connection. With connection pooling, long-lived connections accumulate many prepared statements. Configure your pooler to occasionally recycle connections.
- Benchmark both modes: Test your specific workload with both custom and generic plans to measure the actual impact.
Conclusion
Prepared statements are a powerful optimization, but their interaction with partitioned tables exposes a fundamental tension: caching for reuse versus specificity for efficiency. PostgreSQL’s evolution from version 11 through 16 represents a masterclass in addressing this challenge.
The key takeaway: if you’re using prepared statements with partitioned tables on PostgreSQL versions older than 15, be vigilant about plan caching behavior. Monitor memory usage, check execution plans, and don’t hesitate to force custom plans when generic plans cause problems.
For modern PostgreSQL installations (15+), the improvements are substantial enough that the traditional guidance of “be careful with prepared statements on partitioned tables” is becoming outdated. The database now handles these scenarios with far more intelligence and efficiency.
But understanding the history and mechanics remains crucial, because the next time you see mysterious memory growth in your PostgreSQL connections, you’ll know exactly where to look.