Best-Practice Patterns for pg_trickle
This guide covers common data modeling patterns and recommended configurations for pg_trickle stream tables. Each pattern includes worked SQL examples, anti-patterns to avoid, and refresh mode recommendations.
Version: v0.14.0+. Some features require recent versions — check SQL_REFERENCE.md for per-feature availability.
Table of Contents
- Pattern 1: Bronze / Silver / Gold Materialization
- Pattern 2: Event Sourcing with Stream Tables
- Pattern 3: Slowly Changing Dimensions (SCD)
- Pattern 4: High-Fan-Out Topology
- Pattern 5: Real-Time Dashboards
- Pattern 6: Tiered Refresh Strategy
- General Guidelines
- Replica Bootstrap & PITR Alignment (v0.27.0)
- Pattern 7: Transactional Outbox (v0.28.0)
- Pattern 8: Transactional Inbox (v0.28.0)
Pattern 1: Bronze / Silver / Gold Materialization
A multi-layer approach where raw data flows through progressively refined stream tables, similar to a medallion architecture.
Architecture
[raw_events] ← Bronze: raw ingest table (regular table)
↓
[events_cleaned] ← Silver: filtered, deduplicated, typed
↓
[events_aggregated] ← Gold: business-level aggregates
SQL Example
-- Bronze: regular PostgreSQL table (source of truth)
CREATE TABLE raw_events (
event_id BIGSERIAL PRIMARY KEY,
user_id INT NOT NULL,
event_type TEXT NOT NULL,
payload JSONB,
received_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Silver: cleaned and deduplicated events
SELECT pgtrickle.create_stream_table(
'events_cleaned',
$$SELECT DISTINCT ON (event_id)
event_id,
user_id,
event_type,
(payload->>'amount')::numeric AS amount,
received_at
FROM raw_events
WHERE event_type IN ('purchase', 'refund', 'subscription')$$,
schedule => '5s',
refresh_mode => 'DIFFERENTIAL'
);
-- Gold: per-user purchase summary
SELECT pgtrickle.create_stream_table(
'user_purchase_summary',
$$SELECT user_id,
COUNT(*) AS total_purchases,
SUM(amount) AS total_spent,
AVG(amount) AS avg_order
FROM events_cleaned
WHERE event_type = 'purchase'
GROUP BY user_id$$,
schedule => 'calculated',
refresh_mode => 'DIFFERENTIAL'
);
Recommended Configuration
| Layer | Refresh Mode | Schedule | Tier |
|---|---|---|---|
| Silver | DIFFERENTIAL | 5s – 30s | hot |
| Gold | DIFFERENTIAL | calculated | hot |
Anti-Patterns
- Don't use FULL refresh for Silver. With frequent small inserts, DIFFERENTIAL is 10–100x faster.
- Don't skip the Silver layer. Joining raw tables directly in Gold queries produces wider joins and slower deltas.
- Don't use IMMEDIATE mode for Gold. Aggregate maintenance on every DML row is expensive — batched DIFFERENTIAL is more efficient.
When NOT to use this pattern
- Your data never changes after insert — a single stream table is simpler.
- The Bronze source is external (foreign table,
dblink) — CDC triggers cannot be attached to foreign tables; use WAL CDC mode or FULL refresh. - You have fewer than ~10,000 rows in Silver — the overhead of three layers is not justified; use one or two tables instead.
Pattern 2: Event Sourcing with Stream Tables
Use stream tables as projections of an append-only event log. The source table is the event store; stream tables materialize different read models.
SQL Example
-- Event store (append-only source)
CREATE TABLE events (
event_id BIGSERIAL PRIMARY KEY,
aggregate_id UUID NOT NULL,
event_type TEXT NOT NULL,
payload JSONB NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Projection 1: Current state per aggregate
SELECT pgtrickle.create_stream_table(
'aggregate_state',
$$SELECT DISTINCT ON (aggregate_id)
aggregate_id,
event_type AS last_event,
payload AS current_state,
created_at AS last_updated
FROM events
ORDER BY aggregate_id, created_at DESC$$,
schedule => '2s',
refresh_mode => 'DIFFERENTIAL'
);
-- Projection 2: Event counts by type per hour
SELECT pgtrickle.create_stream_table(
'hourly_event_counts',
$$SELECT date_trunc('hour', created_at) AS hour,
event_type,
COUNT(*) AS event_count
FROM events
GROUP BY 1, 2$$,
schedule => '10s',
refresh_mode => 'DIFFERENTIAL'
);
Recommended Configuration
| Projection | Refresh Mode | Why |
|---|---|---|
| Current state | DIFFERENTIAL | Small delta per cycle; DISTINCT ON supported |
| Hourly counts | DIFFERENTIAL | Algebraic aggregate (COUNT), efficient delta |
| String aggregations | AUTO | GROUP_RESCAN aggs may benefit from FULL |
Anti-Patterns
- Don't DELETE from the event store. pg_trickle tracks changes via triggers; mixing append and delete on the source creates unnecessary delta complexity. Archive old events to a separate table.
- Don't use
append_only => truewith UPDATE/DELETE patterns. Theappend_onlyflag skips DELETE tracking in the change buffer — only use it when the source truly never updates or deletes.
When NOT to use this pattern
- Your event log is consumed and processed in real time by application code — a stream table adds latency without benefit.
- You need strict per-event ordering guarantees within a transaction —
use
IMMEDIATEmode with a single-row projection instead. - Your events are multi-gigabyte payloads — stream tables replicate the whole row; store only metadata in the event log, not the payload.
Pattern 3: Slowly Changing Dimensions (SCD)
SCD Type 1: Overwrite
The stream table always reflects the current state. Source updates overwrite previous values.
-- Source: customer dimension table (updated in place)
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
name TEXT NOT NULL,
email TEXT,
tier TEXT DEFAULT 'standard',
updated_at TIMESTAMPTZ DEFAULT now()
);
-- SCD-1: current customer state enriched with order stats
SELECT pgtrickle.create_stream_table(
'customer_360',
$$SELECT c.customer_id,
c.name,
c.email,
c.tier,
COUNT(o.id) AS total_orders,
COALESCE(SUM(o.amount), 0) AS lifetime_value
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.customer_id
GROUP BY c.customer_id, c.name, c.email, c.tier$$,
schedule => '30s',
refresh_mode => 'DIFFERENTIAL'
);
SCD Type 2: History Tracking
For SCD-2, maintain a history table with valid-from/valid-to ranges. The stream table provides the current snapshot.
-- Source: customer history with validity ranges
CREATE TABLE customer_history (
customer_id INT NOT NULL,
name TEXT NOT NULL,
tier TEXT NOT NULL,
valid_from TIMESTAMPTZ NOT NULL,
valid_to TIMESTAMPTZ, -- NULL = current
PRIMARY KEY (customer_id, valid_from)
);
-- Current active records only
SELECT pgtrickle.create_stream_table(
'customers_current',
$$SELECT customer_id, name, tier, valid_from
FROM customer_history
WHERE valid_to IS NULL$$,
schedule => '10s',
refresh_mode => 'DIFFERENTIAL'
);
Anti-Patterns
- Don't use FULL refresh for SCD-1 with large dimension tables. Customer tables with millions of rows but few changes per cycle are ideal for DIFFERENTIAL.
- Don't forget to index
valid_to IS NULLfor SCD-2 sources. Without it, the delta scan touches all historical rows.
When NOT to use this pattern
- You already have a purpose-built slowly-changing-dimension ETL tool (e.g. dbt snapshots) — pg_trickle's SCD support is complementary, not a replacement, and duplicate ownership creates confusion.
- Your dimension table changes every row on every load — DIFFERENTIAL offers no benefit; use FULL refresh or rethink the source update pattern.
- You need Type 3 (add-a-column) or Type 6 SCD — those require schema evolution that pg_trickle does not automate today.
Pattern 4: High-Fan-Out Topology
When a single source table feeds many downstream stream tables.
Architecture
[orders]
↙ ↓ ↓ ↘
[daily_totals] [by_region] [by_product] [top_customers]
SQL Example
-- Single source feeding multiple views
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id INT NOT NULL,
region TEXT NOT NULL,
product_id INT NOT NULL,
amount NUMERIC(10,2) NOT NULL,
order_date DATE NOT NULL DEFAULT CURRENT_DATE
);
-- Fan-out: 4 stream tables on 1 source
SELECT pgtrickle.create_stream_table('daily_totals',
'SELECT order_date, SUM(amount) AS daily_total, COUNT(*) AS order_count
FROM orders GROUP BY order_date',
schedule => '5s', refresh_mode => 'DIFFERENTIAL');
SELECT pgtrickle.create_stream_table('by_region',
'SELECT region, SUM(amount) AS total, COUNT(*) AS cnt
FROM orders GROUP BY region',
schedule => '5s', refresh_mode => 'DIFFERENTIAL');
SELECT pgtrickle.create_stream_table('by_product',
'SELECT product_id, SUM(amount) AS total, COUNT(*) AS cnt
FROM orders GROUP BY product_id',
schedule => '5s', refresh_mode => 'DIFFERENTIAL');
SELECT pgtrickle.create_stream_table('top_customers',
'SELECT customer_id, SUM(amount) AS lifetime_value, COUNT(*) AS order_count
FROM orders GROUP BY customer_id',
schedule => '10s', refresh_mode => 'DIFFERENTIAL');
Recommended Configuration
- All fan-out targets share the same source change buffer — CDC overhead
is paid once regardless of how many stream tables read from
orders. - Use
schedule => 'calculated'on downstream STs when they chain from other stream tables. - Consider
pg_trickle.max_concurrent_refreshesif fan-out exceeds 8 (default: 4 concurrent refreshes).
Anti-Patterns
- Don't use IMMEDIATE mode on high-fan-out sources. Each DML row triggers N refreshes (one per downstream ST). Use DIFFERENTIAL with a batched schedule instead.
- Don't set different schedules on STs that should be consistent.
If
daily_totalsandby_regionmust agree, give them the same schedule or usediamond_consistency => 'atomic'.
When NOT to use this pattern
- You only have one or two downstream stream tables — the fan-out pattern adds planning overhead that isn't justified below ~4 targets.
- Downstream queries have incompatible refresh modes (e.g. one needs IMMEDIATE, another needs FULL) — prefer separate source tables.
- All downstream STs will always be queried together — a single wider stream table may be simpler and faster.
Pattern 5: Real-Time Dashboards
For dashboards that need sub-second refresh latency.
SQL Example
-- Live order monitor (sub-second freshness)
SELECT pgtrickle.create_stream_table(
'order_monitor',
$$SELECT
date_trunc('minute', order_date) AS minute,
region,
COUNT(*) AS orders,
SUM(amount) AS revenue
FROM orders
WHERE order_date >= CURRENT_DATE
GROUP BY 1, 2$$,
schedule => '1s',
refresh_mode => 'DIFFERENTIAL'
);
-- For truly real-time needs, use IMMEDIATE mode (triggers on each DML)
SELECT pgtrickle.create_stream_table(
'live_counter',
$$SELECT region, COUNT(*) AS cnt, SUM(amount) AS total
FROM orders GROUP BY region$$,
schedule => 'IMMEDIATE',
refresh_mode => 'DIFFERENTIAL'
);
When to Use IMMEDIATE vs Scheduled DIFFERENTIAL
| Scenario | Mode | Why |
|---|---|---|
| Dashboard polls every 1s | 1s | Batched delta amortizes overhead |
| GraphQL subscription, < 100ms | IMMEDIATE | Triggers fire synchronously per DML |
| Aggregate with GROUP_RESCAN | 5s+ | Avoid per-row full rescans |
| High write throughput (>1K/s) | 2s–5s | IMMEDIATE adds latency to each INSERT |
Anti-Patterns
- Don't use IMMEDIATE for complex joins. Each INSERT/UPDATE/DELETE fires the full DVM delta SQL synchronously — multi-table joins in IMMEDIATE mode add significant latency to writes.
- Don't forget
pooler_compatibility_modewith PgBouncer. Transaction pooling drops temp tables between transactions; enable this flag to avoid stale PREPARE statements.
When NOT to use this pattern
- The data source itself is the bottleneck (slow sensors, infrequent API polling) — a sub-second schedule on a stream table that changes once a minute burns CPU for nothing.
- You need consistency across several related tiles — schedule them together or use a single wider query rather than sub-second individual refreshes that can transiently disagree.
- Write throughput exceeds ~5,000 rows/s — IMMEDIATE mode adds latency
to every write; profile with
pg_trickle.latency_percentiles()first.
Pattern 6: Tiered Refresh Strategy
Assign refresh importance tiers to control scheduling priority.
-- Hot: real-time operational dashboard
SELECT pgtrickle.create_stream_table('live_metrics', ...);
SELECT pgtrickle.alter_stream_table('live_metrics', tier => 'hot');
-- Warm: hourly business reports (2x interval multiplier)
SELECT pgtrickle.create_stream_table('hourly_report', ...,
schedule => '1m');
SELECT pgtrickle.alter_stream_table('hourly_report', tier => 'warm');
-- Cold: daily analytics (10x interval multiplier)
SELECT pgtrickle.create_stream_table('daily_analytics', ...,
schedule => '5m');
SELECT pgtrickle.alter_stream_table('daily_analytics', tier => 'cold');
-- Frozen: archive/audit (skip refresh entirely)
SELECT pgtrickle.alter_stream_table('audit_log_summary', tier => 'frozen');
Tier Multipliers
| Tier | Schedule Multiplier | Use Case |
|---|---|---|
| hot | 1x | Operational dashboards, alerts |
| warm | 2x | Hourly reports, batch pipelines |
| cold | 10x | Daily analytics, low-priority STs |
| frozen | skip | Paused/archived, manual refresh |
When NOT to use this pattern
- All your stream tables are equally critical — don't introduce tier complexity just to have tiers; use a flat schedule instead.
- Your scheduler runs with a single worker — tiering helps multi-worker
scheduling; it has no effect when
max_concurrent_refreshes = 1. - Tier multipliers change too frequently — tier is a static property;
if freshness requirements change continuously, use SLA scheduling
(
pg_trickle.sla_scheduling) instead.
General Guidelines
Choosing a Refresh Mode
| Scenario | Recommended Mode |
|---|---|
| Source has < 5% change ratio per cycle | DIFFERENTIAL |
| Source changes > 50% per cycle | FULL |
| Query is a simple filter/projection | DIFFERENTIAL |
| Query has GROUP_RESCAN aggregates (MIN, MAX) | AUTO |
| Query joins 4+ tables | DIFFERENTIAL |
| Target table < 1000 rows | FULL |
| Need per-row latency guarantee | IMMEDIATE |
Use pgtrickle.recommend_refresh_mode() (v0.14.0+) for automated
analysis:
SELECT pgt_name, recommended_mode, confidence, reason
FROM pgtrickle.recommend_refresh_mode();
Monitoring Checklist
-- Check refresh efficiency across all stream tables
SELECT pgt_name, refresh_mode, diff_speedup, avg_change_ratio
FROM pgtrickle.refresh_efficiency()
ORDER BY total_refreshes DESC;
-- Find stream tables that might benefit from mode change
SELECT pgt_name, current_mode, recommended_mode, reason
FROM pgtrickle.recommend_refresh_mode()
WHERE recommended_mode != 'KEEP';
-- Check for error states
SELECT pgt_name, status, last_error_message
FROM pgtrickle.stream_tables_info
WHERE status IN ('ERROR', 'SUSPENDED');
-- Export definitions for backup
SELECT pgtrickle.export_definition(pgt_schema || '.' || pgt_name)
FROM pgtrickle.pgt_stream_tables;
Common Mistakes
-
Using FULL refresh by default. Start with DIFFERENTIAL — it's correct for 80%+ of workloads. Switch to FULL only when
recommend_refresh_mode()suggests it. -
Over-scheduling. A 1-second schedule on a table with 1-hour change cycles wastes CPU. Match the schedule to actual data arrival rate.
-
Ignoring
append_only. If the source table is truly append-only (no UPDATEs, no DELETEs), setappend_only => trueto halve change buffer writes. -
Not using
calculatedschedule for chained STs. When ST-B reads from ST-A, useschedule => 'calculated'on ST-B to avoid unnecessary refreshes. The scheduler automatically propagates ST-A changes downstream. -
Mixing IMMEDIATE and complex joins. IMMEDIATE mode fires delta SQL on every DML — an 8-table join in IMMEDIATE mode adds 50–200ms to each INSERT. Use scheduled DIFFERENTIAL for complex queries.
Replica Bootstrap & PITR Alignment (v0.27.0)
When bootstrapping a new replica or performing point-in-time recovery, stream tables need special handling because their state is derived from source data at a specific frontier (LSN + timestamp).
The problem
After a pg_basebackup or logical restore, stream table rows are present
but their frontiers may be stale. The next refresh would trigger a FULL
re-scan of all source data, which is expensive for large stream tables.
Solution: use snapshots for replica bootstrap
-- On the primary: export the stream table state
SELECT pgtrickle.snapshot_stream_table(
'public.orders_agg',
'pgtrickle.orders_agg_replica_init'
);
-- Dump only the snapshot table to the replica
pg_dump -t 'pgtrickle.orders_agg_replica_init' mydb | psql replica_db
-- On the replica: restore and align the frontier
SELECT pgtrickle.restore_from_snapshot(
'public.orders_agg',
'pgtrickle.orders_agg_replica_init'
);
-- Clean up the bootstrap snapshot
SELECT pgtrickle.drop_snapshot('pgtrickle.orders_agg_replica_init');
After restore_from_snapshot(), the frontier is set to the snapshot's
frontier and the next refresh is DIFFERENTIAL — only changes after the
snapshot creation time are fetched.
PITR alignment workflow
When performing PITR to a specific LSN:
- Take a snapshot immediately before the target LSN
- Restore the database to the target LSN using
pg_basebackup+ WAL replay - Run
restore_from_snapshot()on each stream table to align frontiers
-- Step 1: snapshot all stream tables (before PITR)
SELECT pgtrickle.snapshot_stream_table(
pgt_schema || '.' || pgt_name,
'pgtrickle.pitr_snapshot_' || pgt_name || '_' || extract(epoch from now())::bigint
)
FROM pgtrickle.pgt_stream_tables
WHERE status = 'ACTIVE';
-- Step 3 (after PITR): restore all snapshots
SELECT pgtrickle.restore_from_snapshot(
pgt_schema || '.' || pgt_name,
'pgtrickle.pitr_snapshot_' || pgt_name || '_<epoch>'
)
FROM pgtrickle.pgt_stream_tables;
Performance: Restoring a 1M-row stream table from a snapshot completes in < 5 seconds (bulk INSERT from local table). The frontier alignment ensures the first differential refresh fetches only new changes, not all rows.
Pattern 7: Transactional Outbox (v0.28.0)
Requires: v0.28.0+
The transactional outbox pattern reliably publishes stream table deltas to
external consumers — even if the consumer is temporarily offline. Each time the
stream table refreshes, pg_trickle writes a header row to a dedicated outbox
table. Consumers read from the outbox via poll_outbox(), process the delta,
then commit their offset.
Use this pattern when:
- You need to publish stream table changes to a message queue, webhook, or another service
- Consumers need at-least-once delivery guarantees
- Multiple independent consumers need to read the same stream independently
- You want replay / seek-to-offset for recovery
Architecture
orders (base table)
└─→ orders_agg (stream table)
└─→ pgt_outbox_orders_agg (outbox table)
├─→ Consumer group A: analytics pipeline
└─→ Consumer group B: notification service
SQL Example
-- 1. Create the stream table
SELECT pgtrickle.create_stream_table(
'public.orders_agg',
'SELECT customer_id, SUM(amount) AS total, COUNT(*) AS cnt FROM orders GROUP BY customer_id',
schedule_seconds => 5
);
-- 2. Enable the outbox
SELECT pgtrickle.enable_outbox('public.orders_agg', retention_hours => 48);
-- 3. Create consumer groups
SELECT pgtrickle.create_consumer_group('analytics', 'public.orders_agg', auto_offset_reset => 'latest');
SELECT pgtrickle.create_consumer_group('notifications', 'public.orders_agg', auto_offset_reset => 'latest');
-- 4. Consumer A polls and processes
DO $$
DECLARE
r RECORD;
last_id BIGINT := 0;
BEGIN
FOR r IN
SELECT * FROM pgtrickle.poll_outbox('analytics', 'worker-1', batch_size => 50)
LOOP
-- process r.payload (JSONB with inserted/deleted row arrays)
last_id := r.outbox_id;
END LOOP;
IF last_id > 0 THEN
PERFORM pgtrickle.commit_offset('analytics', 'worker-1', last_id);
END IF;
END;
$$;
-- 5. Check consumer lag
SELECT * FROM pgtrickle.consumer_lag('analytics');
Consumer Group Tips
| Scenario | Setting |
|---|---|
| Multiple competing workers sharing one offset | Put all workers in the same group |
| Independent pipelines that each need the full stream | Create a separate group per pipeline |
| Replay from the beginning | seek_offset('my_group', 'worker-1', 0) |
| Resume after a crash without re-processing | Commit offsets frequently; use extend_lease() for long processing |
Recommended Configuration
pg_trickle.outbox_enabled = true
pg_trickle.outbox_retention_hours = 48 # keep 2 days of history
pg_trickle.outbox_skip_empty_delta = true # don't write rows for no-op refreshes
pg_trickle.outbox_force_retention = true # keep rows until all groups commit
pg_trickle.consumer_dead_threshold_hours = 24 # mark workers dead after 24h silence
pg_trickle.consumer_cleanup_enabled = true
Anti-Patterns
- Polling without committing: If
commit_offset()is never called, the lease expires and the rows are re-delivered. Always commit after successful processing. - One group per worker: Use one group and multiple named consumers within it for competing-consumer parallelism. Use multiple groups only when pipelines are truly independent.
- Long processing without heartbeats: Call
consumer_heartbeat()every 10–15 seconds for long-running processing to avoid being marked dead.
When NOT to use this pattern
- You only need to expose stream table changes to a single application that reads directly from PostgreSQL — a NOTIFY/LISTEN trigger or change table is simpler than a full outbox.
- Delivery guarantees are not required (analytics, dashboards) — the overhead of consumer groups and offset tracking is not justified.
- Your stream table refreshes every few seconds and consumers can tolerate a few seconds of lag — just poll the stream table directly.
Pattern 8: Transactional Inbox (v0.28.0)
Requires: v0.28.0+
The transactional inbox pattern provides a reliable, idempotent message receiver inside PostgreSQL. External producers write events to the inbox table; pg_trickle maintains stream tables that give you live views of pending, failed, and processed messages — all updated incrementally.
Use this pattern when:
- You receive events from external systems and need to process them exactly-once
- You want automatic dead-letter handling for failed messages
- Multiple workers need to process different aggregates without stepping on each other
- You need per-aggregate ordering guarantees
Architecture
external producer (Kafka / webhook / custom application)
└─→ pgtrickle.orders_inbox (raw event table)
├─→ orders_inbox_pending (stream table: awaiting processing)
├─→ orders_inbox_dlq (stream table: failed messages)
└─→ orders_inbox_stats (stream table: event counts by type)
SQL Example
-- 1. Create the inbox
SELECT pgtrickle.create_inbox(
'orders_inbox',
schema => 'pgtrickle',
max_retries => 3,
with_dead_letter => true,
with_stats => true,
schedule_seconds => 5
);
-- 2. External system inserts a message
INSERT INTO pgtrickle.orders_inbox (event_id, event_type, aggregate_id, payload)
VALUES (
gen_random_uuid()::text,
'order.placed',
'customer-123',
'{"order_id": 42, "amount": 99.50}'::jsonb
);
-- 3. Worker polls pending messages and processes
UPDATE pgtrickle.orders_inbox
SET processed_at = now()
WHERE event_id = '<event_id>'
AND processed_at IS NULL;
-- 4. Check inbox health
SELECT pgtrickle.inbox_health('orders_inbox');
-- 5. Replay failed messages
SELECT pgtrickle.replay_inbox_messages(
'orders_inbox',
ARRAY['event-id-1', 'event-id-2']
);
Per-Aggregate Ordering
When messages for the same customer / entity must be processed in sequence:
-- Enable ordering: only surface the next unprocessed message per aggregate
SELECT pgtrickle.enable_inbox_ordering(
'orders_inbox',
aggregate_id_col => 'aggregate_id',
seq_col => 'event_sequence'
);
-- Workers now read from next_orders_inbox (one row per aggregate)
SELECT * FROM pgtrickle.next_orders_inbox;
Multi-Worker Partitioning
Scale horizontally without external coordination:
-- Worker 0 of 4 handles its share of aggregates
SELECT * FROM pgtrickle.orders_inbox_pending
WHERE pgtrickle.inbox_is_my_partition(aggregate_id, 0, 4);
Recommended Configuration
pg_trickle.inbox_enabled = true
pg_trickle.inbox_processed_retention_hours = 72 # keep 3 days of processed msgs
pg_trickle.inbox_dlq_retention_hours = 0 # keep DLQ forever for forensics
pg_trickle.inbox_dlq_alert_max_per_refresh = 10 # alert on DLQ growth
Anti-Patterns
- Not marking messages as processed: The
_pendingstream table will keep growing. Always setprocessed_at = now()after successful processing. - Ignoring the DLQ: Monitor
orders_inbox_dlqand replay or investigate failed messages regularly. Useinbox_health()in your alerting pipeline. - Skipping idempotency: The inbox uses
event_idfor deduplication. Producers must supply stable, uniqueevent_idvalues — typically a UUID derived from the source event.
When NOT to use this pattern
- Message volume is very high (>10,000/s) — the inbox table becomes a hot bottleneck; consider a dedicated message queue (Kafka, NATS) fronting a batch INSERT into the inbox.
- Processing is purely stateless and idempotency is guaranteed by the
producer — writing to an inbox and querying
_pendingadds latency without benefit over direct INSERT + trigger. - The event source already provides at-least-once with dedup — a second layer of dedup in the inbox wastes storage.
See also: Use Cases · Performance Cookbook · SQL Reference · Tutorials: What Happens on INSERT · Outbox Pattern · Inbox Pattern