pg_tide
Transactional outbox, idempotent inbox, and relay pipelines for PostgreSQL 18+.
pg_tide gives your PostgreSQL database a built-in messaging backbone. Publish events within your existing transactions — no dual-writes, no distributed transactions, no message brokers required at the database layer.
When you're ready to fan out to Kafka, NATS, Redis Streams, or any other system, the pg-tide relay binary bridges the gap — reading from outboxes, delivering to external sinks, and writing back to inboxes with exactly-once semantics.
The Problem pg_tide Solves
Most applications eventually need to publish events to other systems. A customer places an order — and the warehouse needs to know, the analytics pipeline needs to know, the email service needs to send a confirmation. The naive approach is deceptively simple: save the order to your database, then publish an event to your message broker.
But what happens when the broker publish fails after the database commit? Or when your application crashes between the two operations? You get silent data loss — the order exists but nobody downstream knows about it. This is the dual-write problem, and it's one of the most common sources of data inconsistency in distributed systems.
pg_tide eliminates this entire class of bugs by implementing the transactional outbox pattern as a PostgreSQL extension. Your application writes both the business data and the event in a single database transaction. They succeed or fail together — atomically, consistently, and durably. A separate relay process then delivers committed events to downstream systems, retrying indefinitely until delivery succeeds.
How It Works
┌─────────────────────────────────────────────────────────────┐
│ PostgreSQL 18+ │
│ │
│ Your Application │
│ │ │
│ ├──▶ INSERT INTO orders (...) ─┐ │
│ │ ├── Same transaction │
│ └──▶ SELECT tide.outbox_publish ─┘ │
│ │ │
│ tide.tide_outbox_messages │
│ │ │
└────────────────────────┼─────────────────────────────────────┘
│ pg_notify wakes relay
▼
┌──────────────────────────────────────────────────────────────┐
│ pg-tide relay binary │
│ │
│ Polls outbox ──▶ Delivers to sink ──▶ Commits offset │
└──────────────────────────────────────────────────────────────┘
│
▼
NATS · Kafka · Redis · RabbitMQ · SQS · Webhooks
Key Features
| Feature | What it does |
|---|---|
| Transactional Outbox | Publish messages within a database transaction. No 2PC, no dual-writes, no data loss. |
| Idempotent Inbox | Exactly-once delivery with automatic deduplication via unique constraints. |
| Consumer Groups | Kafka-style offset tracking with heartbeats, visibility leases, and independent progress per consumer. |
| Relay Binary | Standalone pg-tide process that bridges outboxes/inboxes with external systems. |
| Multi-Backend | NATS, Kafka, Redis Streams, RabbitMQ, SQS, HTTP Webhooks — all supported. |
| Hot Reload | Pipeline config lives in PostgreSQL. Changes apply without relay restart. |
| HA Ready | Advisory lock coordination provides automatic failover across relay instances. |
At a Glance
-- Create an outbox (one-time setup)
SELECT tide.outbox_create('orders', p_retention_hours := 48);
-- Publish within your business transaction
BEGIN;
INSERT INTO orders (id, total) VALUES (42, 99.99);
SELECT tide.outbox_publish('orders',
'{"order_id": 42, "total": 99.99}'::jsonb,
'{"event_type": "order.created"}'::jsonb
);
COMMIT;
-- Configure a relay pipeline
SELECT tide.relay_set_outbox('orders-nats', 'orders', 'nats',
'{"url": "nats://localhost:4222", "subject": "orders.events"}'::jsonb
);
-- Start the relay — messages flow automatically
-- pg-tide --postgres-url "postgres://user:pass@localhost:5432/mydb"
Who Is This For?
pg_tide is built for:
- Backend engineers building applications on PostgreSQL who need reliable event publishing
- Platform teams providing messaging infrastructure without the operational burden of a full streaming platform
- DBAs who want messaging capabilities that integrate naturally with their existing PostgreSQL operational practices
If PostgreSQL is your source of truth and you need events to flow reliably to other systems, pg_tide is designed for you.
Glossary
Key terms used throughout this documentation:
| Term | Meaning |
|---|---|
| Outbox | A named message stream stored in PostgreSQL. Messages are published to an outbox within a transaction. |
| Inbox | A named receiving table with deduplication. External messages are written here with exactly-once semantics. |
| Relay | The pg-tide binary that bridges outboxes/inboxes with external systems. |
| Pipeline | A configured connection between an outbox and a sink (forward) or a source and an inbox (reverse). |
| Consumer Group | A named entity that tracks reading progress through an outbox independently. |
| Offset | The ID of the last message successfully processed by a consumer group. |
| Sink | The destination system in a forward pipeline (e.g., NATS, Kafka). |
| Source | The origin system in a reverse pipeline (e.g., NATS subscription, webhook endpoint). |
| DLQ (Dead-Letter Queue) | Messages that have exhausted retry attempts. Stored in the inbox for investigation and replay. |
| Advisory Lock | A PostgreSQL lock mechanism used to coordinate pipeline ownership across relay instances. |
| Relay Group ID | An identifier that namespaces advisory locks, allowing multiple relay deployments to coexist. |
| Visibility Lease | A time-limited reservation on a batch of messages, preventing double-processing. |
| Dedup Key | A unique identifier (event_id) used by the inbox to detect and discard duplicate deliveries. |
| Hot Reload | The relay's ability to pick up pipeline config changes from the database without restart. |
Documentation Guide
This documentation is organized to match your learning journey:
- Evaluate — decide if pg_tide is right for your use case
- Getting Started — install and build your first pipeline
- Concepts — understand the mechanics in depth
- SQL Reference — complete API documentation
- Relay Guide — configure and operate the relay binary
- Tutorials — guided walkthroughs of common patterns
- Operations — production deployment and maintenance
- Integrations — platform-specific guidance
License
pg_tide is released under the Apache-2.0 license.
Choosing pg_tide
Choosing the right messaging infrastructure is one of the most consequential architectural decisions you'll make. This page helps you determine whether pg_tide is the right fit for your project by examining where it excels, where alternatives serve better, and how it compares to other tools you might be considering.
When pg_tide Is a Great Fit
You need reliable event publishing from PostgreSQL
Your application writes to PostgreSQL and needs to notify other systems about those writes — sending emails, updating search indexes, feeding analytics pipelines, triggering downstream workflows. You want guarantees that every committed transaction produces exactly one event: no lost messages, no duplicates, no manual reconciliation.
pg_tide was built specifically for this scenario. The transactional outbox pattern ensures your events are published atomically with your business data. If the transaction commits, the event is guaranteed to be delivered. If it rolls back, the event never existed.
You want to eliminate dual-write bugs
The dual-write problem is pernicious because it's intermittent and silent. Your application might work perfectly 99.9% of the time, but during network hiccups, process restarts, or database failovers, events get lost or duplicated. These bugs are incredibly difficult to detect in testing and even harder to reproduce.
pg_tide eliminates this entire class of bugs by design. There is no dual write — only a single database write that includes both your data and the event. The relay handles delivery separately, retrying indefinitely until downstream systems acknowledge receipt.
You prefer SQL over SDKs
pg_tide is a PostgreSQL extension. Publishing an event is a SELECT tide.outbox_publish(...) call. There's no client library to install, no serialization framework to learn, no connection pooling for a separate broker, no SDK version compatibility to manage. Any language or framework that can talk to PostgreSQL can publish events.
This means your Go service, Python script, dbt model, PL/pgSQL function, and psql session can all publish events using exactly the same API. The outbox is a database table — you can query it, monitor it, and manage it with standard SQL tools.
You're already running PostgreSQL
If PostgreSQL is your primary data store — and for many teams, it is — pg_tide adds messaging capabilities without introducing new infrastructure. No Kafka cluster to operate, no ZooKeeper to babysit, no broker to monitor. The relay binary is a single static executable that reads its configuration from the same database it's delivering messages from.
This dramatically reduces operational overhead. You already know how to back up PostgreSQL, monitor its performance, manage its connections, and failover between replicas. pg_tide inherits all of that operational maturity.
You need exactly-once delivery semantics
Many messaging systems provide at-most-once or at-least-once delivery. True exactly-once requires coordination between the sender and receiver. pg_tide achieves this through the combination of transactional publishing (no message loss), consumer offset tracking (no re-processing), and the idempotent inbox (no duplicates). The end-to-end result is effectively exactly-once — each event is processed precisely one time.
Your throughput fits within PostgreSQL's capacity
For most applications, PostgreSQL can handle 5,000–15,000 outbox publishes per second on a single connection (depending on payload size and hardware). If your event volume fits within this range — which covers the vast majority of OLTP workloads — pg_tide provides simpler operations than dedicated streaming platforms.
When to Consider Alternatives
You need sub-millisecond propagation latency
pg_tide's relay polls the outbox at a configurable interval (it wakes immediately via pg_notify for new messages, but batching introduces slight delays). For use cases that demand microsecond-level propagation — high-frequency trading signals, real-time game state — a dedicated event streaming platform with direct in-memory writes (like NATS Core or Kafka with acks=0) will provide lower latency.
That said, pg_tide's latency is typically under 100ms end-to-end. For most applications (webhook delivery, service coordination, analytics feeds), this is more than adequate.
You have no PostgreSQL in your stack
pg_tide is a PostgreSQL extension — that's the whole point. If your data lives in MySQL, MongoDB, DynamoDB, or another database, pg_tide can't help you. Look at:
- Debezium — CDC for MySQL, PostgreSQL, MongoDB, SQL Server, and more
- Maxwell — MySQL-specific CDC tool
- DynamoDB Streams — built-in change capture for DynamoDB
You're doing pure pub/sub without durability requirements
If you need ephemeral fire-and-forget messaging — real-time typing indicators, presence updates, live dashboard refreshes — where missed messages are perfectly acceptable, a simple Redis Pub/Sub or NATS Core subscription is lighter and faster. No durability means no outbox, no offset tracking, and no relay to operate.
Your sustained throughput exceeds PostgreSQL's write capacity
pg_tide's throughput ceiling is fundamentally PostgreSQL's INSERT performance. For sustained write rates above ~100,000 messages/second on a single outbox table (very high-volume telemetry, clickstream data, IoT sensor feeds), dedicated log-structured systems (Kafka, Redpanda, Pulsar) are purpose-built for this scale.
However, before concluding that you need more throughput, consider whether you can partition your events across multiple outboxes, which allows parallel relay consumption.
You need automatic schema-change capture
If your use case is "capture every row change in every table automatically, without modifying application code," Debezium's CDC approach is better suited. pg_tide requires you to explicitly publish events — you choose what gets published, when, and in what format. This is a strength (explicit > implicit for event contracts), but it requires more application involvement.
The Sweet Spot
pg_tide occupies the space where transactional correctness matters more than raw throughput, and where operational simplicity (no broker cluster, no JVM, no ZooKeeper) outweighs the need for a standalone streaming platform.
Typical use cases that pg_tide handles beautifully:
| Use case | Why pg_tide fits |
|---|---|
| Order processing pipelines | Events must never be lost; exactly-once is essential |
| Audit event emission | Every business action must produce a corresponding audit record |
| Cross-service synchronization | Services need consistent views of shared data |
| Webhook delivery with retry | Unreliable endpoints need persistent retry with DLQ |
| Saga / process manager coordination | Orchestrating multi-step workflows across services |
| CQRS event sourcing | Projecting command-side events to query-side read models |
| Data warehouse loading | Reliably streaming changes to analytics infrastructure |
| Multi-tenant notification delivery | Per-tenant event routing with independent tracking |
Detailed Comparison with Alternatives
pg_tide vs. Debezium
| Aspect | pg_tide | Debezium |
|---|---|---|
| Mechanism | Application explicitly writes to outbox table | CDC via PostgreSQL logical replication (captures WAL changes) |
| Message format | You control the payload — publish exactly what consumers need | Mirrors row-level changes (schema-coupled to table structure) |
| Event granularity | Publish semantic events ("order confirmed") | Captures physical changes ("row updated in orders table") |
| Infrastructure | PostgreSQL + single relay binary | PostgreSQL + Kafka Connect + Kafka + ZooKeeper/KRaft |
| Exactly-once | Built-in via inbox dedup | Requires downstream idempotency |
| Operational cost | One binary (~20 MB), no JVM | JVM-based Kafka Connect, requires Kafka cluster |
| Flexibility | Arbitrary events, decoupled from table schema | Automatic but tied to schema changes |
| Application changes | Must call outbox_publish() | None (captures changes transparently) |
| Latency | <100ms (notify-driven) | ~1-5s (replication slot lag) |
Choose pg_tide when you want explicit, semantic events that are decoupled from your table schema, and you prefer minimal infrastructure. Choose Debezium when you need automatic capture of all database changes without modifying application code, and you're willing to operate the Kafka ecosystem.
pg_tide vs. Application-Level Outbox (DIY)
| Aspect | pg_tide | Custom outbox table + homegrown relay |
|---|---|---|
| Setup time | CREATE EXTENSION pg_tide; + start relay | Design schema, build polling logic, implement retry, add dedup, build monitoring... |
| Consumer groups | Built-in with offsets, heartbeats, visibility leases | You build and maintain it |
| Relay | Multi-backend binary with metrics, backpressure, HA | You build and maintain it |
| Idempotent inbox | Built-in with DLQ and replay | You build and maintain it |
| Monitoring | Prometheus metrics + SQL views out of the box | You instrument and maintain it |
| HA / failover | Advisory lock coordination, automatic | You design and build it |
| Maintenance | Upgrade extension + relay binary | Maintain all custom code indefinitely |
| Backends | NATS, Kafka, Redis, RabbitMQ, SQS, Webhooks | Whatever you've implemented |
Choose pg_tide to avoid reinventing reliable messaging infrastructure. A DIY outbox is deceptively simple to start but grows in complexity quickly as you add retry logic, offset tracking, multiple consumers, monitoring, and failover. Choose DIY only when you have very specific requirements that don't map to pg_tide's model.
pg_tide vs. pg_notify / LISTEN
| Aspect | pg_tide | pg_notify |
|---|---|---|
| Durability | Messages persist until consumed | Fire-and-forget (lost if no listener is active) |
| Payload size | JSONB (up to 1 GB, practically limited by memory) | 8,000 bytes maximum |
| Retry | Built-in with exponential backoff and DLQ | None — if you miss it, it's gone |
| Consumer groups | Independent offset tracking per consumer | No — every listener sees every notification |
| Delivery guarantee | At-least-once (effectively exactly-once with inbox) | At-most-once (zero-once if no listener) |
| Cross-network | Relay bridges to any external system | Only in-process PostgreSQL clients |
| Ordering | Guaranteed within an outbox (by ID) | Guaranteed within a session |
| Backpressure | Configurable threshold | None (notifications queue in memory) |
Choose pg_tide when you need durable, reliable delivery with guarantees. Choose pg_notify for lightweight real-time signals where message loss is acceptable — like cache invalidation hints or live UI updates where the client can refresh on reconnect.
pg_tide vs. Writing Directly to Kafka/NATS
| Aspect | pg_tide | Direct broker writes from application |
|---|---|---|
| Transactional safety | Guaranteed (same database transaction) | Dual-write risk (DB commit + broker publish are independent) |
| Application complexity | One SQL call per event | Broker client library, connection management, error handling |
| Operational overhead | Extension + lightweight relay | Broker cluster management, application-side retries |
| Throughput ceiling | PostgreSQL write speed (~15K msg/s per connection) | Broker-native throughput (higher ceiling) |
| Latency | ~50-100ms (poll + delivery) | ~1-5ms (direct publish) |
| Message loss risk | Zero (transactional guarantee) | Non-zero (crash between DB commit and broker ack) |
| Duplicate risk | Handled by inbox dedup | Application must implement idempotency |
Choose pg_tide when transactional correctness is paramount and throughput fits within PostgreSQL's capacity. Choose direct broker writes when you accept the dual-write tradeoff for maximum throughput and minimum latency, or when your application already runs inside the broker ecosystem (e.g., a Kafka Streams application).
Cost Analysis: pg_tide vs. Running Kafka
For teams evaluating pg_tide against a full Kafka deployment, here's a practical comparison of operational costs:
| Resource | pg_tide | Kafka (small production cluster) |
|---|---|---|
| Processes to operate | 1-2 relay instances | 3+ brokers + ZooKeeper/KRaft + Connect + Schema Registry |
| Memory footprint | ~50 MB per relay | ~6 GB per broker (JVM heap) |
| Disk | Shared with PostgreSQL | Dedicated high-throughput storage per broker |
| Network | PostgreSQL connection + sink connections | Inter-broker replication, client connections, ZK communication |
| On-call complexity | "Is PostgreSQL healthy? Is the relay running?" | Partition rebalancing, ISR management, disk pressure, GC pauses |
| Team expertise | PostgreSQL DBA + basic ops | Kafka-specialized operations team |
pg_tide's total cost of ownership is dramatically lower for teams whose primary workload is a PostgreSQL-backed application with moderate event volumes (< 50K events/second).
Decision Flowchart
Ask yourself these questions in order:
- Is PostgreSQL your primary data store? If not → look at Debezium, platform-specific CDC
- Do your events need transactional guarantees? If not → consider direct broker writes or pg_notify
- Is your throughput under ~50K events/second? If not → consider Kafka/Redpanda
- Do you want to minimize operational overhead? If yes → pg_tide
- Do you need automatic schema-change capture? If yes → consider Debezium (or combine both)
If you answered "yes" to questions 1, 2, 3, and 4 — pg_tide is an excellent fit.
Migration Paths
Coming from pg_notify
If you're currently using pg_notify for event delivery and hitting its limitations (payload size, durability, reliability):
- Install pg_tide and create outboxes for your event channels
- Replace
PERFORM pg_notify(channel, payload)withSELECT tide.outbox_publish(outbox, payload, headers) - Set up relay pipelines to your downstream consumers
- Benefit from durability, retry, offset tracking, and exactly-once semantics
Coming from a DIY outbox table
If you've built a custom outbox pattern:
- Install pg_tide alongside your existing tables
- Migrate pipeline logic to pg_tide's relay (eliminates your custom polling code)
- Use pg_tide's consumer groups instead of custom offset tracking
- Add inboxes for receiving-side deduplication
- Decommission your custom relay code
Coming from Debezium
If you're considering pg_tide as a complement or replacement for Debezium:
- Complement: Use Debezium for bulk CDC (replicating entire tables) and pg_tide for semantic business events (explicit, shaped events published by application logic)
- Replace: If you're using Debezium primarily for outbox-style event publishing (Debezium's outbox router), pg_tide provides the same capability with far less infrastructure
Architecture
pg_tide consists of two components that work together: a PostgreSQL extension that manages the outbox/inbox catalog, and a relay binary that bridges messages to external systems.
High-Level Overview
┌─────────────────────────────────────────────────────────┐
│ PostgreSQL 18+ │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────┐ │
│ │ tide_outbox │ │ tide_inbox │ │ relay_ │ │
│ │ _config │ │ _config │ │ *_config │ │
│ └──────┬───────┘ └──────┬───────┘ └─────┬──────┘ │
│ │ │ │ │
│ ┌──────▼───────┐ ┌──────▼───────┐ │ │
│ │ tide_outbox │ │ {name}_inbox │ │ │
│ │ _messages │ │ (per inbox) │ │ │
│ └──────┬───────┘ └──────▲───────┘ │ │
│ │ │ │ │
└─────────┼───────────────────┼──────────────────┼─────────┘
│ │ │
│ LISTEN/NOTIFY │ │
▼ │ ▼
┌─────────────────────────────────────────────────────────┐
│ pg-tide relay binary │
│ │
│ ┌────────────┐ ┌────────────┐ │
│ │ Source │───────▶│ Sink │ │
│ │ (outbox │ │ (NATS, │ │
│ │ poller) │ │ Kafka, │ │
│ └────────────┘ │ Redis…) │ │
│ └────────────┘ │
│ ┌────────────┐ ┌────────────┐ │
│ │ Source │───────▶│ Sink │ │
│ │ (NATS, │ │ (inbox │ │
│ │ Kafka…) │ │ writer) │ │
│ └────────────┘ └────────────┘ │
└─────────────────────────────────────────────────────────┘
│ │
▼ ▼
┌──────────────┐ ┌──────────────────┐
│ External │ │ External │
│ Systems │ │ Systems │
│ (consumers) │ │ (producers) │
└──────────────┘ └──────────────────┘
The Extension Layer
The pg_tide extension installs into the tide schema and provides:
- Catalog tables — configuration for outboxes, inboxes, consumer groups, and relay pipelines
- Message storage — a shared
tide_outbox_messagestable for all outboxes, individual{name}_inboxtables for each inbox - SQL API — functions for publishing, consuming, and managing the lifecycle
- NOTIFY triggers — real-time notifications when relay config changes
The extension has no background workers and no shared memory. All state is purely relational, making it compatible with connection poolers (PgBouncer, PgCat) and managed PostgreSQL services.
The Relay Binary
The pg-tide binary is a standalone Rust process that:
- Connects to PostgreSQL and reads pipeline configurations from the relay catalog
- Acquires advisory locks for each pipeline (enabling multi-relay HA deployments)
- Polls outbox tables for new messages (forward mode)
- Subscribes to external sources for incoming messages (reverse mode)
- Delivers messages to configured sinks with retry and dedup
- Commits offsets after successful delivery (exactly-once semantics)
- Exposes Prometheus metrics and a health endpoint
The relay supports hot-reload via LISTEN tide_relay_config — when you update pipeline config in the database, the relay picks up changes without restart.
Forward Mode (Outbox → External)
Application ──INSERT──▶ tide_outbox_messages
│
│ (relay polls)
▼
pg-tide relay ──publish──▶ NATS / Kafka / Redis / …
│
│ (on success)
▼
commit offset + mark consumed
Reverse Mode (External → Inbox)
NATS / Kafka / Redis / … ──subscribe──▶ pg-tide relay
│
│ (dedup + insert)
▼
{name}_inbox table
│
│ (application reads)
▼
Your application
Deployment Topologies
Single relay (simplest)
One relay instance handles all pipelines. Suitable for low-to-medium throughput.
Multiple relays (HA)
Multiple relay instances connect to the same database. PostgreSQL advisory locks ensure each pipeline is owned by exactly one relay — automatic failover when a relay dies.
Sidecar pattern
Deploy the relay as a sidecar container alongside your application pod in Kubernetes. Each pod handles its own subset of pipelines.
Installation
pg_tide has two components to install: the PostgreSQL extension (SQL functions and catalog tables) and the relay binary (the pg-tide process that bridges messages to external systems).
Prerequisites
- PostgreSQL 18 or later
- Superuser or
CREATE EXTENSIONprivileges on your target database
Installing the Extension
From Source (pgrx)
# Install cargo-pgrx if you haven't already
cargo install cargo-pgrx --version "=0.18.0" --locked
cargo pgrx init --pg18 $(which pg_config)
# Build and install
cd pg-tide-ext
cargo pgrx install --release
Enable the Extension
CREATE EXTENSION pg_tide;
This creates the tide schema with all catalog tables, views, and functions.
Installing the Relay Binary
From GitHub Releases
Download the latest release for your platform from the releases page:
# Linux (amd64)
curl -LO https://github.com/trickle-labs/pg-tide/releases/latest/download/pg-tide-x86_64-unknown-linux-gnu.tar.gz
tar xzf pg-tide-x86_64-unknown-linux-gnu.tar.gz
sudo mv pg-tide /usr/local/bin/
# macOS (Apple Silicon)
curl -LO https://github.com/trickle-labs/pg-tide/releases/latest/download/pg-tide-aarch64-apple-darwin.tar.gz
tar xzf pg-tide-aarch64-apple-darwin.tar.gz
sudo mv pg-tide /usr/local/bin/
From Source (Cargo)
cargo install --git https://github.com/trickle-labs/pg-tide pg-tide-relay
Docker
docker pull ghcr.io/trickle-labs/pg-tide:latest
Verify Installation
# Check relay version
pg-tide --version
# Check extension is installed
psql -c "SELECT * FROM pg_extension WHERE extname = 'pg_tide';"
Next Steps
- Quickstart → — publish your first message in 5 minutes
- Tutorial → — set up a complete outbox-to-sink pipeline
Your First Pipeline
This guide walks you through setting up pg_tide from scratch and building a complete message pipeline. By the end, you'll have an outbox publishing order events and a relay delivering them to NATS — with monitoring, consumer tracking, and exactly-once delivery all working together.
We'll go step by step, explaining what's happening at each stage so you understand not just what to do, but why each piece matters.
Prerequisites
Before starting, make sure you have:
- PostgreSQL 18+ running and accessible (local or remote)
- NATS server running locally (we'll use this as our message sink)
- pg-tide relay binary installed (see Installation)
If you just want to kick the tires quickly, here's a Docker Compose file that sets up everything:
# docker-compose.yml — complete pg_tide development environment
services:
postgres:
image: postgres:18
environment:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: app
ports:
- "5432:5432"
volumes:
- pgdata:/var/lib/postgresql/data
nats:
image: nats:latest
ports:
- "4222:4222" # Client connections
- "8222:8222" # Monitoring
pg-tide-relay:
image: ghcr.io/trickle-labs/pg-tide:latest
depends_on:
- postgres
- nats
environment:
PG_TIDE_POSTGRES_URL: "postgres://postgres:postgres@postgres:5432/app"
PG_TIDE_LOG_FORMAT: "json"
PG_TIDE_LOG_LEVEL: "info"
ports:
- "9090:9090" # Metrics + health
volumes:
pgdata:
Start it with docker compose up -d, then connect to PostgreSQL with:
psql "postgres://postgres:postgres@localhost:5432/app"
Step 1: Install the Extension
The pg_tide extension creates the tide schema with all the catalog tables, views, and functions you'll need:
CREATE EXTENSION pg_tide;
Let's verify it's installed correctly:
SELECT extname, extversion FROM pg_extension WHERE extname = 'pg_tide';
extname | extversion
---------+------------
pg_tide | 0.1.0
Behind the scenes, this created:
- The
tideschema - Configuration tables for outboxes, inboxes, consumer groups, and relay pipelines
- The shared
tide.tide_outbox_messagestable where all outbox messages live - Views like
tide.outbox_pendingandtide.consumer_lagfor monitoring - SQL functions like
tide.outbox_publish()for the API
Step 2: Create an Outbox
An outbox is a named message stream. You might have one outbox for order events, another for user events, another for inventory changes — each is logically separate but physically stored in the same table (discriminated by name).
Let's create an outbox for order events:
SELECT tide.outbox_create('orders',
p_retention_hours := 48,
p_inline_threshold := 10000
);
What do these parameters mean?
'orders'— the name of our outbox. This is how you'll refer to it when publishing and when configuring relay pipelines.p_retention_hours := 48— consumed messages are kept for 48 hours before cleanup. This gives you time to investigate issues and replay if needed.p_inline_threshold := 10000— if more than 10,000 messages are pending (unconsumed), publishing will pause to create backpressure. This prevents unbounded outbox growth if the relay is down.
Verify the outbox exists:
SELECT * FROM tide.tide_outbox_config;
outbox_name | retention_hours | inline_threshold | enabled | created_at
-------------+-----------------+------------------+---------+----------------------------
orders | 48 | 10000 | t | 2025-01-15 10:00:00.000+00
Step 3: Publish Your First Messages
Now let's simulate what your application would do — publishing events within business transactions. The key insight is that the event publish is inside the same transaction as the business logic:
-- Create a simple orders table for this tutorial
CREATE TABLE IF NOT EXISTS orders (
id SERIAL PRIMARY KEY,
customer_id TEXT NOT NULL,
total NUMERIC(10,2) NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
created_at TIMESTAMPTZ DEFAULT now()
);
-- Now publish an event atomically with the business operation
BEGIN;
INSERT INTO orders (id, customer_id, total, status)
VALUES (1, 'cust-alice', 149.99, 'confirmed');
SELECT tide.outbox_publish('orders',
'{"order_id": 1, "customer_id": "cust-alice", "total": 149.99, "status": "confirmed"}'::jsonb,
'{"event_type": "order.confirmed", "source": "tutorial"}'::jsonb
);
COMMIT;
Let's publish a few more events to make things interesting:
BEGIN;
INSERT INTO orders (id, customer_id, total, status)
VALUES (2, 'cust-bob', 42.00, 'confirmed');
SELECT tide.outbox_publish('orders',
'{"order_id": 2, "customer_id": "cust-bob", "total": 42.00, "status": "confirmed"}'::jsonb,
'{"event_type": "order.confirmed", "source": "tutorial"}'::jsonb
);
COMMIT;
BEGIN;
INSERT INTO orders (id, customer_id, total, status)
VALUES (3, 'cust-charlie', 299.95, 'confirmed');
SELECT tide.outbox_publish('orders',
'{"order_id": 3, "customer_id": "cust-charlie", "total": 299.95, "status": "confirmed"}'::jsonb,
'{"event_type": "order.confirmed", "source": "tutorial"}'::jsonb
);
COMMIT;
What just happened: Each transaction atomically wrote the order to the orders table AND the event to the outbox. If any transaction had failed (constraint violation, network error, application crash), both the order and the event would have been rolled back together. No orphaned events, no missing events.
Step 4: Check the Outbox Status
Let's verify our messages are pending (waiting for delivery):
SELECT * FROM tide.outbox_pending;
outbox_name | pending_count | oldest_at | max_id
-------------+---------------+------------------------+--------
orders | 3 | 2025-01-15 10:01:00+00 | 3
Three messages waiting for the relay to pick them up. You can also get detailed status:
SELECT tide.outbox_status('orders');
This returns a JSONB object with comprehensive information about the outbox's current state.
Step 5: Create a Consumer Group
Before the relay can start delivering messages, it needs a consumer group to track its progress:
SELECT tide.create_consumer_group('nats-relay', 'orders',
p_auto_offset_reset := 'earliest'
);
We're using 'earliest' because we want the relay to process all existing messages, including the three we just published. If we used 'latest', it would skip those and only process future messages.
Step 6: Configure the Relay Pipeline
Now we tell pg_tide how to deliver messages from the orders outbox to NATS:
SELECT tide.relay_set_outbox(
'orders-to-nats', -- pipeline name (unique identifier)
'orders', -- source outbox
'nats', -- sink type
jsonb_build_object( -- sink-specific configuration
'url', 'nats://localhost:4222',
'subject', 'orders.{event_type}'
)
);
Notice the subject template: 'orders.{event_type}'. The relay will substitute {event_type} with the value from the message headers. Our messages have "event_type": "order.confirmed", so they'll be published to the NATS subject orders.order.confirmed.
This is powerful — different event types from the same outbox can be routed to different NATS subjects without any relay-side logic.
Step 7: Start the Relay
If you're using the Docker Compose setup, the relay is already running. Otherwise, start it manually:
pg-tide --postgres-url "postgres://postgres:postgres@localhost:5432/app"
You'll see log output like:
INFO pg_tide_relay: Starting pg-tide relay v0.1.0
INFO pg_tide_relay: Connected to PostgreSQL
INFO pg_tide_relay: Discovered pipeline: orders-to-nats (forward, nats)
INFO pg_tide_relay: Acquired advisory lock for pipeline: orders-to-nats
INFO pg_tide_relay: Pipeline orders-to-nats: processing 3 pending messages
INFO pg_tide_relay: Pipeline orders-to-nats: published batch [1..3] to nats
What's happening under the hood:
- The relay connects to PostgreSQL and reads the pipeline catalog
- It discovers
orders-to-natsand attempts to acquire an advisory lock for it - It succeeds (it's the only relay instance), so it owns this pipeline
- It polls
tide.tide_outbox_messagesfor messages in theordersoutbox whereid > last_committed_offset - It delivers each message to the configured NATS subject
- On success, it commits the offset and marks messages as consumed
Step 8: Verify Delivery
Subscribe to NATS to see the delivered messages (in another terminal):
# Using the nats CLI tool
nats sub "orders.>"
If messages were already delivered (relay started before you subscribed), you can verify from the PostgreSQL side:
-- Check that messages are now consumed
SELECT * FROM tide.outbox_pending;
outbox_name | pending_count | oldest_at | max_id
-------------+---------------+-----------+--------
(0 rows)
No pending messages — they've all been delivered! Check consumer lag:
SELECT * FROM tide.consumer_lag;
group_name | outbox_name | consumer_id | committed_offset | lag | last_heartbeat
------------+-------------+-------------+------------------+-----+--------------------
nats-relay | orders | relay-0 | 3 | 0 | 2025-01-15 10:02:00
Zero lag — the relay is fully caught up. The committed_offset of 3 means all messages through ID 3 have been delivered.
Step 9: Publish More Messages and Watch Them Flow
Now that the pipeline is running, new messages are delivered in near-real-time. In another terminal, subscribe to NATS:
nats sub "orders.>"
Then publish a new event in psql:
BEGIN;
INSERT INTO orders (id, customer_id, total, status)
VALUES (4, 'cust-diana', 75.50, 'confirmed');
SELECT tide.outbox_publish('orders',
'{"order_id": 4, "customer_id": "cust-diana", "total": 75.50, "status": "confirmed"}'::jsonb,
'{"event_type": "order.confirmed"}'::jsonb
);
COMMIT;
Within milliseconds, you'll see the message appear on the NATS subscription. The flow is:
COMMITtriggerspg_notify('tide_outbox_new', 'orders')- The relay receives the notification and immediately polls for new messages
- Message ID 4 is fetched, delivered to NATS, and the offset is committed
Step 10: Monitor with Prometheus
The relay exposes Prometheus metrics at its metrics endpoint:
curl http://localhost:9090/metrics
Key metrics to watch:
# Total messages delivered
pg_tide_relay_messages_published_total{pipeline="orders-to-nats",direction="forward"} 4
# Error count (should be 0)
pg_tide_relay_publish_errors_total{pipeline="orders-to-nats",direction="forward"} 0
# Pipeline health
pg_tide_relay_pipeline_healthy{pipeline="orders-to-nats"} 1
And the health endpoint:
curl http://localhost:9090/health
# Returns 200 OK when all pipelines are healthy
What You've Built
In this tutorial, you've set up a complete transactional outbox pipeline:
- The extension provides the schema, tables, and SQL API
- The outbox stores events atomically with your business transactions
- The consumer group tracks how far the relay has progressed
- The pipeline configuration tells the relay where to deliver messages
- The relay binary bridges the gap between PostgreSQL and NATS
- Monitoring via Prometheus metrics and the consumer lag view
This is the foundational pattern. From here, you can:
- Add more pipelines to fan out events to multiple systems
- Create an inbox to receive events from external services
- Add more relay instances for high availability
- Configure different backends (Kafka, Redis, webhooks, etc.)
Next Steps
- Concepts: Message Guarantees → — understand exactly-once delivery in depth
- Concepts: Consumption and Relay → — deep dive into consumer groups and pipeline mechanics
- Relay Guide: Backends → — configure NATS, Kafka, Redis, and more
- Tutorial: End-to-End Pipeline → — build a forward + reverse pipeline with Kafka
- Operations: Deployment → — production deployment patterns
Concept: The Transactional Outbox Pattern
The transactional outbox pattern solves one of the most common problems in distributed systems: how to reliably update a database and publish an event at the same time. Without it, you face the "dual-write problem" — a situation where your database write succeeds but the message publish fails (or vice versa), leaving your system in an inconsistent state.
The Problem
Consider an e-commerce application that needs to save an order and notify the shipping service:
BEGIN;
INSERT INTO orders (id, status) VALUES ('ORD-001', 'confirmed');
COMMIT;
-- Outside the transaction:
publish_to_kafka('order.confirmed', { order_id: 'ORD-001' });
What happens if the application crashes between the COMMIT and the publish? The order exists in the database, but the shipping service never learns about it. The event is lost.
What if you reverse the order — publish first, then commit? If the COMMIT fails (constraint violation, connection loss), the event was already published for an order that doesn't exist.
There is no safe ordering for two independent systems. This is the dual-write problem.
The Solution
The transactional outbox pattern writes the event to an "outbox" table in the same database transaction as the business data:
BEGIN;
INSERT INTO orders (id, status) VALUES ('ORD-001', 'confirmed');
SELECT tide.outbox_publish('order_events', 'orders', '{"order_id": "ORD-001", "status": "confirmed"}');
COMMIT;
Both writes are part of the same ACID transaction. Either both succeed or both fail. There is no inconsistency window.
A separate process (the relay) polls the outbox table and publishes events to external systems. If the relay crashes, it simply resumes from where it left off — the events are safely persisted in PostgreSQL.
Guarantees
The transactional outbox provides:
- Atomicity — The business operation and event publication succeed or fail together
- Durability — Events survive crashes (they're in PostgreSQL's WAL)
- Ordering — Events from the same outbox are delivered in the order they were written
- At-least-once delivery — Every committed event will eventually be delivered to the sink
How pg_tide Implements It
Application PostgreSQL Relay Sink
│ │ │ │
│─── BEGIN ─────────────────→│ │ │
│─── INSERT INTO orders ────→│ │ │
│─── outbox_publish() ──────→│ (writes to outbox table) │ │
│─── COMMIT ────────────────→│ │ │
│ │ │ │
│ │←── poll outbox ──────────│ │
│ │─── return rows ─────────→│ │
│ │ │─── publish ──────────→│
│ │ │←── ack ───────────────│
│ │←── mark delivered ───────│ │
The relay advances through the outbox table sequentially. After successful delivery, it advances its cursor. If it crashes and restarts, it re-reads from the last acknowledged position — potentially re-delivering a few messages (at-least-once), but never losing any.
When to Use This Pattern
Use the transactional outbox when:
- You need to update a database and publish an event reliably
- Consistency between your database and event stream matters
- You can't afford lost events (order notifications, payment confirmations, audit logs)
- You want to decouple your application from the messaging infrastructure
Comparison with Alternatives
| Approach | Consistency | Complexity | Trade-offs |
|---|---|---|---|
| Transactional outbox (pg_tide) | Strong | Low | Slight delivery latency (polling interval) |
| WAL-based CDC (Debezium) | Eventual | Medium | Captures all changes, less control |
| Dual-write (publish + commit) | Weak | Low | Events can be lost or orphaned |
| Saga / 2PC | Strong | High | Complex failure handling |
Further Reading
- Architecture — How pg_tide implements the pattern
- Message Guarantees — Delivery semantics in detail
- Tutorial: Getting Started — Hands-on first pipeline
Concept: The Idempotent Inbox Pattern
The idempotent inbox is the receiving counterpart to the transactional outbox. While the outbox ensures events are reliably published, the inbox ensures events are reliably received and processed exactly once — even when the same message arrives multiple times due to network retries, relay restarts, or at-least-once delivery semantics.
The Problem
In distributed systems, at-least-once delivery is the norm. Messages can be delivered more than once due to:
- Network timeouts (sender retries after no ack)
- Consumer crashes (message re-delivered after acknowledgment timeout)
- Relay restarts (last batch re-delivered)
- Partition rebalances (offset not committed)
If your service processes the same payment event twice, you might charge the customer twice. If it processes the same order event twice, you might ship duplicate items.
The Solution
The inbox table stores every received message with a unique identifier (deduplication key). Before processing, it checks whether the message has already been seen:
-- The relay writes incoming messages to the inbox
-- Duplicate dedup_keys are silently ignored (idempotent)
INSERT INTO tide.inbox_events (dedup_key, event_type, payload)
VALUES ('evt-123', 'order.created', '{"order_id": "ORD-001"}')
ON CONFLICT (dedup_key) DO NOTHING;
If the same message arrives again (same dedup_key), the INSERT silently does nothing. The message is not processed a second time.
Processing Workflow
1. Message arrives from source (Kafka, NATS, webhook, etc.)
2. Relay writes to inbox (ON CONFLICT DO NOTHING)
3. Application queries inbox for pending messages
4. Application processes the message within a transaction
5. Application marks the message as processed
-- Step 3: Query pending messages
SELECT id, event_type, payload
FROM tide.inbox_pending('payment_events')
LIMIT 10;
-- Step 4-5: Process within transaction
BEGIN;
-- Your business logic here
INSERT INTO payments (order_id, amount, status)
VALUES ('ORD-001', 149.99, 'captured');
-- Mark as processed (atomically with business logic)
SELECT tide.inbox_mark_processed('payment_events', 42);
COMMIT;
Because the mark-processed call is inside the same transaction as the business logic, either both succeed or both fail. If the transaction rolls back, the message remains pending and will be retried.
Deduplication Keys
The dedup key uniquely identifies a message. Common strategies:
| Strategy | Example | Use Case |
|---|---|---|
| Message ID | "evt-abc-123" | Source provides unique IDs |
| Outbox ID | "outbox-42" | Cross-service pg_tide communication |
| Composite | "order-ORD-001-created" | Derived from payload |
| Kafka offset | "topic-0-12345" | Kafka partition + offset |
pg_tide extracts the dedup key from the message based on the wire format configuration. For native format, it uses the message key. For Debezium, it uses the record key.
Failure Handling
If processing a message fails (business logic error, constraint violation), you have two options:
Retry later:
-- Leave as pending, it will be retried on next poll
-- Optionally record the error for monitoring
SELECT tide.inbox_mark_failed('payment_events', 42, 'Insufficient funds');
Skip permanently:
-- Mark as processed to advance past it
SELECT tide.inbox_mark_processed('payment_events', 42);
Exactly-Once Processing
The inbox provides exactly-once processing (not delivery) through this mechanism:
- At-least-once delivery: The source may deliver the same message multiple times
- Deduplication on write: The inbox's unique constraint prevents duplicate storage
- Atomic processing: Business logic + mark-processed in one transaction
The combination ensures each unique message is processed exactly once, regardless of how many times it's delivered.
When to Use the Inbox
Use the inbox when:
- You receive events from external systems that may duplicate
- Processing has side effects (charging money, sending emails, updating state)
- You need a reliable buffer between message receipt and processing
- You want to decouple message consumption from message processing
Further Reading
- Transactional Outbox — The publishing counterpart
- SQL Reference: Inbox API — Complete function reference
- Message Guarantees — Delivery semantics
Concept: Consumer Groups
Consumer groups allow multiple independent consumers to process messages from the same outbox, each maintaining its own position. This enables fan-out patterns where a single stream of events is consumed by different services at different speeds, without them interfering with each other.
The Problem
Without consumer groups, a single outbox has a single "cursor" — one position tracking which messages have been delivered. If you want two services to receive the same events (say, an analytics service and a notification service), you'd need to create two separate outboxes and publish events to both.
The Solution
Consumer groups give each consumer its own independent position within the same outbox:
Outbox: order_events
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┐
│ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │10 │
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┘
↑ ↑
Analytics group Notifications group
(position: 4) (position: 8)
The analytics service processes messages slowly (heavy aggregation), while the notification service processes them quickly. Each advances independently.
Creating Consumer Groups
-- Create an outbox
SELECT tide.outbox_create('order_events');
-- Create consumer groups for different services
SELECT tide.consumer_group_create('order_events', 'analytics');
SELECT tide.consumer_group_create('order_events', 'notifications');
SELECT tide.consumer_group_create('order_events', 'search-indexer');
Configuring Pipelines per Group
Each consumer group gets its own relay pipeline:
-- Analytics: sends to data warehouse (slow, large batches)
SELECT tide.relay_set_outbox(
'orders-analytics',
'order_events',
'{
"sink_type": "bigquery",
"consumer_group": "analytics",
"batch_size": 1000,
"dataset": "raw_events",
"table": "orders"
}'::jsonb
);
-- Notifications: sends to Slack (fast, small batches)
SELECT tide.relay_set_outbox(
'orders-notifications',
'order_events',
'{
"sink_type": "slack",
"consumer_group": "notifications",
"batch_size": 1,
"webhook_url": "${env:SLACK_WEBHOOK}"
}'::jsonb
);
-- Search: sends to Elasticsearch (medium speed)
SELECT tide.relay_set_outbox(
'orders-search',
'order_events',
'{
"sink_type": "elasticsearch",
"consumer_group": "search-indexer",
"batch_size": 100,
"url": "http://elasticsearch:9200",
"index": "orders"
}'::jsonb
);
Independent Processing
Each consumer group:
- Tracks its own position (last delivered outbox ID)
- Advances at its own pace
- Has its own circuit breaker state
- Can have different transforms, routing, and rate limits
- Can target different sinks
If the analytics pipeline falls behind (BigQuery is slow), notifications and search indexing continue unaffected.
Checking Group Status
-- See position and lag for each consumer group
SELECT * FROM tide.consumer_group_status('order_events');
Returns:
| group_name | last_delivered_id | pending_count |
|---|---|---|
| analytics | 4,231 | 1,769 |
| notifications | 5,998 | 2 |
| search-indexer | 5,500 | 500 |
Adding a New Consumer Group
When you add a new consumer group, you choose where it starts:
-- Start from the beginning (process all historical events)
SELECT tide.consumer_group_create('order_events', 'new-service', 0);
-- Start from the current position (only future events)
SELECT tide.consumer_group_create('order_events', 'new-service');
Use Cases
- Fan-out: Same events go to Kafka, Elasticsearch, and a data lake
- Selective processing: Each group applies different filters/transforms
- Speed isolation: Fast consumers aren't blocked by slow ones
- A/B testing: Two groups process the same events with different logic
- Migration: New consumer group processes alongside old one during transition
Further Reading
- SQL Reference: Consumer Groups API — Complete function reference
- Fan-Out Pattern — Tutorial with multiple consumers
- Notification Fan-Out — Multi-channel notification example
Message Guarantees
pg_tide provides end-to-end exactly-once delivery semantics by combining three mechanisms that work together as a unified system: the transactional outbox ensures no messages are lost at the source, the relay delivers them reliably to downstream systems, and the idempotent inbox catches any duplicates at the destination. This page explains all three mechanisms in depth, how they interact, and what guarantees you can rely on in production.
The Fundamental Problem: Dual Writes
Imagine you're building an e-commerce platform. When a customer places an order, your application needs to do two things: save the order to your PostgreSQL database, and notify the warehouse service that a new order is ready to ship. The naive approach looks straightforward:
Application ──INSERT──▶ PostgreSQL ✓ (order saved)
──publish──▶ Kafka ✗ (network timeout!)
The database has the order. Kafka does not. The warehouse never learns about the order. The customer waits indefinitely for a shipment that nobody knows to send.
This is the dual-write problem. Any time your application writes to two separate systems — a database and a message broker — there's a window where one write can succeed and the other can fail. No amount of application-level retry logic can fully close this window, because your application itself might crash between the two writes.
The consequences are severe and insidious:
- Silent data loss — downstream consumers never see the event
- Inconsistent state — the database says one thing, the event stream says another
- Difficult detection — unless you actively reconcile both systems, you won't know events were lost
- Impossible recovery — once the transaction is committed without the event, you can't retroactively publish it without complex compensating logic
Why retry logic isn't enough
You might think: "I'll just retry the Kafka publish until it succeeds." But consider what happens if the publish succeeds, then your application crashes before recording that success. On restart, it retries — and now the event is published twice. You've traded message loss for message duplication.
What about the reverse order — publish first, then commit? If the database commit fails after a successful publish, you've sent an event about something that never happened.
There is no safe ordering of two independent writes that guarantees exactly-once semantics. The only solution is to eliminate the dual write entirely.
The Solution: The Transactional Outbox Pattern
The transactional outbox pattern eliminates dual writes by reducing two writes to one. Instead of writing to your database and a message broker, you write to your database only — and the message goes into a special outbox table within the same transaction as your business data:
BEGIN;
-- Your business logic: save the order
INSERT INTO orders (id, customer_id, total, status)
VALUES (42, 'cust-123', 99.99, 'confirmed');
-- Event publishing: same transaction, same database
SELECT tide.outbox_publish('orders',
'{"order_id": 42, "customer_id": "cust-123", "total": 99.99, "status": "confirmed"}'::jsonb,
'{"event_type": "order.confirmed", "correlation_id": "req-abc-789"}'::jsonb
);
COMMIT;
Both the order insert and the message insert succeed or fail together — they're part of the same PostgreSQL transaction. There is no window where one succeeds without the other. If the transaction commits, the message is guaranteed to exist. If it rolls back (for any reason — constraint violation, application crash, network disconnect), the message disappears along with the business data.
A separate relay process then reads committed messages from the outbox table and delivers them to whatever downstream system you've configured (Kafka, NATS, webhooks, etc.). The relay runs independently of your application and can retry indefinitely — the message is safely persisted in PostgreSQL until delivery succeeds.
This separation of concerns gives you the best of both worlds:
- Transactional safety — your application only writes to one system
- Guaranteed delivery — the relay keeps trying until the downstream system acknowledges
- Decoupled systems — your application doesn't need to know about broker availability
- Simple application code — publishing an event is just a SQL function call
How pg_tide implements the outbox
When you call tide.outbox_publish(name, payload, headers), pg_tide:
- Inserts a row into
tide.tide_outbox_messageswith your payload and headers - Fires
pg_notify('tide_outbox_new', outbox name) to wake the relay immediately
The outbox messages table stores all messages from all named outboxes in a single table, discriminated by outbox_name:
| Column | Type | Purpose |
|---|---|---|
id | BIGINT (auto-increment) | Monotonically increasing offset — the relay uses this to track progress |
outbox_name | TEXT | Routes messages to the correct pipeline |
payload | JSONB | Your event data — whatever you want downstream consumers to see |
headers | JSONB | Metadata: event type, correlation ID, schema version, etc. |
created_at | TIMESTAMPTZ | When the message was published |
consumed_at | TIMESTAMPTZ | When the relay successfully delivered it (NULL means pending) |
consumer_group | TEXT | Which consumer group processed this message |
The auto-incrementing id column is crucial: it provides a total ordering of messages within an outbox, which the relay uses to guarantee in-order delivery and to track its position.
The relay loop
The pg-tide relay binary continuously:
- Polls for pending messages (
WHERE consumed_at IS NULL AND id > last_committed_offset) - Delivers each batch to the configured sink (NATS, Kafka, Redis, webhooks, etc.)
- Commits the offset — records how far it's read, so it can resume from this position after a restart
- Marks messages consumed — sets
consumed_atso they're excluded from future polls - Respects retention — messages older than
retention_hoursare eligible for cleanup
If the relay crashes at any point in this loop, it restarts from the last committed offset and re-delivers any messages that weren't confirmed. This is why the relay provides at-least-once delivery — it never skips a message, but it might deliver one twice.
Retention and cleanup
Each outbox has a configurable retention window. After messages have been consumed and their retention period has elapsed, they can be cleaned up to prevent unbounded table growth:
-- Create an outbox with 48-hour retention
SELECT tide.outbox_create('orders', p_retention_hours := 48);
The inline_threshold parameter provides backpressure: if the number of pending (unconsumed) messages exceeds this threshold, subsequent publishes will pause, preventing your outbox from growing unboundedly if the relay is down.
The Idempotent Inbox: Catching Duplicates at the Destination
The transactional outbox guarantees that every committed event will be delivered at least once. But "at least once" means duplicates are possible. Consider this scenario:
- The relay polls the outbox and gets message #42
- The relay delivers message #42 to the downstream system — success
- The relay crashes before committing offset 42
- The relay restarts, reads from its last committed offset (41), and delivers message #42 again
Without protection at the receiving end, the downstream system processes the same event twice. For an "order confirmed" event, this might trigger two shipments. For a "payment processed" event, it might charge the customer twice.
The idempotent inbox solves this. It's a PostgreSQL table with a UNIQUE constraint on an event identifier. When a message arrives:
- The relay attempts an
INSERTwith the event's dedup key as theevent_id - If the key already exists (duplicate delivery), the insert is silently skipped via
ON CONFLICT DO NOTHING - Your application only sees each event once, regardless of how many times it was delivered
How the inbox works in practice
Each named inbox gets its own message table with this structure:
CREATE TABLE tide."payment-events_inbox" (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
event_id TEXT NOT NULL,
source TEXT,
payload JSONB,
headers JSONB,
received_at TIMESTAMPTZ DEFAULT now(),
processed_at TIMESTAMPTZ,
retry_count INT DEFAULT 0,
last_error TEXT,
CONSTRAINT uq_payment_events_event_id UNIQUE (event_id)
);
The UNIQUE(event_id) constraint is the deduplication mechanism. It's simple, reliable, and leverages PostgreSQL's proven concurrency guarantees.
Creating an inbox
SELECT tide.inbox_create('payment-events',
p_max_retries := 5,
p_processed_retention_hours := 72,
p_dlq_retention_hours := 168
);
| Parameter | Default | What it controls |
|---|---|---|
p_schema | 'tide' | Schema where the inbox table lives |
p_max_retries | 3 | How many times processing can fail before the message is considered dead |
p_processed_retention_hours | 72 | How long successfully processed messages are kept (for auditing) |
p_dlq_retention_hours | 0 | How long dead-letter messages are kept (0 = forever) |
Processing inbox messages
Your application reads from the inbox table and marks messages as processed after handling them:
-- Read the next batch of pending messages
SELECT id, event_id, payload, headers
FROM tide."payment-events_inbox"
WHERE processed_at IS NULL
AND retry_count < 5
ORDER BY id
LIMIT 10;
-- After successfully processing a message
SELECT tide.inbox_mark_processed('payment-events', 'evt-001');
-- If processing fails (e.g., external API timeout)
SELECT tide.inbox_mark_failed('payment-events', 'evt-003',
'Stripe API timeout after 30s');
When you call inbox_mark_failed, the retry_count is incremented and the last_error is recorded. Your application can retry later. After max_retries failures, the message is effectively in the dead-letter queue — it won't be picked up by normal processing loops.
The dead-letter queue
Messages that exhaust their retry budget aren't deleted — they remain in the inbox table for investigation. You can query them, examine the error history, and replay them once you've fixed the underlying issue:
-- Find all dead-letter messages
SELECT event_id, payload, last_error, retry_count
FROM tide."payment-events_inbox"
WHERE processed_at IS NULL
AND retry_count >= 5;
-- Replay specific messages after fixing the issue
SELECT tide.replay_inbox_messages('payment-events',
ARRAY['evt-003', 'evt-007', 'evt-012']);
Replaying resets the retry_count to zero, making the messages eligible for processing again.
Choosing dedup keys
The event_id should be deterministic and unique per logical event. The goal is that the same logical event always produces the same dedup key, regardless of how many times it's delivered:
| Source | Recommended dedup key | Why |
|---|---|---|
| pg_tide outbox | {outbox_name}:{message_id} | Automatic — the relay generates this |
| Kafka | {topic}:{partition}:{offset} | Uniquely identifies a Kafka record |
| NATS JetStream | Message sequence number | Assigned by NATS |
| HTTP webhook | X-Request-ID header | Sender-assigned idempotency key |
| Custom sources | Any stable unique identifier | Domain-specific (e.g., order-42:confirmed) |
The relay automatically generates appropriate dedup keys based on the source type when operating in reverse mode (external source → inbox).
End-to-End Exactly-Once: The Three Pillars Combined
When you combine the transactional outbox, the relay's offset tracking, and the idempotent inbox, you get effectively exactly-once delivery semantics end-to-end. Here's how the complete flow works:
1. Application: BEGIN; INSERT business_data; outbox_publish(); COMMIT;
↓
2. Relay: Polls outbox → gets messages where id > last_committed_offset
↓
3. Relay: Delivers to sink (e.g., INSERT into inbox with dedup key)
↓
4. Relay: On sink acknowledgment → commit_offset(last_delivered_id)
↓
5. Relay: Marks outbox messages as consumed
Each stage is protected:
| Stage | What could go wrong | Protection mechanism |
|---|---|---|
| Publish | Transaction rolls back | Message disappears with the business data — correct behavior |
| Relay poll | Relay crashes mid-poll | Restarts from last committed offset — no messages skipped |
| Delivery | Sink temporarily down | Relay retries with exponential backoff — message stays pending |
| Delivery | Relay crashes after delivery but before offset commit | Relay re-delivers on restart; inbox dedup key prevents duplicate processing |
| Offset commit | Database connection lost | Relay reconnects and re-commits — idempotent operation |
Edge cases handled
Relay crash after delivery, before offset commit: This is the most important edge case. The relay successfully delivered message #42 to the inbox, then crashed before recording offset 42. On restart, it re-delivers #42. The inbox's UNIQUE constraint on event_id catches the duplicate, and the insert is silently skipped. The application sees message #42 exactly once.
PostgreSQL failover: If the primary PostgreSQL instance fails over to a replica, the relay's advisory locks are automatically released (they're tied to the session). Another relay instance can acquire the locks and resume from the last committed offset. In-flight messages that weren't committed are re-delivered, and the inbox dedup catches any duplicates.
Sink temporarily unavailable: The relay retries with exponential backoff (100ms → 30s with jitter). Messages remain pending in the outbox — they're never lost. Once the sink recovers, delivery resumes automatically.
Duplicate outbox_publish calls: If your application accidentally publishes the same logical event twice (due to a retry at the application level), you can include a deterministic event_id in the headers. The inbox dedup key will catch duplicates at the receiving end. Alternatively, design your consumers to be naturally idempotent.
Guarantees summary
| Component | Guarantee | Mechanism |
|---|---|---|
| Outbox publish | Exactly-once write | Same PostgreSQL transaction as business data |
| Relay delivery | At-least-once | Retries until sink acknowledges, resumes from last offset |
| Inbox receive | Exactly-once processing | UNIQUE constraint on event_id |
| End-to-end | Effectively exactly-once | All three mechanisms combined |
Limitations and honest caveats
pg_tide's exactly-once guarantee is strong, but it's important to understand the boundaries:
-
Cross-sink atomicity: If you configure a single outbox to fan out to multiple sinks (e.g., Kafka and a webhook), and one delivery succeeds while the other fails, you'll have partial delivery. Use separate pipelines per sink for independent exactly-once guarantees per destination.
-
External sink semantics: Exactly-once delivery into pg_tide inboxes is guaranteed because the inbox uses PostgreSQL's UNIQUE constraint. For external sinks (Kafka, NATS, Redis), the guarantee depends on the sink's acknowledgment semantics. If a sink acknowledges delivery but then loses the message internally, pg_tide cannot detect that. Choose sinks with strong durability guarantees for critical workloads.
-
Clock skew and retention: Retention cleanup uses
created_attimestamps. Extreme clock skew between PostgreSQL nodes could cause premature cleanup of messages that haven't been consumed yet. Always use NTP-synchronized hosts. -
"Effectively" vs. "truly" exactly-once: In distributed systems theory, true exactly-once delivery across system boundaries is provably impossible without two-phase commit. pg_tide achieves effectively exactly-once by combining at-least-once delivery with idempotent reception — the outcome is the same (each event is processed exactly once), but the mechanism involves potential redelivery that's silently deduplicated.
Comparison with Other Approaches
To understand why the transactional outbox pattern is valuable, it helps to see how it compares with alternatives:
Two-Phase Commit (2PC)
2PC coordinates writes across multiple systems using a prepare/commit protocol. It provides true atomicity but at severe cost: high latency, reduced availability (any participant failure blocks the entire transaction), and complexity. pg_tide avoids 2PC entirely — you write to one system, and the relay handles the rest asynchronously.
Change Data Capture (CDC) via Debezium
Debezium captures row-level changes from PostgreSQL's WAL (write-ahead log) and publishes them to Kafka. It doesn't require application changes, but you lose control over event format (events mirror table schemas, not business semantics) and require significant infrastructure (Kafka Connect, Kafka cluster, JVM). pg_tide gives you explicit control over what you publish.
Application-Level Retry with Compensation
Some systems retry failed broker publishes and compensate for duplicates on the consumer side. This "best effort" approach is fragile: it requires every consumer to implement idempotency, provides no centralized dedup mechanism, and becomes increasingly complex as the number of consumers grows. pg_tide centralizes deduplication in the inbox.
Direct Broker Writes (Accept the Risk)
For non-critical events (telemetry, analytics pings, real-time notifications), some teams accept the dual-write risk and publish directly to a broker. This is valid when message loss is acceptable. pg_tide is for when it isn't.
Practical Patterns
Publishing multiple events in one transaction
You can publish multiple events atomically:
BEGIN;
UPDATE orders SET status = 'shipped' WHERE id = 42;
UPDATE inventory SET quantity = quantity - 1 WHERE product_id = 'SKU-001';
-- Both events are published atomically
SELECT tide.outbox_publish('orders',
'{"order_id": 42, "status": "shipped"}'::jsonb,
'{"event_type": "order.shipped"}'::jsonb
);
SELECT tide.outbox_publish('inventory',
'{"product_id": "SKU-001", "quantity_change": -1}'::jsonb,
'{"event_type": "inventory.decremented"}'::jsonb
);
COMMIT;
Conditional publishing
Only publish when certain conditions are met:
BEGIN;
UPDATE orders SET status = 'confirmed'
WHERE id = 42 AND status = 'pending'
RETURNING id INTO affected_id;
-- Only publish if the update actually changed something
IF affected_id IS NOT NULL THEN
PERFORM tide.outbox_publish('orders',
format('{"order_id": %s, "status": "confirmed"}', affected_id)::jsonb,
'{"event_type": "order.confirmed"}'::jsonb
);
END IF;
COMMIT;
Including correlation IDs for tracing
Pass request or trace IDs through the headers so downstream systems can correlate events:
SELECT tide.outbox_publish('orders',
'{"order_id": 42}'::jsonb,
jsonb_build_object(
'event_type', 'order.created',
'correlation_id', 'req-abc-123',
'trace_id', 'trace-xyz-456',
'schema_version', '1.0'
)
);
Consumption and Relay
Once messages are safely stored in the transactional outbox, they need to be delivered to downstream systems and tracked independently by each consumer. This page explains how pg_tide's consumer groups and relay pipelines work together to provide reliable, independent message consumption with automatic failover.
Consumer Groups: Independent Bookmarks in a Shared Stream
Think of a consumer group as a bookmark in a shared book. Multiple services might be interested in the same stream of outbox messages — one service sends emails, another updates a search index, a third feeds an analytics pipeline. Each of these services needs to track its own progress independently. That's exactly what a consumer group provides: an independent offset that records how far a particular consumer has read through the outbox.
If the email service crashes and restarts, it picks up right where it left off — at its own bookmark — without replaying messages that the analytics service has already processed, and without skipping messages that it hasn't yet seen.
Core concepts
A consumer group has these key properties:
- Independent progress — different groups read the same outbox at their own pace. The email sender might be at offset 500 while the analytics pipeline is at offset 2000. They don't interfere with each other.
- Offset tracking — each group records the ID of the last message it successfully processed. On restart, consumption resumes from that exact position.
- Heartbeats — consumers periodically signal that they're alive. Stale heartbeats indicate a dead consumer whose work might need to be redistributed.
- Visibility leases — when a consumer claims a batch of messages, it takes a time-limited lease. If it fails to commit the offset before the lease expires, the messages become available for another consumer to process.
Creating a consumer group
SELECT tide.create_consumer_group('email-sender', 'orders',
p_auto_offset_reset := 'earliest'
);
The p_auto_offset_reset parameter determines where consumption starts if no offset has been committed yet:
| Value | Behavior | When to use |
|---|---|---|
earliest | Start from the very first available message in the outbox | When you want to process the complete history — common for new consumers that need to catch up |
latest | Start from the current end of the outbox (skip historical messages) | When you only care about future events — useful for real-time notification services |
none | Raise an error if no committed offset exists | When you want to be explicit and prevent accidentally processing from the wrong position |
Committing offsets
After your application (or the relay) successfully processes a batch of messages, it commits the offset to record its progress:
SELECT tide.commit_offset('email-sender', 'worker-1', 42);
This records that worker-1 in the email-sender group has processed all messages up to and including ID 42. On restart, this consumer will resume from message 43.
Offset commits are idempotent — committing the same offset twice is a no-op. They're also monotonic within a consumer — you should only ever commit a higher offset than the previous one.
Heartbeats and liveness
Consumers periodically send heartbeats to signal that they're alive and actively processing:
SELECT tide.consumer_heartbeat('email-sender', 'worker-1');
Heartbeats serve two purposes:
- Monitoring — you can detect stale consumers by checking
last_heartbeatagainst a threshold - Lease management — future versions of pg_tide may automatically reassign work from consumers that haven't heartbeated recently
The tide.consumer_lag view shows the current state of each consumer:
SELECT * FROM tide.consumer_lag;
group_name | outbox_name | consumer_id | committed_offset | lag | last_heartbeat
-----------------+-------------+-------------+------------------+------+--------------------
email-sender | orders | worker-1 | 500 | 1500 | 2025-01-15 10:30:00
analytics | orders | relay-0 | 1800 | 200 | 2025-01-15 10:31:00
search-indexer | orders | relay-0 | 2000 | 0 | 2025-01-15 10:31:02
In this example, the email sender is 1,500 messages behind the latest — it might be slow or stuck. The search indexer is fully caught up.
Visibility leases
Visibility leases prevent two consumers from processing the same batch of messages simultaneously. When a consumer claims a batch:
- A lease is recorded in
tide.tide_consumer_leaseswith a start ID, end ID, and expiry time - Other consumers in the same group won't see those messages until the lease expires
- When the consumer commits the offset, the lease is released
- If the consumer crashes without committing, the lease eventually expires and the messages become available again
This mechanism is similar to SQS's visibility timeout or Kafka's partition assignment — it provides at-most-once assignment of work within a consumer group.
Multiple groups, one outbox
The most powerful aspect of consumer groups is that a single outbox can serve many independent purposes:
-- The relay delivers to NATS for real-time notifications
SELECT tide.create_consumer_group('nats-relay', 'orders');
-- An analytics service reads the same events for data warehouse loading
SELECT tide.create_consumer_group('analytics-etl', 'orders');
-- An audit logger persists every event to compliance storage
SELECT tide.create_consumer_group('audit-log', 'orders');
-- A search indexer updates Elasticsearch
SELECT tide.create_consumer_group('search-index', 'orders');
Each group progresses independently. The NATS relay might be at offset 5000 while the analytics ETL is catching up at offset 3000. They don't interfere with each other, and the outbox doesn't need to know about any of them.
Consumer group lifecycle
-- Create a group
SELECT tide.create_consumer_group('my-group', 'events');
-- Drop a group (cascades: removes all offsets and leases)
SELECT tide.drop_consumer_group('my-group');
-- Idempotent creation (no error if already exists)
SELECT tide.create_consumer_group('my-group', 'events',
p_if_not_exists := true);
Relay Pipelines: Bridging PostgreSQL to the Outside World
A relay pipeline defines how messages flow between pg_tide and external systems. While consumer groups track position, pipelines define destination — where should messages actually go?
Pipelines are configured directly in the database (not in config files) and discovered by the relay binary at runtime. This means you can create, modify, and delete pipelines entirely through SQL, and the relay picks up changes automatically via PostgreSQL's LISTEN/NOTIFY mechanism.
Two directions of flow
pg_tide supports two pipeline directions:
Forward pipelines (Outbox → External Sink): Messages flow from a pg_tide outbox to an external system. This is the most common pattern — your application publishes events to the outbox, and the relay delivers them to NATS, Kafka, Redis, webhooks, or any other configured sink.
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ PostgreSQL │ │ pg-tide │ │ External │
│ outbox │────────▶│ relay │────────▶│ system │
│ │ poll │ │ publish │ (NATS, etc) │
└──────────────┘ └──────────────┘ └──────────────┘
Reverse pipelines (External Source → Inbox): Messages flow from an external system into a pg_tide inbox. This is for receiving events from other services — the relay subscribes to an external source and writes incoming messages to an inbox table with deduplication.
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ External │ │ pg-tide │ │ PostgreSQL │
│ system │────────▶│ relay │────────▶│ inbox │
│ (NATS, etc) │ subscribe│ │ insert │ │
└──────────────┘ └──────────────┘ └──────────────┘
Configuring a forward pipeline
Forward pipelines connect an outbox to an external sink:
SELECT tide.relay_set_outbox(
'orders-to-kafka', -- pipeline name (must be unique)
'orders', -- source outbox name
'kafka', -- sink type
jsonb_build_object( -- sink-specific configuration
'brokers', 'broker1:9092,broker2:9092',
'topic', 'order-events',
'acks', 'all',
'compression', 'snappy'
),
p_batch_size := 200, -- deliver messages in batches of 200
p_enabled := true -- start processing immediately
);
The config parameter is a JSONB object whose keys depend on the sink type. Each backend (NATS, Kafka, Redis, RabbitMQ, SQS, Webhook) has its own set of configuration options — see the Backends page for complete details.
Configuring a reverse pipeline
Reverse pipelines connect an external source to an inbox:
SELECT tide.relay_set_inbox(
'stripe-webhooks', -- pipeline name
'payment-events', -- target inbox name
jsonb_build_object( -- source-specific configuration
'port', 8080,
'path', '/webhooks/stripe',
'auth_header', 'Bearer whsec_abc123'
),
p_source := 'webhook', -- source type
p_batch_size := 50,
p_idempotent := true -- enable dedup key extraction
);
Pipeline lifecycle management
Pipelines support a full lifecycle through SQL:
-- Pause processing (messages accumulate in the outbox)
SELECT tide.relay_disable('orders-to-kafka');
-- Resume processing
SELECT tide.relay_enable('orders-to-kafka');
-- Delete permanently (removes config and stops processing)
SELECT tide.relay_delete('orders-to-kafka');
-- View a pipeline's current configuration
SELECT tide.relay_get_config('orders-to-kafka');
-- List all configured pipelines
SELECT tide.relay_list_configs();
Hot reload: no restart required
When you create, update, or delete a pipeline configuration, pg_tide fires a PostgreSQL notification:
pg_notify('tide_relay_config', '{"direction": "relay_outbox_config", "op": "INSERT", "name": "orders-to-kafka"}')
The relay binary listens for these notifications via LISTEN tide_relay_config. When a notification arrives, the relay:
- Re-reads the pipeline catalog from the database
- Starts any new pipelines
- Stops any deleted pipelines
- Reconfigures any modified pipelines
This means you can manage your entire pipeline lifecycle from SQL — no relay restarts, no config file deployments, no downtime. Add a new pipeline, and it starts processing within seconds.
Advisory lock coordination: automatic failover
In production, you typically run multiple relay instances for high availability. But if two relays tried to process the same pipeline simultaneously, you'd get duplicate deliveries. pg_tide prevents this using PostgreSQL advisory locks.
Each pipeline is protected by a unique advisory lock. When a relay instance starts up:
- It reads the pipeline catalog
- For each pipeline, it attempts to acquire an advisory lock (non-blocking)
- If it gets the lock, it owns that pipeline and begins processing
- If another instance already holds the lock, it skips that pipeline and moves on
This gives you:
- Automatic failover — if a relay dies, its PostgreSQL session ends, the advisory locks are released, and another instance acquires them within seconds
- No duplicate processing — only one relay processes each pipeline at any given time
- Horizontal distribution — with many pipelines and many relay instances, pipelines are naturally distributed across instances
# Instance A — might own pipelines 1, 3, 5
pg-tide --relay-group-id production --postgres-url ...
# Instance B — might own pipelines 2, 4, 6
pg-tide --relay-group-id production --postgres-url ...
# If A crashes, B acquires pipelines 1, 3, 5 within seconds
The relay group ID
The relay_group_id namespaces advisory locks. Relay instances with the same group ID compete for pipeline ownership — this is how HA failover works. Instances with different group IDs operate independently and can theoretically own the same pipeline simultaneously (though this would cause duplicate delivery and is rarely desirable).
# Production HA pair — same group, automatic failover between them
pg-tide --relay-group-id production --postgres-url ...
pg-tide --relay-group-id production --postgres-url ...
# Separate staging environment — different group, isolated
pg-tide --relay-group-id staging --postgres-url ...
Supported backends
| Backend | Forward (Sink) | Reverse (Source) | Feature gate |
|---|---|---|---|
| NATS | ✓ | ✓ | nats (default) |
| Kafka | ✓ | ✓ | kafka |
| Redis Streams | ✓ | ✓ | redis |
| RabbitMQ | ✓ | ✓ | rabbitmq |
| SQS | ✓ | ✓ | sqs |
| HTTP Webhook | ✓ | ✓ | webhook (default) |
| pg_tide Inbox | ✓ | — | pg-inbox |
| stdout | ✓ | — | stdout (default) |
| stdin | — | ✓ | always available |
Putting it all together
Here's a typical production setup with multiple pipelines serving different purposes:
-- Forward: order events go to NATS for real-time microservice communication
SELECT tide.relay_set_outbox('orders-realtime', 'orders', 'nats',
jsonb_build_object(
'url', 'nats://nats-cluster:4222',
'subject', 'orders.{event_type}'
)
);
-- Forward: order events also go to Kafka for long-term analytics
SELECT tide.relay_set_outbox('orders-analytics', 'orders', 'kafka',
jsonb_build_object(
'brokers', 'kafka:9092',
'topic', 'orders-analytics',
'compression', 'zstd'
),
p_batch_size := 500
);
-- Reverse: incoming payment confirmations from a third-party webhook
SELECT tide.relay_set_inbox('payment-webhooks', 'payments',
jsonb_build_object(
'port', 8080,
'path', '/webhooks/payments'
),
p_source := 'webhook'
);
-- Forward: payment confirmations forwarded to an internal NATS subject
SELECT tide.relay_set_outbox('payments-internal', 'payment-notifications', 'nats',
jsonb_build_object(
'url', 'nats://nats-cluster:4222',
'subject', 'payments.confirmed'
)
);
Each pipeline operates independently with its own offset tracking, retry logic, and advisory lock. The relay binary handles all of them concurrently.
Outbox API
All outbox functions live in the tide schema.
tide.outbox_create
Create a new named outbox.
SELECT tide.outbox_create(
p_name TEXT,
p_retention_hours INT DEFAULT 24,
p_inline_threshold INT DEFAULT 10000
);
| Parameter | Type | Default | Description |
|---|---|---|---|
p_name | TEXT | (required) | Unique outbox name |
p_retention_hours | INT | 24 | Hours to retain consumed messages before cleanup |
p_inline_threshold | INT | 10000 | Maximum pending messages before backpressure signals |
Errors:
- Raises an error if an outbox with the same name already exists.
Example:
SELECT tide.outbox_create('order-events', 48, 50000);
tide.outbox_publish
Publish a message to a named outbox. Runs within the caller's transaction.
SELECT tide.outbox_publish(
p_name TEXT,
p_payload JSONB,
p_headers JSONB
);
| Parameter | Type | Description |
|---|---|---|
p_name | TEXT | Target outbox name |
p_payload | JSONB | Message body |
p_headers | JSONB | Metadata (event_type, correlation_id, etc.) |
Behavior:
- Inserts into
tide.tide_outbox_messages - Fires
pg_notify('tide_outbox_new', p_name)to wake the relay - Errors if the outbox does not exist or is disabled
Example:
BEGIN;
INSERT INTO orders (id, total) VALUES (42, 99.99);
SELECT tide.outbox_publish('order-events',
'{"order_id": 42, "total": 99.99}'::jsonb,
'{"event_type": "order.created"}'::jsonb
);
COMMIT;
tide.outbox_drop
Drop a named outbox and all its messages.
SELECT tide.outbox_drop(
p_name TEXT,
p_if_exists BOOLEAN DEFAULT false
);
| Parameter | Type | Default | Description |
|---|---|---|---|
p_name | TEXT | (required) | Outbox to drop |
p_if_exists | BOOLEAN | false | Suppress error if outbox doesn't exist |
Cascades: Removes all messages and consumer groups for this outbox.
tide.outbox_status
Get a status summary for a named outbox.
SELECT tide.outbox_status(p_name TEXT) → JSONB
Returns:
{
"outbox_name": "orders",
"pending_messages": 42,
"total_messages": 1500,
"oldest_pending_age_seconds": 3.7,
"retention_hours": 24
}
tide.outbox_disable
Pause an outbox. Calls to outbox_publish will error while the outbox is disabled.
SELECT tide.outbox_disable(p_name TEXT);
tide.outbox_enable
Resume a previously disabled outbox.
SELECT tide.outbox_enable(p_name TEXT);
Views
tide.outbox_pending
Pending (unconsumed) messages per outbox:
SELECT * FROM tide.outbox_pending;
| Column | Type | Description |
|---|---|---|
outbox_name | TEXT | Outbox name |
pending_count | BIGINT | Number of unconsumed messages |
oldest_at | TIMESTAMPTZ | Timestamp of the oldest pending message |
max_id | BIGINT | Highest message ID in this outbox |
Inbox API
All inbox functions live in the tide schema.
tide.inbox_create
Create a named inbox with its message table.
SELECT tide.inbox_create(
p_name TEXT,
p_schema TEXT DEFAULT 'tide',
p_max_retries INT DEFAULT 3,
p_processed_retention_hours INT DEFAULT 72,
p_dlq_retention_hours INT DEFAULT 0
);
| Parameter | Type | Default | Description |
|---|---|---|---|
p_name | TEXT | (required) | Unique inbox name |
p_schema | TEXT | 'tide' | Schema where the inbox table is created |
p_max_retries | INT | 3 | Max processing attempts before DLQ |
p_processed_retention_hours | INT | 72 | Hours to keep processed messages |
p_dlq_retention_hours | INT | 0 | Hours to keep DLQ messages (0 = forever) |
Creates: A table {schema}."{name}_inbox" with columns for dedup, retry tracking, and payload storage.
Example:
SELECT tide.inbox_create('payment-webhooks',
p_max_retries := 5,
p_processed_retention_hours := 168
);
tide.inbox_drop
Drop a named inbox and its message table.
SELECT tide.inbox_drop(
p_name TEXT,
p_if_exists BOOLEAN DEFAULT false
);
Cascades: Drops the inbox table and removes the config entry.
tide.inbox_mark_processed
Mark an inbox message as successfully processed.
SELECT tide.inbox_mark_processed(
p_name TEXT,
p_event_id TEXT
);
| Parameter | Type | Description |
|---|---|---|
p_name | TEXT | Inbox name |
p_event_id | TEXT | The event_id to mark as processed |
Sets processed_at = now() on the matching row. Idempotent — calling it on an already-processed message is a no-op.
tide.inbox_mark_failed
Mark an inbox message as failed. Increments retry_count and stores the error.
SELECT tide.inbox_mark_failed(
p_name TEXT,
p_event_id TEXT,
p_error TEXT
);
| Parameter | Type | Description |
|---|---|---|
p_name | TEXT | Inbox name |
p_event_id | TEXT | The event_id that failed |
p_error | TEXT | Error message to store |
tide.inbox_status
Get status summary for an inbox (or all inboxes).
-- Single inbox
SELECT tide.inbox_status('payment-webhooks') → JSONB
-- All inboxes
SELECT tide.inbox_status() → JSONB
Returns (single inbox):
{
"inbox_name": "payment-webhooks",
"pending": 3,
"dlq_count": 1
}
tide.replay_inbox_messages
Re-queue failed messages for reprocessing. Resets retry_count to 0 and clears last_error.
SELECT tide.replay_inbox_messages(
p_name TEXT,
p_event_ids TEXT[]
) → BIGINT
| Parameter | Type | Description |
|---|---|---|
p_name | TEXT | Inbox name |
p_event_ids | TEXT[] | Array of event_ids to replay |
Returns: Number of messages successfully re-queued.
Example:
SELECT tide.replay_inbox_messages('payment-webhooks',
ARRAY['evt-003', 'evt-007', 'evt-015']
);
Inbox Table Schema
Each inbox gets a table {schema}."{name}_inbox" with this structure:
| Column | Type | Description |
|---|---|---|
id | BIGINT | Auto-generated primary key |
event_id | TEXT | Dedup key (UNIQUE) |
source | TEXT | Where the message came from |
payload | JSONB | Message body |
headers | JSONB | Metadata |
received_at | TIMESTAMPTZ | When the message arrived |
processed_at | TIMESTAMPTZ | When processing completed (NULL = pending) |
retry_count | INT | Number of failed processing attempts |
last_error | TEXT | Most recent error message |
Relay API
Functions for managing relay pipeline configurations. All live in the tide schema.
tide.relay_set_outbox
Configure a forward relay pipeline (outbox → external sink).
SELECT tide.relay_set_outbox(
p_name TEXT,
p_outbox TEXT,
p_sink TEXT,
p_config JSONB DEFAULT '{}'::jsonb,
p_batch_size INT DEFAULT 100,
p_enabled BOOLEAN DEFAULT true
);
| Parameter | Type | Default | Description |
|---|---|---|---|
p_name | TEXT | (required) | Unique pipeline name |
p_outbox | TEXT | (required) | Source outbox name |
p_sink | TEXT | (required) | Sink type: nats, kafka, redis, rabbitmq, sqs, webhook, stdout |
p_config | JSONB | {} | Sink-specific configuration |
p_batch_size | INT | 100 | Messages per relay batch |
p_enabled | BOOLEAN | true | Whether the pipeline is active |
Upsert behavior: If a pipeline with the same name exists, its configuration is updated.
Example:
SELECT tide.relay_set_outbox('orders-to-nats', 'orders', 'nats',
jsonb_build_object(
'url', 'nats://localhost:4222',
'subject', 'orders.{event_type}'
),
p_batch_size := 200
);
tide.relay_set_inbox
Configure a reverse relay pipeline (external source → inbox).
SELECT tide.relay_set_inbox(
p_name TEXT,
p_inbox TEXT,
p_config JSONB DEFAULT '{}'::jsonb,
p_batch_size INT DEFAULT 100,
p_source TEXT DEFAULT 'stdout',
p_enabled BOOLEAN DEFAULT true,
p_max_retries INT DEFAULT 3,
p_idempotent BOOLEAN DEFAULT true
);
| Parameter | Type | Default | Description |
|---|---|---|---|
p_name | TEXT | (required) | Unique pipeline name |
p_inbox | TEXT | (required) | Target inbox name |
p_config | JSONB | {} | Source-specific configuration |
p_batch_size | INT | 100 | Messages per batch |
p_source | TEXT | 'stdout' | Source type: nats, kafka, redis, rabbitmq, sqs, webhook, stdin |
p_enabled | BOOLEAN | true | Whether the pipeline is active |
p_max_retries | INT | 3 | Max delivery retries |
p_idempotent | BOOLEAN | true | Use inbox dedup (recommended) |
tide.relay_enable
Enable a previously disabled pipeline.
SELECT tide.relay_enable(p_name TEXT);
Fires pg_notify('tide_relay_config', name) to trigger hot-reload in the relay.
tide.relay_disable
Disable a pipeline (stops processing without deleting config).
SELECT tide.relay_disable(p_name TEXT);
tide.relay_delete
Permanently delete a pipeline configuration.
SELECT tide.relay_delete(p_name TEXT);
tide.relay_get_config
Retrieve the full configuration for a pipeline.
SELECT tide.relay_get_config(p_name TEXT) → JSONB
Returns the stored config JSONB for the named pipeline.
tide.relay_list_configs
List all configured relay pipelines.
SELECT tide.relay_list_configs() → JSONB
Returns a JSON array of all pipelines with their direction and enabled status:
[
{"name": "orders-to-nats", "direction": "outbox", "enabled": true},
{"name": "webhooks-in", "direction": "inbox", "enabled": true}
]
Consumer Groups API
Functions for managing consumer groups. All live in the tide schema.
tide.create_consumer_group
Create a named consumer group for an outbox.
SELECT tide.create_consumer_group(
p_name TEXT,
p_outbox TEXT,
p_auto_offset_reset TEXT DEFAULT 'earliest',
p_if_not_exists BOOLEAN DEFAULT false
);
| Parameter | Type | Default | Description |
|---|---|---|---|
p_name | TEXT | (required) | Unique group name |
p_outbox | TEXT | (required) | Outbox this group consumes from |
p_auto_offset_reset | TEXT | 'earliest' | earliest, latest, or none |
p_if_not_exists | BOOLEAN | false | Suppress error if already exists |
Errors:
- Outbox must exist
- Group name must be unique (unless
p_if_not_exists = true) p_auto_offset_resetmust be one of:earliest,latest,none
tide.drop_consumer_group
Drop a consumer group and all its offset/lease records.
SELECT tide.drop_consumer_group(
p_name TEXT,
p_if_exists BOOLEAN DEFAULT false
);
tide.commit_offset
Commit a consumer's processing position.
SELECT tide.commit_offset(
p_group TEXT,
p_consumer TEXT,
p_last_offset BIGINT
);
| Parameter | Type | Description |
|---|---|---|
p_group | TEXT | Consumer group name |
p_consumer | TEXT | Consumer identifier (e.g., relay instance ID) |
p_last_offset | BIGINT | Last successfully processed message ID |
Behavior:
- Upserts into
tide.tide_consumer_offsets - Updates
last_heartbeattonow() - Releases any visibility lease held by this consumer
tide.consumer_heartbeat
Update the heartbeat timestamp for a consumer.
SELECT tide.consumer_heartbeat(
p_group TEXT,
p_consumer TEXT
);
Call this periodically (e.g., every 10 seconds) while processing to signal liveness.
Views
tide.consumer_lag
Per-consumer lag relative to the latest outbox message:
SELECT * FROM tide.consumer_lag;
| Column | Type | Description |
|---|---|---|
group_name | TEXT | Consumer group |
outbox_name | TEXT | Source outbox |
consumer_id | TEXT | Consumer identifier |
committed_offset | BIGINT | Last committed position |
lag | BIGINT | Messages behind (max_id - committed_offset) |
last_heartbeat | TIMESTAMPTZ | Last heartbeat time |
Catalog Tables
pg_tide stores all state in relational tables within the tide schema. This page documents the underlying tables that power the SQL API.
tide.tide_outbox_config
One row per named outbox.
| Column | Type | Description |
|---|---|---|
outbox_name | TEXT (PK) | Unique outbox identifier |
retention_hours | INT | Hours to retain consumed messages |
inline_threshold | INT | Backpressure threshold |
enabled | BOOLEAN | Whether publishing is allowed |
created_at | TIMESTAMPTZ | Creation timestamp |
tide.tide_outbox_messages
Shared message store for all outboxes.
| Column | Type | Description |
|---|---|---|
id | BIGINT (PK) | Auto-incrementing message ID |
outbox_name | TEXT (FK) | Which outbox this belongs to |
payload | JSONB | Message body |
headers | JSONB | Message metadata |
created_at | TIMESTAMPTZ | Publication time |
consumed_at | TIMESTAMPTZ | When relay delivered it (NULL = pending) |
consumer_group | TEXT | Which group consumed it |
Indexes:
idx_tide_outbox_messages_pending— partial index on(outbox_name, id) WHERE consumed_at IS NULL
tide.tide_consumer_groups
Named consumer groups with offset reset policy.
| Column | Type | Description |
|---|---|---|
group_name | TEXT (PK) | Unique group name |
outbox_name | TEXT (FK) | Outbox being consumed |
auto_offset_reset | TEXT | earliest, latest, or none |
created_at | TIMESTAMPTZ | Creation timestamp |
tide.tide_consumer_offsets
Per-consumer committed offsets and heartbeats.
| Column | Type | Description |
|---|---|---|
group_name | TEXT (PK, FK) | Consumer group |
consumer_id | TEXT (PK) | Consumer instance identifier |
committed_offset | BIGINT | Last processed message ID |
last_heartbeat | TIMESTAMPTZ | Last liveness signal |
tide.tide_consumer_leases
Visibility leases for in-flight message batches.
| Column | Type | Description |
|---|---|---|
group_name | TEXT (PK, FK) | Consumer group |
consumer_id | TEXT (PK, FK) | Consumer instance |
lease_start | BIGINT | First message ID in the leased batch |
lease_end | BIGINT | Last message ID in the leased batch |
expires_at | TIMESTAMPTZ | When the lease expires |
tide.tide_inbox_config
Named inbox configurations.
| Column | Type | Description |
|---|---|---|
inbox_name | TEXT (PK) | Unique inbox identifier |
inbox_schema | TEXT | Schema containing the inbox table |
max_retries | INT | Attempts before DLQ |
processed_retention_hours | INT | Hours to keep processed messages |
dlq_retention_hours | INT | Hours to keep DLQ messages |
created_at | TIMESTAMPTZ | Creation timestamp |
tide.relay_outbox_config
Forward relay pipeline definitions.
| Column | Type | Description |
|---|---|---|
name | TEXT (PK) | Unique pipeline name |
enabled | BOOLEAN | Whether the pipeline is active |
config | JSONB | Full pipeline config (outbox, sink, params) |
Triggers: relay_outbox_config_notify — fires pg_notify('tide_relay_config', ...) on changes.
tide.relay_inbox_config
Reverse relay pipeline definitions.
| Column | Type | Description |
|---|---|---|
name | TEXT (PK) | Unique pipeline name |
enabled | BOOLEAN | Whether the pipeline is active |
config | JSONB | Full pipeline config (inbox, source, params) |
Triggers: relay_inbox_config_notify — fires pg_notify('tide_relay_config', ...) on changes.
tide.relay_consumer_offsets
Durable per-pipeline offset tracking for the relay binary.
| Column | Type | Description |
|---|---|---|
relay_group_id | TEXT (PK) | Relay deployment group |
pipeline_id | TEXT (PK) | Pipeline name |
last_offset | TEXT | Last processed offset |
updated_at | TIMESTAMPTZ | Last update timestamp |
Relay Configuration
The pg-tide relay binary is configured through three layers, applied in order of increasing priority:
- Default values — sensible defaults built into the binary
- TOML config file — specified via
--configorPG_TIDE_CONFIG - CLI flags / environment variables — highest precedence
Pipeline configuration (outbox sources, sink destinations, batch sizes) lives in PostgreSQL — not in the TOML file. The relay loads pipeline config from the tide.relay_outbox_config and tide.relay_inbox_config catalog tables at startup and reloads dynamically via LISTEN/NOTIFY.
Quick Start
The only required parameter is the PostgreSQL connection URL:
pg-tide --postgres-url "postgres://user:pass@localhost:5432/mydb"
For production, use environment variables:
export PG_TIDE_POSTGRES_URL="postgres://relay:${DB_PASSWORD}@pghost:5432/app"
export PG_TIDE_METRICS_ADDR="0.0.0.0:9090"
export PG_TIDE_LOG_FORMAT="json"
export PG_TIDE_GROUP_ID="prod-relay"
pg-tide
TOML Configuration File
For complex setups, use a TOML file:
# relay.toml
postgres_url = "${ENV:DATABASE_URL}"
metrics_addr = "0.0.0.0:9090"
log_format = "json"
log_level = "info"
discovery_interval_secs = 30
default_batch_size = 100
relay_group_id = "production"
sink_max_inflight = 1000
pg-tide --config relay.toml
Configuration Reference
| Parameter | CLI Flag | Environment Variable | Default | Description |
|---|---|---|---|---|
postgres_url | --postgres-url | PG_TIDE_POSTGRES_URL | (required) | PostgreSQL connection URL |
metrics_addr | --metrics-addr | PG_TIDE_METRICS_ADDR | 0.0.0.0:9090 | Prometheus metrics + health endpoint bind address |
log_format | --log-format | PG_TIDE_LOG_FORMAT | text | Log output format: text or json |
log_level | --log-level | PG_TIDE_LOG_LEVEL | info | Log verbosity: error, warn, info, debug, trace |
relay_group_id | --relay-group-id | PG_TIDE_GROUP_ID | default | Relay group identifier for advisory lock namespacing |
discovery_interval_secs | — | — | 30 | Seconds between pipeline discovery polls |
default_batch_size | — | — | 100 | Default messages per batch when not specified per-pipeline |
sink_max_inflight | — | — | 1000 | Maximum in-flight messages before upstream polling pauses. 0 = unlimited |
| — | --drain-timeout | PG_TIDE_DRAIN_TIMEOUT | 30 | Seconds to wait for in-flight messages to drain on SIGTERM |
| — | --config | PG_TIDE_CONFIG | — | Path to TOML config file |
Environment Variable Substitution
Connection strings in TOML files support ${ENV:VAR_NAME} substitution:
postgres_url = "postgres://${ENV:DB_USER}:${ENV:DB_PASSWORD}@${ENV:DB_HOST}:5432/${ENV:DB_NAME}"
This resolves at relay startup time using the process environment. Unknown variables are left as-is (the relay will report a connection error rather than silently using a broken URL).
Relay Group ID
The relay_group_id parameter is critical for multi-deployment setups. It controls:
- Advisory lock namespacing — each relay instance acquires a PostgreSQL advisory lock scoped to its group ID + pipeline name. Only one relay per group can own a given pipeline.
- Consumer group offset tracking — progress is tracked per relay group, allowing multiple independent relay deployments to process the same outbox.
Single deployment (default):
pg-tide --relay-group-id "default"
Multi-region deployment:
# US region — processes orders outbox → US NATS
pg-tide --relay-group-id "us-east" --postgres-url "..."
# EU region — processes same orders outbox → EU NATS
pg-tide --relay-group-id "eu-west" --postgres-url "..."
Each group tracks its own offsets independently — the EU relay won't skip messages just because the US relay already processed them.
Pipeline Configuration (in PostgreSQL)
Pipelines are configured via SQL, not via the relay's TOML/CLI config. The relay discovers pipelines from two catalog tables:
Forward Pipelines (Outbox → Sink)
SELECT tide.relay_set_outbox(
p_name := 'orders-nats', -- Pipeline name (unique)
p_outbox := 'orders', -- Source outbox name
p_sink := 'nats', -- Sink type
p_config := '{
"url": "nats://localhost:4222",
"subject": "orders.{event_type}"
}'::jsonb
);
Reverse Pipelines (Source → Inbox)
SELECT tide.relay_set_inbox(
p_name := 'nats-orders-inbox', -- Pipeline name (unique)
p_inbox := 'order_events', -- Target inbox name
p_source := 'nats', -- Source type
p_config := '{
"url": "nats://localhost:4222",
"subject": "orders.>",
"consumer_name": "pg-tide-inbox"
}'::jsonb
);
Enabling / Disabling Pipelines
-- Disable a pipeline (relay will stop it on next discovery cycle)
SELECT tide.relay_enable('orders-nats', false);
-- Re-enable
SELECT tide.relay_enable('orders-nats', true);
Pipeline changes are picked up via:
LISTEN/NOTIFY— immediate reaction to config changes- Periodic polling — every
discovery_interval_secsas a fallback
Hot Reload
The relay watches for NOTIFY signals on the tide_relay_config_changed channel. When you modify a pipeline via tide.relay_set_outbox() or tide.relay_set_inbox(), the trigger fires a notification and the relay reloads within seconds — no restart required.
If LISTEN is interrupted (connection blip), the periodic discovery poll acts as a safety net.
High Availability
Run multiple relay instances with the same relay_group_id:
# Instance 1
pg-tide --relay-group-id "prod" --postgres-url "..."
# Instance 2 (standby — takes over if instance 1 dies)
pg-tide --relay-group-id "prod" --postgres-url "..."
Advisory locks ensure only one instance owns each pipeline at a time. If the owning instance dies, its locks are released and another instance acquires them on the next discovery cycle.
Backpressure
The sink_max_inflight parameter controls backpressure behavior:
- When the number of in-flight (unacknowledged) messages reaches this limit, the relay pauses polling from the outbox
- Once the sink acknowledges enough messages to drop below the threshold, polling resumes
- Set to
0to disable backpressure (not recommended for production)
This prevents the relay from overwhelming a slow sink while still allowing high throughput for fast sinks.
Graceful Shutdown
On SIGTERM:
- The relay stops accepting new pipeline ownership
- Active pipelines finish their current batch
- If batches don't complete within
--drain-timeoutseconds (default: 30), the relay exits - Unfinished messages will be redelivered when the relay restarts (at-least-once guarantee)
Example: Production Configuration
# /etc/pg-tide/relay.toml
postgres_url = "${ENV:DATABASE_URL}"
metrics_addr = "0.0.0.0:9090"
log_format = "json"
log_level = "info"
discovery_interval_secs = 10
default_batch_size = 500
relay_group_id = "production"
sink_max_inflight = 5000
# systemd unit or container entrypoint
PG_TIDE_DRAIN_TIMEOUT=60 pg-tide --config /etc/pg-tide/relay.toml
Catalog vs. TOML: Configuration Hierarchy
pg-tide has two places where configuration lives. This page explains which is the single source of truth and how to use each correctly.
The Rule: Catalog Is Primary, TOML Is Process Config
| What | Where | Source of truth |
|---|---|---|
| Which pipelines exist | tide.relay_outbox_config / tide.relay_inbox_config | Catalog (SQL) |
| Pipeline direction (forward / reverse) | Catalog | Catalog (SQL) |
| Sink type, sink config, batch size | Catalog config JSONB column | Catalog (SQL) |
| Wire format, DLQ settings | Catalog config JSONB column | Catalog (SQL) |
| PostgreSQL credentials | --postgres-url-file / PG_TIDE_POSTGRES_URL | TOML / env var |
| Relay group identity | relay_group_id TOML key | TOML / env var |
Resource limits (max_owned_pipelines, max_connections) | TOML / CLI flags | TOML / CLI |
| Logging format and level | TOML / CLI flags | TOML / CLI |
| Metrics address | TOML / CLI flags | TOML / CLI |
Managing Pipelines via SQL (Recommended)
All pipeline configuration should be managed through the tide schema
SQL functions:
-- Create or update a forward pipeline (outbox → NATS):
SELECT tide.relay_set_outbox_v2('{
"name": "orders-to-nats",
"outbox": "orders",
"sink_type": "nats",
"config": {
"url": "nats://nats.svc.cluster.local:4222",
"subject": "orders.{event_type}"
},
"batch_size": 50,
"enabled": true
}'::jsonb);
-- Disable a pipeline without deleting it:
SELECT tide.relay_disable('orders-to-nats');
-- Re-enable:
SELECT tide.relay_enable('orders-to-nats');
-- List all configured pipelines:
SELECT name, direction, enabled, config->>'sink_type' AS sink
FROM tide.relay_outbox_config
UNION ALL
SELECT name, 'reverse', enabled, config->>'source_type'
FROM tide.relay_inbox_config;
Changes take effect within one second thanks to the LISTEN/NOTIFY hot-reload
introduced in v0.18.0. There is no need to restart the relay.
TOML File — Process Configuration Only
The TOML file (default: /etc/pg-tide/pg-tide.toml) configures the relay
process, not the pipelines. It should contain only:
postgres_url = "..." # or use --postgres-url-file
relay_group_id = "prod"
max_owned_pipelines = 50
max_connections = 100
discovery_interval_secs = 30
default_batch_size = 100
metrics_addr = "0.0.0.0:9090"
log_level = "info"
log_format = "json"
drain_timeout_secs = 30
A fully commented example is baked into the Docker image at
/etc/pg-tide/pg-tide.example.toml. Copy it as a starting point:
docker cp pg-tide:/etc/pg-tide/pg-tide.example.toml ./pg-tide.toml
Startup Warning: TOML-Only Pipelines
If the TOML file configures a pipeline that is not present in the
catalog (e.g. via a legacy [pipelines.*] section), the relay emits a
WARN-level log entry at startup:
WARN pipeline "orders-to-nats" defined in TOML but not found in catalog — ignoring
The expected resolution is to create the pipeline via SQL using
tide.relay_set_outbox_v2() or tide.relay_set_inbox_v2(), then remove
the TOML definition.
Secret Interpolation
Sensitive values (passwords, API keys) should not be stored in the
catalog JSONB config in plain text. Instead, use the ${ENV:VAR_NAME} or
${FILE:/path/to/secret} interpolation syntax in the config JSON:
SELECT tide.relay_set_outbox_v2('{
"name": "orders-to-kafka",
"outbox": "orders",
"sink_type": "kafka",
"config": {
"brokers": "kafka.svc:9092",
"topic": "orders",
"sasl_password": "${ENV:KAFKA_SASL_PASSWORD}"
}
}'::jsonb);
The relay resolves ${ENV:...} tokens at runtime, keeping the secret out of
the database entirely.
See Also
CLI Reference
The pg-tide binary is both the relay daemon and an operational toolkit. Run it
without a subcommand to start the relay; use a subcommand for diagnostics,
maintenance, and introspection.
Usage
pg-tide [OPTIONS] [COMMAND]
When COMMAND is omitted, the relay daemon starts. All subcommands are
short-lived and exit after completing their task.
Global Options
These flags apply to both daemon mode and all subcommands.
| Flag | Env | Default | Description |
|---|---|---|---|
--postgres-url <URL> | PG_TIDE_POSTGRES_URL | — | PostgreSQL connection URL |
--metrics-addr <ADDR> | PG_TIDE_METRICS_ADDR | 0.0.0.0:9090 | Prometheus metrics + health endpoint |
--log-format <FORMAT> | PG_TIDE_LOG_FORMAT | text | text or json |
--log-level <LEVEL> | PG_TIDE_LOG_LEVEL | info | error, warn, info, debug, trace |
--relay-group-id <ID> | PG_TIDE_RELAY_GROUP_ID | default | Advisory lock namespace; use one value per deployment group |
--config <PATH> | PG_TIDE_CONFIG | — | Path to TOML config file; CLI flags override file values |
--drain-timeout <SECS> | PG_TIDE_DRAIN_TIMEOUT | 30 | Seconds to wait for in-flight messages to drain on SIGTERM |
--max-pipelines <N> | PG_TIDE_MAX_PIPELINES | 50 | Maximum concurrent pipeline workers (each holds one PG connection) |
--max-connections <N> | PG_TIDE_MAX_CONNECTIONS | 52 | Coordinator connection pool size |
Daemon Mode
pg-tide --postgres-url "postgres://relay:secret@db.internal:5432/app"
Starts the relay daemon. All pipeline configuration is loaded from PostgreSQL
and hot-reloads on SIGHUP without restart. See Configuration
for the TOML file format and pipeline schema.
HTTP Endpoints
| Endpoint | Description |
|---|---|
GET /metrics | Prometheus metrics in text exposition format |
GET /health | 200 OK when healthy, 503 when unhealthy |
Signals
| Signal | Behavior |
|---|---|
SIGTERM / SIGINT | Graceful shutdown: drain in-flight messages, release advisory locks, exit |
SIGHUP | Hot-reload pipeline configuration from PostgreSQL without downtime |
Subcommands
doctor
Validates PostgreSQL connectivity, schema version, and catalog health.
pg-tide doctor [--postgres-url <URL>]
Checks performed:
- TCP connectivity and TLS handshake to PostgreSQL
- Existence of the
tideschema - Presence of all required catalog tables
relay_consumer_offsets.last_change_idcolumn (v0.12.0+ migration marker)- Presence of
tide.outbox_truncate_delivered()function (v0.15.0+) - Count of configured forward and reverse pipelines
Exit codes: 0 = all checks passed, 1 = one or more failures.
Example output:
pg-tide doctor v0.16.0
Connecting to PostgreSQL...
[OK] Connected to PostgreSQL
[OK] Schema 'tide' exists
[OK] Table tide.tide_outbox_config
[OK] Table tide.relay_outbox_config
[OK] relay_consumer_offsets.last_change_id column present
[OK] tide.outbox_truncate_delivered() present (v0.15.0+)
[INFO] 3 forward pipeline(s), 1 reverse pipeline(s) configured
pg-tide doctor: all checks passed.
Typical use: health check in CI, post-deploy validation, Kubernetes
readinessProbe via a Job.
validate-config
Dry-runs source and sink factory construction for a named pipeline without processing any messages.
pg-tide validate-config --pipeline <NAME> [--postgres-url <URL>]
What it does:
- Loads the pipeline config from
tide.relay_outbox_configortide.relay_inbox_config - Resolves all
${ENV:VAR}secret placeholders - Constructs the source implementation (e.g., outbox poller, Kafka consumer)
- Constructs the sink implementation (e.g., Kafka producer, HTTP webhook)
- Reports success or the first construction failure
No messages are read or published. Exit 0 = config is valid, 1 = failure.
Example:
pg-tide validate-config \
--pipeline orders-kafka \
--postgres-url "$DATABASE_URL"
pg-tide validate-config — pipeline: orders-kafka
[OK] Secrets resolved
[OK] Source 'outbox:orders' instantiated
[OK] Sink 'kafka:orders.events' instantiated
validate-config: pipeline 'orders-kafka' configuration is valid.
Typical use: pre-flight check before enabling a new pipeline; CI step after updating sink credentials.
status
Prints a summary table of all configured relay pipelines.
pg-tide status [--postgres-url <URL>]
Columns:
| Column | Description |
|---|---|
PIPELINE | Pipeline name |
DIRECTION | forward (outbox → sink) or reverse (source → inbox) |
ENABLED | Whether the pipeline is enabled in the catalog |
LAST_OFFSET | Last committed change ID (0 if never consumed) |
CONSUMER_LAG | Unconsumed outbox messages (forward pipelines only) |
Example:
PIPELINE DIRECTION ENABLED LAST_OFFSET CONSUMER_LAG
--------------------------------------------------------------------------------
orders-kafka forward yes 1842731 0
payments-kafka forward yes 998201 3
webhooks-incoming reverse yes 0 0
audit-log forward no 0 0
4 pipeline(s) configured.
Consumer lag is queried from the outbox table at snapshot time; it does not reflect in-flight messages being processed by a running relay.
sweep
Deletes consumed outbox messages that are past their retention window.
pg-tide sweep [--outbox <NAME>] [--postgres-url <URL>]
Calls tide.outbox_truncate_delivered() for each outbox. When --outbox is
omitted, all outboxes are swept. Run this on a schedule to prevent unbounded
growth of the tide_outbox_messages table.
Example:
# Sweep all outboxes
pg-tide sweep --postgres-url "$DATABASE_URL"
# Sweep a single outbox
pg-tide sweep --outbox orders --postgres-url "$DATABASE_URL"
pg-tide sweep v0.16.0
[OK] Swept outbox 'orders': 12408 rows deleted
[OK] Swept outbox 'payments': 4891 rows deleted
pg-tide sweep: 17299 total row(s) deleted from 2 outbox(es).
Typical use: cron job or Kubernetes CronJob running every hour.
# Kubernetes CronJob
schedule: "0 * * * *"
command: ["pg-tide", "sweep", "--postgres-url", "$(DATABASE_URL)"]
replay
Replay workbench for inspecting, debugging, and recovering from delivery failures. All replay subcommands are read-only or operate only on DLQ metadata — they never advance consumer offsets.
replay preview
Print outbox messages in an ID range as JSONL without consuming them.
pg-tide replay preview \
--outbox <NAME> \
[--from-id <ID>] \
[--to-id <ID>] \
[--limit <N>] \
[--postgres-url <URL>]
| Flag | Default | Description |
|---|---|---|
--outbox | required | Outbox name to preview |
--from-id | 0 | Start of ID range (inclusive) |
--to-id | i64::MAX | End of ID range (inclusive) |
--limit | 100 | Maximum rows to return |
Output is JSONL on stdout; progress is printed to stderr.
pg-tide replay preview --outbox orders --from-id 1840000 --limit 5
{"id":1840001,"outbox_name":"orders","payload":{"order_id":42},"headers":{},"created_at":"2026-05-07T10:00:01Z","consumed":true}
{"id":1840002,"outbox_name":"orders","payload":{"order_id":43},"headers":{},"created_at":"2026-05-07T10:00:02Z","consumed":false}
replay dry-run
Evaluate a pipeline's transforms against a sample of outbox messages and print the resulting envelopes to stdout — without publishing anything.
pg-tide replay dry-run \
--pipeline <NAME> \
[--from-id <ID>] \
[--to-id <ID>] \
[--limit <N>] \
[--postgres-url <URL>]
Useful for verifying JMESPath transform expressions and wire format output before enabling a pipeline.
pg-tide replay dry-run --pipeline orders-kafka --limit 3
Dry-run transform evaluation for pipeline 'orders-kafka' (3 message(s)):
{"outbox_id":1840001,"event_id":"uuid-...","op":"c","payload":{...}}
{"outbox_id":1840002,"event_id":"uuid-...","op":"u","payload":{...}}
[SKIP] id=1840003 (tombstone or filtered)
replay dlq-resolve
Mark a DLQ entry as resolved (closed without requeue).
pg-tide replay dlq-resolve \
--pipeline <NAME> \
--dedup-key <KEY> \
[--postgres-url <URL>]
Sets resolved = true on the DLQ row. The message will not be retried.
Use when the failure is expected or the downstream system has been manually
updated.
replay dlq-requeue
Requeue a DLQ entry for another relay attempt.
pg-tide replay dlq-requeue \
--pipeline <NAME> \
--dedup-key <KEY> \
[--postgres-url <URL>]
Marks the current DLQ entry resolved and inserts a fresh pending entry with
attempt_count = 0. The running relay will pick it up on the next cycle.
asyncapi export
Generate an AsyncAPI 3.0 document from relay catalog metadata.
pg-tide asyncapi export \
[--format yaml|json] \
[--output <PATH>] \
[--postgres-url <URL>]
| Flag | Default | Description |
|---|---|---|
--format | yaml | Output format: yaml or json |
--output | stdout | File path to write the document; omit to print to stdout |
Reads all configured outbox and inbox pipelines from PostgreSQL and emits an AsyncAPI 3.0 document describing each pipeline as a named channel, operation, and message schema. Useful for API documentation, consumer contract testing with Microcks, and downstream code generation.
pg-tide asyncapi export \
--format yaml \
--output relay-asyncapi.yaml \
--postgres-url "$DATABASE_URL"
See Microcks Integration for a complete guide on using the exported spec for consumer contract testing.
Daemon Startup Examples
Minimal
pg-tide --postgres-url "postgres://user:pass@localhost:5432/mydb"
Production
pg-tide \
--postgres-url "postgres://relay:secret@db.internal:5432/app" \
--log-format json \
--log-level info \
--relay-group-id production \
--metrics-addr 0.0.0.0:9090 \
--drain-timeout 60
From config file
pg-tide --config /etc/pg-tide/relay.toml
Docker
docker run \
-e PG_TIDE_POSTGRES_URL="postgres://..." \
-p 9090:9090 \
ghcr.io/trickle-labs/pg-tide:latest
Backends
The pg-tide relay supports multiple messaging backends as both sinks (forward mode: outbox → external system) and sources (reverse mode: external system → inbox). This page covers all available backends with their configuration, use cases, and operational guidance.
Choosing a Backend
Different backends suit different architectural needs. Use this decision matrix to pick the right one:
| Backend | Best for | Latency | Durability | Ordering | Throughput |
|---|---|---|---|---|---|
| NATS | Low-latency microservice communication, pub/sub | ~1ms | With JetStream | Per-subject | Very high |
| Kafka | High-throughput event streaming, analytics pipelines | ~5ms | Strong | Per-partition | Extremely high |
| Redis Streams | Lightweight streaming, existing Redis infrastructure | ~1ms | Configurable (AOF/RDB) | Per-stream | High |
| RabbitMQ | Complex routing, work queues, existing AMQP infrastructure | ~2ms | Per-message (persistent) | Per-queue | Moderate |
| SQS | AWS-native, serverless consumers, managed infrastructure | ~20ms | Extremely high | Best-effort (FIFO available) | Moderate |
| Webhook | Push notifications, third-party integrations, serverless endpoints | ~50-500ms | Depends on receiver | Per-delivery | Low-moderate |
Quick recommendations
- Starting out / prototyping: NATS (default, zero configuration, fast)
- Enterprise data pipelines: Kafka (strongest durability and ordering guarantees)
- AWS-native infrastructure: SQS (fully managed, no servers to operate)
- Existing Redis stack: Redis Streams (reuse your Redis deployment)
- Third-party integrations: Webhook (push to any HTTP endpoint)
- Legacy AMQP systems: RabbitMQ (rich routing, mature ecosystem)
Feature Flags
Backends are feature-gated at compile time. Only enabled backends are compiled into the relay binary:
| Backend | Cargo Feature | Enabled by default |
|---|---|---|
| NATS | nats | ✓ |
| Kafka | kafka | ✗ |
| Redis | redis | ✗ |
| RabbitMQ | rabbitmq | ✗ |
| SQS | sqs | ✗ |
| Webhook | webhook | ✓ |
| stdout | stdout | ✓ |
To build with specific backends:
# Only NATS and Kafka
cargo build --package pg-tide-relay --features "nats,kafka"
# All backends
cargo build --package pg-tide-relay --all-features
The official Docker image and GitHub release binaries include all backends.
NATS
NATS is the default and recommended backend for pg_tide. It provides extremely low-latency publish/subscribe messaging with optional JetStream durability. NATS is lightweight (single binary, no JVM), supports wildcards, and handles millions of messages per second.
When to use NATS: Real-time microservice communication, event fan-out, lightweight pub/sub where you want simplicity and speed. NATS is the "batteries included" choice for most pg_tide deployments.
Forward (Outbox → NATS)
SELECT tide.relay_set_outbox('orders-nats', 'orders', 'nats',
jsonb_build_object(
'url', 'nats://localhost:4222',
'subject', 'orders.events'
)
);
Configuration
| Key | Required | Default | Description |
|---|---|---|---|
url | Yes | — | NATS server URL (e.g., nats://localhost:4222) |
subject | Yes | — | Subject to publish to. Supports template variables. |
credentials | No | — | Path to NATS credentials file (.creds) for authentication |
Subject templates
The subject supports variable substitution from the message headers, allowing dynamic routing without relay-side logic:
{outbox_name}— source outbox name{event_type}— value of theevent_typekey in the message headers{outbox_id}— the message ID (numeric)
Example: "orders.{event_type}" with a message that has "event_type": "order.created" in its headers will publish to subject "orders.order.created".
This is powerful for fan-out patterns: a single outbox can route events to many different NATS subjects based on their type, and downstream services can subscribe to only the events they care about using NATS wildcards.
Reverse (NATS → Inbox)
SELECT tide.relay_set_inbox('nats-to-inbox', 'incoming-events',
jsonb_build_object(
'url', 'nats://localhost:4222',
'subject', 'external.events.>'
),
p_source := 'nats'
);
Configuration
| Key | Required | Default | Description |
|---|---|---|---|
url | Yes | — | NATS server URL |
subject | Yes | — | Subject to subscribe to (NATS wildcards * and > supported) |
queue_group | No | — | Queue group for load balancing across multiple relay instances |
credentials | No | — | Path to NATS credentials file |
The queue_group option enables NATS queue subscriptions: if multiple relay instances subscribe to the same subject with the same queue group, NATS distributes messages among them (each message goes to one subscriber). This is useful for horizontal scaling of reverse pipelines.
NATS operational notes
- NATS Core (without JetStream) provides at-most-once delivery — if no subscriber is active, messages are lost. For durable delivery, enable JetStream on your NATS server.
- The relay handles NATS reconnection automatically with exponential backoff.
- For multi-server NATS clusters, provide any single server URL — the NATS client discovers other servers automatically.
Kafka
Apache Kafka is the industry standard for high-throughput event streaming. If you're building data pipelines, event-driven architectures at scale, or need strong ordering guarantees with long-term retention, Kafka is the natural choice.
When to use Kafka: High-volume event streaming (>10K events/sec), data pipelines feeding analytics/ML systems, scenarios requiring strong ordering guarantees, or when Kafka is already part of your infrastructure.
Forward (Outbox → Kafka)
SELECT tide.relay_set_outbox('events-kafka', 'events', 'kafka',
jsonb_build_object(
'brokers', 'broker1:9092,broker2:9092,broker3:9092',
'topic', 'app-events',
'acks', 'all',
'compression', 'snappy',
'key', '{event_type}'
)
);
Configuration
| Key | Required | Default | Description |
|---|---|---|---|
brokers | Yes | — | Comma-separated list of Kafka broker addresses |
topic | Yes | — | Target Kafka topic |
key | No | — | Message key template (determines partition assignment). Supports {outbox_name}, {event_type}, etc. |
acks | No | all | Acknowledgment level: 0 (fire-and-forget), 1 (leader only), all (all in-sync replicas) |
compression | No | none | Compression: none, gzip, snappy, lz4, zstd |
Key (partition) strategy: The key field determines which Kafka partition each message goes to. Messages with the same key are guaranteed to land in the same partition, preserving ordering. Common patterns:
{event_type}— all events of the same type go to the same partition{outbox_name}— partition by source outbox- No key — round-robin distribution across partitions (best throughput, no ordering)
Reverse (Kafka → Inbox)
SELECT tide.relay_set_inbox('kafka-to-inbox', 'kafka-events',
jsonb_build_object(
'brokers', 'broker1:9092,broker2:9092',
'topic', 'external-events',
'group_id', 'pg-tide-consumer'
),
p_source := 'kafka'
);
Configuration
| Key | Required | Default | Description |
|---|---|---|---|
brokers | Yes | — | Comma-separated broker list |
topic | Yes | — | Kafka topic to consume from |
group_id | Yes | — | Kafka consumer group ID (for offset tracking within Kafka) |
auto_offset_reset | No | earliest | Where to start if no Kafka offset exists: earliest or latest |
Kafka operational notes
- Building with Kafka support requires
librdkafka(or the bundled cmake build viardkafka/cmake-buildfeature). - Use
acks=allfor durability in production — it ensures the message is replicated before acknowledgment. - For high throughput, use
snappyorlz4compression and increasep_batch_sizeto 200-500. - The relay commits Kafka consumer offsets back to Kafka (for reverse pipelines) in addition to tracking them in pg_tide's own offset table.
Redis Streams
Redis Streams provide lightweight event streaming with consumer groups built on top of Redis. If you already run Redis and need simple, fast event delivery without the operational overhead of Kafka, Redis Streams is an excellent choice.
When to use Redis: You already have Redis in your stack, you need low-latency delivery, your event volume is moderate (<50K/sec), or you want minimal additional infrastructure.
Forward (Outbox → Redis Stream)
SELECT tide.relay_set_outbox('events-redis', 'events', 'redis',
jsonb_build_object(
'url', 'redis://localhost:6379',
'stream', 'app:events',
'maxlen', 100000
)
);
Configuration
| Key | Required | Default | Description |
|---|---|---|---|
url | Yes | — | Redis connection URL (e.g., redis://localhost:6379) |
stream | Yes | — | Redis stream key name |
maxlen | No | — | Maximum stream length. Older entries are trimmed automatically (Redis MAXLEN). |
Reverse (Redis Stream → Inbox)
SELECT tide.relay_set_inbox('redis-to-inbox', 'redis-events',
jsonb_build_object(
'url', 'redis://localhost:6379',
'stream', 'external:events',
'group', 'pg-tide',
'consumer', 'relay-0'
),
p_source := 'redis'
);
Configuration
| Key | Required | Default | Description |
|---|---|---|---|
url | Yes | — | Redis connection URL |
stream | Yes | — | Stream key to read from |
group | Yes | — | Redis consumer group name |
consumer | Yes | — | Consumer name within the group (should be unique per relay instance) |
Redis operational notes
- Redis Streams durability depends on your Redis persistence configuration (AOF, RDB, or none). For production event delivery, enable AOF with
appendfsync everysec. - Use
maxlento prevent unbounded stream growth. Redis will trim the oldest entries when the limit is reached. - For Redis Cluster, the relay connects to the cluster and routes to the correct shard based on the stream key.
RabbitMQ
RabbitMQ provides rich message routing through exchanges, bindings, and queues. It's ideal when you need complex routing patterns (topic-based, header-based, fanout) or when you're integrating with existing AMQP infrastructure.
When to use RabbitMQ: Complex routing requirements, existing AMQP infrastructure, work queue patterns where messages should be processed by exactly one consumer, or when you need message priority and TTL features.
Forward (Outbox → RabbitMQ)
SELECT tide.relay_set_outbox('events-rabbit', 'events', 'rabbitmq',
jsonb_build_object(
'url', 'amqp://user:pass@localhost:5672/%2f',
'exchange', 'app.events',
'routing_key', 'orders.created',
'exchange_type', 'topic',
'durable', true
)
);
Configuration
| Key | Required | Default | Description |
|---|---|---|---|
url | Yes | — | AMQP connection URL |
exchange | Yes | — | Exchange to publish to |
routing_key | No | "" | Routing key for the exchange. Supports template variables like {event_type}. |
exchange_type | No | topic | Exchange type: direct, topic, fanout, headers |
durable | No | true | Whether messages should be persisted to disk |
Reverse (RabbitMQ → Inbox)
SELECT tide.relay_set_inbox('rabbit-to-inbox', 'amqp-events',
jsonb_build_object(
'url', 'amqp://user:pass@localhost:5672/%2f',
'queue', 'incoming-events',
'prefetch', 20
),
p_source := 'rabbitmq'
);
Configuration
| Key | Required | Default | Description |
|---|---|---|---|
url | Yes | — | AMQP connection URL |
queue | Yes | — | Queue to consume from |
prefetch | No | 10 | How many messages to prefetch (controls parallelism and memory) |
RabbitMQ operational notes
- For the URL, use
%2ffor the default vhost (/):amqp://user:pass@host:5672/%2f - The relay declares the exchange if it doesn't exist (for forward mode). For reverse mode, the queue must already exist.
- Use
durable=truein production to survive broker restarts. - RabbitMQ's topic exchange routing keys support wildcards:
orders.*matchesorders.created,orders.shipped, etc.
SQS
Amazon SQS is a fully managed message queue service. It requires zero infrastructure management and integrates seamlessly with the AWS ecosystem (Lambda, ECS, Step Functions).
When to use SQS: AWS-native infrastructure, serverless consumers (Lambda), when you want zero queue management overhead, or when your team already uses AWS services extensively.
Forward (Outbox → SQS)
SELECT tide.relay_set_outbox('events-sqs', 'events', 'sqs',
jsonb_build_object(
'queue_url', 'https://sqs.us-east-1.amazonaws.com/123456789012/my-queue',
'region', 'us-east-1'
)
);
Configuration
| Key | Required | Default | Description |
|---|---|---|---|
queue_url | Yes | — | Full SQS queue URL |
region | Yes | — | AWS region |
message_group_id | No | — | Required for FIFO queues. Messages with the same group ID are delivered in order. |
delay_seconds | No | 0 | Delivery delay (0-900 seconds) |
Reverse (SQS → Inbox)
SELECT tide.relay_set_inbox('sqs-to-inbox', 'sqs-events',
jsonb_build_object(
'queue_url', 'https://sqs.us-east-1.amazonaws.com/123456789012/incoming',
'region', 'us-east-1',
'wait_time_seconds', 20,
'max_messages', 10
),
p_source := 'sqs'
);
Configuration
| Key | Required | Default | Description |
|---|---|---|---|
queue_url | Yes | — | Full SQS queue URL |
region | Yes | — | AWS region |
wait_time_seconds | No | 20 | Long-poll wait time (reduces API calls, max 20s) |
max_messages | No | 10 | Max messages per receive call (1-10) |
SQS authentication
The relay uses the standard AWS credential chain (in priority order):
- Environment variables:
AWS_ACCESS_KEY_IDandAWS_SECRET_ACCESS_KEY - AWS config/credentials files:
~/.aws/credentials - IAM instance role: Automatic on EC2/ECS/Lambda
- EKS IRSA: Via web identity token (recommended for Kubernetes)
For EKS deployments, use IAM Roles for Service Accounts (IRSA) to avoid managing credentials directly.
SQS operational notes
- Use FIFO queues when ordering matters. Set
message_group_idto group related messages (e.g., by customer ID or order ID). - Standard queues provide higher throughput but best-effort ordering.
- Long polling (
wait_time_seconds: 20) reduces costs by minimizing empty receives. - SQS has a 256 KB message size limit. For larger payloads, consider the claim-check pattern: store the payload in S3 and put a reference in the SQS message.
Webhook
HTTP webhooks are the universal integration mechanism — any system with an HTTP endpoint can receive events from pg_tide, and any system that can send HTTP requests can push events into a pg_tide inbox.
When to use Webhooks: Third-party integrations (Stripe, Twilio, GitHub), push notifications to serverless functions, integrating with systems that don't support native messaging protocols, or receiving events from external services.
Forward (Outbox → HTTP Webhook)
Delivers outbox messages as HTTP POST requests to a configured URL:
SELECT tide.relay_set_outbox('events-webhook', 'events', 'webhook',
jsonb_build_object(
'url', 'https://api.example.com/webhooks/events',
'timeout_ms', 5000,
'headers', '{"Authorization": "Bearer token123", "X-Source": "pg-tide"}'
)
);
Configuration
| Key | Required | Default | Description |
|---|---|---|---|
url | Yes | — | Webhook endpoint URL (HTTPS recommended) |
timeout_ms | No | 30000 | Request timeout in milliseconds |
headers | No | {} | Additional HTTP headers as a JSON object |
method | No | POST | HTTP method |
retry_codes | No | [429, 500, 502, 503, 504] | HTTP status codes that trigger retry |
Request format
The relay sends each message as a JSON POST request:
POST /webhooks/events HTTP/1.1
Content-Type: application/json
X-PgTide-Dedup-Key: orders:42:0
X-PgTide-Event-Type: order.created
Authorization: Bearer token123
{"order_id": 42, "total": 99.99}
The X-PgTide-Dedup-Key header allows the receiver to implement idempotency. The X-PgTide-Event-Type header carries the event type from the outbox message headers.
Reverse (HTTP Webhook → Inbox)
Exposes an HTTP endpoint that accepts incoming webhook deliveries and writes them to an inbox:
SELECT tide.relay_set_inbox('webhook-receiver', 'incoming-hooks',
jsonb_build_object(
'port', 8080,
'path', '/webhooks/incoming',
'auth_header', 'Bearer whsec_your_secret'
),
p_source := 'webhook'
);
Configuration
| Key | Required | Default | Description |
|---|---|---|---|
port | No | 8080 | Port for the HTTP listener |
path | No | / | URL path to accept requests on |
auth_header | No | — | Expected Authorization header value. Requests without this are rejected with 401. |
Dedup key extraction
For incoming webhooks, the relay extracts a dedup key from these headers (in priority order):
X-Request-IDX-Idempotency-KeyX-Webhook-ID- Auto-generated UUID (fallback — use only if the sender doesn't provide idempotency keys)
The extracted key becomes the event_id in the inbox table, enabling deduplication of retried webhook deliveries.
Webhook operational notes
- Always use HTTPS for outbound webhooks in production (sensitive data in payloads, authentication headers).
- Set
retry_codesto match the receiver's error semantics. 429 (rate limited) should always trigger retry. - For inbound webhooks, validate the
auth_headerto prevent unauthorized writes to your inbox. - Consider using a short
timeout_ms(5000ms) for forward webhooks to avoid holding relay resources on slow receivers.
stdout and stdin (Development)
For development, testing, and debugging, the relay includes stdout (forward) and stdin (reverse) backends.
stdout prints delivered messages to the relay's standard output — useful for verifying that your pipeline configuration works without setting up an external system:
SELECT tide.relay_set_outbox('debug-pipeline', 'events', 'stdout');
stdin reads messages from standard input — useful for manual testing of inbox processing:
echo '{"event_id": "test-1", "payload": {"hello": "world"}}' | pg-tide --stdin-pipeline my-inbox
Common Configuration Patterns
TLS/SSL for all backends
Most backends support TLS via their URL scheme:
-- NATS with TLS
'url', 'nats://nats.example.com:4443' -- with credentials file for mutual TLS
-- Kafka with SSL
'brokers', 'broker1:9093' -- SSL port, configured via librdkafka
-- Redis with TLS
'url', 'rediss://redis.example.com:6380' -- note: rediss:// (double-s)
-- RabbitMQ with TLS
'url', 'amqps://user:pass@rabbit.example.com:5671/%2f' -- amqps:// scheme
-- Webhook with TLS
'url', 'https://api.example.com/webhooks' -- always HTTPS in production
Naming conventions for pipelines
Choose pipeline names that describe the data flow clearly:
-- Good: describes source and destination
'orders-to-kafka'
'payment-webhooks-to-inbox'
'inventory-nats-fanout'
-- Bad: vague or too generic
'pipeline-1'
'my-pipeline'
'test'
Batching strategy
All forward pipelines support p_batch_size. The right batch size depends on your backend:
| Backend | Recommended batch size | Rationale |
|---|---|---|
| NATS | 50-100 | NATS is fast per-message; small batches keep latency low |
| Kafka | 200-500 | Kafka benefits from batching (compression, fewer round-trips) |
| Redis | 100-200 | XADD is fast but benefits from pipelining |
| RabbitMQ | 50-100 | Per-message confirms; moderate batches balance throughput and latency |
| SQS | 10 | SQS SendMessageBatch supports max 10 messages |
| Webhook | 1-10 | HTTP round-trips are expensive; batch if the receiver supports it |
Error Handling
The pg-tide relay is designed to be resilient in the face of failures. This page covers how errors are categorized and handled at each stage of the pipeline, the retry strategy, graceful shutdown behavior, dead-letter queue management, and a complete reference of all error codes.
Error Philosophy
pg_tide's error handling follows a simple principle: transient errors are retried, permanent errors are logged and skipped. The relay never silently drops messages — every error is logged, counted in Prometheus metrics, and (for inbox-side failures) tracked in the dead-letter queue.
The relay distinguishes between:
- Transient errors — network timeouts, temporary unavailability, connection resets. These will succeed if retried.
- Permanent errors — malformed payloads, deserialization failures, invalid configuration. These will never succeed regardless of retries.
Retry Strategy
All transient errors trigger exponential backoff retry with jitter:
| Parameter | Value | Purpose |
|---|---|---|
| Initial delay | 100ms | Start retrying quickly for brief hiccups |
| Maximum delay | 30 seconds | Cap the backoff to avoid minute-long waits |
| Jitter | ±20% | Prevent thundering herd when multiple relays reconnect simultaneously |
| Maximum retries | Unlimited | The relay retries forever for transient errors — messages are never lost |
| Backoff multiplier | 2× | Each retry doubles the delay (100ms → 200ms → 400ms → ...) |
The backoff sequence looks like: 100ms, 200ms, 400ms, 800ms, 1.6s, 3.2s, 6.4s, 12.8s, 25.6s, 30s, 30s, 30s...
Jitter randomizes each delay by ±20%, so the actual sequence might be: 85ms, 220ms, 350ms, 900ms, etc. This prevents synchronized retry storms.
Error Categories
Connection errors (PostgreSQL)
Symptoms: Relay logs "PostgreSQL connection failed, retrying" or "postgres error"
What happens:
- The relay logs a warning with connection details
- Enters reconnection mode with exponential backoff (100ms → 30s)
- All pipelines are paused (they can't function without the database)
- On reconnect, advisory locks are re-acquired
- Pipeline processing resumes from the last committed offset
- No messages are lost — they remain pending in the outbox
Common causes:
- PostgreSQL is restarting or failing over
- Network partition between relay and database
- Connection pool exhaustion
- Authentication failure (password rotation)
Resolution: Usually self-healing. The relay reconnects automatically when PostgreSQL is available again. If the issue is persistent (auth failure), fix the credentials and the relay will reconnect on its next attempt.
Sink errors (delivery failures)
Symptoms: Relay logs "sink publish error" or "sink unhealthy", Prometheus counter pg_tide_relay_publish_errors_total increases.
What happens:
- Messages remain pending in the outbox (they are never lost)
- The relay retries delivery with exponential backoff until the sink recovers
- Prometheus metrics track
pg_tide_relay_publish_errors_total{pipeline="..."} - The health endpoint reports unhealthy (
503) for affected pipelines - Once the sink recovers, delivery resumes automatically
Common causes:
- Downstream system (Kafka, NATS, webhook endpoint) is temporarily unavailable
- Network issues between relay and sink
- Sink is overloaded and rejecting new messages (backpressure)
- TLS certificate issues
Resolution: Usually self-healing. Monitor the error rate and investigate if it persists beyond expected maintenance windows.
Source errors (reverse mode)
Symptoms: Relay logs "source poll error" for reverse pipelines.
What happens:
- The relay retries subscription/polling with exponential backoff
- Once reconnected, consumption resumes from the last acknowledged position
- No messages are skipped (the source tracks its own offset)
Common causes:
- External source (NATS, Kafka, SQS) is temporarily unavailable
- Subscription expired or was revoked
- Consumer group rebalancing (Kafka)
Payload errors (permanent)
Symptoms: Relay logs "payload decode error" or "unsupported outbox payload version"
What happens:
- The error is logged with full context (outbox name, message ID, raw payload excerpt)
- The message is skipped — it will never succeed regardless of retries
- The offset advances past the bad message
- Prometheus tracks the error count
Common causes:
- Application published malformed JSONB that the relay cannot interpret
- Message format version mismatch (relay expects v2, message is v1)
- Corruption (extremely rare)
Resolution: Investigate the specific message. Fix the publishing code if it's generating invalid payloads. For format mismatches, upgrade the relay or add backward-compatible handling.
Configuration errors
Symptoms: Relay logs "config error" or "invalid config for pipeline" at startup or after hot-reload.
What happens:
- If the error is in the TOML file, the relay refuses to start
- If the error is in a pipeline config (in PostgreSQL), that specific pipeline is skipped
- Other pipelines continue to operate normally
Common causes:
- Missing required config key (e.g., no
brokersfor Kafka sink) - Invalid value (e.g., non-numeric batch_size)
- Unsupported backend name
Resolution: Fix the configuration. For pipeline configs, update the JSONB in the database and the relay will pick up the correction via hot-reload.
Graceful Shutdown
When the relay receives SIGTERM or SIGINT:
- Stop accepting new work — no new batches are fetched from the outbox
- Drain in-flight messages — wait for currently-delivering batches to complete (up to a drain timeout)
- Commit final offsets — record the last successfully delivered position
- Release advisory locks — allow other relay instances to take over immediately
- Close connections — cleanly disconnect from PostgreSQL and sinks
- Exit with code 0 — signal success to the process manager
The drain timeout prevents the relay from hanging indefinitely if a sink is unresponsive during shutdown. Messages that weren't committed will be re-delivered by the next relay instance (and deduplicated by the inbox if applicable).
Dead-Letter Queue (Inbox Side)
For reverse pipelines that write to inboxes, messages that fail processing are managed through the inbox's built-in DLQ mechanism.
How messages enter the DLQ
- Your application reads a message from the inbox and attempts to process it
- Processing fails (external API timeout, validation error, business rule violation)
- You call
tide.inbox_mark_failed(inbox_name, event_id, error_message) - The message's
retry_countis incremented andlast_erroris recorded - After
max_retriesfailures, the message is effectively dead-lettered
Querying the DLQ
-- Find all dead-lettered messages in an inbox
SELECT event_id, payload, retry_count, last_error, received_at
FROM tide."my-inbox_inbox"
WHERE processed_at IS NULL
AND retry_count >= 5 -- assuming max_retries = 5
ORDER BY received_at;
Investigating failures
-- Group DLQ messages by error pattern
SELECT
left(last_error, 50) AS error_pattern,
count(*) AS message_count,
min(received_at) AS earliest,
max(received_at) AS latest
FROM tide."my-inbox_inbox"
WHERE processed_at IS NULL AND retry_count >= 5
GROUP BY left(last_error, 50)
ORDER BY message_count DESC;
Replaying messages
After fixing the underlying issue, replay specific messages or all DLQ messages:
-- Replay specific messages
SELECT tide.replay_inbox_messages('my-inbox',
ARRAY['evt-001', 'evt-002', 'evt-003']);
-- Replay all DLQ messages for an inbox
SELECT tide.replay_inbox_messages('my-inbox',
(SELECT array_agg(event_id)
FROM tide."my-inbox_inbox"
WHERE processed_at IS NULL AND retry_count >= 5)
);
Replaying resets retry_count to 0, making messages eligible for normal processing again.
Extension Error Reference
Errors raised by pg_tide SQL functions:
| Error message | Raised by | What it means |
|---|---|---|
outbox already exists: {name} | outbox_create | An outbox with this name already exists. Use p_if_not_exists := true to suppress. |
outbox not found: {name} | outbox_publish, outbox_drop, outbox_status, outbox_enable/disable | No outbox with this name exists. Create it first with outbox_create. |
inbox already exists: {name} | inbox_create | An inbox with this name already exists. |
inbox not found: {name} | inbox_drop, inbox_mark_processed/failed, inbox_status | No inbox with this name exists. |
relay pipeline not found: {name} | relay_enable/disable/delete/get_config | No pipeline with this name in the catalog. |
invalid argument: {details} | Various | A parameter value is invalid (e.g., negative retention_hours). |
SPI error: {details} | Various | Internal database error during SPI execution. |
Handling extension errors in PL/pgSQL
DO $$
BEGIN
PERFORM tide.outbox_publish('maybe-missing', '{}'::jsonb, '{}'::jsonb);
EXCEPTION
WHEN OTHERS THEN
RAISE NOTICE 'Publish failed: %', SQLERRM;
-- Handle gracefully: log, retry, or use a fallback
END $$;
Relay Error Reference
Errors logged by the pg-tide relay binary:
| Error | Category | What it means | Self-healing? |
|---|---|---|---|
postgres error | Connection | Database communication failure | ✓ (reconnects) |
postgres connection failed | Connection | Cannot reach PostgreSQL | ✓ (retries) |
config error | Configuration | Invalid TOML or missing field | ✗ (fix config) |
invalid config for pipeline | Configuration | Pipeline JSONB validation failure | ✗ (fix SQL config) |
pipeline not found | Configuration | Referenced pipeline doesn't exist | ✗ (create pipeline) |
missing required config key | Configuration | A required backend config key is missing | ✗ (fix SQL config) |
unsupported outbox payload version | Payload | Message format version mismatch | ✗ (upgrade relay or fix publisher) |
payload decode error | Payload | Cannot deserialize message | ✗ (fix publisher) |
sink publish error | Delivery | Sink rejected or timed out | ✓ (retries) |
sink unhealthy | Delivery | Sink not accepting connections | ✓ (retries) |
source poll error | Ingestion | Source read failure | ✓ (retries) |
channel closed | Internal | Internal communication channel dropped | ✓ (relay recovers) |
Monitoring Errors
Prometheus metrics for error tracking
# Total delivery errors by pipeline (should be 0 in steady state)
rate(pg_tide_relay_publish_errors_total[5m])
# Unhealthy pipelines (immediate alert)
pg_tide_relay_pipeline_healthy == 0
# Error rate as a percentage of total deliveries
rate(pg_tide_relay_publish_errors_total[5m])
/ rate(pg_tide_relay_messages_published_total[5m])
Alerting rules
- alert: PgTideDeliveryErrors
expr: rate(pg_tide_relay_publish_errors_total[5m]) > 0
for: 2m
labels:
severity: warning
annotations:
summary: "Delivery errors on pipeline {{ $labels.pipeline }}"
description: "The relay is experiencing delivery failures. Check sink availability."
- alert: PgTidePipelineDown
expr: pg_tide_relay_pipeline_healthy == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Pipeline {{ $labels.pipeline }} is unhealthy"
description: "Immediate investigation required. Messages are accumulating."
Log-based monitoring
With structured JSON logging (--log-format json), you can filter and alert on error logs:
{"level":"error","pipeline":"orders-to-kafka","error":"sink publish error: BrokerNotAvailable","msg":"delivery failed, will retry","timestamp":"2025-01-15T10:30:00Z"}
Key fields to monitor:
level=error— any error-level log indicates a problempipeline— identifies which pipeline is affectederror— the specific error message for diagnosis
Monitoring
The pg-tide relay exposes Prometheus metrics and a health endpoint for observability.
Endpoints
| Endpoint | Port | Description |
|---|---|---|
GET /metrics | 9090 (default) | Prometheus metrics |
GET /health | 9090 (default) | Liveness + readiness check |
Configure the port with --metrics-addr:
pg-tide --metrics-addr 0.0.0.0:9090
Prometheus Metrics
Counters
| Metric | Labels | Description |
|---|---|---|
pg_tide_relay_messages_published_total | pipeline, direction | Messages successfully delivered to sink |
pg_tide_relay_messages_consumed_total | pipeline, direction | Messages read from source |
pg_tide_relay_publish_errors_total | pipeline, direction | Failed delivery attempts |
pg_tide_relay_dedup_skipped_total | pipeline | Messages skipped (duplicate dedup key) |
Gauges
| Metric | Labels | Description |
|---|---|---|
pg_tide_relay_pipeline_healthy | pipeline | 1 if pipeline is operational, 0 otherwise |
Health Endpoint
curl http://localhost:9090/health
- 200 OK — all pipelines healthy
- 503 Service Unavailable — one or more pipelines unhealthy
Response body includes unhealthy pipeline names when degraded.
SQL-Level Monitoring
In addition to relay metrics, monitor from PostgreSQL:
-- Pending messages per outbox
SELECT * FROM tide.outbox_pending;
-- Consumer lag
SELECT * FROM tide.consumer_lag;
-- Pipeline status
SELECT tide.relay_list_configs();
Grafana Dashboard
Example PromQL queries for a Grafana dashboard:
# Message throughput (published/sec)
rate(pg_tide_relay_messages_published_total[5m])
# Error rate
rate(pg_tide_relay_publish_errors_total[5m])
# Consumer lag (from PostgreSQL — use a Postgres exporter)
pg_tide_consumer_lag{group_name="my-relay"}
# Pipeline health
pg_tide_relay_pipeline_healthy
Alerting Recommendations
| Condition | Severity | Alert |
|---|---|---|
pg_tide_relay_pipeline_healthy == 0 | Critical | Pipeline down |
rate(publish_errors_total[5m]) > 0 | Warning | Delivery errors |
| Consumer lag > threshold | Warning | Relay falling behind |
| No heartbeat for > 60s | Critical | Relay process dead |
Sinks Overview
When you publish a message to a pg_tide outbox, that message sits safely in your PostgreSQL database, waiting to be delivered somewhere useful. A sink is the destination where the pg_tide relay delivers those messages. Think of it like a postal service: your application drops a letter (message) into a mailbox (outbox), and the postal carrier (relay) delivers it to the recipient's address (sink).
pg_tide supports 30 different sinks, covering everything from traditional message queues like Apache Kafka and RabbitMQ, to cloud services like Amazon SQS and Google Cloud Pub/Sub, to modern data lakes like Apache Iceberg and Delta Lake, to notification services like Slack and PagerDuty. No matter where your messages need to go, there is likely a sink that fits your needs.
Choosing the Right Sink
Selecting a sink depends on what you are trying to accomplish. The table below groups sinks by category and highlights the primary use case for each. If you are building a new system and have flexibility in choosing your messaging infrastructure, start with the category that matches your architectural goals, then read the detailed page for each sink to understand the trade-offs.
Message Queues & Streaming
These sinks deliver messages to traditional message brokers and streaming platforms. They are ideal when you need durable, ordered message delivery to multiple consumers, or when you are integrating with an existing event-driven architecture.
| Sink | Best For | Ordering | Delivery Guarantee |
|---|---|---|---|
| Apache Kafka | High-throughput streaming, event sourcing, CDC pipelines | Per-partition | At-least-once (exactly-once with idempotent producer) |
| NATS JetStream | Low-latency pub/sub, microservice communication | Per-subject | At-least-once (exactly-once with dedup) |
| RabbitMQ | Complex routing, work queues, legacy integration | Per-queue | At-least-once (with publisher confirms) |
| Redis Streams | Lightweight streaming, real-time dashboards | Per-stream | At-least-once |
| Amazon SQS | AWS-native queuing, serverless triggers | FIFO optional | At-least-once (exactly-once with FIFO) |
| Amazon Kinesis | Real-time analytics on AWS, high-volume ingestion | Per-shard | At-least-once |
| Google Cloud Pub/Sub | GCP-native messaging, global distribution | Per-ordering-key | At-least-once |
| Azure Service Bus | Enterprise messaging on Azure, sessions, transactions | Per-session | At-least-once |
| Azure Event Hubs | High-throughput event ingestion on Azure | Per-partition | At-least-once |
| MQTT v5 | IoT device communication, edge computing | Per-topic (QoS dependent) | Configurable (QoS 0/1/2) |
Analytics & Data Lakes
These sinks write messages directly into analytical databases and data lake storage. They are ideal when your PostgreSQL events need to feed dashboards, machine learning pipelines, or long-term analytical storage without going through an intermediate message broker.
| Sink | Best For | Format | Batch Support |
|---|---|---|---|
| ClickHouse | Real-time analytics, time-series, log storage | Native protocol | Yes (batch inserts) |
| Snowflake | Cloud data warehouse, BI reporting | Stage + COPY | Yes (micro-batches) |
| BigQuery | Google Cloud analytics, large-scale queries | Streaming/Load | Yes (streaming inserts) |
| Apache Iceberg | Open table format, lakehouse architecture | Parquet | Yes (append commits) |
| Delta Lake | Databricks ecosystem, ACID on object storage | Parquet | Yes (append commits) |
| DuckLake | Lightweight lakehouse, PostgreSQL-cataloged Parquet | Parquet | Yes (batch writes) |
| MongoDB | Document storage, flexible schemas | BSON documents | Yes (bulk writes) |
| Elasticsearch | Full-text search, log analytics, APM | JSON documents | Yes (bulk API) |
| Object Storage | S3/GCS/Azure Blob archival, data lake landing | JSONL or Parquet | Yes (file-per-batch) |
| Apache Arrow Flight | High-performance columnar transfer, ML pipelines | Arrow IPC | Yes (record batches) |
Notifications & Webhooks
These sinks deliver messages to notification services and HTTP endpoints. They are ideal for alerting, triggering external workflows, and integrating with third-party APIs that expect HTTP callbacks.
| Sink | Best For | Format |
|---|---|---|
| HTTP Webhook | Third-party API integration, custom endpoints | JSON POST |
| Slack | Team notifications, operational alerts | Block Kit messages |
| Discord | Community notifications, bot integrations | Embed messages |
| PagerDuty | Incident management, on-call alerting | Events API v2 |
Connector Ecosystems
These sinks integrate with established connector frameworks, giving you access to hundreds of additional destinations through a single configuration. Instead of pg_tide implementing a direct connection to every possible system, these adapters let you leverage existing connector ecosystems.
| Sink | Best For | Ecosystem Size |
|---|---|---|
| Singer / Meltano | Open-source ETL, Meltano Hub targets | ~500 targets |
| Airbyte | Managed data integration, destination connectors | ~400 connectors |
| Fivetran HVR | Enterprise data integration, HVR endpoint | Fivetran ecosystem |
Infrastructure
These sinks deliver messages to other PostgreSQL instances or output streams. They are useful for database-to-database messaging, testing, and debugging.
| Sink | Best For |
|---|---|
| PostgreSQL Inbox | Cross-service messaging within PostgreSQL |
| Remote PostgreSQL Outbox | Multi-cluster federation |
| stdout / File | Debugging, log capture, piping to external tools |
Common Configuration Patterns
Every sink is configured through the relay pipeline's JSONB configuration stored in PostgreSQL. The basic pattern looks like this:
SELECT tide.relay_set_outbox(
'my-pipeline',
'orders',
'relay-group',
'{
"sink_type": "kafka",
"brokers": "localhost:9092",
"topic": "order-events"
}'::jsonb
);
The sink_type field determines which sink implementation the relay uses. All other fields in the JSON object are sink-specific configuration. Every sink page documents its complete configuration reference.
Secret Management
Sensitive values like passwords, API keys, and connection strings should never be stored directly in the pipeline configuration. Instead, use secret interpolation:
{
"sink_type": "kafka",
"brokers": "${env:KAFKA_BROKERS}",
"sasl_username": "${env:KAFKA_USERNAME}",
"sasl_password": "${env:KAFKA_PASSWORD}"
}
The relay resolves ${env:VAR_NAME} tokens from environment variables and ${file:/path/to/secret} from files at startup. Resolved values are never written to logs or metric labels.
Delivery Guarantees
All sinks provide at-least-once delivery by default. The relay acknowledges messages in the outbox only after the sink confirms receipt. If the relay crashes between delivering a message and acknowledging it, the message will be delivered again on restart.
For sinks that support it, you can achieve exactly-once semantics by combining pg_tide's outbox deduplication key with the sink's native deduplication mechanism. Each sink page documents whether exactly-once is possible and how to configure it.
Error Handling
When a sink cannot accept a message (network failure, authentication error, malformed payload), the relay retries with exponential backoff. If retries are exhausted, the message is routed to the dead-letter queue (DLQ) for manual inspection and replay. The circuit breaker protects against cascading failures by temporarily halting delivery when a sink is consistently unavailable.
Next Steps
Browse the sink pages that match your use case, or start with the most common choices:
- New to event streaming? Start with NATS JetStream for simplicity
- Enterprise Kafka shop? Go to Apache Kafka
- Building a data lake? See Apache Iceberg or Object Storage
- Need webhooks? See HTTP Webhook
- Want maximum connector coverage? See Singer / Meltano
Apache Kafka
Apache Kafka is a distributed event streaming platform that serves as the backbone of real-time data architectures at thousands of organizations worldwide. Originally developed at LinkedIn and now maintained by the Apache Software Foundation, Kafka excels at handling high-throughput, fault-tolerant, ordered streams of events. When you connect pg_tide to Kafka, every message published to your PostgreSQL outbox is automatically delivered to Kafka topics, making your database changes available to any downstream system that speaks the Kafka protocol — from stream processors like Apache Flink to data warehouses like Snowflake.
If your organization already uses Kafka, connecting pg_tide is the simplest way to get your PostgreSQL events into the broader event-driven ecosystem without writing custom producer code or managing CDC infrastructure like Debezium. If you are evaluating message brokers, Kafka is an excellent choice when you need durable, ordered, replayable event streams at scale.
When to Use This Sink
Choose the Kafka sink when you need one or more of the following:
- High throughput — Kafka handles millions of messages per second across partitioned topics. If your outbox produces thousands of events per second, Kafka will keep up without breaking a sweat.
- Durable replay — Kafka retains messages for a configurable period (days, weeks, or forever with log compaction). Downstream consumers can replay the entire history or start from any point in time.
- Multiple consumers — Many different services need to independently consume the same stream of events. Kafka's consumer group model makes this natural.
- Existing Kafka ecosystem — Your organization already runs Kafka and your downstream consumers (Flink, ksqlDB, Materialize, Connect) expect Kafka topics.
- CDC compatibility — You want to produce Debezium-formatted change events that existing CDC-aware tools can consume natively.
Consider a different sink if you need sub-millisecond latency (NATS is faster for point-to-point), if you want zero operational overhead (SQS/Pub/Sub are fully managed), or if your total message volume is very low (Kafka's operational cost may not be justified).
How It Works
When the relay processes a batch of outbox messages destined for Kafka, it performs the following steps:
- Fetch — The relay polls the outbox table for unacknowledged messages belonging to this pipeline's consumer group.
- Transform — If JMESPath transforms are configured, the relay applies filter and projection expressions to each message payload.
- Encode — Messages are serialized according to the configured wire format (native JSON, Debezium, or Avro via Schema Registry).
- Route — The relay determines the target Kafka topic for each message using the configured topic template (static or dynamic based on message content).
- Produce — Messages are sent to Kafka using the configured producer settings (compression, batching, acknowledgment level).
- Acknowledge — Once Kafka confirms receipt (based on the
ackssetting), the relay commits the consumer group offset in PostgreSQL, marking those messages as delivered.
If any step fails, the relay retries with exponential backoff. If retries are exhausted, failed messages are routed to the dead-letter queue.
sequenceDiagram
participant App as Application
participant PG as PostgreSQL
participant Relay as pg-tide relay
participant Kafka as Kafka Cluster
App->>PG: INSERT order + outbox_publish()
Note over PG: Single transaction
Relay->>PG: Poll outbox (batch)
PG-->>Relay: Messages batch
Relay->>Kafka: Produce (compressed, batched)
Kafka-->>Relay: Ack (all replicas)
Relay->>PG: Commit offset
Configuration
Minimal Configuration
The simplest possible Kafka sink configuration requires only the broker addresses and a topic name:
SELECT tide.relay_set_outbox(
'orders-to-kafka', -- pipeline name
'orders', -- outbox name
'kafka-relay', -- consumer group
'{
"sink_type": "kafka",
"brokers": "localhost:9092",
"topic": "order-events"
}'::jsonb
);
This connects to a local Kafka cluster without authentication, sends all messages to the order-events topic, and uses default producer settings. This is appropriate for development but not for production.
Production Configuration
A production-ready configuration includes authentication, compression, and tuned producer settings:
SELECT tide.relay_set_outbox(
'orders-to-kafka',
'orders',
'kafka-relay',
'{
"sink_type": "kafka",
"brokers": "${env:KAFKA_BROKERS}",
"topic": "orders.events.{op}",
"sasl_mechanism": "SCRAM-SHA-256",
"sasl_username": "${env:KAFKA_USERNAME}",
"sasl_password": "${env:KAFKA_PASSWORD}",
"tls_enabled": true,
"compression": "zstd",
"acks": "all",
"batch_size": 500,
"linger_ms": 50,
"idempotent": true,
"request_timeout_ms": 30000
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "kafka" |
brokers | string | — | Comma-separated list of broker addresses (host:port) |
topic | string | — | Target topic name. Supports template variables: {stream_table}, {op}, {outbox_id} |
sasl_mechanism | string | null | Authentication mechanism: "PLAIN", "SCRAM-SHA-256", "SCRAM-SHA-512" |
sasl_username | string | null | SASL username |
sasl_password | string | null | SASL password |
tls_enabled | bool | false | Enable TLS for broker connections |
tls_ca_cert | string | null | Path to CA certificate file for TLS verification |
tls_client_cert | string | null | Path to client certificate for mTLS authentication |
tls_client_key | string | null | Path to client private key for mTLS authentication |
compression | string | "none" | Compression codec: "none", "gzip", "snappy", "lz4", "zstd" |
acks | string | "all" | Acknowledgment level: "0" (fire-and-forget), "1" (leader only), "all" (all ISR replicas) |
batch_size | int | 100 | Maximum messages per produce request |
linger_ms | int | 10 | Time to wait for batch to fill before sending |
idempotent | bool | false | Enable idempotent producer (prevents duplicates on retry) |
request_timeout_ms | int | 30000 | Timeout for produce requests |
message_key | string | null | Template for Kafka message key. Determines partition assignment. Supports {dedup_key}, {stream_table} |
headers | object | null | Static headers to include on every message |
Authentication
No Authentication (Development Only)
For local development with an unsecured Kafka cluster, no authentication configuration is needed. Simply provide the broker addresses:
{
"sink_type": "kafka",
"brokers": "localhost:9092",
"topic": "dev-events"
}
This is not recommended for any environment accessible over a network.
SASL/PLAIN (Confluent Cloud)
Confluent Cloud and many managed Kafka services use SASL/PLAIN over TLS. This requires a username (API key) and password (API secret):
{
"sink_type": "kafka",
"brokers": "${env:CONFLUENT_BOOTSTRAP_SERVERS}",
"topic": "my-topic",
"sasl_mechanism": "PLAIN",
"sasl_username": "${env:CONFLUENT_API_KEY}",
"sasl_password": "${env:CONFLUENT_API_SECRET}",
"tls_enabled": true
}
SASL/SCRAM-SHA-256
Self-hosted Kafka clusters often use SCRAM-SHA-256 for username/password authentication with stronger security than PLAIN:
{
"sink_type": "kafka",
"brokers": "kafka-1:9093,kafka-2:9093,kafka-3:9093",
"topic": "events",
"sasl_mechanism": "SCRAM-SHA-256",
"sasl_username": "${env:KAFKA_USER}",
"sasl_password": "${env:KAFKA_PASS}",
"tls_enabled": true
}
mTLS (Certificate-Based)
For environments requiring mutual TLS authentication (common in financial services and regulated industries), provide client certificates:
{
"sink_type": "kafka",
"brokers": "kafka-1:9093,kafka-2:9093",
"topic": "secure-events",
"tls_enabled": true,
"tls_ca_cert": "/etc/certs/ca.pem",
"tls_client_cert": "/etc/certs/client.pem",
"tls_client_key": "/etc/certs/client-key.pem"
}
Message Format
Each outbox message becomes a Kafka record with the following mapping:
| Kafka Record Field | Source | Example |
|---|---|---|
| Key | message_key template or dedup_key | "order-12345" |
| Value | Serialized message payload (JSON by default) | {"order_id": 12345, ...} |
| Topic | topic template | "orders.events.insert" |
| Headers | pg_tide metadata + configured static headers | pg_tide_outbox: "orders", pg_tide_op: "insert" |
Topic Routing
The topic field supports template variables that are resolved per-message:
{stream_table}— The outbox name (e.g.,orders){op}— The operation type (insert,update,delete){outbox_id}— The unique outbox message ID
For example, "events.{stream_table}.{op}" routes INSERT messages from the orders outbox to the topic events.orders.insert.
Wire Format Integration
When using the Debezium wire format, messages are produced in Debezium envelope format, making them compatible with tools like Apache Iceberg's Debezium sink connector, Flink CDC, ksqlDB, and Materialize:
{
"sink_type": "kafka",
"brokers": "localhost:9092",
"topic": "dbserver1.public.orders",
"wire_format": "debezium"
}
See the Debezium Wire Format page for details on the message structure.
Delivery Guarantees
The Kafka sink provides at-least-once delivery by default. With the idempotent producer enabled, it provides exactly-once semantics for the produce operation — Kafka's broker-side deduplication ensures that retried produce requests do not create duplicate records.
Combined with pg_tide's consumer group offset tracking, this means:
- A message is published to Kafka at least once (at-least-once from outbox to Kafka)
- With
idempotent: true, Kafka deduplicates retried produces (effectively exactly-once on the produce side) - If the downstream consumer also uses an idempotent inbox, end-to-end exactly-once is achieved
Acknowledgment Levels
The acks setting controls when Kafka considers a produce successful:
"0"— The relay does not wait for any acknowledgment. Fastest, but messages can be lost if the leader fails before replicating."1"— The leader broker acknowledges after writing to its local log. Messages can be lost if the leader fails before followers replicate."all"— All in-sync replicas (ISR) must acknowledge. No data loss as long as at least one replica survives. Recommended for production.
Performance Tuning
Batch Size and Linger
The relay collects messages into batches before sending to Kafka. Larger batches improve throughput but increase latency:
batch_size: 100(default) — Good balance for most workloadsbatch_size: 500-1000— Better throughput for high-volume pipelineslinger_ms: 50-100— Wait longer to fill batches; reduces request count at the cost of latency
Compression
Compression reduces network bandwidth and Kafka storage at the cost of CPU:
"zstd"— Best compression ratio, good speed. Recommended for most workloads."lz4"— Fastest compression/decompression, moderate ratio. Best when CPU is constrained."snappy"— Good balance, widely supported. Safe default."gzip"— Highest ratio but slowest. Use only when bandwidth is extremely limited.
Expected Throughput
Under typical conditions with acks: "all" and compression: "zstd":
| Batch Size | Messages/sec | Latency (p99) |
|---|---|---|
| 100 | ~5,000 | ~50ms |
| 500 | ~20,000 | ~100ms |
| 1000 | ~40,000 | ~200ms |
Actual numbers depend on message size, network latency, Kafka cluster capacity, and replication factor.
Complete Example: Order Events to Kafka
This example demonstrates a complete pipeline from order creation in PostgreSQL to delivery on a Kafka topic.
1. Set Up the Outbox
-- Create the orders outbox with 48-hour retention
SELECT tide.outbox_create('orders', retention_hours => 48);
2. Publish Messages from Your Application
-- In your order processing transaction:
BEGIN;
INSERT INTO orders (id, customer_id, total, status)
VALUES (gen_random_uuid(), 'cust-789', 99.99, 'confirmed');
SELECT tide.outbox_publish(
'orders',
jsonb_build_object(
'event_type', 'order.confirmed',
'order_id', 'ord-12345',
'customer_id', 'cust-789',
'total', 99.99,
'items', jsonb_build_array(
jsonb_build_object('sku', 'WIDGET-A', 'qty', 2)
)
),
'ord-12345' -- dedup_key
);
COMMIT;
3. Configure the Pipeline
SELECT tide.relay_set_outbox(
'orders-to-kafka',
'orders',
'kafka-relay',
'{
"sink_type": "kafka",
"brokers": "kafka:9092",
"topic": "order-events",
"compression": "zstd",
"acks": "all",
"idempotent": true,
"batch_size": 100
}'::jsonb
);
SELECT tide.relay_enable('orders-to-kafka');
4. Start the Relay
pg-tide --postgres-url "postgresql://relay_user:password@localhost:5432/mydb"
5. Verify Messages Arrive
Using the Kafka console consumer:
kafka-console-consumer \
--bootstrap-server kafka:9092 \
--topic order-events \
--from-beginning
You should see your order event messages arriving as JSON payloads.
Compatibility
The pg_tide Kafka sink is compatible with:
- Apache Kafka 2.8+ (including KRaft mode)
- Confluent Cloud (fully managed)
- Confluent Platform (self-managed)
- Amazon MSK (with IAM or SASL auth)
- Redpanda (Kafka-compatible API)
- Aiven for Kafka
- Upstash Kafka (serverless)
Troubleshooting
"Connection refused" or "Broker not available"
The relay cannot reach the Kafka brokers. Check:
- Broker addresses are correct and include ports
- Network connectivity exists (firewall rules, security groups, VPC peering)
- DNS resolution works for broker hostnames
- TLS is enabled if the cluster requires it
"SASL authentication failed"
Authentication credentials are incorrect or misconfigured:
- Verify
sasl_mechanismmatches what the cluster expects - Check that environment variables containing credentials are set
- For Confluent Cloud, ensure you're using the API key (not the cluster ID) as the username
"Topic does not exist"
The target topic has not been created and auto-creation is disabled on the cluster:
- Create the topic manually:
kafka-topics --create --topic order-events --partitions 6 --replication-factor 3 - Or enable
auto.create.topics.enable=trueon the cluster (not recommended for production)
"Message too large"
The message payload exceeds max.message.bytes on the broker:
- Check your message payload sizes
- Increase
max.message.byteson the broker/topic configuration - Consider using JMESPath projections to reduce payload size before delivery
Messages delivered but not in expected order
Kafka guarantees ordering only within a single partition. If you need message ordering:
- Set
message_keyto a field that should determine ordering (e.g.,{dedup_key}for per-entity ordering) - Messages with the same key always go to the same partition
Further Reading
- Wire Formats — Produce Debezium, Avro, or custom formats to Kafka
- Content-Based Routing — Route different event types to different topics
- Schema Registry — Enforce Avro/Protobuf schemas on produced messages
- Dead-Letter Queue — Handle delivery failures gracefully
- Circuit Breaker — Protect against Kafka cluster outages
NATS JetStream
NATS is a lightweight, high-performance messaging system designed for cloud-native applications. JetStream is NATS's built-in persistence layer that adds durable message storage, replay capabilities, and exactly-once delivery semantics to the core NATS protocol. When you connect pg_tide to NATS JetStream, your PostgreSQL outbox messages are delivered with sub-millisecond latency to any service subscribed to the relevant subjects, while JetStream ensures messages are persisted and can be replayed if a consumer was offline.
NATS is particularly well-suited for microservice architectures where you need fast, reliable communication between services without the operational complexity of running a Kafka cluster. Its subject-based addressing model makes routing intuitive, and its lightweight footprint means you can run it anywhere — from a single container in development to a globally distributed supercluster in production.
When to Use This Sink
Choose the NATS JetStream sink when your architecture values simplicity and speed:
- Low-latency messaging — NATS delivers messages in microseconds. If your downstream services need near-real-time notification of database changes, NATS is one of the fastest options available.
- Simple operations — NATS is a single binary with minimal configuration. Unlike Kafka, there is no ZooKeeper, no partition management, and no broker coordination to think about.
- Subject-based routing — NATS's hierarchical subject naming (e.g.,
orders.created,orders.shipped) provides natural topic routing without needing separate topic creation steps. - Microservice communication — When your services communicate through events and you want a lightweight broker that scales horizontally with minimal fuss.
- Cloud-native deployments — NATS has first-class support for Kubernetes, runs efficiently in containers, and supports leaf nodes for edge computing scenarios.
Consider Kafka instead if you need very long retention periods (weeks/months), strict partition-level ordering guarantees, or compatibility with the Kafka ecosystem (Connect, Streams, ksqlDB).
How It Works
The relay connects to a NATS server (or cluster) and publishes messages to JetStream subjects. JetStream provides durable storage, so messages are persisted even if no consumer is currently subscribed. The flow is:
- The relay fetches a batch of undelivered messages from the outbox.
- Each message is published to the configured NATS subject (which can be templated per-message).
- JetStream acknowledges persistence of each message.
- The relay commits the consumer group offset in PostgreSQL.
NATS JetStream supports message deduplication based on a Nats-Msg-Id header. pg_tide automatically sets this header to the outbox message's dedup_key, which means that even if the relay retries a publish (after a network interruption, for example), NATS will not create duplicate messages in the stream.
Configuration
Minimal Configuration
SELECT tide.relay_set_outbox(
'orders-to-nats',
'orders',
'nats-relay',
'{
"sink_type": "nats",
"url": "nats://localhost:4222",
"subject": "orders.events"
}'::jsonb
);
Production Configuration
SELECT tide.relay_set_outbox(
'orders-to-nats',
'orders',
'nats-relay',
'{
"sink_type": "nats",
"url": "${env:NATS_URL}",
"subject": "events.{stream_table}.{op}",
"credentials_file": "${env:NATS_CREDS_FILE}",
"tls_enabled": true,
"tls_ca_cert": "/etc/certs/nats-ca.pem",
"stream": "EVENTS",
"batch_size": 200
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "nats" |
url | string | — | NATS server URL(s). Comma-separated for clusters: "nats://host1:4222,nats://host2:4222" |
subject | string | — | Target subject. Supports templates: {stream_table}, {op}, {outbox_id} |
stream | string | null | JetStream stream name (auto-detected from subject if not specified) |
credentials_file | string | null | Path to NATS credentials file (.creds) |
nkey_seed | string | null | NKey seed for authentication |
token | string | null | Authentication token |
username | string | null | Username for user/password auth |
password | string | null | Password for user/password auth |
tls_enabled | bool | false | Enable TLS |
tls_ca_cert | string | null | CA certificate path |
batch_size | int | 100 | Messages per batch |
Authentication
No Authentication (Development)
For local development:
{
"sink_type": "nats",
"url": "nats://localhost:4222",
"subject": "dev.events"
}
Credentials File (NATS.io Cloud / Production)
NATS credentials files contain both the JWT and the NKey seed. This is the recommended authentication method for NATS.io's managed service (Synadia Cloud):
{
"sink_type": "nats",
"url": "tls://connect.ngs.global",
"subject": "myapp.events",
"credentials_file": "/etc/nats/user.creds"
}
NKey Authentication
NKeys provide public-key authentication without passwords:
{
"sink_type": "nats",
"url": "nats://nats-server:4222",
"subject": "events",
"nkey_seed": "${env:NATS_NKEY_SEED}",
"tls_enabled": true
}
Token Authentication
Simple token-based auth for smaller deployments:
{
"sink_type": "nats",
"url": "nats://nats-server:4222",
"subject": "events",
"token": "${env:NATS_TOKEN}"
}
Delivery Guarantees
The NATS JetStream sink provides exactly-once delivery when properly configured. This is achieved through the combination of:
- JetStream message deduplication — pg_tide sets the
Nats-Msg-Idheader to the message's dedup_key. JetStream tracks published message IDs within its deduplication window and rejects duplicates silently. - Outbox offset tracking — The relay only commits offsets after JetStream acknowledges persistence.
This means that even if the relay crashes and restarts, re-published messages will be deduplicated by JetStream, preventing downstream consumers from seeing duplicates.
Subject Routing
NATS subjects use a dot-separated hierarchical namespace that makes routing intuitive. pg_tide's template variables map naturally to this model:
events.orders.insert → new orders
events.orders.update → order status changes
events.payments.insert → new payments
events.*.delete → all deletes (wildcard subscription)
Configure dynamic subject routing with:
{
"subject": "events.{stream_table}.{op}"
}
Downstream services can subscribe to exactly the events they care about using NATS wildcards (* for single token, > for multiple tokens).
Complete Example
1. Create the Outbox
SELECT tide.outbox_create('notifications', retention_hours => 24);
2. Configure the Pipeline
SELECT tide.relay_set_outbox(
'notify-pipeline',
'notifications',
'nats-group',
'{
"sink_type": "nats",
"url": "nats://localhost:4222",
"subject": "notifications.{op}",
"stream": "NOTIFICATIONS"
}'::jsonb
);
SELECT tide.relay_enable('notify-pipeline');
3. Publish an Event
SELECT tide.outbox_publish(
'notifications',
'{"type": "order.shipped", "order_id": "ord-555", "customer": "alice@example.com"}'::jsonb,
'ord-555-shipped'
);
4. Verify with NATS CLI
nats sub "notifications.>"
# Output: [notifications.insert] {"type": "order.shipped", ...}
Troubleshooting
"Connection refused"
NATS server is not reachable:
- Check the URL includes the correct port (default 4222)
- Verify network connectivity and firewall rules
- For NATS clusters, ensure at least one seed server is accessible
"Authorization violation"
Authentication or authorization failed:
- Verify credentials file path exists and is readable
- Check that the user/account has publish permission on the target subject
- For NKey auth, ensure the seed matches the configured user
"No responders" or "Stream not found"
JetStream is not configured for the target subject:
- Create the JetStream stream:
nats stream add EVENTS --subjects "events.>" - Or set the
streamparameter to match an existing stream - Verify JetStream is enabled on the NATS server (
jetstream: enabledin config)
Further Reading
- Sources: NATS — Consuming from NATS into a pg_tide inbox
- Content-Based Routing — Advanced subject routing patterns
- Bidirectional Sync Tutorial — Using NATS for two-way communication
RabbitMQ
RabbitMQ is one of the most widely deployed open-source message brokers, trusted by tens of thousands of organizations for reliable message delivery. Built on the AMQP 0-9-1 protocol, RabbitMQ provides sophisticated routing capabilities through its exchange-and-queue model, making it particularly well-suited for scenarios where messages need to be routed to different consumers based on content, headers, or routing patterns. When you connect pg_tide to RabbitMQ, your outbox messages are published to exchanges where RabbitMQ's routing rules determine which queues (and ultimately which consumers) receive each message.
When to Use This Sink
Choose RabbitMQ when you need complex message routing patterns (topic exchanges, header-based routing, priority queues), when you are integrating with existing RabbitMQ infrastructure, or when you need per-message acknowledgment with sophisticated dead-letter handling built into the broker itself. RabbitMQ's management UI and mature tooling ecosystem also make it an excellent choice for teams that value operational visibility.
Configuration
Minimal Configuration
SELECT tide.relay_set_outbox(
'orders-to-rabbit',
'orders',
'rabbit-relay',
'{
"sink_type": "rabbitmq",
"url": "amqp://localhost:5672",
"exchange": "events",
"routing_key": "orders.created"
}'::jsonb
);
Production Configuration
SELECT tide.relay_set_outbox(
'orders-to-rabbit',
'orders',
'rabbit-relay',
'{
"sink_type": "rabbitmq",
"url": "amqps://${env:RABBITMQ_USER}:${env:RABBITMQ_PASS}@${env:RABBITMQ_HOST}:5671/%2f",
"exchange": "events",
"exchange_type": "topic",
"routing_key": "{stream_table}.{op}",
"publisher_confirms": true,
"mandatory": true,
"tls_enabled": true,
"batch_size": 100
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "rabbitmq" |
url | string | — | AMQP connection URL |
exchange | string | — | Target exchange name |
exchange_type | string | "topic" | Exchange type: "direct", "topic", "fanout", "headers" |
routing_key | string | "" | Routing key template. Supports {stream_table}, {op} |
publisher_confirms | bool | true | Wait for broker acknowledgment |
mandatory | bool | false | Return unroutable messages as errors |
tls_enabled | bool | false | Enable TLS |
batch_size | int | 100 | Messages per batch |
headers | object | null | Additional AMQP message headers |
Delivery Guarantees
With publisher_confirms: true (the default), RabbitMQ acknowledges each message after it has been written to disk and (if mirrored) replicated to mirror nodes. This provides at-least-once delivery. Combined with RabbitMQ's built-in deduplication plugin or consumer-side idempotency, you can achieve effectively exactly-once processing.
Routing Patterns
RabbitMQ's routing model is more flexible than simple topic-based systems. The routing_key template combined with exchange types enables powerful patterns:
- Direct exchange: Messages with routing key
orders.insertgo only to queues bound with that exact key. - Topic exchange: Messages with routing key
orders.insertmatch queues bound toorders.*,orders.#, or*.insert. - Fanout exchange: All messages go to all bound queues regardless of routing key (broadcast).
- Headers exchange: Route based on message headers rather than routing key.
Complete Example
-- Publish an order event
SELECT tide.outbox_publish(
'orders',
'{"order_id": "ord-99", "status": "paid", "amount": 149.99}'::jsonb,
'ord-99-paid'
);
Verify with rabbitmqadmin:
rabbitmqadmin get queue=order-processing count=1
Troubleshooting
- "Connection refused" — Check RabbitMQ is running and the port is correct (5672 for AMQP, 5671 for AMQPS)
- "Access refused" — Verify username/password and that the user has publish permission on the exchange
- "Exchange not found" — Create the exchange first or set
exchange_declare: trueif supported - Messages not arriving in queue — Check queue bindings match the routing key pattern
Further Reading
- Sources: RabbitMQ — Consuming from RabbitMQ into pg_tide inbox
- Content-Based Routing — Dynamic routing key templates
Redis Streams
Redis Streams is a log-like data structure built into Redis that combines the simplicity of Redis with the durability of an append-only log. Unlike Redis Pub/Sub (which is fire-and-forget), Streams persist messages and support consumer groups with acknowledgment semantics, making them suitable for reliable event delivery. When you connect pg_tide to Redis Streams, your outbox messages are appended to a Redis stream where multiple consumer groups can independently read and process them at their own pace.
Redis Streams is an excellent choice when you already run Redis in your infrastructure and want lightweight event streaming without deploying a separate message broker. It provides ordering guarantees within a single stream, consumer group support for load balancing, and the sub-millisecond latency that Redis is known for.
When to Use This Sink
Choose Redis Streams when you need lightweight, fast message delivery and already have Redis in your stack. It is particularly effective for real-time dashboards, caching invalidation, rate-limited work queues, and microservice communication where message volumes are moderate (thousands per second rather than millions). Consider Kafka or NATS for higher throughput requirements or when you need longer retention periods.
Configuration
Minimal Configuration
SELECT tide.relay_set_outbox(
'events-to-redis',
'events',
'redis-relay',
'{
"sink_type": "redis",
"url": "redis://localhost:6379",
"stream_key": "events:outbox"
}'::jsonb
);
Production Configuration
SELECT tide.relay_set_outbox(
'events-to-redis',
'events',
'redis-relay',
'{
"sink_type": "redis",
"url": "rediss://${env:REDIS_HOST}:6380",
"password": "${env:REDIS_PASSWORD}",
"stream_key": "events:{stream_table}",
"maxlen": 100000,
"batch_size": 200,
"tls_enabled": true
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "redis" |
url | string | — | Redis connection URL |
password | string | null | Redis password (or use URL auth) |
stream_key | string | — | Redis stream key. Supports {stream_table}, {op} |
maxlen | int | null | Maximum stream length (MAXLEN trimming). Older entries are evicted |
batch_size | int | 100 | Messages per pipeline batch |
tls_enabled | bool | false | Enable TLS |
database | int | 0 | Redis database number |
Delivery Guarantees
Redis Streams provides at-least-once delivery when combined with pg_tide's offset tracking. Messages are appended atomically to the stream using XADD, and the relay commits its offset only after Redis confirms the write. Redis Streams' consumer groups provide independent progress tracking for multiple downstream consumers.
Stream Management
Redis Streams grow indefinitely unless you configure trimming. The maxlen parameter caps the stream size:
{"maxlen": 100000}
This keeps at most 100,000 entries. Redis uses approximate trimming by default for performance, so the actual size may briefly exceed this limit. For time-based retention, use Redis's built-in XTRIM with MINID in a periodic cleanup job.
Complete Example
SELECT tide.outbox_publish(
'cache_invalidation',
'{"entity": "product", "id": "prod-42", "action": "updated"}'::jsonb,
'prod-42-v7'
);
Verify with redis-cli:
redis-cli XRANGE events:cache_invalidation - + COUNT 5
Troubleshooting
- "Connection refused" — Verify Redis is running and accessible on the specified port
- "NOAUTH Authentication required" — Set the
passwordparameter - "OOM command not allowed" — Redis is out of memory; configure
maxlenor increase Redis memory - High memory usage — Streams without
maxlengrow indefinitely; configure trimming
Further Reading
- Sources: Redis — Consuming from Redis Streams into pg_tide inbox
- Rate Limiting — Controlling message flow to Redis
Amazon SQS
Amazon Simple Queue Service (SQS) is a fully managed message queuing service provided by AWS. It requires zero operational overhead — there are no brokers to provision, no clusters to manage, and no capacity to plan. SQS automatically scales from one message per second to thousands, and you pay only for what you use. When you connect pg_tide to SQS, your outbox messages are delivered to SQS queues where they can trigger Lambda functions, feed ECS services, or be consumed by any AWS service or application that polls SQS.
SQS offers two queue types: Standard queues provide nearly unlimited throughput with at-least-once delivery, while FIFO queues guarantee exactly-once processing with strict message ordering. Both work seamlessly with pg_tide.
When to Use This Sink
Choose SQS when your infrastructure runs on AWS and you want a zero-maintenance message queue. SQS is particularly valuable for triggering Lambda functions (event-driven serverless), decoupling microservices within AWS, and building reliable work queues where messages must not be lost. The FIFO variant is excellent when you need both ordering and exactly-once delivery without managing broker infrastructure.
Configuration
Minimal Configuration
SELECT tide.relay_set_outbox(
'orders-to-sqs',
'orders',
'sqs-relay',
'{
"sink_type": "sqs",
"queue_url": "https://sqs.us-east-1.amazonaws.com/123456789/order-events",
"region": "us-east-1"
}'::jsonb
);
Production Configuration (FIFO Queue)
SELECT tide.relay_set_outbox(
'orders-to-sqs',
'orders',
'sqs-relay',
'{
"sink_type": "sqs",
"queue_url": "${env:SQS_QUEUE_URL}",
"region": "${env:AWS_REGION}",
"message_group_id": "{stream_table}",
"deduplication_id": "{dedup_key}",
"batch_size": 10
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "sqs" |
queue_url | string | — | Full SQS queue URL |
region | string | — | AWS region |
access_key_id | string | null | AWS access key (falls back to default credential chain) |
secret_access_key | string | null | AWS secret key |
message_group_id | string | null | Message group ID for FIFO queues. Supports templates |
deduplication_id | string | null | Deduplication ID for FIFO queues. Supports {dedup_key} |
batch_size | int | 10 | Messages per SendMessageBatch (max 10 for SQS) |
message_attributes | object | null | Custom SQS message attributes |
Authentication
The relay uses the standard AWS credential chain: environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY), instance profile (EC2/ECS), or explicit credentials in the pipeline config. For production on AWS, use IAM roles attached to your ECS task or EC2 instance rather than explicit keys.
Delivery Guarantees
- Standard queues: At-least-once delivery with best-effort ordering. Messages may occasionally be delivered more than once.
- FIFO queues: Exactly-once processing with strict ordering within a message group. Set
deduplication_idto{dedup_key}for automatic deduplication.
Complete Example
SELECT tide.outbox_publish(
'orders',
'{"event": "order.created", "order_id": "ord-100", "total": 299.00}'::jsonb,
'ord-100-created'
);
The message appears in the SQS queue and can trigger a Lambda function:
aws sqs receive-message --queue-url $SQS_QUEUE_URL --max-number-of-messages 1
Troubleshooting
- "Access Denied" — The IAM role/user needs
sqs:SendMessageandsqs:SendMessageBatchpermissions on the queue - "Queue does not exist" — Verify the queue URL is correct and the queue exists in the specified region
- "InvalidParameterValue" for FIFO — FIFO queues require
message_group_id; ensure it's configured - Duplicate messages in Standard queue — Expected behavior; implement idempotent consumers
Further Reading
- Sources: SQS — Consuming from SQS into pg_tide inbox
- Amazon Kinesis — For higher throughput streaming on AWS
Amazon Kinesis Data Streams
Amazon Kinesis Data Streams is a real-time data streaming service designed for high-volume, continuous data ingestion on AWS. Unlike SQS (which is a queue), Kinesis is a stream — data is retained for a configurable period and multiple consumers can independently read from the same stream at their own pace. When pg_tide delivers messages to Kinesis, they become available to real-time analytics applications, machine learning pipelines, and data lake ingestion processes running on AWS.
Kinesis is designed for scenarios where you need to process hundreds of thousands of records per second in real time. Each stream is composed of shards, and each shard provides 1 MB/s of write capacity and 2 MB/s of read capacity, allowing you to scale by adding more shards.
When to Use This Sink
Choose Kinesis when you need high-throughput real-time streaming on AWS, when you want multiple consumers reading the same data independently (Kinesis Analytics, Lambda, custom applications), or when you need data retention for replay purposes (up to 365 days). Kinesis integrates deeply with AWS services like Firehose, Analytics, and Lambda.
Configuration
Minimal Configuration
SELECT tide.relay_set_outbox(
'events-to-kinesis',
'events',
'kinesis-relay',
'{
"sink_type": "kinesis",
"stream_name": "pg-tide-events",
"region": "us-east-1",
"partition_key": "{dedup_key}"
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "kinesis" |
stream_name | string | — | Kinesis stream name |
region | string | — | AWS region |
partition_key | string | — | Partition key template. Determines shard assignment. Supports {dedup_key}, {stream_table} |
access_key_id | string | null | AWS access key (falls back to default credential chain) |
secret_access_key | string | null | AWS secret key |
batch_size | int | 100 | Records per PutRecords call (max 500) |
Delivery Guarantees
Kinesis provides at-least-once delivery. The relay uses the PutRecords API for batch ingestion and confirms delivery before committing offsets. Kinesis guarantees ordering within a partition key, so messages with the same partition_key value are always delivered in order.
Partition Strategy
The partition key determines which shard receives each record. Use {dedup_key} to keep all events for the same entity on the same shard (preserving per-entity ordering), or {stream_table} to group by outbox name. For maximum throughput distribution, use a high-cardinality key.
Troubleshooting
- "Stream not found" — Verify stream name and region are correct
- "ProvisionedThroughputExceededException" — Shard capacity exceeded; add more shards or reduce batch rate with the rate limiter
- "Access Denied" — IAM role needs
kinesis:PutRecordandkinesis:PutRecordspermissions
Further Reading
- Sources: Kinesis — Consuming from Kinesis into pg_tide inbox
- Amazon SQS — For simpler queuing without stream semantics
Google Cloud Pub/Sub
Google Cloud Pub/Sub is a fully managed, globally distributed messaging service built for reliability at scale. It decouples services by allowing publishers to send messages to topics without knowing who will receive them, and subscribers to receive messages without knowing who sent them. When pg_tide publishes to Pub/Sub, your outbox messages become available to any GCP service or application subscribed to the topic — from Cloud Functions and Cloud Run to Dataflow and BigQuery subscriptions.
Pub/Sub handles automatic scaling, message retention (up to 31 days), and global message routing without any infrastructure management. It is the natural choice for GCP-native architectures.
When to Use This Sink
Choose Pub/Sub when your infrastructure runs on Google Cloud Platform, when you need global message distribution across regions, or when you want deep integration with GCP services like Cloud Functions (event triggers), Dataflow (stream processing), and BigQuery (direct subscriptions for analytics). Pub/Sub supports ordering within ordering keys and scales to millions of messages per second without provisioning.
Configuration
Minimal Configuration
SELECT tide.relay_set_outbox(
'events-to-pubsub',
'events',
'pubsub-relay',
'{
"sink_type": "pubsub",
"project_id": "my-gcp-project",
"topic": "outbox-events"
}'::jsonb
);
Production Configuration
SELECT tide.relay_set_outbox(
'events-to-pubsub',
'events',
'pubsub-relay',
'{
"sink_type": "pubsub",
"project_id": "${env:GCP_PROJECT_ID}",
"topic": "events-{stream_table}",
"credentials_json": "${file:/etc/gcp/service-account.json}",
"ordering_key": "{dedup_key}",
"batch_size": 100
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "pubsub" |
project_id | string | — | GCP project ID |
topic | string | — | Pub/Sub topic name. Supports templates |
credentials_json | string | null | Service account JSON (falls back to Application Default Credentials) |
ordering_key | string | null | Ordering key template for ordered delivery |
batch_size | int | 100 | Messages per publish request |
attributes | object | null | Custom message attributes |
Authentication
On GCP (GKE, Cloud Run, Compute Engine), use Workload Identity or the default service account — no explicit credentials needed. For external deployments, provide a service account JSON key file via credentials_json.
Delivery Guarantees
Pub/Sub provides at-least-once delivery by default. With ordering keys configured, messages sharing the same ordering key are delivered in publish order to subscribers. Combined with subscriber-side deduplication (using the message ID or the dedup_key attribute), you can achieve effectively exactly-once processing.
Complete Example
SELECT tide.outbox_publish(
'analytics_events',
'{"event": "page_view", "user_id": "u-456", "page": "/checkout"}'::jsonb,
'pv-u456-checkout-1715000000'
);
Messages appear in the Pub/Sub topic and can be consumed by any subscriber:
gcloud pubsub subscriptions pull my-subscription --auto-ack --limit=5
Troubleshooting
- "Permission denied" — Service account needs
roles/pubsub.publisheron the topic - "Topic not found" — Create the topic first:
gcloud pubsub topics create outbox-events - "Ordering key too long" — Ordering keys must be ≤ 1024 bytes
Further Reading
- Sources: Pub/Sub — Subscribing to Pub/Sub messages into pg_tide inbox
- BigQuery — For direct analytics sink on GCP
Azure Service Bus
Azure Service Bus is Microsoft's enterprise message broker, providing reliable message delivery with advanced features like sessions, transactions, dead-lettering, and scheduled delivery. It serves as the backbone for many enterprise integration patterns on Azure. When pg_tide publishes to Service Bus, your outbox messages are delivered to queues or topics where Azure Functions, Logic Apps, or custom applications can process them with enterprise-grade reliability.
When to Use This Sink
Choose Azure Service Bus when your infrastructure runs on Azure, when you need enterprise messaging features (sessions for ordered processing, transactions, scheduled messages), or when you are integrating with Azure-native services like Azure Functions and Logic Apps. Service Bus is particularly strong for ordered message processing and complex routing scenarios within Azure.
Configuration
SELECT tide.relay_set_outbox(
'events-to-servicebus',
'events',
'servicebus-relay',
'{
"sink_type": "servicebus",
"connection_string": "${env:SERVICEBUS_CONNECTION_STRING}",
"queue_or_topic": "outbox-events",
"session_id": "{stream_table}",
"batch_size": 50
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "servicebus" |
connection_string | string | — | Service Bus connection string with send permissions |
queue_or_topic | string | — | Target queue or topic name |
session_id | string | null | Session ID template for session-enabled entities |
batch_size | int | 50 | Messages per batch send |
content_type | string | "application/json" | Message content type |
properties | object | null | Custom application properties |
Delivery Guarantees
Service Bus provides at-least-once delivery with server-side duplicate detection available. When sending to session-enabled queues/topics, messages within the same session are delivered in FIFO order. The relay commits offsets only after Service Bus acknowledges receipt.
Troubleshooting
- "Unauthorized access" — Verify connection string has
Sendpermission on the entity - "Entity not found" — Queue/topic does not exist; create it in the Azure portal
- "Session ID required" — The target entity requires sessions; set
session_id
Further Reading
- Sources: Service Bus — Receiving from Service Bus
- Azure Event Hubs — For higher throughput scenarios on Azure
Azure Event Hubs
Azure Event Hubs is a high-throughput event ingestion service on Azure, capable of receiving and processing millions of events per second. It is architecturally similar to Apache Kafka — events flow into partitions, consumer groups track progress independently, and data is retained for a configurable period. In fact, Event Hubs exposes a Kafka-compatible endpoint, making it a managed alternative to self-hosted Kafka on Azure. When pg_tide publishes to Event Hubs, your outbox messages become available to Azure Stream Analytics, Azure Functions, and any Kafka-compatible consumer.
When to Use This Sink
Choose Event Hubs when you need high-throughput event ingestion on Azure, when you want a managed Kafka-compatible service without cluster operations, or when you are building real-time analytics pipelines with Azure Stream Analytics. Event Hubs is designed for millions of events per second and integrates natively with the Azure data ecosystem.
Configuration
SELECT tide.relay_set_outbox(
'events-to-eventhubs',
'events',
'eventhubs-relay',
'{
"sink_type": "eventhubs",
"connection_string": "${env:EVENTHUBS_CONNECTION_STRING}",
"event_hub_name": "outbox-events",
"partition_key": "{stream_table}",
"batch_size": 100
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "eventhubs" |
connection_string | string | — | Event Hubs connection string |
event_hub_name | string | — | Target Event Hub name |
partition_key | string | null | Partition key template for ordering |
batch_size | int | 100 | Events per batch send |
properties | object | null | Custom event properties |
Delivery Guarantees
Event Hubs provides at-least-once delivery. Events are durably stored across multiple replicas before the send is acknowledged. Ordering is guaranteed within a partition (determined by partition key).
Kafka Compatibility
Event Hubs exposes a Kafka-compatible endpoint. If you prefer, you can use the Kafka sink with Event Hubs' Kafka endpoint instead of the native Event Hubs protocol. The native sink is slightly more efficient as it uses the AMQP protocol directly.
Troubleshooting
- "Unauthorized" — Connection string needs
Sendclaim on the Event Hub - "Event Hub not found" — Verify the event hub name matches the entity in your namespace
- "Quota exceeded" — Throughput units exhausted; scale up the Event Hubs namespace
Further Reading
- Sources: Event Hubs — Consuming from Event Hubs
- Azure Service Bus — For enterprise queuing patterns
MQTT v5
MQTT (Message Queuing Telemetry Transport) is a lightweight publish/subscribe messaging protocol designed for constrained devices and low-bandwidth, high-latency networks. It has become the de facto standard for IoT (Internet of Things) communication, connecting everything from industrial sensors and smart home devices to connected vehicles and healthcare equipment. When pg_tide publishes to an MQTT broker, your outbox messages are delivered to MQTT topics where IoT devices, edge gateways, and backend services can subscribe.
MQTT v5 (the version pg_tide supports) adds features like message expiry, topic aliases, shared subscriptions, and user properties that make it suitable for more sophisticated use cases beyond basic IoT telemetry.
When to Use This Sink
Choose the MQTT sink when you need to deliver events to IoT devices or edge computing infrastructure, when you are integrating with industrial IoT platforms (HiveMQ, EMQX, AWS IoT Core, Azure IoT Hub), or when you need the lightweight protocol overhead that MQTT provides for bandwidth-constrained environments. MQTT's Quality of Service (QoS) levels let you choose between maximum performance (QoS 0), guaranteed delivery (QoS 1), and exactly-once delivery (QoS 2).
Configuration
SELECT tide.relay_set_outbox(
'telemetry-to-mqtt',
'device_commands',
'mqtt-relay',
'{
"sink_type": "mqtt",
"url": "mqtts://${env:MQTT_BROKER}:8883",
"topic": "devices/{stream_table}/commands",
"qos": 1,
"username": "${env:MQTT_USER}",
"password": "${env:MQTT_PASS}",
"client_id": "pg-tide-relay-01",
"tls_enabled": true
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "mqtt" |
url | string | — | MQTT broker URL (mqtt:// or mqtts://) |
topic | string | — | MQTT topic template. Supports {stream_table}, {op} |
qos | int | 1 | Quality of Service: 0 (at-most-once), 1 (at-least-once), 2 (exactly-once) |
username | string | null | Authentication username |
password | string | null | Authentication password |
client_id | string | auto-generated | MQTT client identifier |
tls_enabled | bool | false | Enable TLS |
tls_ca_cert | string | null | CA certificate path |
retain | bool | false | Set retain flag on published messages |
clean_start | bool | true | Start with a clean session |
batch_size | int | 1 | Messages per batch (most MQTT use cases are individual) |
Delivery Guarantees
Delivery guarantees depend on the QoS level:
- QoS 0 (At-most-once): Fire-and-forget. Fastest but messages can be lost.
- QoS 1 (At-least-once): Broker acknowledges receipt. Messages may be delivered more than once. Recommended for most use cases.
- QoS 2 (Exactly-once): Four-phase handshake ensures each message is delivered exactly once. Highest overhead but strongest guarantee.
Topic Hierarchy
MQTT topics use / as a separator (unlike NATS which uses .). pg_tide's template variables work naturally with MQTT's topic model:
devices/sensors/temperature → sensor readings
commands/device-01/reboot → device commands
events/orders/created → business events
Troubleshooting
- "Connection refused" — Check broker URL, port (1883 for TCP, 8883 for TLS), and firewall rules
- "Not authorized" — Verify username/password or client certificate
- "Client ID already in use" — Each relay instance needs a unique
client_id - Messages not received by subscribers — Check topic name matches subscriber's subscription pattern (MQTT uses
+and#wildcards)
Further Reading
- Sources: MQTT — Subscribing to MQTT topics into pg_tide inbox
- Rate Limiting — Protecting IoT brokers from overload
HTTP Webhook
An HTTP webhook is one of the most versatile ways to connect pg_tide to the outside world. Rather than requiring a specific message broker or protocol, webhooks deliver your outbox messages as HTTP POST requests to any URL endpoint. This makes webhooks ideal for integrating with third-party APIs, triggering serverless functions, notifying external services, and building custom integrations where the destination simply needs to accept an HTTP request.
When you configure a webhook sink, the pg_tide relay acts as an HTTP client that sends your outbox messages as JSON payloads to the configured URL. The relay handles retries, timeouts, signature verification, and error tracking automatically — all you need to provide is a URL and optionally some authentication headers.
When to Use This Sink
Choose the webhook sink when:
- Integrating with third-party APIs — Most SaaS platforms accept webhook-style HTTP callbacks. If the service you want to notify has a REST API or webhook endpoint, this sink works immediately.
- Triggering serverless functions — AWS Lambda (via function URLs or API Gateway), Google Cloud Functions, Azure Functions, and Cloudflare Workers all accept HTTP requests as triggers.
- Custom microservices — Your internal services expose HTTP endpoints for receiving events. Webhooks provide a simple, protocol-agnostic delivery mechanism.
- No broker infrastructure — You don't want to run Kafka or NATS just to deliver events to a single service. An HTTP webhook requires no message broker infrastructure at all.
- Prototyping and development — Webhooks are the fastest way to verify your outbox pipeline works end-to-end, since you can use services like webhook.site or ngrok to inspect deliveries.
Consider a dedicated message queue sink (Kafka, NATS, SQS) if you need fan-out to multiple consumers, message replay, or if the destination processes events asynchronously with its own consumer group model.
Configuration
Minimal Configuration
SELECT tide.relay_set_outbox(
'orders-webhook',
'orders',
'webhook-relay',
'{
"sink_type": "webhook",
"url": "https://api.example.com/webhooks/orders"
}'::jsonb
);
Production Configuration
SELECT tide.relay_set_outbox(
'orders-webhook',
'orders',
'webhook-relay',
'{
"sink_type": "webhook",
"url": "${env:WEBHOOK_URL}",
"headers": {
"Authorization": "Bearer ${env:WEBHOOK_TOKEN}",
"X-Source": "pg-tide"
},
"timeout_ms": 10000,
"signature_secret": "${env:WEBHOOK_SIGNING_SECRET}",
"signature_scheme": "hmac-sha256",
"signature_header": "X-Signature-256"
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "webhook" |
url | string | — | Target URL. Supports template variables: {stream_table}, {op} |
method | string | "POST" | HTTP method |
headers | object | {} | Static headers added to every request |
timeout_ms | int | 30000 | Request timeout in milliseconds |
signature_secret | string | null | Secret key for HMAC signature generation |
signature_scheme | string | null | Signature scheme: "hmac-sha256", "github", "stripe", "svix" |
signature_header | string | "X-Webhook-Signature" | Header name for the computed signature |
batch_size | int | 1 | Messages per request (1 = individual delivery) |
content_type | string | "application/json" | Content-Type header value |
Authentication
Bearer Token
The most common webhook authentication pattern:
{
"sink_type": "webhook",
"url": "https://api.example.com/events",
"headers": {
"Authorization": "Bearer ${env:API_TOKEN}"
}
}
API Key Header
Some services use a custom header for API key authentication:
{
"sink_type": "webhook",
"url": "https://api.example.com/ingest",
"headers": {
"X-API-Key": "${env:API_KEY}"
}
}
HMAC Signature (Webhook Signing)
For secure webhook delivery, pg_tide can compute an HMAC-SHA256 signature of the request body and include it in a header. This allows the receiver to verify that the request genuinely came from pg_tide:
{
"sink_type": "webhook",
"url": "https://myservice.example.com/hooks",
"signature_secret": "${env:SIGNING_SECRET}",
"signature_scheme": "hmac-sha256",
"signature_header": "X-Signature-256"
}
The receiver computes HMAC-SHA256(body, secret) and compares it to the header value using a timing-safe comparison.
Delivery Guarantees
The webhook sink provides at-least-once delivery. A delivery is considered successful when the target endpoint responds with an HTTP status code in the 2xx range. Any other response (4xx, 5xx, timeout, connection error) triggers a retry.
The retry sequence uses exponential backoff: 100ms → 200ms → 400ms → 800ms → ... up to a maximum of 30 seconds between attempts. After exhausting all retries, the message is routed to the dead-letter queue.
Important: Your webhook receiver should be idempotent. Because delivery is at-least-once, the same message may be delivered more than once in edge cases (e.g., the relay crashes after the endpoint processed the request but before acknowledging the outbox offset). Use the dedup_key included in the payload or headers to detect and skip duplicates.
URL Routing
The url field supports template variables for dynamic routing:
{
"sink_type": "webhook",
"url": "https://api.example.com/hooks/{stream_table}/{op}"
}
This routes INSERT events from the orders outbox to https://api.example.com/hooks/orders/insert.
Complete Example: Notify a Fulfillment Service
1. Create the Outbox
SELECT tide.outbox_create('fulfillment_events', retention_hours => 72);
2. Publish an Event
BEGIN;
UPDATE orders SET status = 'ready_to_ship' WHERE id = 'ord-123';
SELECT tide.outbox_publish(
'fulfillment_events',
jsonb_build_object(
'event', 'order.ready_to_ship',
'order_id', 'ord-123',
'warehouse', 'us-east-1',
'items', jsonb_build_array('SKU-001', 'SKU-002')
),
'ord-123-ready'
);
COMMIT;
3. Configure the Pipeline
SELECT tide.relay_set_outbox(
'fulfillment-webhook',
'fulfillment_events',
'webhook-group',
'{
"sink_type": "webhook",
"url": "https://fulfillment.internal/api/v1/events",
"headers": {
"Authorization": "Bearer ${env:FULFILLMENT_API_KEY}",
"Content-Type": "application/json"
},
"timeout_ms": 5000,
"signature_secret": "${env:WEBHOOK_SECRET}",
"signature_scheme": "hmac-sha256"
}'::jsonb
);
SELECT tide.relay_enable('fulfillment-webhook');
4. Start the Relay
export FULFILLMENT_API_KEY="sk_live_abc123"
export WEBHOOK_SECRET="whsec_xyz789"
pg-tide --postgres-url "postgresql://relay@localhost:5432/mydb"
The fulfillment service will receive an HTTP POST with the event payload and a signature header it can verify.
Troubleshooting
"Connection refused" or timeout
The target URL is not reachable:
- Verify the URL is correct and the service is running
- Check DNS resolution for the hostname
- Verify network connectivity (firewalls, security groups, VPN)
- Check that
timeout_msis sufficient for the endpoint's response time
HTTP 401 / 403 responses
Authentication failed:
- Verify the
Authorizationheader value is correct - Check that API keys/tokens have not expired
- Ensure the token has permission to access the webhook endpoint
HTTP 429 (Too Many Requests)
The endpoint is rate-limiting your requests:
- Add a rate limiter to your pipeline configuration
- Consider increasing
batch_sizeif the endpoint supports batch payloads - Check the endpoint's rate limit documentation
Signature verification failures on the receiver
The receiver rejects the webhook signature:
- Ensure the signing secret matches on both sides
- Verify the
signature_schemematches what the receiver expects - Check that the receiver is comparing against the raw request body (not a parsed/re-serialized version)
Further Reading
- Webhook Signature Verification — Detailed signature scheme documentation
- Rate Limiting — Protecting endpoints from overload
- Dead-Letter Queue — Handling persistent delivery failures
- Content-Based Routing — Dynamic URL routing based on message content
ClickHouse
ClickHouse is an open-source columnar database management system designed for real-time analytical queries on large datasets. It can process billions of rows per second, making it one of the fastest analytical databases available. When pg_tide delivers messages to ClickHouse, your PostgreSQL events become immediately queryable for real-time dashboards, log analytics, time-series analysis, and business intelligence workloads.
Unlike traditional message queues where data is consumed and deleted, ClickHouse stores your events permanently (or until you define a TTL), letting you run ad-hoc analytical queries across your entire event history. This makes it an excellent complement to pg_tide — your outbox provides reliable event delivery, and ClickHouse provides the analytical query engine.
When to Use This Sink
Choose ClickHouse when you need real-time analytics on your PostgreSQL events, when you want sub-second query performance on billions of rows, or when you are building observability platforms (log storage, metrics, traces). ClickHouse excels at time-series data, aggregation queries, and full-text search across structured event data. It is particularly cost-effective for high-volume workloads because its columnar compression achieves 10-50x data reduction.
Consider Snowflake or BigQuery if you prefer fully managed cloud services with zero operations, or Elasticsearch if your primary need is full-text search with fuzzy matching.
Configuration
Minimal Configuration
SELECT tide.relay_set_outbox(
'events-to-clickhouse',
'events',
'clickhouse-relay',
'{
"sink_type": "clickhouse",
"url": "http://localhost:8123",
"database": "analytics",
"table": "events"
}'::jsonb
);
Production Configuration
SELECT tide.relay_set_outbox(
'events-to-clickhouse',
'events',
'clickhouse-relay',
'{
"sink_type": "clickhouse",
"url": "https://${env:CLICKHOUSE_HOST}:8443",
"database": "analytics",
"table": "events",
"username": "${env:CLICKHOUSE_USER}",
"password": "${env:CLICKHOUSE_PASSWORD}",
"batch_size": 1000,
"tls_enabled": true
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "clickhouse" |
url | string | — | ClickHouse HTTP interface URL |
database | string | — | Target database |
table | string | — | Target table |
username | string | "default" | Authentication username |
password | string | "" | Authentication password |
batch_size | int | 1000 | Rows per INSERT batch |
tls_enabled | bool | false | Enable TLS |
Table Schema
The relay inserts messages as structured rows. Create a ClickHouse table that matches your event schema:
CREATE TABLE analytics.events (
event_id String,
outbox_name String,
event_type String,
payload String, -- JSON string
dedup_key String,
created_at DateTime64(3),
op String
) ENGINE = ReplacingMergeTree(created_at)
ORDER BY (event_type, created_at)
TTL created_at + INTERVAL 90 DAY;
The ReplacingMergeTree engine automatically deduplicates rows with the same sort key during background merges, providing eventual deduplication even if the relay delivers a message twice.
Delivery Guarantees
ClickHouse provides at-least-once delivery. The relay uses batch INSERT operations and commits offsets only after ClickHouse confirms the insert succeeded. Using ReplacingMergeTree or dedup_key checks in your queries provides idempotent behavior.
Performance Tuning
ClickHouse performs best with large batch inserts (1,000+ rows). Small, frequent inserts create many small parts that require background merging. Configure:
batch_size: 1000-5000— Larger batches are more efficient for ClickHouse- Adjust the relay's polling interval to accumulate larger batches during high throughput
Troubleshooting
- "Table not found" — Create the target table in ClickHouse before starting the pipeline
- "Column count mismatch" — Ensure the ClickHouse table schema matches the fields the relay produces
- "Too many parts" — Batch size is too small; increase
batch_sizeto reduce insert frequency - "Authentication failed" — Check username/password and that the user has INSERT permission
Further Reading
- Snowflake — Cloud data warehouse alternative
- BigQuery — Google Cloud analytics alternative
- Wire Formats — Customize how events are structured for ClickHouse
Snowflake
Snowflake is a cloud-native data warehouse that separates compute from storage, allowing you to scale query processing independently of data volume. It runs on AWS, Azure, and GCP, providing a single platform for data warehousing, data lakes, and data sharing across clouds. When pg_tide delivers messages to Snowflake, your PostgreSQL events flow directly into your data warehouse for analytics, reporting, and machine learning without requiring intermediate ETL pipelines.
When to Use This Sink
Choose Snowflake when your organization uses it as the primary data warehouse for analytics and BI, when you need to combine PostgreSQL event data with other data sources already in Snowflake, or when you want zero-maintenance analytical storage that scales automatically. Snowflake's semi-structured data support (VARIANT type) handles JSON event payloads natively, making it easy to query nested event data without predefined schemas.
Configuration
Production Configuration
SELECT tide.relay_set_outbox(
'events-to-snowflake',
'events',
'snowflake-relay',
'{
"sink_type": "snowflake",
"account": "${env:SNOWFLAKE_ACCOUNT}",
"database": "ANALYTICS",
"schema": "EVENTS",
"table": "RAW_EVENTS",
"warehouse": "INGEST_WH",
"username": "${env:SNOWFLAKE_USER}",
"private_key_path": "${env:SNOWFLAKE_KEY_PATH}",
"batch_size": 500,
"stage": "pg_tide_stage"
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "snowflake" |
account | string | — | Snowflake account identifier |
database | string | — | Target database |
schema | string | — | Target schema |
table | string | — | Target table |
warehouse | string | — | Compute warehouse for COPY operations |
username | string | — | Authentication username |
password | string | null | Password (use key-pair auth instead for production) |
private_key_path | string | null | Path to RSA private key for key-pair authentication |
stage | string | null | Internal stage name for COPY operations |
batch_size | int | 500 | Rows per micro-batch |
role | string | null | Snowflake role to assume |
Authentication
For production, use key-pair authentication rather than passwords:
- Generate a key pair:
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out snowflake_key.p8 -nocrypt - Assign the public key to your Snowflake user:
ALTER USER relay_user SET RSA_PUBLIC_KEY='...' - Reference the private key in your config:
"private_key_path": "/etc/snowflake/key.p8"
How It Works
The relay uses a stage-based approach for efficient loading:
- Messages are accumulated into micro-batches
- Each batch is written as a compressed file to an internal Snowflake stage
- A COPY INTO command loads the staged file into the target table
- The stage file is removed after successful loading
This approach is more cost-effective than streaming inserts because Snowflake charges per-compute-second, and batch loading uses minimal warehouse time.
Table Schema
CREATE TABLE ANALYTICS.EVENTS.RAW_EVENTS (
event_id VARCHAR,
outbox_name VARCHAR,
payload VARIANT, -- Stores JSON natively
dedup_key VARCHAR,
operation VARCHAR,
ingested_at TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP()
);
Cost Optimization
- Use an X-Small warehouse for ingestion (sufficient for most event volumes)
- Set warehouse auto-suspend to 60 seconds to minimize idle costs
- Batch sizes of 500-1000 reduce the number of COPY operations
- Consider using Snowpipe for continuous micro-batch loading in very high volume scenarios
Troubleshooting
- "Warehouse not running" — Ensure the warehouse is set to auto-resume, or start it manually
- "Insufficient privileges" — Grant USAGE on warehouse, INSERT on table, WRITE on stage
- "Key-pair authentication failed" — Verify the public key is assigned to the Snowflake user and the private key path is correct
Further Reading
- BigQuery — Google Cloud alternative
- ClickHouse — Self-hosted real-time analytics alternative
- Object Storage — Raw file-based data lake landing
BigQuery
Google BigQuery is a serverless, highly scalable data warehouse that can analyze petabytes of data with standard SQL. Unlike traditional warehouses that require provisioning, BigQuery separates storage from compute and charges only for queries run and data stored. When pg_tide delivers messages to BigQuery, your PostgreSQL events become immediately available for analytics, machine learning (via BigQuery ML), and visualization in Looker or Data Studio.
When to Use This Sink
Choose BigQuery when your analytics stack runs on Google Cloud, when you want truly serverless analytics with no capacity planning, or when you need to combine PostgreSQL event data with other GCP datasets. BigQuery's streaming insert API makes events queryable within seconds of delivery, and its columnar storage format provides excellent compression and query performance for analytical workloads.
Configuration
SELECT tide.relay_set_outbox(
'events-to-bigquery',
'events',
'bq-relay',
'{
"sink_type": "bigquery",
"project_id": "${env:GCP_PROJECT_ID}",
"dataset": "event_analytics",
"table": "raw_events",
"credentials_json": "${file:/etc/gcp/service-account.json}",
"batch_size": 500
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "bigquery" |
project_id | string | — | GCP project ID |
dataset | string | — | BigQuery dataset name |
table | string | — | BigQuery table name |
credentials_json | string | null | Service account JSON (falls back to ADC) |
batch_size | int | 500 | Rows per streaming insert batch (max 10,000) |
insert_method | string | "streaming" | Insert method: "streaming" or "load" |
Insert Methods
- Streaming inserts (default): Events are queryable within seconds. Best for real-time analytics. Incurs streaming insert pricing.
- Load jobs: Events are batched into files and loaded periodically. Lower cost but higher latency (minutes). Best for cost-sensitive batch analytics.
Table Schema
CREATE TABLE event_analytics.raw_events (
event_id STRING,
outbox_name STRING,
payload JSON,
dedup_key STRING,
operation STRING,
published_at TIMESTAMP
)
PARTITION BY DATE(published_at)
CLUSTER BY outbox_name, operation;
Partitioning by date and clustering by common filter columns optimizes both cost and query performance.
Troubleshooting
- "Access Denied" — Service account needs
roles/bigquery.dataEditoron the dataset - "Table not found" — Create the table and dataset before starting the pipeline
- "Streaming insert quota exceeded" — BigQuery has per-table streaming limits; use load jobs for very high volumes
- "Invalid rows" — Schema mismatch between event structure and table schema
Further Reading
- Snowflake — Multi-cloud data warehouse alternative
- Google Cloud Pub/Sub — GCP messaging for intermediate processing
Apache Iceberg
Apache Iceberg is an open table format for large-scale analytical datasets, designed to bring reliability and simplicity to data lakes. Unlike raw files on object storage, Iceberg provides ACID transactions, schema evolution, time travel, and partition evolution — features traditionally associated with data warehouses, but available on open storage like S3, GCS, and ADLS. When pg_tide delivers messages to Iceberg, your PostgreSQL events become part of a queryable lakehouse that can be accessed by Spark, Trino, Flink, Snowflake, BigQuery, and dozens of other engines.
When to Use This Sink
Choose Apache Iceberg when you want the cost efficiency of object storage with the reliability of a data warehouse, when you need multi-engine access to the same data (Spark for ETL, Trino for ad-hoc queries, Flink for streaming), or when vendor lock-in is a concern and you prefer open formats. Iceberg is the foundation of the modern lakehouse architecture and is supported by all major cloud providers and query engines.
Configuration
SELECT tide.relay_set_outbox(
'events-to-iceberg',
'events',
'iceberg-relay',
'{
"sink_type": "iceberg",
"catalog_type": "rest",
"catalog_uri": "${env:ICEBERG_CATALOG_URI}",
"warehouse": "s3://my-lake/warehouse",
"namespace": "analytics",
"table": "events",
"batch_size": 1000
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "iceberg" |
catalog_type | string | — | Catalog type: "rest", "glue", "hive" |
catalog_uri | string | — | Catalog service URI |
warehouse | string | — | Storage location (S3/GCS/ADLS path) |
namespace | string | — | Iceberg namespace (database) |
table | string | — | Iceberg table name |
batch_size | int | 1000 | Records per data file |
s3_access_key_id | string | null | S3 credentials (falls back to default chain) |
s3_secret_access_key | string | null | S3 secret key |
s3_region | string | null | S3 region |
Catalog Types
- REST Catalog — The most portable option. Works with Tabular, Polaris, and any REST-compatible catalog.
- AWS Glue — Native integration with AWS analytics services (Athena, EMR, Redshift Spectrum).
- Hive Metastore — For Hadoop-based environments with existing Hive infrastructure.
How It Works
The relay accumulates messages into batches and writes them as Parquet data files to object storage. Each batch becomes an Iceberg append commit, maintaining full ACID transactional semantics. This means:
- Partial writes never become visible (atomic commits)
- Concurrent readers always see a consistent snapshot
- Failed writes are automatically cleaned up
- Time travel lets you query the state at any point in history
Delivery Guarantees
At-least-once delivery. If the relay restarts mid-batch, the uncommitted data files are orphaned and cleaned up by Iceberg's periodic orphan file removal. The re-delivered messages create a new commit. For exact deduplication, include the dedup_key as a column and deduplicate at query time.
Debezium Compatibility
When combined with the Debezium wire format, the Iceberg sink produces CDC-compatible records that standard Iceberg CDC consumers (like the Iceberg Flink connector) can process for upsert/delete semantics:
{
"sink_type": "iceberg",
"wire_format": "debezium",
"catalog_type": "rest",
"catalog_uri": "http://catalog:8181",
"namespace": "cdc",
"table": "orders"
}
Troubleshooting
- "Catalog not found" — Verify
catalog_uriis reachable and the catalog service is running - "Namespace/Table not found" — Create the table first using Spark, Trino, or the catalog API
- "Access denied to storage" — Check S3/GCS/ADLS credentials and bucket policies
- "Commit conflict" — Another writer committed concurrently; the relay will retry automatically
Further Reading
- Delta Lake — Alternative open table format (Databricks ecosystem)
- DuckLake — Lightweight lakehouse with PostgreSQL catalog
- Object Storage — Raw file storage without table format
Delta Lake
Delta Lake is an open-source storage framework that brings ACID transactions and scalable metadata handling to data lakes. Originally created by Databricks, Delta Lake has become a foundational technology in the Databricks ecosystem and is widely adopted beyond it. When pg_tide delivers messages to Delta Lake, your PostgreSQL events are written as Parquet files with a transaction log that enables time travel, schema enforcement, and reliable upserts.
When to Use This Sink
Choose Delta Lake when your analytics platform is built on Databricks or Spark, when you need ACID transactions on object storage, or when you want to support both streaming and batch queries on the same dataset. Delta Lake's integration with Databricks Unity Catalog provides governance, lineage tracking, and fine-grained access control.
Configuration
SELECT tide.relay_set_outbox(
'events-to-delta',
'events',
'delta-relay',
'{
"sink_type": "delta",
"table_uri": "s3://my-lake/delta/events",
"storage_options": {
"AWS_ACCESS_KEY_ID": "${env:AWS_ACCESS_KEY_ID}",
"AWS_SECRET_ACCESS_KEY": "${env:AWS_SECRET_ACCESS_KEY}",
"AWS_REGION": "us-east-1"
},
"batch_size": 1000
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "delta" |
table_uri | string | — | Delta table location (S3/GCS/ADLS/local path) |
storage_options | object | {} | Cloud storage credentials and options |
batch_size | int | 1000 | Records per commit |
mode | string | "append" | Write mode: "append" or "overwrite" |
How It Works
Each batch of messages is written as a Parquet data file to the Delta table location. The relay then atomically commits the file to the Delta transaction log (_delta_log/). This ensures that readers always see complete, consistent batches. Failed or partial writes do not affect the table state.
Delivery Guarantees
At-least-once delivery. Delta Lake's transaction log is append-only, and each commit is atomic. If the relay crashes before committing, the orphaned Parquet file is ignored. Duplicates can be handled using Delta Lake's MERGE operations or by deduplication during downstream queries.
Troubleshooting
- "Access denied" — Check storage credentials in
storage_options - "Table does not exist" — Create the table first using Spark or delta-rs, or enable auto-creation
- "Conflict during commit" — Concurrent writers detected; relay retries automatically
Further Reading
- Apache Iceberg — Alternative open table format (broader engine support)
- DuckLake — Lightweight lakehouse alternative
- Object Storage — Raw file storage without table format
DuckLake
DuckLake is a novel lakehouse architecture that combines Parquet data files with a PostgreSQL metadata catalog. Unlike Iceberg or Delta Lake (which store metadata as JSON files alongside data), DuckLake uses a relational database (PostgreSQL) as the source of truth for table metadata, schema, and transaction history. This design makes metadata operations (listing tables, schema evolution, time travel) dramatically faster while keeping data in efficient Parquet format on any object storage.
When pg_tide delivers messages to DuckLake, your events are written as Parquet files while the catalog metadata is maintained in PostgreSQL — potentially even in the same PostgreSQL instance that hosts your outbox. This creates an elegantly simple architecture where your database manages both the events and their analytical storage metadata.
When to Use This Sink
Choose DuckLake when you want a lightweight lakehouse that integrates naturally with PostgreSQL, when you want to query event data with DuckDB (including from the command line or embedded in applications), or when you prefer the simplicity of a single PostgreSQL database managing both operational and analytical metadata. DuckLake is particularly compelling for smaller teams that want lakehouse capabilities without the operational complexity of running a separate catalog service.
Configuration
SELECT tide.relay_set_outbox(
'events-to-ducklake',
'events',
'ducklake-relay',
'{
"sink_type": "ducklake",
"catalog_url": "postgresql://localhost:5432/analytics",
"data_path": "s3://my-lake/ducklake/events",
"table": "raw_events",
"batch_size": 1000,
"storage_options": {
"AWS_REGION": "us-east-1"
}
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "ducklake" |
catalog_url | string | — | PostgreSQL connection URL for the DuckLake catalog |
data_path | string | — | Storage path for Parquet data files (S3/GCS/local) |
table | string | — | DuckLake table name |
batch_size | int | 1000 | Records per Parquet file |
storage_options | object | {} | Cloud storage credentials |
Querying DuckLake Data
Once events are written, you can query them with DuckDB:
-- Attach the DuckLake catalog
ATTACH 'ducklake:postgresql://localhost:5432/analytics' AS lake;
-- Query your events
SELECT * FROM lake.raw_events
WHERE event_type = 'order.created'
AND published_at > '2024-01-01';
Troubleshooting
- "Catalog connection failed" — Verify the PostgreSQL URL is reachable from the relay
- "Storage access denied" — Check cloud storage credentials in
storage_options - "Table not found" — Create the DuckLake table first using DuckDB
Further Reading
- Apache Iceberg — For broader engine ecosystem support
- Delta Lake — For Databricks/Spark integration
MongoDB
MongoDB is a document-oriented database that stores data as flexible JSON-like documents. It is widely used for applications that need schema flexibility, horizontal scaling, and the ability to store nested, hierarchical data naturally. When pg_tide delivers messages to MongoDB, your PostgreSQL events are written as documents to a collection, where they can be queried with MongoDB's rich query language, aggregated for analytics, or used to maintain a read-optimized view of your data.
When to Use This Sink
Choose MongoDB when your downstream consumers use MongoDB as their primary data store, when you want to maintain a denormalized read model of your PostgreSQL data in a document database, or when you need flexible schema storage for event data that evolves rapidly. MongoDB's document model maps naturally to JSON event payloads without requiring a predefined schema.
Configuration
SELECT tide.relay_set_outbox(
'events-to-mongo',
'events',
'mongo-relay',
'{
"sink_type": "mongodb",
"connection_string": "${env:MONGODB_URI}",
"database": "events",
"collection": "outbox_events",
"batch_size": 500,
"write_concern": "majority"
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "mongodb" |
connection_string | string | — | MongoDB connection URI |
database | string | — | Target database |
collection | string | — | Target collection |
batch_size | int | 500 | Documents per bulk write |
write_concern | string | "majority" | Write concern level |
upsert | bool | false | Use upsert mode (update if exists, insert if not) |
upsert_key | string | "dedup_key" | Field used as document ID for upserts |
Upsert Mode
When upsert: true, the relay uses the dedup_key as the document _id. This provides natural deduplication — re-delivered messages update the existing document rather than creating duplicates. This is particularly useful for maintaining a current-state view of entities:
{
"sink_type": "mongodb",
"connection_string": "mongodb+srv://...",
"database": "orders",
"collection": "current_state",
"upsert": true,
"upsert_key": "dedup_key"
}
Delivery Guarantees
With write_concern: "majority", MongoDB acknowledges writes only after they are replicated to a majority of replica set members, providing durable at-least-once delivery. With upsert mode, re-delivery is idempotent.
Troubleshooting
- "Authentication failed" — Check credentials in the connection string and verify the user has write access
- "Connection timeout" — Verify network connectivity; for Atlas, ensure your IP is in the access list
- "Write concern timeout" — Replica set members are unavailable; check cluster health
Further Reading
- Elasticsearch — For full-text search use cases
- ClickHouse — For columnar analytics
Elasticsearch / OpenSearch
Elasticsearch is a distributed search and analytics engine built on Apache Lucene. It excels at full-text search, log analytics, application performance monitoring (APM), and real-time data exploration. OpenSearch is an AWS-maintained fork with identical functionality. When pg_tide delivers messages to Elasticsearch, your PostgreSQL events become searchable — enabling full-text queries, aggregations, dashboards (via Kibana/OpenSearch Dashboards), and real-time alerting on your event data.
When to Use This Sink
Choose Elasticsearch when you need full-text search across your events (e.g., searching order descriptions, user messages, log entries), when you are building observability dashboards with Kibana, or when you need real-time aggregations and alerting on high-volume event streams. Elasticsearch's inverted index makes text search blazing fast, while its aggregation framework supports complex analytics.
Configuration
SELECT tide.relay_set_outbox(
'events-to-elastic',
'events',
'elastic-relay',
'{
"sink_type": "elasticsearch",
"url": "https://${env:ELASTIC_HOST}:9200",
"index": "events-{stream_table}",
"username": "${env:ELASTIC_USER}",
"password": "${env:ELASTIC_PASS}",
"batch_size": 500,
"document_id": "{dedup_key}"
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "elasticsearch" |
url | string | — | Elasticsearch/OpenSearch URL |
index | string | — | Target index name. Supports templates |
username | string | null | Basic auth username |
password | string | null | Basic auth password |
api_key | string | null | API key authentication (alternative to basic auth) |
batch_size | int | 500 | Documents per bulk request |
document_id | string | null | Document ID template. Enables idempotent upserts |
tls_enabled | bool | true | Enable TLS |
tls_ca_cert | string | null | Custom CA certificate |
Document IDs and Idempotency
Setting document_id to {dedup_key} makes writes idempotent — if the same message is delivered twice, it overwrites the same document rather than creating a duplicate. This is strongly recommended for production.
Index Lifecycle Management (ILM)
For high-volume event streams, use time-based indices with ILM policies:
{"index": "events-{stream_table}-2024.01"}
Configure Elasticsearch ILM to roll over indices by size or age, and delete old indices after your retention period.
Troubleshooting
- "Connection refused" — Verify Elasticsearch is running and the URL includes the correct port
- "Authentication failed" — Check username/password or API key
- "Index not found" — Elasticsearch auto-creates indices by default; if disabled, create the index first
- "Bulk request failed" — Check individual error messages; common causes are mapping conflicts or disk space
Further Reading
- ClickHouse — For columnar analytics (faster aggregations, no full-text search)
- MongoDB — For document storage without search focus
Object Storage (S3 / GCS / Azure Blob)
Object storage services like Amazon S3, Google Cloud Storage (GCS), and Azure Blob Storage provide virtually unlimited, highly durable data storage at very low cost. When pg_tide delivers messages to object storage, your events are written as files (JSONL or Parquet format) organized in a path structure you define. This creates a data lake landing zone that can be queried by tools like Athena, BigQuery, Trino, DuckDB, or Spark without requiring a dedicated streaming infrastructure.
When to Use This Sink
Choose object storage when you need cost-effective long-term archival of events, when you want to build a data lake without committing to a specific table format (Iceberg/Delta), when compliance requires immutable event retention, or when your analytical tools can query files directly from object storage. Object storage is the most cost-effective option for high-volume event archival.
Configuration
SELECT tide.relay_set_outbox(
'events-to-s3',
'events',
's3-relay',
'{
"sink_type": "object_storage",
"provider": "s3",
"bucket": "my-data-lake",
"prefix": "events/{stream_table}/year={year}/month={month}/day={day}/",
"format": "parquet",
"region": "us-east-1",
"batch_size": 1000,
"file_rotation_seconds": 300
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "object_storage" |
provider | string | — | Storage provider: "s3", "gcs", "azure" |
bucket | string | — | Bucket/container name |
prefix | string | "" | Path prefix template. Supports {stream_table}, {year}, {month}, {day}, {hour} |
format | string | "jsonl" | File format: "jsonl" or "parquet" |
region | string | null | Cloud region |
access_key_id | string | null | Access key (falls back to default credential chain) |
secret_access_key | string | null | Secret key |
batch_size | int | 1000 | Records per file |
file_rotation_seconds | int | 300 | Maximum seconds before a file is finalized and uploaded |
compression | string | null | Compression for JSONL: "gzip", "zstd" |
File Formats
JSONL (JSON Lines)
One JSON object per line. Human-readable, easy to process with standard tools:
{"event_id":"abc","payload":{"order_id":"ord-1"},"op":"insert","ts":"2024-01-15T10:30:00Z"}
{"event_id":"def","payload":{"order_id":"ord-2"},"op":"insert","ts":"2024-01-15T10:30:01Z"}
Parquet
Columnar binary format optimized for analytics. 10-50x compression vs. raw JSON, and dramatically faster query performance for analytical workloads. Use Parquet when the data will be queried by analytics engines.
Path Partitioning
The prefix template creates a Hive-style partitioned layout that analytical engines recognize automatically:
s3://my-lake/events/orders/year=2024/month=01/day=15/batch-001.parquet
s3://my-lake/events/orders/year=2024/month=01/day=15/batch-002.parquet
This enables partition pruning — queries that filter by date only read the relevant files, dramatically reducing scan costs.
Troubleshooting
- "Access Denied" — Check IAM permissions:
s3:PutObjecton the bucket/prefix - "Bucket not found" — Verify bucket name and region
- Large files / memory pressure — Reduce
batch_sizeorfile_rotation_seconds - Query engine can't read files — Ensure format matches what the engine expects; check compression codec
Further Reading
- Apache Iceberg — Add ACID transactions and time travel on top of object storage
- Delta Lake — Alternative table format for object storage
- Snowflake — Query object storage files from Snowflake external tables
Apache Arrow Flight
Apache Arrow Flight is a high-performance RPC framework for transferring large datasets between systems using the Apache Arrow columnar memory format. Unlike JSON-based protocols that require serialization/deserialization, Arrow Flight transfers data in-memory columnar format over gRPC, achieving throughput measured in gigabytes per second. When pg_tide delivers messages via Arrow Flight, your events are batched into Arrow record batches and streamed to any Arrow Flight-compatible endpoint.
When to Use This Sink
Choose Arrow Flight when you need maximum throughput for analytical workloads (machine learning pipelines, real-time feature stores, analytics engines), when the receiving system supports Arrow natively (DuckDB, DataFusion, Polars, pandas, many ML frameworks), or when you want to minimize serialization overhead for high-volume data transfer. Arrow Flight is particularly effective for scenarios where events are consumed in batches for computation rather than processed individually.
Configuration
SELECT tide.relay_set_outbox(
'events-to-flight',
'events',
'flight-relay',
'{
"sink_type": "arrow_flight",
"endpoint": "grpc://${env:FLIGHT_HOST}:8815",
"batch_size": 5000,
"tls_enabled": false
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "arrow_flight" |
endpoint | string | — | gRPC endpoint URL |
batch_size | int | 1000 | Records per Arrow record batch |
tls_enabled | bool | false | Enable TLS for gRPC |
auth_token | string | null | Bearer token for authentication |
How It Works
Messages are accumulated into batches and converted to Arrow columnar format (record batches). The relay then streams these record batches to the Flight endpoint using gRPC's DoPut RPC. This approach is dramatically more efficient than JSON-over-HTTP for large batch transfers because:
- Arrow's columnar format enables zero-copy reads on the receiver side
- gRPC streaming amortizes connection overhead across many records
- Arrow's type system preserves data types without string conversion
Troubleshooting
- "Connection failed" — Verify the gRPC endpoint is reachable and the port is correct
- "Unauthenticated" — Set
auth_tokenif the Flight server requires authentication - Low throughput — Increase
batch_size; Arrow Flight is most efficient with large batches
Further Reading
- ClickHouse — For persistent analytical storage
- Object Storage — For file-based data delivery
Slack
Slack is the leading workplace communication platform used by millions of teams. The Slack sink delivers your outbox messages as formatted notifications to Slack channels using incoming webhooks. This enables real-time operational alerts, business event notifications, and workflow triggers delivered directly to the channels where your team collaborates.
When to Use This Sink
Choose the Slack sink when you want your team to be notified immediately when important business events occur — new high-value orders, system errors, deployment completions, or compliance-relevant actions. The Slack sink formats messages using Slack's Block Kit for rich, readable notifications.
Configuration
SELECT tide.relay_set_outbox(
'alerts-to-slack',
'alerts',
'slack-relay',
'{
"sink_type": "slack",
"webhook_url": "${env:SLACK_WEBHOOK_URL}",
"channel": "#ops-alerts",
"username": "pg_tide",
"icon_emoji": ":database:"
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "slack" |
webhook_url | string | — | Slack incoming webhook URL |
channel | string | null | Override channel (must be allowed by webhook config) |
username | string | "pg_tide" | Display name for the bot |
icon_emoji | string | null | Emoji icon for the bot |
template | string | null | Custom Block Kit template for message formatting |
Rate Limits
Slack imposes rate limits on incoming webhooks (approximately 1 message per second per webhook). For high-volume outboxes, use the rate limiter to stay within limits:
{
"sink_type": "slack",
"webhook_url": "${env:SLACK_WEBHOOK_URL}",
"rate_limit": {"messages_per_second": 1}
}
Troubleshooting
- "Invalid webhook URL" — Webhook URLs expire if the app is uninstalled; regenerate in Slack app settings
- "Channel not found" — The webhook's default channel was deleted; set
channelexplicitly - HTTP 429 — Rate limited; add rate limiting to the pipeline configuration
Further Reading
- Discord — Similar notification sink for Discord
- PagerDuty — For incident management alerting
- HTTP Webhook — For custom HTTP endpoints
Discord
Discord is a popular communication platform originally designed for gaming communities but now widely used for developer communities, open-source projects, and team collaboration. The Discord sink delivers your outbox messages as webhook notifications to Discord channels, using Discord's embed format for rich, formatted messages.
When to Use This Sink
Choose the Discord sink for community notifications (open-source project updates, release announcements), developer team alerts, or any scenario where your audience lives in Discord rather than Slack.
Configuration
SELECT tide.relay_set_outbox(
'events-to-discord',
'community_events',
'discord-relay',
'{
"sink_type": "discord",
"webhook_url": "${env:DISCORD_WEBHOOK_URL}",
"username": "pg_tide Bot",
"avatar_url": "https://example.com/bot-avatar.png"
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "discord" |
webhook_url | string | — | Discord webhook URL |
username | string | null | Override bot display name |
avatar_url | string | null | Override bot avatar image URL |
template | string | null | Custom embed template |
Rate Limits
Discord webhooks are limited to approximately 30 requests per minute per channel. Use the rate limiter to avoid HTTP 429 responses.
Troubleshooting
- "Unknown Webhook" — The webhook was deleted from Discord; create a new one in channel settings
- HTTP 429 — Rate limited; reduce send rate with rate limiting
Further Reading
- Slack — Team communication alternative
- HTTP Webhook — Generic HTTP delivery
PagerDuty
PagerDuty is an incident management platform that helps teams detect, respond to, and resolve operational issues. The PagerDuty sink delivers your outbox messages as Events API v2 events, which can trigger incidents, route to on-call responders, and integrate with PagerDuty's full incident lifecycle management. This enables your PostgreSQL events to directly drive incident response — when a critical business condition is detected, pg_tide can automatically page the right team.
When to Use This Sink
Choose PagerDuty when critical business events in your PostgreSQL database should trigger immediate human response. Examples include: payment processing failures that need urgent attention, security-relevant events (unauthorized access attempts), SLA violations, or infrastructure conditions detected through database monitoring.
Configuration
SELECT tide.relay_set_outbox(
'critical-alerts',
'incidents',
'pagerduty-relay',
'{
"sink_type": "pagerduty",
"routing_key": "${env:PAGERDUTY_ROUTING_KEY}",
"severity": "critical",
"source": "pg-tide",
"component": "payment-service",
"dedup_key_field": "dedup_key"
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "pagerduty" |
routing_key | string | — | PagerDuty Events API v2 routing (integration) key |
severity | string | "error" | Event severity: "critical", "error", "warning", "info" |
source | string | "pg-tide" | Source of the event |
component | string | null | Component or service name |
dedup_key_field | string | "dedup_key" | Field to use as PagerDuty dedup key (prevents duplicate incidents) |
event_action | string | "trigger" | Action: "trigger", "acknowledge", "resolve" |
Deduplication
PagerDuty uses deduplication keys to group related events into a single incident. By default, pg_tide uses the outbox message's dedup_key as PagerDuty's dedup key. This means publishing multiple events with the same dedup_key updates the existing incident rather than creating duplicate pages.
Auto-Resolve
You can configure separate pipelines for triggering and resolving incidents:
-- Trigger incidents from error events
SELECT tide.relay_set_outbox('trigger-alerts', 'errors', 'pd-group',
'{"sink_type": "pagerduty", "routing_key": "${env:PD_KEY}", "event_action": "trigger"}'::jsonb);
-- Resolve incidents from recovery events
SELECT tide.relay_set_outbox('resolve-alerts', 'recoveries', 'pd-group',
'{"sink_type": "pagerduty", "routing_key": "${env:PD_KEY}", "event_action": "resolve"}'::jsonb);
Troubleshooting
- "Invalid routing key" — Verify the routing key from your PagerDuty integration settings
- Duplicate incidents — Check that
dedup_key_fieldmaps to a consistent identifier - No incidents triggered — Verify the service has an active escalation policy
Further Reading
- Slack — For less urgent team notifications
- HTTP Webhook — For custom alerting endpoints
- Circuit Breaker — Preventing alert storms during outages
Singer / Meltano
The Singer protocol is an open standard for moving data between systems. It defines a simple JSON-based interface that allows "taps" (data extractors) and "targets" (data loaders) to communicate through stdin/stdout pipes. The Meltano Hub catalogs approximately 500 taps and targets maintained by the community, covering everything from SaaS APIs (Salesforce, HubSpot, Stripe) to databases (MySQL, Oracle) to file formats (CSV, Parquet). When pg_tide uses the Singer sink, it streams your outbox messages through any Singer target, giving you instant access to hundreds of destinations without pg_tide needing to implement each one individually.
This is one of pg_tide's most powerful integrations because it multiplies the number of available destinations dramatically. Instead of waiting for pg_tide to add native support for a niche system, you can connect today using an existing Singer target from Meltano Hub.
When to Use This Sink
Choose the Singer sink when your destination is not directly supported by pg_tide's native sinks, when you want to leverage existing Singer targets maintained by the Meltano community, or when you need the Singer protocol's built-in STATE persistence for resumable syncs and SCHEMA handling for data type management. Common use cases include loading events into SaaS analytics tools (Amplitude, Mixpanel), sending to CRM systems (Salesforce, HubSpot), or writing to specialized databases.
How It Works
The relay launches a Singer target process and streams messages to it via stdin in Singer's RECORD format. The relay also manages STATE messages (for resumable syncs) and SCHEMA messages (for data type declarations):
- The relay sends a SCHEMA message declaring the event structure
- For each outbox message, the relay sends a RECORD message via stdin
- The target writes STATE messages back to stdout, which the relay persists in the
tide.singer_statetable - If the relay restarts, it resumes from the last persisted STATE
Configuration
SELECT tide.relay_set_outbox(
'events-to-amplitude',
'analytics_events',
'singer-relay',
'{
"sink_type": "singer",
"target_command": "target-amplitude",
"target_config": {
"api_key": "${env:AMPLITUDE_API_KEY}",
"project_id": "${env:AMPLITUDE_PROJECT_ID}"
},
"on_schema_change": "log",
"batch_size": 100
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "singer" |
target_command | string | — | Singer target executable name or path |
target_config | object | {} | Configuration object passed to the target |
on_schema_change | string | "log" | Schema drift policy: "ignore", "log", "fail", "evolve" |
batch_size | int | 100 | Records per STATE checkpoint |
stream_name | string | auto | Singer stream name (defaults to outbox name) |
STATE Persistence
Singer targets emit STATE messages to communicate their progress. pg_tide persists these in the tide.singer_state catalog table, enabling resumable syncs:
-- Inspect Singer state
SELECT * FROM tide.singer_state_list();
If the relay restarts, it sends the last persisted STATE to the target on startup, so the target can resume from where it left off rather than reprocessing all data.
Schema Handling
The on_schema_change policy controls what happens when the structure of outbox messages changes:
"ignore"— Silently accept new fields"log"— Log a warning but continue processing"fail"— Stop the pipeline and alert (safest for production)"evolve"— Automatically send an updated SCHEMA message to the target
Monitor schema drift with:
SELECT * FROM tide.singer_schema_drift();
Finding Targets on Meltano Hub
Browse available targets at hub.meltano.com. Popular targets include:
target-bigquery— Load into BigQuerytarget-snowflake— Load into Snowflaketarget-postgres— Load into another PostgreSQLtarget-s3-csv— Write CSV files to S3target-hubspot— Push to HubSpot CRMtarget-salesforce— Push to Salesforce
Install targets with pip: pip install target-amplitude
Troubleshooting
- "Target command not found" — Ensure the target is installed and available in the relay's PATH
- "Invalid STATE message" — The target emitted malformed STATE; check target version compatibility
- Schema drift detected — Event structure changed; review with
singer_schema_drift()and adjuston_schema_changepolicy - "Target process exited unexpectedly" — Check target logs for configuration errors
Further Reading
- Airbyte — Alternative connector ecosystem
- Fivetran — Enterprise data integration
- Singer Protocol Feature Guide — Detailed protocol documentation
Airbyte
Airbyte is an open-source data integration platform with approximately 400 connectors for moving data between systems. When pg_tide uses the Airbyte sink, it streams your outbox messages through Airbyte destination connectors, providing access to a vast ecosystem of data loading targets. Airbyte connectors run as Docker containers with a standardized protocol, making them portable and well-tested.
When to Use This Sink
Choose the Airbyte sink when your target destination has an Airbyte connector but not a native pg_tide sink, when you are already invested in the Airbyte ecosystem, or when you need the protocol's built-in state management and catalog discovery features. Airbyte connectors are particularly strong for loading data into data warehouses and SaaS tools.
Configuration
SELECT tide.relay_set_outbox(
'events-to-destination',
'events',
'airbyte-relay',
'{
"sink_type": "airbyte",
"destination_image": "airbyte/destination-bigquery:latest",
"destination_config": {
"project_id": "${env:GCP_PROJECT}",
"dataset_id": "events",
"credentials_json": "${env:GCP_CREDS_JSON}"
},
"batch_size": 500
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "airbyte" |
destination_image | string | — | Docker image for the Airbyte destination connector |
destination_config | object | {} | Connector-specific configuration |
batch_size | int | 500 | Records per state checkpoint |
catalog | object | null | Airbyte catalog (auto-generated if not provided) |
How It Works
The relay launches the Airbyte destination connector as a Docker container and communicates via the Airbyte protocol over stdin/stdout. Messages are sent as Airbyte RECORD messages, with periodic STATE messages for checkpointing. The relay manages the container lifecycle, handles restarts, and persists state for resumable syncs.
Troubleshooting
- "Docker image not found" — Pull the image first:
docker pull airbyte/destination-bigquery:latest - "Container exited with error" — Check connector logs for configuration issues
- "Docker not available" — The relay host needs Docker installed and the relay user needs Docker socket access
Further Reading
- Singer / Meltano — Alternative connector ecosystem (no Docker required)
- Fivetran — Enterprise managed alternative
Fivetran HVR
Fivetran is an enterprise data integration platform that automates data pipelines from sources to destinations. The Fivetran HVR sink exposes a webhook-compatible endpoint that speaks Fivetran's HVR (High Volume Replication) format, allowing pg_tide to deliver events in a format that Fivetran's infrastructure can consume and route to any Fivetran-supported destination.
When to Use This Sink
Choose the Fivetran sink when your organization uses Fivetran as its primary data integration platform and you want pg_tide events to flow through Fivetran's managed pipeline infrastructure for delivery to final destinations.
Configuration
SELECT tide.relay_set_outbox(
'events-to-fivetran',
'events',
'fivetran-relay',
'{
"sink_type": "fivetran",
"endpoint_url": "${env:FIVETRAN_WEBHOOK_URL}",
"api_key": "${env:FIVETRAN_API_KEY}",
"api_secret": "${env:FIVETRAN_API_SECRET}",
"batch_size": 100
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "fivetran" |
endpoint_url | string | — | Fivetran HVR endpoint URL |
api_key | string | — | Fivetran API key |
api_secret | string | — | Fivetran API secret |
batch_size | int | 100 | Records per webhook batch |
Further Reading
- Singer / Meltano — Open-source alternative with ~500 connectors
- Airbyte — Open-source alternative with ~400 connectors
PostgreSQL Inbox
The PostgreSQL Inbox sink delivers outbox messages from one pg_tide instance to an inbox on another PostgreSQL database (or the same database). This enables cross-service messaging entirely within PostgreSQL — no external message broker required. When your architecture consists of multiple services that each have their own PostgreSQL database with pg_tide installed, the inbox sink provides reliable, deduplicated message delivery between them.
When to Use This Sink
Choose the PostgreSQL Inbox sink when you need direct database-to-database messaging, when you want to avoid the operational overhead of an external message broker for internal service communication, or when both the sender and receiver are PostgreSQL-based services with pg_tide. This is the simplest possible reverse pipeline — messages flow from outbox in Database A to inbox in Database B with exactly-once deduplication.
Configuration
SELECT tide.relay_set_outbox(
'orders-to-warehouse',
'orders',
'inbox-relay',
'{
"sink_type": "inbox",
"target_url": "${env:WAREHOUSE_DB_URL}",
"inbox_name": "incoming_orders",
"batch_size": 100
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "inbox" |
target_url | string | — | PostgreSQL connection URL for the target database |
inbox_name | string | — | Target inbox name (must exist on the target database) |
batch_size | int | 100 | Messages per batch insert |
Delivery Guarantees
The inbox sink provides exactly-once delivery through the inbox's built-in deduplication. Each message's dedup_key is used as the inbox message identifier — if the same message is delivered twice (due to relay restart), the inbox's UNIQUE constraint on the dedup_key silently rejects the duplicate.
Troubleshooting
- "Inbox not found" — Create the inbox on the target database:
SELECT tide.inbox_create('incoming_orders') - "Connection refused" — Verify the target database URL is reachable from the relay
- "Duplicate key violation" — This is expected behavior (deduplication working correctly); the relay handles this gracefully
Further Reading
- Remote PostgreSQL Outbox — For outbox-to-outbox federation
- Bidirectional Sync Tutorial — Using inbox sinks for two-way communication
Remote PostgreSQL Outbox
The Remote PostgreSQL Outbox sink delivers messages from one outbox to another outbox on a different PostgreSQL instance. This enables multi-cluster federation — messages published in one data center or region can be forwarded to outboxes in other regions, where local relays deliver them to local consumers. This creates a hierarchical event distribution topology.
When to Use This Sink
Choose this sink for multi-region deployments where events published in one region need to be available to consumers in other regions, or for organizational boundaries where different teams manage separate PostgreSQL clusters but need shared event streams.
Configuration
SELECT tide.relay_set_outbox(
'replicate-to-eu',
'orders',
'federation-relay',
'{
"sink_type": "pg_outbox",
"target_url": "${env:EU_DB_URL}",
"target_outbox": "orders_eu_replica",
"batch_size": 200
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "pg_outbox" |
target_url | string | — | PostgreSQL URL for the target instance |
target_outbox | string | — | Target outbox name |
batch_size | int | 100 | Messages per batch |
Further Reading
- PostgreSQL Inbox — For direct inbox delivery
- Cross-Region Tutorial — Multi-region event relay patterns
stdout / File
The stdout sink writes messages to standard output or a file. This is primarily useful for debugging, testing, and piping pg_tide output to external tools. In development, it lets you verify your pipeline configuration and transforms without needing an external message broker running.
When to Use This Sink
Choose stdout when debugging pipeline configurations, testing JMESPath transforms, validating wire format encoding, or piping events to external command-line tools. It is also useful for log-based delivery patterns where messages are written to a file that is then consumed by a log shipper (Fluentd, Vector, Filebeat).
Configuration
SELECT tide.relay_set_outbox(
'debug-pipeline',
'orders',
'debug-relay',
'{
"sink_type": "stdout",
"format": "json",
"output_file": null
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
sink_type | string | — | Must be "stdout" |
format | string | "json" | Output format: "json" (one JSON per line) or "pretty" (indented JSON) |
output_file | string | null | Write to file instead of stdout (optional) |
Use Cases
Debugging transforms
# See what your JMESPath transform produces
pg-tide --postgres-url "..." 2>/dev/null | jq .
Piping to external tools
# Pipe to a custom processor
pg-tide --postgres-url "..." | my-custom-processor
Further Reading
- Dry-Run Mode — Test pipelines without delivering
- JMESPath Transforms — Transform messages before delivery
Sources Overview
While sinks deliver messages from your PostgreSQL outbox to external systems, sources work in the opposite direction — they consume messages from external systems and deliver them into your PostgreSQL inbox. This is what pg_tide calls a "reverse pipeline." Think of it as an intake funnel: events from the outside world flow through the source, through the relay, and into your inbox table where your application can process them with full transactional guarantees and idempotent deduplication.
pg_tide supports 16 different sources, covering all the major message brokers, cloud services, and connector ecosystems. Any system that can produce messages can be connected to your PostgreSQL inbox.
Why Use Sources?
The inbox pattern solves the same reliability problem as the outbox, but in reverse. When your application receives events from an external system, it needs to process them reliably — without losing messages, without processing duplicates, and without inconsistency between the event processing and your database state. By routing external events through a pg_tide inbox, your application processes them within a database transaction, gaining exactly-once semantics for free.
How Sources Work
- The relay connects to the external system as a consumer/subscriber
- Messages are pulled (or pushed) from the external system in batches
- Each message is written to the configured pg_tide inbox table
- The inbox's deduplication mechanism (UNIQUE constraint on event_id) prevents duplicates
- The relay acknowledges the messages to the external system
- Your application processes inbox messages at its own pace within database transactions
sequenceDiagram
participant External as External System
participant Relay as pg-tide relay
participant PG as PostgreSQL
participant App as Application
External-->>Relay: Pull messages (batch)
Relay->>PG: INSERT INTO inbox (deduplicated)
PG-->>Relay: Confirm insert
Relay->>External: Acknowledge messages
App->>PG: SELECT from inbox + process + mark_processed
Note over App,PG: Single transaction
Available Sources
| Source | System | Direction |
|---|---|---|
| PostgreSQL Outbox | PostgreSQL outbox polling | Forward (outbox → sink) |
| Apache Kafka | Kafka consumer | Reverse |
| NATS JetStream | NATS subscriber | Reverse |
| RabbitMQ | RabbitMQ consumer | Reverse |
| Redis Streams | Redis XREADGROUP | Reverse |
| Amazon SQS | SQS receiver | Reverse |
| Amazon Kinesis | Kinesis reader | Reverse |
| Google Cloud Pub/Sub | Pub/Sub subscriber | Reverse |
| Azure Service Bus | Service Bus receiver | Reverse |
| Azure Event Hubs | Event Hubs reader | Reverse |
| MQTT v5 | MQTT subscriber | Reverse |
| HTTP Webhook Receiver | HTTP server | Reverse |
| Singer / Meltano | Singer tap consumer | Reverse |
| Airbyte | Airbyte source connector | Reverse |
| stdin / File | Standard input | Reverse |
Configuring a Reverse Pipeline
Reverse pipelines are configured using tide.relay_set_inbox():
SELECT tide.relay_set_inbox(
'payments-from-stripe', -- pipeline name
'payment_events', -- inbox name
'{
"source_type": "webhook",
"listen_addr": "0.0.0.0:8080",
"path": "/webhooks/stripe",
"signature_scheme": "stripe",
"signature_secret": "${env:STRIPE_WEBHOOK_SECRET}"
}'::jsonb
);
Deduplication
Every source implementation extracts or generates a unique event identifier for each message. This ID is used as the inbox's dedup_key, ensuring that even if the same message is delivered multiple times (network retry, consumer rebalance, relay restart), it appears in your inbox exactly once. The deduplication mechanism varies by source:
- Kafka — Partition + offset combination
- NATS — JetStream sequence number
- SQS — SQS message ID
- Webhook — Request-provided idempotency key or generated UUID
Next Steps
- Receiving events from a message broker? Start with Kafka or NATS
- Building a webhook endpoint? See HTTP Webhook Receiver
- Consuming from cloud services? See SQS, Pub/Sub, or Event Hubs
- Using connector ecosystems? See Singer or Airbyte
Source: PostgreSQL Outbox
The PostgreSQL Outbox source is the heart of pg_tide's forward pipeline. It polls the outbox table for new messages and feeds them to whichever sink is configured for the pipeline. Unlike the reverse sources (which consume from external systems), the outbox source reads from your own PostgreSQL database — it is the starting point for every forward pipeline.
How It Works
The outbox source uses a combination of polling and PostgreSQL NOTIFY to detect new messages efficiently:
- Notification-driven wake-up — When your application publishes a message with
outbox_publish(), PostgreSQL sends a NOTIFY on thetide_outbox_notifychannel. The relay, which is LISTENing on this channel, wakes up immediately. - Batch polling — The relay queries the outbox table for unacknowledged messages belonging to this pipeline's consumer group, fetching up to
batch_sizemessages at a time. - Offset tracking — After the sink confirms delivery, the relay commits the consumer group offset, marking those messages as delivered.
- Retention cleanup — Messages older than the configured
retention_hoursare periodically deleted by background cleanup.
This hybrid approach means messages are typically delivered within milliseconds of being published (notification-driven), while the polling fallback ensures no messages are missed even if a notification is lost.
Configuration
The outbox source is implicitly configured when you create a forward pipeline with relay_set_outbox():
SELECT tide.relay_set_outbox(
'orders-pipeline', -- pipeline name
'orders', -- outbox name (this IS the source)
'relay-consumer', -- consumer group name
'{
"sink_type": "kafka",
"brokers": "localhost:9092",
"topic": "order-events",
"batch_size": 100,
"poll_interval_ms": 1000
}'::jsonb
);
Source-Relevant Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
batch_size | int | 100 | Maximum messages fetched per poll cycle |
poll_interval_ms | int | 1000 | Fallback polling interval when no NOTIFY is received |
visibility_timeout_ms | int | 30000 | How long a batch is "leased" before it can be re-claimed by another relay instance |
Consumer Groups
Each forward pipeline uses a consumer group to track its position in the outbox. Multiple pipelines can read from the same outbox independently:
-- Two pipelines reading the same outbox independently
SELECT tide.relay_set_outbox('to-kafka', 'orders', 'kafka-group', '{"sink_type": "kafka", ...}'::jsonb);
SELECT tide.relay_set_outbox('to-s3', 'orders', 's3-group', '{"sink_type": "object_storage", ...}'::jsonb);
Each consumer group maintains its own offset, so Kafka delivery and S3 archival proceed independently without affecting each other.
Troubleshooting
- Messages not flowing — Check that the pipeline is enabled:
SELECT tide.relay_get_config('orders-pipeline') - High consumer lag — Increase
batch_sizeor add more relay instances (with differentrelay_group_id) - Messages redelivered after relay restart — Normal behavior for uncommitted batches; downstream sinks should be idempotent
Further Reading
- Consumer Groups Concept — How consumer groups and offsets work
- HA Coordination — Multiple relay instances sharing pipelines
- Scaling — Increasing outbox throughput
Source: Apache Kafka
The Kafka source consumes messages from Kafka topics and delivers them into a pg_tide inbox. This enables reverse pipelines where events produced by other systems (via Kafka) flow reliably into your PostgreSQL database for processing. The relay acts as a Kafka consumer, managing offsets, partition assignment, and rebalancing automatically.
When to Use This Source
Use the Kafka source when other services produce events to Kafka that your PostgreSQL-based application needs to process, when you want to consume CDC events from Debezium (which publishes to Kafka), or when you want to build a reliable event consumer that processes Kafka messages within database transactions.
Configuration
SELECT tide.relay_set_inbox(
'payments-from-kafka',
'payment_events',
'{
"source_type": "kafka",
"brokers": "${env:KAFKA_BROKERS}",
"topic": "payment-events",
"group_id": "pg-tide-payments",
"auto_offset_reset": "earliest",
"sasl_mechanism": "SCRAM-SHA-256",
"sasl_username": "${env:KAFKA_USER}",
"sasl_password": "${env:KAFKA_PASS}",
"tls_enabled": true
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
source_type | string | — | Must be "kafka" |
brokers | string | — | Kafka broker addresses |
topic | string | — | Topic to consume from |
group_id | string | — | Kafka consumer group ID |
auto_offset_reset | string | "earliest" | Where to start: "earliest" or "latest" |
sasl_mechanism | string | null | Auth mechanism |
sasl_username | string | null | SASL username |
sasl_password | string | null | SASL password |
tls_enabled | bool | false | Enable TLS |
batch_size | int | 100 | Messages per inbox insert batch |
Offset Management
The relay commits Kafka consumer offsets only after messages are successfully written to the inbox. This ensures no messages are lost even if the relay crashes. On restart, Kafka redelivers any uncommitted messages, and the inbox's deduplication prevents duplicates.
Wire Format Integration
When consuming Debezium-formatted messages from Kafka, specify the wire format:
{
"source_type": "kafka",
"brokers": "localhost:9092",
"topic": "dbserver1.public.orders",
"group_id": "pg-tide-cdc",
"wire_format": "debezium"
}
This decodes Debezium envelope messages and maps them to inbox rows with proper operation type (insert/update/delete), old and new payload, and commit timestamp.
Troubleshooting
- "Group coordinator not available" — Brokers are unreachable or the cluster is starting up
- "Topic not found" — Create the topic or check the name spelling
- Consumer lag growing — Increase
batch_sizeor check if inbox inserts are slow
Further Reading
- Sinks: Kafka — Publishing to Kafka (forward direction)
- Debezium Wire Format — Consuming CDC events from Debezium
Source: NATS JetStream
The NATS source subscribes to NATS JetStream subjects and delivers messages into a pg_tide inbox. This enables reverse pipelines where events from NATS-connected services flow reliably into your PostgreSQL database.
Configuration
SELECT tide.relay_set_inbox(
'events-from-nats',
'incoming_events',
'{
"source_type": "nats",
"url": "nats://localhost:4222",
"subject": "events.>",
"durable_name": "pg-tide-consumer",
"stream": "EVENTS",
"ack_wait_secs": 30
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
source_type | string | — | Must be "nats" |
url | string | — | NATS server URL |
subject | string | — | Subject filter (supports wildcards: *, >) |
durable_name | string | — | Durable consumer name for offset tracking |
stream | string | null | JetStream stream name |
ack_wait_secs | int | 30 | Seconds before unacknowledged messages are redelivered |
credentials_file | string | null | NATS credentials file |
batch_size | int | 100 | Messages per inbox insert batch |
Delivery Guarantees
The NATS source acknowledges messages only after they are written to the inbox. JetStream's durable consumer tracks progress, so messages are never lost. The inbox's deduplication mechanism (using JetStream's sequence number as the event ID) prevents duplicates even on redelivery.
Troubleshooting
- "Consumer not found" — Create the consumer or ensure the stream and subject match
- "No messages" — Verify the subject filter matches what publishers are sending to
- High redelivery count — The relay may be slow to acknowledge; increase
ack_wait_secs
Further Reading
- Sinks: NATS — Publishing to NATS (forward direction)
- Bidirectional Sync — Two-way NATS communication
Source: RabbitMQ
The RabbitMQ source consumes messages from RabbitMQ queues and delivers them into a pg_tide inbox.
Configuration
SELECT tide.relay_set_inbox(
'events-from-rabbit',
'incoming_events',
'{
"source_type": "rabbitmq",
"url": "amqp://user:pass@rabbitmq:5672/%2f",
"queue": "pg-tide-events",
"prefetch_count": 100,
"auto_ack": false
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
source_type | string | — | Must be "rabbitmq" |
url | string | — | AMQP connection URL |
queue | string | — | Queue to consume from |
prefetch_count | int | 100 | Messages pre-fetched from broker |
auto_ack | bool | false | Auto-acknowledge (set false for at-least-once) |
tls_enabled | bool | false | Enable TLS |
Delivery Guarantees
Messages are acknowledged only after successful inbox insertion. The inbox's deduplication uses the RabbitMQ message ID as the event identifier, preventing duplicate processing even on redelivery.
Further Reading
- Sinks: RabbitMQ — Publishing to RabbitMQ
Source: Redis Streams
The Redis source consumes messages from Redis Streams using consumer groups (XREADGROUP) and delivers them into a pg_tide inbox.
Configuration
SELECT tide.relay_set_inbox(
'events-from-redis',
'incoming_events',
'{
"source_type": "redis",
"url": "redis://localhost:6379",
"stream_key": "events:orders",
"group_name": "pg-tide",
"consumer_name": "relay-01",
"batch_size": 100
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
source_type | string | — | Must be "redis" |
url | string | — | Redis connection URL |
stream_key | string | — | Redis stream key to consume from |
group_name | string | — | Consumer group name |
consumer_name | string | — | Consumer name within the group |
batch_size | int | 100 | Messages per XREADGROUP call |
password | string | null | Redis password |
Delivery Guarantees
Messages are acknowledged (XACK) only after inbox insertion. Unacknowledged messages remain in the pending entries list (PEL) and are reclaimed on relay restart, ensuring no message loss.
Further Reading
- Sinks: Redis — Publishing to Redis Streams
Source: Amazon SQS
The SQS source receives messages from Amazon SQS queues and delivers them into a pg_tide inbox. It uses long polling for efficient message retrieval and deletes messages from SQS only after successful inbox insertion.
Configuration
SELECT tide.relay_set_inbox(
'events-from-sqs',
'incoming_events',
'{
"source_type": "sqs",
"queue_url": "${env:SQS_QUEUE_URL}",
"region": "us-east-1",
"wait_time_seconds": 20,
"visibility_timeout": 60,
"batch_size": 10
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
source_type | string | — | Must be "sqs" |
queue_url | string | — | SQS queue URL |
region | string | — | AWS region |
wait_time_seconds | int | 20 | Long poll wait time (max 20) |
visibility_timeout | int | 60 | Seconds before unprocessed messages become visible again |
batch_size | int | 10 | Messages per ReceiveMessage call (max 10) |
access_key_id | string | null | AWS credentials (optional, falls back to default chain) |
secret_access_key | string | null | AWS secret key |
Delivery Guarantees
Messages are deleted from SQS only after inbox insertion succeeds. The inbox's deduplication uses the SQS message ID, preventing duplicates from Standard queues (which may deliver messages more than once). FIFO queues provide exactly-once semantics.
Further Reading
- Sinks: SQS — Publishing to SQS
Source: Amazon Kinesis
The Kinesis source reads records from Amazon Kinesis Data Streams and delivers them into a pg_tide inbox.
Configuration
SELECT tide.relay_set_inbox(
'events-from-kinesis',
'incoming_events',
'{
"source_type": "kinesis",
"stream_name": "external-events",
"region": "us-east-1",
"iterator_type": "LATEST",
"batch_size": 100
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
source_type | string | — | Must be "kinesis" |
stream_name | string | — | Kinesis stream name |
region | string | — | AWS region |
iterator_type | string | "LATEST" | Start position: "LATEST", "TRIM_HORIZON", "AT_TIMESTAMP" |
batch_size | int | 100 | Records per GetRecords call |
access_key_id | string | null | AWS credentials |
secret_access_key | string | null | AWS secret key |
Further Reading
- Sinks: Kinesis — Publishing to Kinesis
Source: Google Cloud Pub/Sub
The Pub/Sub source subscribes to a Google Cloud Pub/Sub subscription and delivers messages into a pg_tide inbox.
Configuration
SELECT tide.relay_set_inbox(
'events-from-pubsub',
'incoming_events',
'{
"source_type": "pubsub",
"project_id": "${env:GCP_PROJECT_ID}",
"subscription": "pg-tide-sub",
"credentials_json": "${file:/etc/gcp/sa.json}",
"batch_size": 100
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
source_type | string | — | Must be "pubsub" |
project_id | string | — | GCP project ID |
subscription | string | — | Pub/Sub subscription name |
credentials_json | string | null | Service account JSON |
batch_size | int | 100 | Messages per pull |
ack_deadline_secs | int | 30 | Acknowledgment deadline |
Delivery Guarantees
Messages are acknowledged only after inbox insertion. The inbox deduplicates using the Pub/Sub message ID. Unacknowledged messages are redelivered after the ack deadline expires.
Further Reading
- Sinks: Pub/Sub — Publishing to Pub/Sub
Source: Azure Service Bus
The Service Bus source receives messages from Azure Service Bus queues or topic subscriptions and delivers them into a pg_tide inbox.
Configuration
SELECT tide.relay_set_inbox(
'events-from-servicebus',
'incoming_events',
'{
"source_type": "servicebus",
"connection_string": "${env:SERVICEBUS_CONNECTION_STRING}",
"queue_or_subscription": "incoming-events",
"batch_size": 50
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
source_type | string | — | Must be "servicebus" |
connection_string | string | — | Service Bus connection string (Listen permission) |
queue_or_subscription | string | — | Queue name or topic/subscription path |
batch_size | int | 50 | Messages per receive batch |
max_lock_duration_secs | int | 60 | Message lock duration |
Delivery Guarantees
Messages are completed (acknowledged) only after inbox insertion. The peek-lock mechanism ensures unprocessed messages are redelivered after the lock expires.
Further Reading
- Sinks: Service Bus — Publishing to Service Bus
Source: Azure Event Hubs
The Event Hubs source reads events from Azure Event Hubs and delivers them into a pg_tide inbox.
Configuration
SELECT tide.relay_set_inbox(
'events-from-eventhubs',
'incoming_events',
'{
"source_type": "eventhubs",
"connection_string": "${env:EVENTHUBS_CONNECTION_STRING}",
"event_hub_name": "external-events",
"consumer_group": "$Default",
"batch_size": 100
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
source_type | string | — | Must be "eventhubs" |
connection_string | string | — | Event Hubs connection string (Listen permission) |
event_hub_name | string | — | Event Hub name |
consumer_group | string | "$Default" | Consumer group |
batch_size | int | 100 | Events per read batch |
starting_position | string | "latest" | Start position: "latest", "earliest" |
Further Reading
- Sinks: Event Hubs — Publishing to Event Hubs
Source: MQTT v5
The MQTT source subscribes to MQTT topics and delivers messages into a pg_tide inbox. This is ideal for ingesting IoT telemetry, sensor data, and device events into PostgreSQL for processing and analytics.
Configuration
SELECT tide.relay_set_inbox(
'telemetry-from-devices',
'device_telemetry',
'{
"source_type": "mqtt",
"url": "mqtts://${env:MQTT_BROKER}:8883",
"topic": "devices/+/telemetry",
"qos": 1,
"client_id": "pg-tide-ingest-01",
"username": "${env:MQTT_USER}",
"password": "${env:MQTT_PASS}",
"tls_enabled": true
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
source_type | string | — | Must be "mqtt" |
url | string | — | MQTT broker URL |
topic | string | — | MQTT topic filter (supports + and # wildcards) |
qos | int | 1 | Quality of Service level |
client_id | string | auto | MQTT client ID |
username | string | null | Auth username |
password | string | null | Auth password |
tls_enabled | bool | false | Enable TLS |
clean_start | bool | false | Clean session on connect |
Delivery Guarantees
With QoS 1, the MQTT broker guarantees at-least-once delivery to the relay. The inbox's deduplication handles any duplicates that arrive. With clean_start: false, the broker retains subscriptions and queues messages while the relay is offline.
Further Reading
- Sinks: MQTT — Publishing to MQTT
Source: HTTP Webhook Receiver
The Webhook Receiver source starts an HTTP server within the relay that accepts incoming webhook POST requests and delivers them into a pg_tide inbox. This turns pg_tide into a webhook endpoint — external services (Stripe, GitHub, Shopify, or any custom service) can push events directly to your relay, which stores them reliably in your PostgreSQL inbox for processing.
When to Use This Source
Use the webhook receiver when external services push events to you via HTTP (payment notifications from Stripe, repository events from GitHub, order updates from Shopify), when you want to decouple webhook reception from processing (accept the webhook immediately, process later within a transaction), or when you need webhook signature verification and idempotent deduplication built in.
Configuration
SELECT tide.relay_set_inbox(
'stripe-webhooks',
'payment_events',
'{
"source_type": "webhook",
"listen_addr": "0.0.0.0:8080",
"path": "/webhooks/stripe",
"signature_scheme": "stripe",
"signature_secret": "${env:STRIPE_WEBHOOK_SECRET}",
"idempotency_header": "Stripe-Idempotency-Key"
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
source_type | string | — | Must be "webhook" |
listen_addr | string | "0.0.0.0:8080" | Address and port for the HTTP server |
path | string | "/" | URL path to accept webhooks on |
signature_scheme | string | null | Verification scheme: "hmac-sha256", "github", "stripe", "svix" |
signature_secret | string | null | Secret key for signature verification |
idempotency_header | string | null | Header containing the dedup key |
max_body_size | int | 1048576 | Maximum request body size (bytes) |
Signature Verification
The webhook receiver can verify the authenticity of incoming requests using various signature schemes:
hmac-sha256— Standard HMAC-SHA256 signature in a configurable headergithub— GitHub'sX-Hub-Signature-256header formatstripe— Stripe'sStripe-Signatureheader with timestamp verificationsvix— Svix webhook signature format
Requests with invalid signatures are rejected with HTTP 401, protecting your inbox from spoofed events.
Response Codes
The webhook receiver responds to senders with:
- 200 OK — Message accepted and written to inbox
- 401 Unauthorized — Signature verification failed
- 409 Conflict — Duplicate message (already in inbox, idempotent success)
- 413 Payload Too Large — Body exceeds
max_body_size - 500 Internal Server Error — Database write failed
Troubleshooting
- "Connection refused" from sender — Check
listen_addrport and firewall rules - HTTP 401 from all requests — Verify
signature_secretmatches the sender's configuration - Missing events — Check that
pathmatches what the sender is configured to POST to
Further Reading
- Webhook Signatures Feature — Detailed signature scheme documentation
- Sinks: HTTP Webhook — Sending webhooks (forward direction)
Source: Singer / Meltano
The Singer source runs a Singer "tap" (data extractor) and delivers its output records into a pg_tide inbox. This gives you access to approximately 500 data sources from the Meltano Hub ecosystem — SaaS APIs, databases, file formats, and more — all flowing reliably into your PostgreSQL inbox with incremental sync state management.
Configuration
SELECT tide.relay_set_inbox(
'hubspot-contacts',
'crm_events',
'{
"source_type": "singer",
"tap_command": "tap-hubspot",
"tap_config": {
"api_key": "${env:HUBSPOT_API_KEY}",
"start_date": "2024-01-01T00:00:00Z"
},
"on_schema_change": "log"
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
source_type | string | — | Must be "singer" |
tap_command | string | — | Singer tap executable |
tap_config | object | {} | Tap configuration |
on_schema_change | string | "log" | Schema drift policy |
state_persistence | bool | true | Persist STATE for incremental syncs |
stream_filter | array | null | Specific streams to extract (null = all) |
STATE Persistence for Incremental Syncs
Singer taps emit STATE messages containing bookmarks (last sync position). pg_tide persists these in tide.singer_state, enabling incremental syncs — on each run, only new or changed data is extracted.
Further Reading
- Sinks: Singer — Running Singer targets (forward direction)
- Singer Protocol Feature — Detailed protocol documentation
Source: Airbyte
The Airbyte source runs Airbyte source connectors (as Docker containers) and delivers extracted records into a pg_tide inbox. This provides access to approximately 400 data sources from the Airbyte connector catalog.
Configuration
SELECT tide.relay_set_inbox(
'salesforce-data',
'crm_inbox',
'{
"source_type": "airbyte",
"source_image": "airbyte/source-salesforce:latest",
"source_config": {
"client_id": "${env:SF_CLIENT_ID}",
"client_secret": "${env:SF_CLIENT_SECRET}",
"refresh_token": "${env:SF_REFRESH_TOKEN}"
},
"streams": ["contacts", "opportunities"]
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
source_type | string | — | Must be "airbyte" |
source_image | string | — | Docker image for the Airbyte source connector |
source_config | object | {} | Connector configuration |
streams | array | null | Streams to sync (null = all discovered) |
sync_mode | string | "incremental" | Sync mode: "incremental" or "full_refresh" |
Further Reading
- Sinks: Airbyte — Airbyte destination connectors
- Singer Source — Alternative connector ecosystem (no Docker needed)
Source: stdin / File
The stdin source reads line-delimited JSON from standard input or a file and delivers each line as a message into a pg_tide inbox. This is useful for testing, replay from log files, data migration, and piping output from external commands into your inbox.
Configuration
SELECT tide.relay_set_inbox(
'file-import',
'imported_events',
'{
"source_type": "stdin",
"input_file": "/data/events-export.jsonl",
"batch_size": 1000
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
source_type | string | — | Must be "stdin" |
input_file | string | null | File path (null = read from stdin) |
batch_size | int | 100 | Lines per inbox insert batch |
format | string | "jsonl" | Input format: "jsonl" (one JSON per line) |
Use Cases
Replaying from a log file
pg-tide --postgres-url "..." < events-backup.jsonl
Piping from another tool
kafka-console-consumer --topic events --from-beginning | pg-tide --postgres-url "..."
Further Reading
- Sinks: stdout — Writing to stdout (forward direction)
- Dry-Run & Replay — Replay mode for reprocessing
Wire Formats
A wire format defines how messages are serialized on the transport layer — what shape the bytes take when they travel between systems. When pg_tide relays messages from an outbox to a sink, it encodes them into a wire format. When it receives messages from a source into an inbox, it decodes them from a wire format. The wire format is the contract between pg_tide and the external system.
Why Wire Formats Matter
Different systems expect different message shapes. A Debezium consumer expects a specific JSON envelope with before, after, and op fields. A Maxwell consumer expects database, table, type, and data. Your own services might have a custom CDC format. Wire formats let pg_tide speak all of these languages without changing your application code or database schema.
Supported Formats
| Format | Encode (outbox → sink) | Decode (source → inbox) | Use Case |
|---|---|---|---|
| Native | ✓ | ✓ | Default pg_tide format — simple and complete |
| Debezium | ✓ | ✓ | Kafka Connect ecosystem, schema registries |
| Maxwell | — | ✓ | Ingest from Maxwell CDC tool |
| Canal | — | ✓ | Ingest from Alibaba Canal CDC tool |
| CDC JSON | — | ✓ | Map any JSON CDC format via JSONPath |
Choosing a Wire Format
Use native when both sides are pg_tide, when you control the consumer, or when you want the simplest possible format. Native passes through the outbox row with minimal transformation.
Use debezium when you're feeding Kafka consumers that already understand Debezium, when you need schema registry integration, when you're replacing Debezium CDC with pg_tide, or when you want compatibility with the broader Kafka Connect ecosystem.
Use maxwell or canal when you're migrating from those MySQL CDC tools and want to ingest their streams into PostgreSQL via a pg_tide inbox.
Use cdc_json when you have a custom CDC format from any system — define JSONPath expressions that map fields to pg_tide's internal model, and you can ingest any JSON-based CDC stream without writing code.
Configuration
Wire format is specified per-pipeline in the relay configuration:
[[pipelines]]
name = "orders-to-kafka"
wire_format = "debezium"
[pipelines.wire_config]
server_name = "production"
emit_tombstones = true
Or via SQL:
SELECT tide.relay_set_outbox(
'orders-pipeline',
'order_events',
'{
"sink_type": "kafka",
"wire_format": "debezium",
"wire_config": {
"server_name": "production",
"emit_tombstones": true
}
}'::jsonb
);
Message Flow
┌─────────────┐ encode ┌───────────────┐
│ Outbox Row │ ──────────────→ │ Encoded Bytes │ → Sink
└─────────────┘ └───────────────┘
┌───────────────┐ decode ┌─────────────┐
│ Raw Message │ ──────────────→ │ Inbox Row │ → PostgreSQL
└───────────────┘ └─────────────┘
During encoding, the wire format takes an outbox row (with fields like op, new_row, old_row, stream_table) and produces bytes suitable for the transport. During decoding, it takes raw bytes from a source and extracts the semantic fields (operation type, payload, event ID, timestamps) into an inbox row.
Further Reading
- Native Format — The default, transparent format
- Debezium Format — Full Kafka Connect compatibility
- CDC JSON Format — Map any custom format via JSONPath
Wire Format: Native
The native wire format is pg_tide's default. It passes outbox rows through with minimal transformation, producing clean JSON that contains all the information needed to reconstruct the event on the receiving side. If you control both the producer and consumer, native is the simplest and most transparent choice.
Encoded Format (Outbox → Sink)
When encoding an outbox row for delivery, the native format produces:
{
"outbox_id": 42,
"op": "insert",
"stream_table": "order_events",
"payload": {
"order_id": "ORD-001",
"status": "confirmed",
"total": 99.95
}
}
The message key is set to the outbox row's routing key (if configured) or the outbox ID.
Decoded Format (Source → Inbox)
When decoding incoming messages, the native format expects JSON payloads. It extracts:
| Field | Source |
|---|---|
event_id | From message key, or generated UUID |
event_type | From stream_table field or message topic |
op | From op field (insert, update, delete) |
payload | From payload field (or entire message) |
Configuration
No configuration is needed — native is the default when no wire_format is specified:
[[pipelines]]
name = "orders"
# wire_format defaults to "native"
Or explicitly:
[[pipelines]]
name = "orders"
wire_format = "native"
When to Use Native
- Both producer and consumer are pg_tide (e.g., outbox → inbox replication)
- You control the consumer and can parse the simple JSON envelope
- You want maximum transparency — what goes in is what comes out
- You're debugging or developing and want to see raw messages clearly
When to Use Something Else
- Your consumer expects Debezium format → use debezium
- You're ingesting from Maxwell or Canal → use maxwell or canal
- You have a custom format with non-standard field names → use cdc_json
Further Reading
- Wire Formats Overview — Comparison of all formats
- Sinks: stdout — See native output directly
Wire Format: Debezium
The Debezium wire format produces and consumes messages in the same shape as Debezium, the popular open-source CDC platform. This means pg_tide can be a drop-in replacement for Debezium in existing architectures — your Kafka consumers, ksqlDB queries, Flink jobs, and stream processors continue working without modification.
Encoded Format (Outbox → Sink)
For an INSERT operation:
{
"schema": { ... },
"payload": {
"before": null,
"after": {
"order_id": "ORD-001",
"status": "confirmed",
"total": 99.95
},
"op": "c",
"ts_ms": 1714029482000,
"source": {
"version": "pg-tide",
"connector": "postgresql",
"name": "production",
"ts_ms": 1714029482000,
"db": "mydb",
"schema": "public",
"table": "orders",
"lsn": 12345678
}
}
}
For a DELETE operation with tombstones enabled, two messages are produced:
- The delete event (with
beforepopulated andafteras null) - A tombstone message (null value with the same key) for log compaction
Operation Mapping
| pg_tide op | Debezium op | Notes |
|---|---|---|
insert | c (create) | before = null, after = new row |
update | u (update) | before = old row, after = new row |
delete | d (delete) | before = old row, after = null |
Decoded Format (Source → Inbox)
When consuming Debezium messages (e.g., from an actual Debezium deployment feeding Kafka), pg_tide extracts:
| Field | Source |
|---|---|
event_id | From message key or generated UUID |
event_type | {source.db}.{source.table} |
op | Mapped from payload.op: c→insert, u→update, d→delete, r→insert/upsert |
payload | From payload.after (or payload.before for deletes) |
old_payload | From payload.before (for updates) |
commit_ts | From payload.source.ts_ms |
source_position | From payload.source.lsn |
Snapshot reads (op: "r") are treated as inserts by default, configurable via snapshot_op_treatment.
Configuration
[[pipelines]]
name = "orders-to-kafka"
wire_format = "debezium"
[pipelines.wire_config]
server_name = "production"
emit_tombstones = true
envelope = "json"
tombstone_handling = "delete"
key_strategy = "primary_key"
snapshot_op_treatment = "insert"
heartbeat_interval_ms = 10000
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
server_name | string | "pg-tide" | Logical name in source.name field |
envelope | string | "json" | Envelope type: "json" or "avro" |
emit_tombstones | bool | true | Emit null-value tombstone after DELETE |
tombstone_handling | string | "delete" | How to handle incoming tombstones: "delete" or "drop" |
key_strategy | string | "primary_key" | Message key: "primary_key" or "message_key" |
snapshot_op_treatment | string | "insert" | Treat op=r as: "insert" or "upsert" |
heartbeat_interval_ms | int | 10000 | Heartbeat emission interval (0 = disabled) |
schema_registry_url | string | null | Confluent Schema Registry URL (for Avro) |
Avro Support
When envelope is set to "avro", messages are serialized using Apache Avro with schemas registered in a Confluent-compatible Schema Registry. This provides:
- Compact binary encoding (smaller messages)
- Schema evolution with compatibility checks
- Integration with the Confluent ecosystem
[pipelines.wire_config]
envelope = "avro"
schema_registry_url = "http://schema-registry:8081"
Schema Evolution
The Debezium format tracks schema changes. When a new column appears in the outbox, it's added to the Avro schema or JSON structure automatically. Incompatible changes (column removal) are detected and reported.
Tombstones and Log Compaction
Kafka topic compaction uses null-value messages (tombstones) to signal that a key should be removed. When emit_tombstones is true, a DELETE operation produces two messages:
- Delete event — Contains the old row state in
before - Tombstone — Null value with the same key, enabling compaction to remove the key
This is essential for Kafka topics configured with cleanup.policy=compact.
Migrating from Debezium to pg_tide
If you're currently using Debezium and want to switch to pg_tide:
- Configure pg_tide with
wire_format = "debezium"and matchingserver_name - Consumers see identical message shapes — no changes needed
- The
source.versionfield changes to"pg-tide"but this rarely matters - You lose Debezium-specific features (schema history topic) but gain transactional outbox guarantees
Further Reading
- Wire Formats Overview — Comparison of all formats
- Schema Registry Feature — Avro schema management
- Sinks: Kafka — Common pairing with Debezium format
Wire Format: Maxwell
The Maxwell wire format decodes messages produced by Maxwell's Daemon, a MySQL CDC tool. This allows pg_tide to ingest MySQL change streams into a PostgreSQL inbox — useful for migrating data from MySQL to PostgreSQL, building cross-database event pipelines, or consolidating changes from multiple MySQL instances.
Note: Maxwell is decode-only. pg_tide can consume Maxwell messages but does not produce them.
Message Shape
Maxwell produces JSON messages with this structure:
{
"database": "myapp",
"table": "users",
"type": "insert",
"ts": 1714029482,
"xid": 12345,
"data": {
"id": 7,
"name": "alice",
"email": "alice@example.com"
}
}
For UPDATE operations, an old field contains the previous values of changed columns:
{
"database": "myapp",
"table": "users",
"type": "update",
"ts": 1714029482,
"xid": 12346,
"data": {
"id": 7,
"name": "alice_new",
"email": "alice@example.com"
},
"old": {
"name": "alice"
}
}
Decoded Fields
| Inbox Field | Maxwell Source |
|---|---|
event_id | Message key or generated UUID |
event_type | {database}.{table} |
op | From type: insert, update, delete |
payload | From data field |
old_payload | From old field (updates only) |
commit_ts | From ts (Unix seconds) |
source_position | From xid or position |
Configuration
[[pipelines]]
name = "mysql-to-postgres"
wire_format = "maxwell"
[pipelines.wire_config]
treat_bootstrap_as_insert = true
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
treat_bootstrap_as_insert | bool | true | Map bootstrap-insert events as INSERT operations |
Bootstrap Events
Maxwell supports "bootstrapping" — bulk-loading existing table data. These events have type: "bootstrap-insert". With treat_bootstrap_as_insert: true (the default), they're treated as normal inserts, allowing you to perform initial data loads through the same pipeline.
Further Reading
- Wire Formats Overview — Comparison of all formats
- Canal Format — Another MySQL CDC format
- CDC JSON Format — Generic format for custom CDC shapes
Wire Format: Canal
The Canal wire format decodes messages produced by Alibaba Canal, a MySQL/MariaDB CDC tool popular in the Chinese technology ecosystem. pg_tide can ingest Canal change streams into a PostgreSQL inbox for cross-database replication, data consolidation, or event-driven processing.
Note: Canal is decode-only. pg_tide can consume Canal messages but does not produce them.
Message Shape
Canal produces JSON messages with this structure:
{
"id": 1,
"database": "myapp",
"table": "orders",
"pkNames": ["id"],
"isDdl": false,
"type": "INSERT",
"es": 1714029482000,
"ts": 1714029483000,
"data": [
{
"id": "42",
"status": "confirmed",
"total": "99.95"
}
],
"old": null
}
For UPDATE operations:
{
"id": 2,
"database": "myapp",
"table": "orders",
"pkNames": ["id"],
"isDdl": false,
"type": "UPDATE",
"es": 1714029484000,
"ts": 1714029485000,
"data": [
{
"id": "42",
"status": "shipped",
"total": "99.95"
}
],
"old": [
{
"status": "confirmed"
}
]
}
Important: Canal serializes all column values as strings, regardless of their original MySQL type.
Decoded Fields
| Inbox Field | Canal Source |
|---|---|
event_id | Message key or generated UUID |
event_type | {database}.{table} |
op | From type: INSERT→insert, UPDATE→update, DELETE→delete |
payload | First element of data array |
old_payload | First element of old array (updates only) |
commit_ts | From es (event timestamp, milliseconds) |
source_position | From id field |
Configuration
[[pipelines]]
name = "canal-ingest"
wire_format = "canal"
[pipelines.wire_config]
skip_ddl = true
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
skip_ddl | bool | true | Skip DDL events (CREATE TABLE, ALTER TABLE, etc.) |
DDL Events
Canal captures DDL (Data Definition Language) statements like ALTER TABLE and CREATE INDEX. These events have isDdl: true. By default, pg_tide skips them since they don't represent data changes. Set skip_ddl: false if you want to capture schema changes as events in your inbox.
String-Typed Values
Canal represents all MySQL column values as JSON strings. A numeric column total DECIMAL(10,2) arrives as "99.95" rather than 99.95. Your downstream processing should account for this — you may want to use a JMESPath transform to cast values back to their intended types.
Further Reading
- Wire Formats Overview — Comparison of all formats
- Maxwell Format — Another MySQL CDC format
- CDC JSON Format — Generic format for custom CDC shapes
Wire Format: CDC JSON
The CDC JSON wire format is a universal decoder that maps any JSON-based change data capture format into pg_tide's internal model using JSONPath expressions. Instead of being tied to a specific CDC tool's output shape, you define paths that tell pg_tide where to find the operation type, payload, timestamps, and other fields within your custom message format.
This is the format to use when your source produces CDC-like messages but doesn't match Debezium, Maxwell, or Canal exactly.
Note: CDC JSON is decode-only. pg_tide can consume custom CDC messages but does not produce them.
How It Works
You provide JSONPath expressions that map fields from your message format to pg_tide's internal representation:
Your custom message JSONPath config pg_tide inbox row
───────────────────── ───────────────── ─────────────────
{ event_id: "evt-42"
"event_type": "created", op_path: "$.event_type" op: insert
"occurred_at": "2024...", commit_ts_path: ... commit_ts: 2024-...
"data": { payload_path: "$.data" payload: {id: 7}
"id": 7,
"name": "alice"
}
}
Configuration
[[pipelines]]
name = "custom-cdc-ingest"
wire_format = "cdc_json"
[pipelines.wire_config]
op_path = "$.event_type"
payload_path = "$.data"
commit_ts_path = "$.occurred_at"
commit_ts_format = "rfc3339"
[pipelines.wire_config.op_map]
created = "insert"
modified = "update"
removed = "delete"
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
op_path | string | "$.op" | JSONPath to the operation field |
op_map | object | {} | Map source values to pg_tide ops |
payload_path | string | "$" | JSONPath to new row / event data |
old_payload_path | string | null | JSONPath to before-state (optional) |
event_id_path | string | null | JSONPath to deduplication key |
event_type_path | string | null | JSONPath to event type (defaults to topic) |
commit_ts_path | string | null | JSONPath to commit timestamp |
commit_ts_format | string | "rfc3339" | Timestamp format: "rfc3339", "unix_seconds", "unix_millis" |
source_position_path | string | null | JSONPath to source position / offset |
JSONPath Syntax
pg_tide supports simple dot-notation JSONPath expressions:
$— The entire message (root)$.field— Top-level field$.field.nested— Nested field access
Array indexing is not supported. If your messages use arrays, consider preprocessing or using a JMESPath transform.
Operation Mapping
The op_map configuration translates your source's operation names into pg_tide's standard operations (insert, update, delete):
[pipelines.wire_config.op_map]
# Map custom names to pg_tide standard ops
"CREATED" = "insert"
"UPDATED" = "update"
"DELETED" = "delete"
"SNAPSHOT" = "insert"
If op_map is empty and op_path points to a field that already contains insert/update/delete, no mapping is needed.
Examples
Stripe-style Events
{
"id": "evt_1234",
"type": "invoice.paid",
"created": 1714029482,
"data": {
"object": { "id": "in_5678", "amount": 5000 }
}
}
[pipelines.wire_config]
event_id_path = "$.id"
event_type_path = "$.type"
payload_path = "$.data.object"
commit_ts_path = "$.created"
commit_ts_format = "unix_seconds"
# No op_path needed — all events are treated as inserts by default
Custom Microservice Events
{
"action": "user.updated",
"timestamp": "2024-04-25T14:30:00Z",
"correlation_id": "corr-abc-123",
"before": { "name": "Alice", "role": "user" },
"after": { "name": "Alice", "role": "admin" }
}
[pipelines.wire_config]
op_path = "$.action"
op_map = { "user.created" = "insert", "user.updated" = "update", "user.deleted" = "delete" }
event_id_path = "$.correlation_id"
event_type_path = "$.action"
payload_path = "$.after"
old_payload_path = "$.before"
commit_ts_path = "$.timestamp"
commit_ts_format = "rfc3339"
Further Reading
- Wire Formats Overview — Comparison of all formats
- Transforms — Post-decode JMESPath transformations
- Debezium Format — Standard CDC format (if your source matches)
Feature: Dead Letter Queue
When a message fails to deliver after all retry attempts — because the sink rejected it, the payload couldn't be decoded, or a permanent error occurred — it needs somewhere to go. The dead letter queue (DLQ) captures these failed messages in a PostgreSQL table where you can inspect them, understand why they failed, fix the underlying issue, and replay them back through the pipeline.
Without a DLQ, a single poison message can block an entire pipeline. The DLQ isolates failures so healthy messages continue flowing while problematic ones wait for human attention.
How It Works
Message fails → Retry (up to max_retries)
Still failing → Classify error kind
Route to DLQ → INSERT INTO tide.relay_dlq
Continue pipeline → Next messages flow normally
Failed messages are inserted into tide.relay_dlq with full context: the pipeline name, source and sink names, the original payload, the error message, and a classification of why it failed.
Error Classifications
| Error Kind | Meaning | Example |
|---|---|---|
decode | Payload couldn't be decoded from wire format | Malformed JSON, Avro schema mismatch |
sink_permanent | Sink rejected permanently (no retry will help) | Invalid credentials, schema validation failure |
inbox_permanent | Inbox insertion failed | Constraint violation, duplicate key |
max_retries_exceeded | Transient error persisted beyond retry limit | Network timeout after 5 attempts |
Configuration
DLQ is configured per-pipeline:
SELECT tide.relay_set_outbox(
'orders-pipeline',
'order_events',
'{
"sink_type": "kafka",
"brokers": "kafka:9092",
"topic": "orders",
"dlq": {
"enabled": true,
"max_retries": 5,
"retry_delay_seconds": 10,
"retention_days": 30
}
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
dlq.enabled | bool | true | Enable dead letter queue |
dlq.max_retries | int | 5 | Delivery attempts before DLQ routing |
dlq.retry_delay_seconds | int | 10 | Delay between retry attempts |
dlq.retention_days | int | 30 | Days to keep resolved DLQ entries |
Inspecting the DLQ
Query the DLQ table directly:
SELECT id, pipeline_name, error_kind, error_message, created_at
FROM tide.relay_dlq
WHERE resolved_at IS NULL
ORDER BY created_at DESC;
Or use the SQL API:
-- List unresolved DLQ entries for a pipeline
SELECT * FROM tide.relay_dlq_list('orders-pipeline');
-- View full payload of a specific entry
SELECT payload FROM tide.relay_dlq WHERE id = 42;
Replaying Failed Messages
Once you've fixed the underlying issue (corrected credentials, updated schema, fixed payload format), replay messages back through the pipeline:
-- Retry a single message
SELECT tide.relay_dlq_retry(42);
-- Retry all messages for a pipeline
SELECT tide.relay_dlq_retry_all('orders-pipeline');
Replayed messages go through the normal pipeline path. If they fail again, they return to the DLQ with an updated retry count.
Integration with Circuit Breaker
When the circuit breaker opens (sink is unhealthy), messages are routed directly to the DLQ rather than waiting indefinitely. This prevents message buildup in memory while the sink recovers. Once the circuit closes, new messages flow normally — and you can replay DLQ entries to recover the ones that were sidelined.
Monitoring
Track DLQ activity via Prometheus metrics:
pg_tide_dlq_entries_total— Total messages routed to DLQ (by pipeline, error_kind)- Check the DLQ table row count as part of your alerting
Further Reading
- Circuit Breaker — Automatic failure detection
- Troubleshooting — Diagnosing delivery failures
Feature: Circuit Breaker
The circuit breaker protects your relay pipelines from cascading failures. When a sink becomes unavailable — a Kafka broker goes down, an HTTP endpoint returns 503s, a database connection drops — the circuit breaker detects the pattern of consecutive failures and stops attempting delivery. This prevents wasted resources, avoids flooding error logs, and gives the downstream system time to recover.
State Machine
The circuit breaker has three states:
failure_threshold
┌────────┐ consecutive ┌────────┐
│ CLOSED │ ─────────────────→│ OPEN │
│(normal)│ │(failing)│
└────────┘ └────────┘
↑ │
│ success_threshold │ half_open_timeout
│ consecutive ↓
┌────────────┐ ┌───────────┐
│ CLOSED │←─────────────│ HALF-OPEN │
└────────────┘ success │ (probe) │
└───────────┘
│
│ failure
↓
┌────────┐
│ OPEN │
└────────┘
Closed (normal operation): Messages flow to the sink. Each failure increments a counter; each success resets it. When consecutive failures reach failure_threshold, the circuit opens.
Open (failing): All publish attempts fail immediately without contacting the sink. After half_open_timeout elapses, the circuit transitions to half-open.
Half-open (recovery probe): A single message is allowed through as a probe. If it succeeds, success_threshold consecutive successes close the circuit. If it fails, the circuit re-opens immediately.
Configuration
SELECT tide.relay_set_outbox(
'orders-pipeline',
'order_events',
'{
"sink_type": "kafka",
"brokers": "kafka:9092",
"topic": "orders",
"circuit_breaker": {
"enabled": true,
"failure_threshold": 5,
"success_threshold": 3,
"half_open_timeout_seconds": 30
}
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
circuit_breaker.enabled | bool | true | Enable circuit breaker |
circuit_breaker.failure_threshold | int | 5 | Consecutive failures to open circuit |
circuit_breaker.success_threshold | int | 3 | Consecutive successes to close from half-open |
circuit_breaker.half_open_timeout_seconds | int | 30 | Seconds before open → half-open transition |
Behavior When Open
When the circuit is open:
- Messages are not sent to the sink (no wasted network calls)
- If a DLQ is configured, messages are routed there for later replay
- If no DLQ, the worker sleeps until the half-open timeout, then probes
- Prometheus metrics reflect the unhealthy state (
pipeline_healthy= 0) - The
/healthendpoint reports the pipeline as unhealthy
Tuning Guidelines
Low failure_threshold (2-3): Opens quickly, aggressive protection. Use for sinks that rarely have transient errors — if they fail twice, something is seriously wrong.
High failure_threshold (10-20): Tolerates intermittent failures. Use for sinks with occasional transient errors (network blips, DNS resolution hiccups).
Short half_open_timeout (5-15s): Recovers quickly after brief outages. Use for sinks that recover fast (load-balanced services, managed cloud endpoints).
Long half_open_timeout (60-300s): Gives downstream systems more time to recover. Use for sinks that take time to restart (database failovers, broker rebalances).
Monitoring
The circuit breaker state is reflected in:
- Prometheus gauge:
pg_tide_pipeline_healthy(1 = closed, 0 = open) - Health endpoint:
/healthreturns 503 when any pipeline's circuit is open - Logs: State transitions logged at
warnlevel (open) andinfolevel (close)
Further Reading
- Dead Letter Queue — Where messages go when circuit is open
- Rate Limiting — Complementary back-pressure mechanism
- Monitoring — Prometheus metrics reference
Feature: Rate Limiting
Rate limiting controls how fast messages flow from the outbox to the sink. It uses a token-bucket algorithm that allows short bursts above the steady-state rate while enforcing a long-term maximum throughput. When the bucket is empty, the relay pauses — this back-pressure propagates upstream, causing outbox rows to accumulate in PostgreSQL until the rate allows them through.
Why Rate Limit?
Rate limiting protects downstream systems from being overwhelmed. Common scenarios:
- API rate limits: Webhook endpoints, Slack, PagerDuty, and cloud APIs enforce request-per-second limits. Exceeding them causes 429 errors and potential account throttling.
- Cost control: Cloud services (BigQuery, Kinesis, Pub/Sub) charge per operation. Rate limiting caps your bill predictably.
- Graceful degradation: During bulk backfill operations, rate limiting prevents your relay from saturating network links or database connections.
- Fair sharing: Multiple pipelines competing for the same sink — rate limiting ensures each gets predictable throughput.
Configuration
SELECT tide.relay_set_outbox(
'notifications',
'notification_events',
'{
"sink_type": "slack",
"webhook_url": "${env:SLACK_WEBHOOK_URL}",
"rate_limit": {
"enabled": true,
"max_messages_per_second": 1,
"burst_size": 5
}
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
rate_limit.enabled | bool | false | Enable rate limiting |
rate_limit.max_messages_per_second | int | 0 | Steady-state rate (0 = unlimited) |
rate_limit.burst_size | int | same as rate | Burst capacity above steady rate |
How the Token Bucket Works
The token bucket starts full with burst_size tokens. Each message consumes one token. Tokens refill at max_messages_per_second rate. When the bucket is empty, the relay blocks until tokens are available.
Bucket capacity: burst_size = 10
Refill rate: max_messages_per_second = 5
Time 0s: [■■■■■■■■■■] 10 tokens (full)
→ Send 10 messages instantly (burst)
Time 0s: [ ] 0 tokens (empty)
→ Block until tokens refill
Time 1s: [■■■■■ ] 5 tokens (refilled at 5/s)
→ Send 5 messages
Time 2s: [■■■■■ ] 5 tokens
...
This means:
- The first batch can send up to
burst_sizemessages immediately - After the burst, throughput stabilizes at
max_messages_per_second - Brief pauses allow tokens to accumulate for the next burst
Back-Pressure Propagation
When the rate limiter blocks, it creates a chain of back-pressure:
- Rate limiter blocks → Worker pauses before publishing
- Worker pauses → Source poll interval stretches
- Source poll stretches → Outbox rows stay in PostgreSQL longer
- Rows stay in DB → No data loss, messages wait safely in the transactional outbox
This is safe because the outbox is durable — messages persist in PostgreSQL until acknowledged. The rate limiter doesn't drop messages; it just slows the relay down.
Tuning Guidelines
For webhook/API sinks: Set max_messages_per_second at 80% of the API's documented rate limit. This leaves headroom for retries and other clients.
For streaming sinks (Kafka, NATS): Usually unnecessary — these systems handle high throughput natively. Only rate-limit if you're paying per-message or want to control network bandwidth.
For database sinks (ClickHouse, BigQuery): Set rate to keep bulk insert batches at a comfortable size. 100-1000 msg/s is typical for analytical databases.
Burst size: Set to your expected batch size. If default_batch_size is 100, a burst_size of 100-200 allows full batches to flow without blocking.
Further Reading
- Circuit Breaker — Complementary failure protection
- Scaling — Throughput optimization strategies
Feature: Schema Registry
The schema registry integration enables Avro serialization with Confluent Schema Registry compatibility. Instead of sending verbose JSON over the wire, messages are serialized as compact Avro binary with schema IDs — reducing message size by 50-80% while providing schema evolution guarantees.
Why Use a Schema Registry?
- Compact messages: Avro binary is significantly smaller than JSON, reducing network bandwidth and storage costs
- Schema evolution: Add fields, remove optional fields, or change defaults without breaking consumers
- Contract enforcement: Producers can't send data that violates the registered schema
- Discovery: Consumers can look up the schema by ID rather than out-of-band documentation
Configuration
SELECT tide.relay_set_outbox(
'orders-pipeline',
'order_events',
'{
"sink_type": "kafka",
"brokers": "kafka:9092",
"topic": "orders",
"wire_format": "debezium",
"wire_config": {
"envelope": "avro",
"schema_registry_url": "http://schema-registry:8081"
},
"schema_registry": {
"url": "http://schema-registry:8081",
"username": "${env:SR_USER}",
"password": "${env:SR_PASS}",
"auto_register": true
},
"serialization": {
"format": "avro",
"subject_name_strategy": "TopicName"
}
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
schema_registry.url | string | null | Schema Registry URL |
schema_registry.username | string | null | HTTP Basic Auth username |
schema_registry.password | string | null | HTTP Basic Auth password |
schema_registry.auto_register | bool | true | Auto-register new schemas |
serialization.format | string | "json" | Wire format: "json" or "avro" |
serialization.subject_name_strategy | string | "TopicName" | Subject naming strategy |
Subject Name Strategies
The subject name strategy determines how schemas are organized in the registry:
| Strategy | Subject Format | Use Case |
|---|---|---|
TopicName | {topic}-value | One schema per topic (most common) |
RecordName | {record_name}-value | Schema shared across topics |
TopicRecordName | {topic}-{record_name}-value | Multiple record types per topic |
Confluent Wire Format
Messages serialized with the schema registry follow the Confluent wire format:
┌───────┬──────────────┬─────────────────┐
│ Magic │ Schema ID │ Avro Payload │
│ 0x00 │ (4 bytes) │ (N bytes) │
└───────┴──────────────┴─────────────────┘
- Magic byte (0x00): Identifies Confluent serialization
- Schema ID (4 bytes, big-endian): Registry identifier for this schema
- Payload: Avro-encoded binary data
Schema Evolution
When your outbox table gains new columns, the schema evolves:
- pg_tide detects the new field in the outbox row
- A new Avro schema is generated with the additional field
- If
auto_registeris true, the schema is registered (compatibility checked) - Messages are serialized with the new schema ID
- Consumers using the registry can decode both old and new messages
The registry enforces backward compatibility by default — new schemas must be readable by consumers using the previous schema version.
Further Reading
- Wire Format: Debezium — Avro support in Debezium format
- Sinks: Kafka — Common pairing with Schema Registry
Feature: Transforms
Transforms let you filter and reshape messages in-flight — between polling from the source and publishing to the sink. Using JMESPath expressions, you can drop messages that don't match a condition, extract specific fields from a payload, reshape the data structure, or compute derived values. All without touching your application code or database schema.
Two Operations
Transforms provide two complementary operations:
Filter — A JMESPath expression evaluated as a predicate. If the result is "truthy" (not null, not false, not empty), the message passes through. If "falsy", the message is silently dropped and acknowledged.
Payload projection — A JMESPath expression that replaces the entire message payload with its result. The original payload goes in; the expression's output comes out.
You can use filter alone, projection alone, or both together (filter is applied first).
Configuration
SELECT tide.relay_set_outbox(
'high-value-orders',
'order_events',
'{
"sink_type": "kafka",
"brokers": "kafka:9092",
"topic": "high-value-orders",
"transform": {
"filter": "payload.total > `1000`",
"payload": "{ order_id: payload.id, amount: payload.total, customer: payload.customer_email }"
}
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
transform.filter | string | null | JMESPath filter expression (truthy = keep) |
transform.payload | string | null | JMESPath projection expression (replaces payload) |
JMESPath Quick Reference
JMESPath is a query language for JSON. Here are the patterns most useful for transforms:
Field Access
payload.order_id → "ORD-001"
payload.customer.email → "alice@example.com"
Comparisons (for filters)
payload.status == 'confirmed' → true/false
payload.amount > `100` → true/false (backtick for literals)
payload.priority != 'low' → true/false
Boolean Logic (for filters)
payload.status == 'confirmed' && payload.amount > `100`
payload.region == 'us' || payload.region == 'eu'
Object Projection (for payloads)
{
id: payload.order_id,
total: payload.amount,
email: payload.customer.email
}
Field Existence (for filters)
payload.premium_tier → truthy if field exists and is not null
Examples
Filter: Only forward error events
{
"transform": {
"filter": "payload.level == 'error'"
}
}
Filter: Drop internal events
{
"transform": {
"filter": "payload.source != 'internal'"
}
}
Projection: Slim down payload
{
"transform": {
"payload": "{ id: payload.id, type: payload.event_type, data: payload.data }"
}
}
Combined: Filter and reshape
{
"transform": {
"filter": "payload.country == 'US' && payload.amount > `50`",
"payload": "{ order: payload.id, amount: payload.amount, state: payload.shipping.state }"
}
}
Truthiness Rules
JMESPath truthiness determines whether a filter passes:
| Value | Truthy? | Example |
|---|---|---|
null | No | Missing field |
false | No | Failed comparison |
"" (empty string) | No | Empty text field |
[] (empty array) | No | No items |
{} (empty object) | No | No fields |
| Everything else | Yes | Numbers, non-empty strings, arrays with items |
Performance
Transforms are applied in-memory before publishing. JMESPath expressions are compiled once at pipeline startup and evaluated per-message. The overhead is negligible for typical expressions — a few microseconds per message.
For filters that drop many messages, transforms reduce load on the sink (fewer messages to publish) while the source continues polling at full speed.
Further Reading
- Routing — Content-based topic routing (complementary feature)
- JMESPath specification — Full language reference
Feature: Content-Based Routing
Content-based routing dynamically determines the destination subject (topic, queue, channel) for each message based on its payload content. Instead of sending all messages from an outbox to a single topic, you can route them to different destinations based on event type, priority, region, or any other field in the message.
How It Works
Routing rules are evaluated in order. Each rule matches a field in the message payload against an expected value. The first rule that matches determines the output subject. If no rule matches, the default template is used.
Message payload: Routing rules:
{ 1. event_type == "order.created" → orders.created
"event_type": "order.shipped", 2. event_type == "order.shipped" → orders.shipped
"region": "eu", 3. region == "eu" → eu.events
"priority": "high" default → tide.{stream_table}
}
Result: "orders.shipped" (rule 2 matches first)
Configuration
SELECT tide.relay_set_outbox(
'multi-topic-orders',
'order_events',
'{
"sink_type": "kafka",
"brokers": "kafka:9092",
"routing": {
"default_template": "orders.general",
"rules": [
{
"match_field": "event_type",
"match_value": "order.created",
"subject": "orders.created"
},
{
"match_field": "event_type",
"match_value": "order.shipped",
"subject": "orders.shipped"
},
{
"match_field": "priority",
"match_value": "high",
"subject": "high-priority.orders"
}
]
}
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
routing.default_template | string | "{stream_table}" | Fallback subject when no rule matches |
routing.rules | array | [] | Ordered list of routing rules |
routing.rules[].match_field | string | — | Dot-separated path into payload |
routing.rules[].match_value | string | — | Expected value (string equality) |
routing.rules[].subject | string | — | Subject template when rule matches |
Subject Templates
Subject strings support template variables that are expanded at runtime:
| Variable | Expansion |
|---|---|
{stream_table} | The outbox's stream table name |
{op} | Operation type: insert, update, delete |
{outbox_id} | Numeric outbox row ID |
Examples:
"orders.{op}"→"orders.insert","orders.update","orders.delete""{stream_table}.{op}"→"order_events.insert""priority.{stream_table}"→"priority.order_events"
Nested Field Access
The match_field parameter supports dot-separated paths for nested objects:
{
"routing": {
"rules": [
{
"match_field": "customer.tier",
"match_value": "enterprise",
"subject": "enterprise-events"
},
{
"match_field": "shipping.country",
"match_value": "US",
"subject": "domestic-shipping"
}
]
}
}
Rule Evaluation
- Rules are evaluated in order — first match wins
- If no rule matches, the
default_templateis used - Field matching is string equality (case-sensitive)
- Missing fields never match (treated as null, which ≠ any string)
Use Cases
Fan-out by event type
Route different event types to dedicated topics for independent consumers:
{
"rules": [
{ "match_field": "type", "match_value": "user.signup", "subject": "user-signups" },
{ "match_field": "type", "match_value": "user.churn", "subject": "user-churn" },
{ "match_field": "type", "match_value": "order.placed", "subject": "new-orders" }
]
}
Priority routing
Send high-priority messages to a fast-lane topic with dedicated consumers:
{
"rules": [
{ "match_field": "priority", "match_value": "critical", "subject": "alerts.critical" },
{ "match_field": "priority", "match_value": "high", "subject": "alerts.high" }
],
"default_template": "alerts.normal"
}
Geographic routing
Route events to region-specific topics:
{
"rules": [
{ "match_field": "region", "match_value": "eu", "subject": "events.eu" },
{ "match_field": "region", "match_value": "us", "subject": "events.us" },
{ "match_field": "region", "match_value": "apac", "subject": "events.apac" }
]
}
Further Reading
- Transforms — Filter and reshape messages (applied before routing)
- Fan-Out Pattern — Tutorial combining routing with multiple consumers
Feature: Webhook Signatures
When pg_tide sends outgoing webhooks (via the HTTP webhook sink) or receives incoming webhooks (via the webhook receiver source), it can sign and verify messages using HMAC-based signatures. This ensures the recipient can verify the webhook came from pg_tide (outgoing) and that pg_tide can verify webhooks come from a trusted sender (incoming).
Outgoing Webhook Signatures
When publishing to an HTTP webhook endpoint, pg_tide can sign the request body and include the signature in a header. The receiving service verifies the signature to ensure the request is authentic and hasn't been tampered with.
Configuration (Outgoing)
SELECT tide.relay_set_outbox(
'order-notifications',
'order_events',
'{
"sink_type": "webhook",
"url": "https://partner.example.com/webhooks/orders",
"signature": {
"scheme": "hmac-sha256",
"secret": "${env:WEBHOOK_SIGNING_SECRET}",
"header": "X-Signature-256"
}
}'::jsonb
);
The signature is computed as HMAC-SHA256(secret, request_body) and sent as a hex-encoded string in the configured header.
Incoming Webhook Verification
When receiving webhooks from external services via the webhook receiver source, pg_tide verifies the signature before accepting the message. Requests with invalid signatures are rejected with HTTP 401.
Configuration (Incoming)
SELECT tide.relay_set_inbox(
'stripe-events',
'payment_inbox',
'{
"source_type": "webhook",
"listen_addr": "0.0.0.0:8080",
"path": "/webhooks/stripe",
"signature_scheme": "stripe",
"signature_secret": "${env:STRIPE_WEBHOOK_SECRET}"
}'::jsonb
);
Supported Signature Schemes
hmac-sha256 — Standard HMAC
The most common webhook signing scheme. Computes HMAC-SHA256 of the request body using a shared secret.
- Header: Configurable (default:
X-Signature-256) - Format:
sha256=<hex-encoded-hmac> - Verification: Recompute HMAC and compare (constant-time)
github — GitHub Webhooks
GitHub's webhook signature format using X-Hub-Signature-256.
- Header:
X-Hub-Signature-256 - Format:
sha256=<hex-encoded-hmac> - Secret: Your GitHub webhook secret
stripe — Stripe Webhooks
Stripe's signature includes a timestamp to prevent replay attacks.
- Header:
Stripe-Signature - Format:
t=<timestamp>,v1=<hmac> - Verification: HMAC computed over
timestamp.body - Replay protection: Rejects signatures with timestamps too far in the past
svix — Svix Webhook Platform
Svix is a webhook delivery platform. pg_tide can verify Svix-signed webhooks.
- Header:
svix-signature - Format: Svix-specific signature scheme
- Includes: Message ID, timestamp, and signature
Security Considerations
- Always use HTTPS for webhook endpoints. Signatures prove authenticity but don't encrypt the payload.
- Rotate secrets periodically. When rotating, you can temporarily accept both old and new secrets.
- Use environment variables for secrets (never hardcode):
${env:SECRET_NAME} - Reject stale timestamps (Stripe scheme does this automatically) to prevent replay attacks.
Further Reading
- Sinks: HTTP Webhook — Outgoing webhook configuration
- Sources: Webhook Receiver — Incoming webhook server
- Security Guide — Broader security practices
Feature: Dry-Run and Replay
Dry-run mode lets you test a pipeline configuration without actually publishing messages to the sink. Replay mode lets you reprocess a range of messages from your outbox — useful for backfilling a new consumer, recovering from a sink failure, or reprocessing after fixing a bug.
Dry-Run Mode
In dry-run mode, the relay performs every step of the pipeline (polling, transforms, routing) but skips the actual publish to the sink. Instead, it logs what would have been sent. This is invaluable for:
- Validating configuration before going live
- Testing transforms to see which messages pass the filter and what the output looks like
- Verifying routing to confirm messages end up on the expected topics
- Capacity planning to understand message volume and size before connecting a real sink
Configuration
SELECT tide.relay_set_outbox(
'orders-test',
'order_events',
'{
"sink_type": "kafka",
"brokers": "kafka:9092",
"topic": "orders",
"dry_run": true
}'::jsonb
);
Or via TOML:
[[pipelines]]
name = "orders-test"
dry_run = true
What Gets Logged
In dry-run mode, each batch produces log output like:
INFO [orders-test] dry-run: would publish 5 messages to "orders"
INFO [orders-test] dry-run: msg[0] key="ORD-001" size=342 bytes
INFO [orders-test] dry-run: msg[1] key="ORD-002" size=287 bytes
...
Messages are still acknowledged from the source after logging — so the pipeline advances through the outbox even in dry-run mode. This means you can run dry-run temporarily to see what's flowing, then disable it to start real delivery from the current position.
Replay Mode
Replay mode reprocesses a range of outbox messages, typically to backfill a new consumer or recover from a failure. You specify a starting offset and optionally an ending offset, and the relay processes only messages within that range.
Configuration
SELECT tide.relay_set_outbox(
'orders-backfill',
'order_events',
'{
"sink_type": "kafka",
"brokers": "kafka:9092",
"topic": "orders-v2",
"replay": {
"from_offset": 1000,
"to_offset": 5000
}
}'::jsonb
);
Replay Behavior
- Messages outside the offset range are skipped (not published, not acknowledged)
- When the range is exhausted, the pipeline exits cleanly
- Replay pipelines can run alongside live pipelines — they don't interfere
- Combine with transforms to replay with different filtering or reshaping
Use Cases
Backfilling a new consumer: When you add a new Kafka consumer that needs historical data, create a replay pipeline targeting the new topic. Once the replay completes, switch to a live pipeline for ongoing messages.
Recovering from sink failure: If your sink was down and messages went to the DLQ, you can replay the affected range instead of retrying individual DLQ entries.
Reprocessing after a bug fix: If a transform had a bug that produced incorrect output, fix the transform and replay the affected range to the same (or a new) destination.
Combining Dry-Run and Replay
You can use both together to preview what a replay would produce without actually sending anything:
{
"dry_run": true,
"replay": {
"from_offset": 1000,
"to_offset": 2000
}
}
This is useful for estimating replay volume and validating transform behavior on historical messages.
Further Reading
- Dead Letter Queue — Alternative recovery path for individual messages
- Transforms — Reshape messages during replay
Feature: Configuration Hot-Reload
pg_tide supports hot-reloading pipeline configurations without restarting the relay process. When you add, modify, or disable a pipeline in the PostgreSQL catalog, the relay detects the change and reconciles — starting new pipelines, stopping removed ones, and updating modified configurations in place.
How It Works
The relay discovers configuration changes through two mechanisms:
- LISTEN/NOTIFY — Immediate notification when catalog tables change
- Periodic polling — Rediscovers pipelines every
discovery_interval_secs(fallback)
When a change is detected, the coordinator compares the new pipeline set against the currently running pipelines:
- New pipeline → Acquire advisory lock, spawn worker task
- Removed pipeline → Signal worker to stop, release advisory lock
- Modified pipeline → Stop old worker, start new one with updated config
- Disabled pipeline → Same as removed (worker stopped, lock released)
Triggering a Reload
Automatic (via LISTEN/NOTIFY)
The relay listens on the tide_relay_config PostgreSQL notification channel. When you call any tide.relay_set_* function, a notification is emitted automatically:
-- This triggers immediate reload
SELECT tide.relay_set_outbox(
'orders-pipeline',
'order_events',
'{"sink_type": "kafka", "brokers": "kafka:9092", "topic": "orders"}'::jsonb
);
Periodic Discovery
Even without NOTIFY (e.g., if the relay reconnects after a network partition), the coordinator polls for changes every discovery_interval_secs:
# Default: 30 seconds
discovery_interval_secs = 30
What Can Be Changed Without Restart
| Change | Hot-Reload? | Notes |
|---|---|---|
| Add new pipeline | ✓ | Started within seconds |
| Remove pipeline | ✓ | Gracefully drained and stopped |
| Change sink type | ✓ | Worker restarted with new sink |
| Change sink config (URL, topic) | ✓ | Worker restarted |
| Change transforms/routing | ✓ | Worker restarted |
| Enable/disable pipeline | ✓ | Started or stopped |
| Change relay process config | ✗ | Requires restart |
Change metrics_addr | ✗ | Requires restart |
Change postgres_url | ✗ | Requires restart |
Graceful Pipeline Transitions
When a pipeline configuration changes, the existing worker is drained before the new one starts:
- Worker receives stop signal
- Current batch completes (in-flight messages finish)
- Source acknowledgment completes
- Worker task exits
- New worker spawns with updated config
- New worker begins polling
This ensures no messages are lost or double-processed during reconfiguration.
Configuration
The discovery mechanism itself is configured at the process level:
# How often to poll for pipeline changes (fallback)
discovery_interval_secs = 30
Or via CLI:
pg-tide --discovery-interval 30
Further Reading
- Relay Configuration — Full process-level configuration
- HA Coordination — How multiple relays handle config changes
Feature: OpenTelemetry
pg_tide integrates with OpenTelemetry for distributed tracing, giving you end-to-end visibility into how messages flow from your application through the outbox, relay, and into the sink. Traces show exactly where time is spent — polling the source, applying transforms, publishing to the sink, and acknowledging delivery.
What You Get
With OpenTelemetry enabled, every relay poll cycle produces a distributed trace with spans for:
poll_cycle— Root span covering one full iterationsource_poll— Time spent fetching messages from the sourcesink_publish— Time spent delivering the batch to the sinksource_acknowledge— Time spent acknowledging processed messages
These spans include attributes like pipeline name, batch size, message count, and any errors encountered.
Configuration
Enable OpenTelemetry by setting the OTLP endpoint:
pg-tide \
--postgres-url "postgres://..." \
--otel-endpoint "http://localhost:4317"
Or via environment variable:
export PG_TIDE_OTEL_ENDPOINT="http://localhost:4317"
pg-tide --postgres-url "postgres://..."
Configuration Reference
| Parameter | Source | Default | Description |
|---|---|---|---|
--otel-endpoint | CLI | null | OTLP gRPC endpoint |
PG_TIDE_OTEL_ENDPOINT | Environment | null | OTLP gRPC endpoint |
When no endpoint is configured, OpenTelemetry is completely disabled (zero overhead).
Compatible Backends
pg_tide exports traces via OTLP (OpenTelemetry Protocol) gRPC, compatible with:
- Jaeger — Open-source distributed tracing
- Grafana Tempo — Scalable trace backend
- Honeycomb — Observability platform
- Datadog — APM and tracing
- AWS X-Ray (via OTEL Collector)
- Google Cloud Trace (via OTEL Collector)
- Any OTEL Collector — Route to multiple backends
Example: Grafana Tempo
# docker-compose.yml
services:
tempo:
image: grafana/tempo:latest
ports:
- "4317:4317" # OTLP gRPC
- "3200:3200" # Tempo query
pg-tide:
environment:
PG_TIDE_OTEL_ENDPOINT: "http://tempo:4317"
Trace Attributes
Spans include these attributes for filtering and analysis:
| Attribute | Description |
|---|---|
service.name | "pg-tide-relay" |
pipeline.name | Pipeline identifier |
pipeline.direction | "forward" or "reverse" |
batch.size | Number of messages in batch |
error | Error message (if span errored) |
Feature Gate
OpenTelemetry support is compiled behind a feature gate. The pre-built binaries include it. If building from source:
cargo build --release --features otel
Without the otel feature, all tracing functions are no-ops with zero runtime cost.
Further Reading
- Metrics — Prometheus metrics (complementary to tracing)
- Monitoring — Complete observability setup
- OpenTelemetry Collector Integration — Advanced collector configuration
Feature: High Availability Coordination
pg_tide achieves high availability through PostgreSQL advisory locks. Multiple relay instances can run simultaneously — each discovers the same set of pipelines, but only one instance owns each pipeline at any time. If an instance crashes or loses its database connection, its locks are automatically released and another instance takes over.
How It Works
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Relay #1 │ │ Relay #2 │ │ Relay #3 │
│ owns: A, B │ │ owns: C, D │ │ owns: E │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
└───────────────────┼───────────────────┘
│
┌──────────────┐
│ PostgreSQL │
│ advisory locks│
└──────────────┘
Each relay instance:
- Discovers all enabled pipelines from the catalog
- Attempts to acquire a PostgreSQL advisory lock for each pipeline
- Only starts worker tasks for pipelines where it holds the lock
- Periodically re-checks lock ownership during discovery
If Relay #1 crashes:
- PostgreSQL automatically releases its advisory locks (session locks die with the connection)
- Relay #2 or #3 acquires locks for pipelines A and B on the next discovery cycle
- Messages continue flowing within
discovery_interval_secs
Advisory Lock Mechanics
pg_tide uses pg_try_advisory_lock(key1, key2) where:
key1=hashtext(relay_group_id)— Groups relays into a coordination clusterkey2=hashtext(pipeline_name)— Identifies the specific pipeline
pg_try_advisory_lock is non-blocking — if another instance holds the lock, it returns false immediately rather than waiting. This means relay instances never deadlock or block each other.
Configuration
Relay Group ID
All relay instances that should coordinate must share the same relay_group_id:
# Instance 1
pg-tide --postgres-url "..." --relay-group-id "production"
# Instance 2
pg-tide --postgres-url "..." --relay-group-id "production"
Instances with different group IDs operate independently — they can both own the same pipeline (useful for blue/green deployments or multi-region setups with separate databases).
Discovery Interval
Controls how quickly failover happens:
pg-tide --discovery-interval 10 # Check every 10 seconds
Lower values = faster failover, but more frequent PostgreSQL queries. The default of 30 seconds is a good balance for most deployments.
Failover Timeline
t=0s Relay #1 crashes (holds locks for pipeline A, B)
t=0s PostgreSQL closes connection, releases advisory locks
t=10s Relay #2 runs discovery cycle
t=10s Relay #2 acquires lock for pipeline A (success)
t=10s Relay #2 spawns worker for pipeline A
t=10s Pipeline A resumes processing
With discovery_interval = 10, worst-case failover is 10 seconds. Messages are never lost — they wait safely in the outbox until a relay instance picks them up.
Scaling Patterns
Active-Active (recommended)
Run N relay instances. Pipelines are distributed across all instances automatically. As you add instances, pipelines rebalance on the next discovery cycle.
Active-Standby
Run 2 instances. The primary acquires all locks. If it fails, the standby takes over. Simpler but less efficient than active-active.
Per-Pipeline Scaling
For high-throughput pipelines, use consumer groups to parallelize within a single pipeline rather than running multiple relay instances.
Graceful Shutdown
When a relay instance receives SIGTERM:
- Coordinator sends stop signal to all owned workers
- Workers complete their current batch (in-flight messages finish)
- Workers acknowledge processed messages
- Coordinator releases all advisory locks
- Process exits
Other instances detect released locks on next discovery and take ownership.
Monitoring HA
- Prometheus gauge:
pg_tide_pipeline_healthyper instance shows which pipelines each instance owns - Advisory locks query:
SELECT * FROM pg_locks WHERE locktype = 'advisory'shows current ownership - Health endpoint:
/healthreports healthy only if the instance owns at least one pipeline
Further Reading
- Graceful Shutdown — Clean shutdown behavior
- Deployment Architectures — HA deployment patterns
- Scaling — Scaling strategies
Feature: Graceful Shutdown
When the relay receives a shutdown signal (SIGTERM or SIGINT), it doesn't abruptly terminate. Instead, it performs a graceful drain: in-flight batches complete, messages are acknowledged, advisory locks are released, and connections are closed cleanly. This ensures no messages are lost or double-processed during deployments, restarts, or scaling events.
Shutdown Sequence
1. SIGTERM received
2. Coordinator signals all worker tasks to stop
3. Each worker:
a. Finishes current batch publish (if in progress)
b. Acknowledges the batch with the source
c. Exits its processing loop
4. Coordinator waits for all workers to exit
5. Coordinator releases all advisory locks
6. Metrics server stops accepting new requests
7. OpenTelemetry flushes pending traces
8. Process exits with code 0
Why This Matters
Without graceful shutdown:
- In-flight messages could be published to the sink but not acknowledged in the source, causing re-delivery (duplicates)
- Advisory locks would be held until PostgreSQL's connection timeout (potentially minutes), delaying failover
- Metrics might not be scraped for the final interval
- Traces might be lost
With graceful shutdown:
- Every message is either fully processed (published + acknowledged) or not processed at all
- Advisory locks are released immediately, enabling instant failover
- Final metrics are available for scraping
- All traces are exported
Shutdown Timeout
The relay enforces a maximum shutdown duration. If workers don't exit within the timeout, the process terminates forcefully:
# Default: 30 seconds
pg-tide --shutdown-timeout 30
If a sink is extremely slow (e.g., a webhook endpoint that takes 60 seconds to respond), increase this timeout. In Kubernetes, ensure terminationGracePeriodSeconds exceeds your shutdown timeout.
Kubernetes Integration
In Kubernetes deployments, the pod receives SIGTERM when it's being evicted, scaled down, or updated. Configure your deployment to give pg_tide enough time:
spec:
terminationGracePeriodSeconds: 60 # Must exceed shutdown-timeout
containers:
- name: pg-tide
command: ["pg-tide", "--shutdown-timeout", "45"]
PreStop Hook (optional)
If you need extra time for load balancers to drain connections to the metrics endpoint:
lifecycle:
preStop:
exec:
command: ["sleep", "5"]
Signal Handling
| Signal | Behavior |
|---|---|
SIGTERM | Graceful shutdown (standard Kubernetes signal) |
SIGINT | Graceful shutdown (Ctrl+C in terminal) |
SIGKILL | Immediate termination (cannot be caught) |
Further Reading
- HA Coordination — How shutdown interacts with advisory locks
- Deployment Guide — Production deployment practices
Feature: Prometheus Metrics
pg_tide exposes Prometheus-format metrics via an HTTP endpoint, giving you real-time visibility into pipeline throughput, error rates, latency, and health. These metrics integrate with Grafana, Datadog, or any Prometheus-compatible monitoring stack.
Metrics Endpoint
The relay starts an HTTP server on port 9090 by default:
GET http://localhost:9090/metrics → Prometheus text format
GET http://localhost:9090/health → Health check (200 or 503)
Configure the listen address:
pg-tide --metrics-addr "0.0.0.0:9090"
Available Metrics
Counters
| Metric | Labels | Description |
|---|---|---|
pg_tide_messages_published_total | pipeline, direction | Total messages successfully published to sink |
pg_tide_messages_consumed_total | pipeline, direction | Total messages consumed from source |
pg_tide_publish_errors_total | pipeline, direction | Total publish failures |
pg_tide_dedup_skipped_total | pipeline | Messages skipped due to deduplication |
Gauges
| Metric | Labels | Description |
|---|---|---|
pg_tide_pipeline_healthy | pipeline | 1 = healthy, 0 = circuit breaker open |
pg_tide_consumer_lag | pipeline | Pending messages in outbox (estimated) |
Histograms
| Metric | Labels | Buckets (seconds) | Description |
|---|---|---|---|
pg_tide_delivery_latency_seconds | pipeline | 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1.0, 5.0, 30.0 | Time from outbox insert to sink acknowledgment |
Labels
All metrics are labeled by:
pipeline— Pipeline name (e.g.,"orders-to-kafka")direction—"forward"(outbox → sink) or"reverse"(source → inbox)
Health Endpoint
The /health endpoint returns:
- 200 OK with body
"healthy"— All pipelines have closed circuit breakers - 503 Service Unavailable with body
"unhealthy: [pipeline-a, pipeline-b]"— One or more pipelines have open circuit breakers
Use this for Kubernetes liveness/readiness probes:
livenessProbe:
httpGet:
path: /health
port: 9090
initialDelaySeconds: 5
periodSeconds: 10
Prometheus Scrape Configuration
# prometheus.yml
scrape_configs:
- job_name: 'pg-tide'
static_targets:
- targets: ['pg-tide:9090']
scrape_interval: 15s
For Kubernetes with pod annotations:
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
prometheus.io/path: "/metrics"
Key Queries
Throughput (messages/second)
rate(pg_tide_messages_published_total[5m])
Error rate
rate(pg_tide_publish_errors_total[5m])
Delivery latency (p99)
histogram_quantile(0.99, rate(pg_tide_delivery_latency_seconds_bucket[5m]))
Consumer lag
pg_tide_consumer_lag
Unhealthy pipelines
pg_tide_pipeline_healthy == 0
Further Reading
- Dashboards — Pre-built Grafana dashboards
- OpenTelemetry — Distributed tracing (complementary)
- Monitoring Guide — Complete observability setup
Feature: Grafana Dashboards
pg_tide ships with a pre-built Grafana dashboard that visualizes relay health, throughput, latency, and error rates. Import it into your Grafana instance for instant observability without manual panel creation.
Importing the Dashboard
The dashboard JSON is located at pg-tide/dashboards/relay-health.json in the repository. Import it into Grafana:
- Open Grafana → Dashboards → Import
- Upload or paste the JSON from
pg-tide/dashboards/relay-health.json - Select your Prometheus data source
- Click Import
Or use the Grafana API:
curl -X POST http://admin:admin@grafana:3000/api/dashboards/db \
-H 'Content-Type: application/json' \
-d @pg-tide/dashboards/relay-health.json
Dashboard Panels
The relay health dashboard includes:
Overview Row
- Pipeline Status — Table showing each pipeline's health status, last error, and uptime
- Total Throughput — Graph of messages/second across all pipelines
- Active Pipelines — Count of currently running pipelines
Throughput Row
- Messages Published (per pipeline) — Rate of successful publishes
- Messages Consumed (per pipeline) — Rate of messages polled from source
- Publish Errors (per pipeline) — Rate of delivery failures
Latency Row
- Delivery Latency (p50/p95/p99) — Histogram showing message transit time
- Latency Heatmap — Distribution of delivery times over time
Health Row
- Circuit Breaker State — Timeline showing open/closed state per pipeline
- Consumer Lag — Current backlog per pipeline
- DLQ Entries — Count of unresolved dead letter queue entries
Alerting Rules
Suggested Grafana alert rules to pair with the dashboard:
High Error Rate
alert: PgTideHighErrorRate
expr: rate(pg_tide_publish_errors_total[5m]) > 0.1
for: 5m
labels:
severity: warning
annotations:
summary: "pg_tide pipeline {{ $labels.pipeline }} has elevated errors"
Circuit Breaker Open
alert: PgTideCircuitOpen
expr: pg_tide_pipeline_healthy == 0
for: 1m
labels:
severity: critical
annotations:
summary: "pg_tide pipeline {{ $labels.pipeline }} circuit breaker is open"
High Consumer Lag
alert: PgTideHighLag
expr: pg_tide_consumer_lag > 10000
for: 10m
labels:
severity: warning
annotations:
summary: "pg_tide pipeline {{ $labels.pipeline }} has {{ $value }} pending messages"
Customization
The dashboard uses standard Prometheus queries. Customize it by:
- Adding panels for specific pipelines
- Adjusting time ranges and refresh intervals
- Adding annotations for deployment events
- Linking to your tracing backend (Tempo, Jaeger) for drill-down
Further Reading
- Metrics — Available Prometheus metrics
- Prometheus + Grafana Integration — Full stack setup
- Monitoring Guide — Observability best practices
Feature: Singer Protocol Support
pg_tide implements the Singer specification for both extraction (taps) and loading (targets). This gives you access to approximately 500 data connectors from the Meltano Hub ecosystem without writing custom integration code.
What is Singer?
Singer is a specification for moving data between systems. It defines three message types that flow between "taps" (data extractors) and "targets" (data loaders) over standard I/O:
- RECORD — A single data row with stream name and record data
- SCHEMA — JSON Schema for a stream (column names, types)
- STATE — Bookmark for incremental sync (last sync position)
pg_tide as a Singer Target
When used as a sink, pg_tide acts as a Singer target — it receives RECORD, SCHEMA, and STATE messages from any Singer tap and writes them into the configured destination:
SELECT tide.relay_set_outbox(
'hubspot-to-warehouse',
'etl_events',
'{
"sink_type": "singer",
"target_command": "target-postgres",
"target_config": {
"host": "warehouse.example.com",
"database": "analytics"
}
}'::jsonb
);
See Sinks: Singer for full configuration.
pg_tide as a Singer Tap Consumer
When used as a source, pg_tide runs a Singer tap subprocess and ingests its output into a pg_tide inbox:
SELECT tide.relay_set_inbox(
'salesforce-sync',
'crm_inbox',
'{
"source_type": "singer",
"tap_command": "tap-salesforce",
"tap_config": {
"client_id": "${env:SF_CLIENT_ID}",
"start_date": "2024-01-01"
}
}'::jsonb
);
See Sources: Singer for full configuration.
STATE Persistence
Singer STATE messages contain bookmarks — the last sync position for each stream. pg_tide persists these in the catalog so incremental syncs resume where they left off:
- Tap emits STATE message after processing a page of records
- pg_tide writes STATE to
tide.singer_statetable - On next run, pg_tide passes the saved STATE back to the tap via
--stateargument - Tap resumes from the bookmark (only fetches new/changed records)
Schema Handling
When a tap emits a SCHEMA message, pg_tide uses it for:
- Validation: Reject records that don't conform (optional)
- Evolution: Detect new fields and update downstream schemas
- Documentation: Store discovered schemas for inspection
On Schema Change
{
"on_schema_change": "log"
}
| Policy | Behavior |
|---|---|
"log" | Log the change, continue processing |
"stop" | Stop the pipeline (manual intervention needed) |
"evolve" | Automatically adapt (add new columns, etc.) |
Stream Selection
By default, all streams discovered by the tap are synced. To select specific streams:
{
"stream_filter": ["contacts", "deals", "companies"]
}
Compatible Taps and Targets
Any Singer-compatible tap or target works with pg_tide. Popular examples:
Taps (data sources): tap-salesforce, tap-hubspot, tap-stripe, tap-github, tap-postgres, tap-mysql, tap-google-analytics, tap-shopify, tap-zendesk
Targets (data loaders): target-postgres, target-snowflake, target-bigquery, target-redshift, target-s3-csv, target-jsonl
Browse the full catalog at hub.meltano.com.
Further Reading
- Sources: Singer — Running Singer taps
- Sinks: Singer — Running Singer targets
- Airbyte Protocol — Alternative connector ecosystem
Feature: Airbyte Protocol Support
pg_tide implements the Airbyte protocol for running Airbyte source and destination connectors. This gives you access to approximately 400 data connectors from the Airbyte catalog, each packaged as a Docker container with a standardized interface.
What is the Airbyte Protocol?
The Airbyte protocol defines how source connectors (extractors) and destination connectors (loaders) communicate. Connectors are Docker containers that read configuration from a JSON file and exchange messages via stdout/stdin:
- AirbyteRecordMessage — A single data row
- AirbyteStateMessage — Sync checkpoint for incremental mode
- AirbyteCatalogMessage — Available streams and their schemas
- AirbyteLogMessage — Connector log output
pg_tide as an Airbyte Source Host
pg_tide can run Airbyte source connectors and ingest their records into an inbox:
SELECT tide.relay_set_inbox(
'salesforce-data',
'crm_inbox',
'{
"source_type": "airbyte",
"source_image": "airbyte/source-salesforce:latest",
"source_config": {
"client_id": "${env:SF_CLIENT_ID}",
"client_secret": "${env:SF_CLIENT_SECRET}",
"refresh_token": "${env:SF_REFRESH_TOKEN}"
},
"streams": ["contacts", "opportunities"],
"sync_mode": "incremental"
}'::jsonb
);
See Sources: Airbyte for full configuration.
pg_tide as an Airbyte Destination Host
pg_tide can run Airbyte destination connectors and feed them outbox messages:
SELECT tide.relay_set_outbox(
'warehouse-sync',
'analytics_events',
'{
"sink_type": "airbyte",
"destination_image": "airbyte/destination-bigquery:latest",
"destination_config": {
"project_id": "my-project",
"dataset_id": "raw_events",
"credentials_json": "${env:GCP_CREDENTIALS}"
}
}'::jsonb
);
See Sinks: Airbyte for full configuration.
Sync Modes
| Mode | Behavior |
|---|---|
incremental | Only new/changed records since last sync |
full_refresh | Re-extract all records on every run |
Incremental mode persists state between runs (same as Singer STATE), so each sync only transfers the delta.
Docker Requirement
Airbyte connectors run as Docker containers. The relay host must have Docker available:
# Verify Docker is accessible
docker ps
The relay pulls connector images automatically on first use. For air-gapped environments, pre-pull images into a local registry.
Differences from Singer
| Aspect | Singer | Airbyte |
|---|---|---|
| Packaging | Python packages (pip) | Docker containers |
| Discovery | --discover flag | Separate discover command |
| State | JSON file on stdin | State messages in protocol |
| Schema | SCHEMA messages | Catalog with supported sync modes |
| Ecosystem | ~500 connectors | ~400 connectors |
| Overhead | Low (native process) | Higher (Docker container per sync) |
Choose Singer when you want lightweight connectors without Docker. Choose Airbyte when you need connectors only available in the Airbyte catalog or prefer container isolation.
Further Reading
- Sources: Airbyte — Running Airbyte source connectors
- Sinks: Airbyte — Running Airbyte destination connectors
- Singer Protocol — Alternative connector ecosystem
Feature: Fivetran Destination Support
pg_tide can function as a Fivetran destination connector, receiving data from Fivetran's managed extraction pipelines and writing it into a pg_tide inbox. This lets you use Fivetran's 300+ managed connectors while routing the data through your transactional inbox for further processing.
How It Works
Fivetran manages the extraction side (connecting to sources like Salesforce, Stripe, databases) and pushes data to a destination. pg_tide acts as that destination — receiving Fivetran's standardized output and writing records into your PostgreSQL inbox table.
Fivetran Cloud → HTTP Push → pg_tide (Fivetran destination) → Inbox table
Configuration
SELECT tide.relay_set_inbox(
'fivetran-crm',
'crm_inbox',
'{
"source_type": "fivetran",
"listen_addr": "0.0.0.0:8080",
"api_key": "${env:FIVETRAN_API_KEY}",
"api_secret": "${env:FIVETRAN_API_SECRET}"
}'::jsonb
);
Configuration Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
source_type | string | — | Must be "fivetran" |
listen_addr | string | "0.0.0.0:8080" | HTTP server address |
api_key | string | — | Fivetran API key for authentication |
api_secret | string | — | Fivetran API secret |
When to Use Fivetran vs Singer/Airbyte
| Aspect | Fivetran | Singer/Airbyte |
|---|---|---|
| Management | Fully managed (Fivetran Cloud) | Self-hosted |
| Scheduling | Fivetran handles sync schedule | You manage cron/orchestration |
| Monitoring | Fivetran dashboard | Your own metrics |
| Cost | Per-row pricing | Free (compute cost only) |
| Connector quality | Enterprise-grade, maintained by Fivetran | Community-maintained |
Choose Fivetran when you want zero-maintenance extraction with enterprise SLAs. Choose Singer/Airbyte when you want full control and cost predictability.
Further Reading
- Sinks: Fivetran — Acting as a Fivetran destination
- Singer Protocol — Self-hosted alternative
- Airbyte Protocol — Self-hosted alternative with Docker
Deployment
This page covers everything you need to deploy pg_tide in production: from a single-machine setup to highly-available Kubernetes deployments. pg_tide has two components to deploy — the PostgreSQL extension and the relay binary — and both are designed to be operationally simple.
Components Overview
| Component | What it is | Where it runs | State |
|---|---|---|---|
| pg_tide extension | SQL functions + catalog tables | Inside your PostgreSQL database | All state in PostgreSQL tables |
| pg-tide relay | Standalone binary that bridges messages | Anywhere with network access to PostgreSQL + sinks | Stateless — all state in PostgreSQL |
The relay binary is completely stateless. You can kill it, restart it, replace it, scale it up or down — it always recovers from the last committed offset stored in PostgreSQL. This makes deployment and upgrades straightforward.
Extension Deployment
Install the extension on your PostgreSQL 18+ server:
CREATE EXTENSION pg_tide;
The extension creates the tide schema with all required tables, views, triggers, and functions. It requires no background workers, no shared memory, and no file system access — making it compatible with:
- All managed PostgreSQL services (RDS, Cloud SQL, Azure Database, Supabase, Neon)
- Connection poolers (PgBouncer, PgCat, Pgpool-II)
- CloudNativePG and other Kubernetes operators
- Standard replication setups (streaming, logical)
Permissions
The extension can be installed by any user with CREATE privilege on the database. No superuser required. After installation, grant appropriate permissions:
-- Application users can publish to outboxes
GRANT USAGE ON SCHEMA tide TO app_user;
GRANT EXECUTE ON FUNCTION tide.outbox_publish(text, jsonb, jsonb) TO app_user;
-- Relay user needs read/write access to message tables
CREATE ROLE pg_tide_relay LOGIN PASSWORD 'strong-password';
GRANT USAGE ON SCHEMA tide TO pg_tide_relay;
GRANT SELECT, UPDATE ON tide.tide_outbox_messages TO pg_tide_relay;
GRANT SELECT ON tide.tide_outbox_config TO pg_tide_relay;
GRANT SELECT ON tide.relay_outbox_config TO pg_tide_relay;
GRANT SELECT ON tide.relay_inbox_config TO pg_tide_relay;
GRANT SELECT, INSERT, UPDATE ON tide.tide_consumer_offsets TO pg_tide_relay;
GRANT SELECT, INSERT, UPDATE, DELETE ON tide.tide_consumer_leases TO pg_tide_relay;
GRANT SELECT, INSERT, UPDATE ON tide.relay_consumer_offsets TO pg_tide_relay;
Standalone Binary Deployment
The simplest deployment: download the relay binary and run it directly.
Download and install
# Linux (amd64)
curl -LO https://github.com/trickle-labs/pg-tide/releases/latest/download/pg-tide-x86_64-unknown-linux-gnu.tar.gz
tar xzf pg-tide-x86_64-unknown-linux-gnu.tar.gz
sudo mv pg-tide /usr/local/bin/
# macOS (Apple Silicon)
curl -LO https://github.com/trickle-labs/pg-tide/releases/latest/download/pg-tide-aarch64-apple-darwin.tar.gz
tar xzf pg-tide-aarch64-apple-darwin.tar.gz
sudo mv pg-tide /usr/local/bin/
Run the relay
pg-tide \
--postgres-url "postgres://pg_tide_relay:pass@db.internal:5432/app" \
--relay-group-id production \
--log-format json \
--metrics-addr 0.0.0.0:9090
Systemd service (Linux)
For production Linux deployments, run the relay as a systemd service:
# /etc/systemd/system/pg-tide-relay.service
[Unit]
Description=pg-tide relay
After=network-online.target postgresql.service
Wants=network-online.target
[Service]
Type=simple
User=pgtide
Group=pgtide
ExecStart=/usr/local/bin/pg-tide \
--config /etc/pg-tide/relay.toml
Restart=always
RestartSec=5
# Security hardening
NoNewPrivileges=yes
ProtectSystem=strict
ProtectHome=yes
PrivateTmp=yes
[Install]
WantedBy=multi-user.target
sudo systemctl enable pg-tide-relay
sudo systemctl start pg-tide-relay
Docker Deployment
The official Docker image is lightweight (~20 MB, Alpine-based) and runs as a non-root user.
Quick start
docker run -d \
--name pg-tide-relay \
-e PG_TIDE_POSTGRES_URL="postgres://user:pass@host.docker.internal:5432/mydb" \
-e PG_TIDE_LOG_FORMAT="json" \
-e PG_TIDE_GROUP_ID="production" \
-p 9090:9090 \
ghcr.io/trickle-labs/pg-tide:latest
Image details
| Property | Value |
|---|---|
| Base | Alpine 3.21 |
| Size | ~20 MB |
| User | pgtide (UID 1000) |
| Entrypoint | pg-tide |
| Exposed Port | 9090 (metrics + health) |
Environment variables
All relay configuration can be passed via environment variables:
| Variable | Description | Required |
|---|---|---|
PG_TIDE_POSTGRES_URL | PostgreSQL connection string | Yes |
PG_TIDE_METRICS_ADDR | Metrics endpoint (default: 0.0.0.0:9090) | No |
PG_TIDE_LOG_FORMAT | text or json | No |
PG_TIDE_LOG_LEVEL | error, warn, info, debug, trace | No |
PG_TIDE_GROUP_ID | Relay group ID for HA coordination | No |
Docker Compose (complete development environment)
This sets up PostgreSQL with pg_tide, a NATS server, and the relay — everything you need for local development:
# docker-compose.yml
services:
postgres:
image: postgres:18
environment:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: app
ports:
- "5432:5432"
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s
timeout: 3s
retries: 5
nats:
image: nats:latest
ports:
- "4222:4222"
- "8222:8222"
pg-tide-relay:
image: ghcr.io/trickle-labs/pg-tide:latest
depends_on:
postgres:
condition: service_healthy
environment:
PG_TIDE_POSTGRES_URL: "postgres://postgres:postgres@postgres:5432/app"
PG_TIDE_LOG_FORMAT: "json"
PG_TIDE_LOG_LEVEL: "info"
ports:
- "9090:9090"
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "http://localhost:9090/health"]
interval: 10s
timeout: 5s
retries: 3
volumes:
pgdata:
Building a custom image
If you need to bundle the extension with PostgreSQL:
FROM postgres:18
# Copy compiled extension files
COPY pg_tide.so /usr/lib/postgresql/18/lib/
COPY pg_tide.control /usr/share/postgresql/18/extension/
COPY sql/pg_tide--0.1.0.sql /usr/share/postgresql/18/extension/
Kubernetes Deployment
For Kubernetes deployments, the relay runs as a standard Deployment with health checks, Prometheus metrics scraping, and optional horizontal scaling for HA.
Basic deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: pg-tide-relay
labels:
app: pg-tide-relay
spec:
replicas: 2 # HA: advisory locks prevent duplicate processing
selector:
matchLabels:
app: pg-tide-relay
template:
metadata:
labels:
app: pg-tide-relay
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
prometheus.io/path: "/metrics"
spec:
containers:
- name: relay
image: ghcr.io/trickle-labs/pg-tide:0.1.0
env:
- name: PG_TIDE_POSTGRES_URL
valueFrom:
secretKeyRef:
name: pg-tide-secrets
key: postgres-url
- name: PG_TIDE_LOG_FORMAT
value: "json"
- name: PG_TIDE_GROUP_ID
value: "production"
ports:
- containerPort: 9090
name: metrics
livenessProbe:
httpGet:
path: /health
port: 9090
initialDelaySeconds: 5
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 9090
initialDelaySeconds: 3
periodSeconds: 5
resources:
requests:
cpu: 50m
memory: 32Mi
limits:
cpu: 500m
memory: 128Mi
securityContext:
runAsNonRoot: true
runAsUser: 1000
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
Secret
apiVersion: v1
kind: Secret
metadata:
name: pg-tide-secrets
type: Opaque
stringData:
postgres-url: "postgres://pg_tide_relay:secret@pg-cluster-rw:5432/app?sslmode=require"
Service (for metrics scraping)
apiVersion: v1
kind: Service
metadata:
name: pg-tide-relay
labels:
app: pg-tide-relay
spec:
selector:
app: pg-tide-relay
ports:
- port: 9090
targetPort: 9090
name: metrics
CloudNativePG integration
If you use CloudNativePG, deploy the relay as a sidecar alongside your PostgreSQL pods. The relay connects to localhost:5432 via the CNPG-generated app secret:
spec:
sidecars:
- name: pg-tide-relay
image: ghcr.io/trickle-labs/pg-tide:0.1.0
env:
- name: PG_TIDE_POSTGRES_URL
valueFrom:
secretKeyRef:
name: my-cluster-app
key: uri
See CloudNativePG Integration for the complete setup.
Helm chart
The project includes a Helm chart at examples/helm/pg-tide/:
helm install pg-tide-relay ./examples/helm/pg-tide \
--set relay.postgresUrl="postgres://..." \
--set relay.groupId="production" \
--set relay.replicas=2
High Availability
Running multiple relay instances with the same relay_group_id provides automatic failover:
# Instance A — acquires locks for pipelines 1, 3, 5
pg-tide --relay-group-id production --postgres-url ...
# Instance B — acquires locks for pipelines 2, 4, 6
pg-tide --relay-group-id production --postgres-url ...
# If Instance A crashes, Instance B acquires pipelines 1, 3, 5 within seconds
How it works:
- Each relay instance attempts to acquire a PostgreSQL advisory lock for each pipeline
- Only one instance can hold each lock — the lock owner processes that pipeline
- If the owner crashes, its PostgreSQL session ends, locks are released
- Other instances detect the released locks and acquire them on their next discovery cycle (every 30s by default, or immediately via LISTEN/NOTIFY)
Important: More replicas means faster failover, not more parallelism per pipeline. Each pipeline is always processed by exactly one relay instance.
Resource Requirements
The relay is lightweight and predictable in its resource usage:
| Resource | Typical usage | Notes |
|---|---|---|
| CPU | ~50m per active pipeline | Scales with message volume and sink latency |
| Memory | 20-50 MB base + message buffer | Buffer grows with batch_size × average message size |
| Network | PostgreSQL connection + sink connections | 1 persistent PG connection + 1 LISTEN channel per instance |
| Disk | None | Completely stateless — all state in PostgreSQL |
For capacity planning: a relay instance handling 10 active pipelines at 1,000 messages/second total typically uses ~100m CPU and ~64 MB memory.
Pre-Deployment Checklist
Before going live, verify each item:
-
PostgreSQL 18+ with
pg_tideextension installed and verified - Relay binary or Docker image available and version-pinned
- Pipeline configurations created in the database
- Consumer groups created for each forward pipeline
- Relay database user created with minimal privileges (see permissions above)
-
TLS enabled for PostgreSQL connection (
sslmode=require) - Monitoring configured (Prometheus scrape target + alerting rules)
- Health check configured in load balancer / orchestrator
-
At least 2 relay instances for HA (same
relay_group_id) - Log aggregation configured (structured JSON logs recommended)
- Backup strategy verified (standard PostgreSQL backups cover all pg_tide state)
Zero-Downtime Upgrades
Upgrading the relay binary requires no downtime:
- Deploy new relay instances alongside old ones (same
relay_group_id) - New instances start and wait for advisory locks
- Gracefully stop old instances (
SIGTERM) - Old instances drain in-flight messages, commit final offsets, release locks
- New instances acquire the released locks and resume processing
- No messages are lost or duplicated
For Kubernetes rolling updates, this happens automatically:
kubectl set image deployment/pg-tide-relay relay=ghcr.io/trickle-labs/pg-tide:0.2.0
For the extension itself:
-- Check current version
SELECT extversion FROM pg_extension WHERE extname = 'pg_tide';
-- Upgrade (runs migration SQL automatically)
ALTER EXTENSION pg_tide UPDATE TO '0.2.0';
Always upgrade the extension before upgrading the relay binary.
Deployment Architectures
This guide covers proven deployment patterns for pg_tide in production, from simple single-instance setups to globally distributed architectures.
Single Instance
The simplest deployment: one relay process connected to one PostgreSQL database.
┌─────────────┐ ┌────────────┐ ┌──────────┐
│ Application │──────→│ PostgreSQL │──────→│ pg-tide │──→ Sinks
│ (INSERT) │ │ (outbox) │ │ (relay) │
└─────────────┘ └────────────┘ └──────────┘
When to use: Development, staging, low-throughput production (< 10,000 msg/s), applications where simplicity matters more than redundancy.
Pros: Simple to operate, no coordination overhead, easy to debug.
Cons: Single point of failure. If the relay crashes, messages queue in the outbox until it restarts.
Active-Active High Availability
Multiple relay instances share the workload using PostgreSQL advisory locks for coordination.
┌──────────┐ ┌────────────┐ ┌───────────┐
│ pg-tide │────→│ │ │ │
│ relay #1 │ │ │────→│ Sinks │
└──────────┘ │ PostgreSQL │ │ │
│ │ └───────────┘
┌──────────┐ │ │
│ pg-tide │────→│ │
│ relay #2 │ └────────────┘
└──────────┘
Each instance acquires advisory locks for pipelines. If instance #1 crashes, instance #2 picks up its pipelines within one discovery interval (default: 30 seconds).
When to use: Production workloads requiring fault tolerance.
Configuration:
# Both instances use the same relay-group-id
pg-tide --relay-group-id "production" --discovery-interval 10
Pros: Automatic failover, no manual intervention, pipelines distributed across instances.
Cons: Slightly more complex to deploy and monitor.
Kubernetes Deployment
The recommended production deployment for most teams. pg_tide runs as a Kubernetes Deployment with multiple replicas.
apiVersion: apps/v1
kind: Deployment
metadata:
name: pg-tide-relay
spec:
replicas: 3
selector:
matchLabels:
app: pg-tide-relay
template:
metadata:
labels:
app: pg-tide-relay
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
spec:
terminationGracePeriodSeconds: 60
containers:
- name: pg-tide
image: ghcr.io/your-org/pg-tide:latest
args:
- --postgres-url
- $(DATABASE_URL)
- --relay-group-id
- production
- --shutdown-timeout
- "45"
ports:
- containerPort: 9090
name: metrics
livenessProbe:
httpGet:
path: /health
port: 9090
initialDelaySeconds: 10
periodSeconds: 15
readinessProbe:
httpGet:
path: /health
port: 9090
initialDelaySeconds: 5
periodSeconds: 5
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: pg-tide-secrets
key: database-url
Key considerations:
- Set
terminationGracePeriodSeconds>shutdown-timeout - Use
readinessProbeon/healthto remove unhealthy pods from service discovery - Store secrets in Kubernetes Secrets or external secret managers
- Use
PodDisruptionBudgetto prevent all replicas from being evicted simultaneously
Sidecar Pattern
Run pg_tide as a sidecar container alongside your application in the same pod. The relay connects to the same database and handles message delivery.
spec:
containers:
- name: app
image: your-app:latest
- name: pg-tide
image: ghcr.io/your-org/pg-tide:latest
args: ["--postgres-url", "$(DATABASE_URL)"]
When to use: When each service has its own dedicated outbox and you want relay lifecycle tied to the application.
Pros: Simple lifecycle management, dedicated resources per service.
Cons: More total relay instances, each handling fewer pipelines.
Multi-Region
For globally distributed applications, run relay instances in each region connecting to local read replicas or regional databases.
Region US: Region EU:
┌──────────┐ ┌──────────┐
│ pg-tide │──→ US Sinks │ pg-tide │──→ EU Sinks
│ relay │ │ relay │
└──────────┘ └──────────┘
│ │
↓ ↓
┌──────────┐ ┌──────────┐
│ PG (US) │ ←── replication ──→ │ PG (EU) │
└──────────┘ └──────────┘
Key considerations:
- Use different
relay-group-idper region to prevent cross-region lock contention - Or use the same group ID with a shared database for automatic geographic failover
- Consider latency to sinks when choosing relay placement
Further Reading
- HA Coordination — Advisory lock mechanics
- Graceful Shutdown — Kubernetes shutdown integration
- Scaling — Throughput optimization
- CloudNativePG Integration — PostgreSQL operator setup
Scaling
Strategies for scaling pg_tide as your message volume grows.
Relay Scaling
Horizontal (more instances)
Run multiple relay instances with the same relay_group_id. PostgreSQL advisory locks distribute pipeline ownership automatically.
- Each pipeline is handled by exactly one relay at a time
- Adding more relays improves failover speed, not per-pipeline throughput
- Useful when you have many pipelines
Vertical (more resources)
For a single high-throughput pipeline:
- Increase
batch_sizeto reduce per-message overhead - Tune
sink_max_inflightfor higher concurrency - Ensure the relay has sufficient CPU and network bandwidth
Outbox Scaling
Batch Size
Larger batches reduce round-trips but increase latency for individual messages:
SELECT tide.relay_set_outbox('high-volume', 'events', 'nats',
'{"url": "nats://localhost:4222", "subject": "events"}'::jsonb,
p_batch_size := 500
);
Multiple Outboxes
Split high-volume event streams across multiple outboxes for parallel relay consumption:
SELECT tide.outbox_create('orders-us', 24);
SELECT tide.outbox_create('orders-eu', 24);
Each outbox gets its own pipeline and relay ownership lock.
Index Performance
The default partial index on tide_outbox_messages is optimized for pending-message polling:
-- Already created by the extension:
CREATE INDEX idx_tide_outbox_messages_pending
ON tide.tide_outbox_messages (outbox_name, id)
WHERE consumed_at IS NULL;
At very high volumes (>10M rows), consider time-based partitioning.
PostgreSQL Scaling
Connection Pooling
The relay uses a single PostgreSQL connection. For applications with many connections, use PgBouncer or PgCat — pg_tide is fully compatible with connection poolers in transaction mode.
Read Replicas
The relay must connect to the primary (it writes offsets and marks messages consumed). However, monitoring queries (outbox_pending, consumer_lag) can safely run against read replicas.
Throughput Benchmarks
Typical performance on a standard cloud instance (4 vCPU, 16 GB RAM):
| Scenario | Throughput |
|---|---|
| Outbox publish (single connection) | ~15,000 msg/s |
| Relay forward (NATS sink) | ~10,000 msg/s |
| Relay forward (Kafka sink) | ~8,000 msg/s |
| Relay reverse (NATS source → inbox) | ~5,000 msg/s |
These are conservative estimates. Actual performance depends on message size, PostgreSQL configuration, and network latency.
Capacity Planning
This guide helps you estimate the resources needed for pg_tide based on your workload characteristics.
Key Dimensions
Capacity planning for pg_tide involves three systems: PostgreSQL (where outbox/inbox tables live), the relay process (CPU and memory), and the network (bandwidth to sinks).
PostgreSQL
The outbox table is the primary bottleneck for most deployments. Key factors:
| Factor | Impact | Mitigation |
|---|---|---|
| Write throughput | INSERT rate into outbox tables | Connection pooling, partitioning |
| Table size | Unrelayed rows waiting for delivery | Tune batch size and poll interval |
| Index maintenance | Outbox has sequential ID index | Minimal — append-only workload |
| Disk I/O | WAL writes for each INSERT | Fast storage, WAL tuning |
Rule of thumb: A single PostgreSQL instance handles 10,000-50,000 outbox inserts/second depending on row size and hardware. The relay processes rows faster than most applications can generate them.
Relay Process
The relay is CPU-light and memory-light for most workloads:
| Workload | CPU | Memory | Bottleneck |
|---|---|---|---|
| 1,000 msg/s, JSON, Kafka | 0.1 core | 50 MB | Network to Kafka |
| 10,000 msg/s, JSON, Kafka | 0.5 core | 100 MB | Kafka ack latency |
| 10,000 msg/s, Avro, Schema Registry | 1 core | 200 MB | Avro serialization |
| 50,000 msg/s, JSON, NATS | 0.3 core | 80 MB | Network |
| 1,000 msg/s, HTTP webhook | 0.1 core | 50 MB | Webhook response time |
Rule of thumb: Start with 0.5 CPU and 256 MB memory. Monitor actual usage and adjust.
Network
Bandwidth depends on message size and throughput:
Bandwidth = messages_per_second × average_message_size_bytes
Example: 10,000 msg/s × 1 KB/msg = 10 MB/s = 80 Mbps
Sizing Formulas
Outbox Table Growth
If the relay is down (or slower than production), the outbox grows:
Rows pending = (insert_rate - relay_rate) × downtime_seconds
Disk usage = rows_pending × average_row_size
Example: 5,000 inserts/s with relay down for 5 minutes:
- Rows: 5,000 × 300 = 1,500,000
- Disk: 1,500,000 × 500 bytes = 750 MB
Consumer Lag Recovery Time
After an outage, how long to drain the backlog:
Recovery time = pending_rows / (relay_rate - insert_rate)
Example: 1.5M pending rows, relay at 20,000/s, inserts at 5,000/s:
- Recovery: 1,500,000 / 15,000 = 100 seconds
Relay Instance Count
For active-active HA with balanced load:
Instances = ceil(total_pipelines / pipelines_per_instance)
Most pipelines consume negligible resources. Start with 2 instances (for HA) and scale based on actual throughput needs.
Batch Size Tuning
Batch size affects both throughput and latency:
| Batch Size | Throughput | Latency | Use Case |
|---|---|---|---|
| 1 | Lowest | Lowest | Real-time notifications |
| 10-50 | Medium | Low | General event streaming |
| 100-500 | High | Medium | Analytics, data lake loading |
| 1000+ | Highest | Higher | Bulk ETL, backfill |
Configure per-pipeline:
{ "batch_size": 100 }
Or set a process-wide default:
pg-tide --default-batch-size 100
PostgreSQL Configuration
Key PostgreSQL settings for outbox-heavy workloads:
-- Connection handling
max_connections = 200 -- Enough for app + relay + monitoring
shared_buffers = '4GB' -- 25% of RAM
-- WAL configuration (important for high-insert workloads)
wal_buffers = '64MB'
max_wal_size = '4GB'
checkpoint_completion_target = 0.9
-- Vacuuming (outbox rows are deleted after relay)
autovacuum_vacuum_scale_factor = 0.01 -- Vacuum more aggressively
autovacuum_naptime = '10s'
Monitoring for Capacity
Set alerts on these metrics to detect capacity issues early:
| Metric | Warning Threshold | Action |
|---|---|---|
pg_tide_consumer_lag | > 10,000 | Increase batch size or add relay instances |
| CPU usage (relay) | > 70% sustained | Add CPU or split pipelines |
| PostgreSQL connections | > 80% of max | Increase max_connections or use pgBouncer |
| Disk usage growth | > 1 GB/hour unrelayed | Investigate relay health |
Further Reading
- Scaling — Strategies for increasing throughput
- Deployment Architectures — Choosing your topology
- Monitoring — Setting up observability
Maintenance
This page covers the ongoing maintenance tasks for a pg_tide deployment: backups, upgrades, retention management, and capacity planning. Because pg_tide stores all state in PostgreSQL, maintenance is straightforward — your existing PostgreSQL operational practices already cover most of what you need.
Backup and Restore
What needs to be backed up
The good news: everything pg_tide needs is in PostgreSQL. There's no external state, no local files, no configuration that lives outside the database. Your existing backup strategy already covers pg_tide.
| Object | Table/Schema | Purpose |
|---|---|---|
| Outbox configurations | tide.tide_outbox_config | Outbox definitions (names, retention, thresholds) |
| Outbox messages | tide.tide_outbox_messages | Pending and recently-consumed messages |
| Consumer groups | tide.tide_consumer_groups | Group definitions |
| Consumer offsets | tide.tide_consumer_offsets | Processing progress (critical for resume) |
| Consumer leases | tide.tide_consumer_leases | In-flight batch reservations |
| Inbox configurations | tide.tide_inbox_config | Inbox definitions |
| Inbox message tables | tide."{name}_inbox" | Per-inbox message tables |
| Relay pipeline configs | tide.relay_outbox_config, tide.relay_inbox_config | Pipeline definitions |
| Relay offsets | tide.relay_consumer_offsets | Relay progress tracking |
Logical backup with pg_dump
For targeted backups of just the pg_tide state (useful for migration or cloning):
# Back up only the tide schema
pg_dump \
--schema=tide \
--no-owner \
--no-privileges \
--format=custom \
--file=pg_tide_backup.dump \
"$DATABASE_URL"
# Restore
pg_restore \
--schema=tide \
--no-owner \
--clean \
--if-exists \
--dbname="$DATABASE_URL" \
pg_tide_backup.dump
Physical backup (recommended for production)
Physical backups via pg_basebackup or a CloudNativePG Backup resource capture the entire cluster. This is the preferred approach because:
- Point-in-time recovery (PITR) is available — restore to any moment
- Consistency — outbox messages and consumer offsets are consistent with the application tables they reference
- No extra configuration — pg_tide is just tables, indexes, and functions
Point-in-time recovery
Restoring to a previous point in time is safe with pg_tide:
- Stop the relay before beginning the restore
- Restore the database to the target point in time
- Check consumer lag:
SELECT * FROM tide.consumer_lag— any messages whose offset is now ahead of the restored outbox will be re-delivered on relay startup - The inbox dedup prevents duplicates — re-delivered messages are caught by the UNIQUE constraint
- Restart the relay — it resumes from the restored committed offset
What you do NOT need to back up
The relay binary holds no persistent state. All configuration, offsets, and messages live in PostgreSQL. A new relay instance pointing at a restored database picks up exactly where the previous one left off.
Upgrades
Extension upgrades
pg_tide uses PostgreSQL's built-in extension versioning. Each version transition has a migration script:
-- Check current version
SELECT extversion FROM pg_extension WHERE extname = 'pg_tide';
-- Upgrade (PostgreSQL runs the migration script automatically)
ALTER EXTENSION pg_tide UPDATE TO '0.2.0';
The migration script (sql/pg_tide--0.1.0--0.2.0.sql) handles all schema changes. Your data is preserved.
Rollback: PostgreSQL does not support extension downgrades via ALTER EXTENSION. To roll back, restore from a backup taken before the upgrade.
Best practice: Always take a backup immediately before upgrading the extension.
Relay upgrades
The relay binary is stateless, making upgrades trivial:
Standalone binary:
# Stop the current relay
systemctl stop pg-tide-relay
# Replace the binary
curl -LO https://github.com/trickle-labs/pg-tide/releases/latest/download/pg-tide-x86_64-unknown-linux-gnu.tar.gz
tar xzf pg-tide-*.tar.gz
sudo mv pg-tide /usr/local/bin/
# Restart
systemctl start pg-tide-relay
Docker / Kubernetes:
kubectl set image deployment/pg-tide-relay relay=ghcr.io/trickle-labs/pg-tide:0.2.0
Rolling updates work seamlessly: new instances wait for advisory locks, old instances release them during graceful shutdown.
Zero-downtime upgrade procedure
For deployments that cannot tolerate any message delivery gap:
- Deploy new relay instances alongside old ones (same
relay_group_id) - New instances start and attempt to acquire advisory locks (blocked by old instances)
- Gracefully stop old instances (
SIGTERMor pod termination) - Old instances drain in-flight messages, commit final offsets, release locks
- New instances acquire the freed locks within seconds
- Processing resumes from the last committed offset — no gap, no duplicates
Compatibility matrix
| pg_tide Extension | Relay Binary | PostgreSQL |
|---|---|---|
| 0.1.x | 0.1.x | 18+ |
Rule: Always upgrade the extension first, then the relay binary. The relay is forward-compatible with same-minor extension versions.
Retention and Cleanup
Outbox retention
Each outbox has a configurable retention_hours. After messages are consumed and the retention period elapses, they're eligible for cleanup:
-- Create with custom retention
SELECT tide.outbox_create('high-volume', p_retention_hours := 12);
-- Change retention for an existing outbox
UPDATE tide.tide_outbox_config
SET retention_hours = 24
WHERE outbox_name = 'high-volume';
Trigger cleanup manually:
SELECT tide.outbox_truncate_delivered();
Or automate with pg_cron:
-- Clean all outboxes every hour
SELECT cron.schedule(
'cleanup-outbox',
'0 * * * *',
'SELECT tide.outbox_truncate_delivered()'
);
Inbox retention
Inbox tables accumulate processed messages for auditing. The processed_retention_hours parameter controls when they're cleaned:
-- Create inbox with aggressive cleanup (24h retention)
SELECT tide.inbox_create('high-volume-inbox',
p_processed_retention_hours := 24,
p_dlq_retention_hours := 168
);
-- Manual cleanup
SELECT tide.inbox_truncate_processed('high-volume-inbox');
Storage sizing
For capacity planning, estimate storage needs:
| Factor | Formula |
|---|---|
| Outbox storage | message_rate × avg_message_size × retention_hours × 3600 |
| Index overhead | ~30% of table size |
| Inbox storage | inbound_rate × avg_message_size × processed_retention_hours × 3600 |
Example: 1,000 messages/second × 1 KB average × 24 hours retention = ~82 GB of outbox data before cleanup. In practice, with regular cleanup, steady-state usage is much lower because consumed messages are cleaned before retention expires.
Monitoring and Health
Essential monitoring queries
Run these periodically (or expose via a PostgreSQL exporter):
-- Pending messages (should be low if relay is healthy)
SELECT * FROM tide.outbox_pending;
-- Consumer lag (alert if growing)
SELECT * FROM tide.consumer_lag WHERE lag > 1000;
-- Active relay pipelines
SELECT name, enabled, config->>'outbox' as outbox
FROM tide.relay_outbox_config WHERE enabled;
-- Relay offset freshness (stale = relay might be down)
SELECT pipeline_id, updated_at, now() - updated_at AS age
FROM tide.relay_consumer_offsets
WHERE now() - updated_at > interval '5 minutes';
Prometheus alerting rules
groups:
- name: pg-tide
rules:
- alert: PgTideRelayDown
expr: pg_tide_relay_pipeline_healthy == 0
for: 2m
labels:
severity: critical
annotations:
summary: "pg-tide relay pipeline {{ $labels.pipeline }} is unhealthy"
- alert: PgTideHighConsumerLag
expr: pg_tide_consumer_lag > 10000
for: 5m
labels:
severity: warning
annotations:
summary: "Consumer {{ $labels.group_name }} has lag of {{ $value }}"
- alert: PgTideDeliveryErrors
expr: rate(pg_tide_relay_publish_errors_total[5m]) > 0
for: 1m
labels:
severity: warning
annotations:
summary: "pg-tide relay delivery errors on {{ $labels.pipeline }}"
Routine Maintenance Tasks
| Task | Frequency | Method |
|---|---|---|
| Check consumer lag | Continuous (Prometheus) | SELECT * FROM tide.consumer_lag |
| Outbox cleanup | Hourly (pg_cron) | SELECT tide.outbox_truncate_delivered() |
| Inbox cleanup | Daily (pg_cron) | SELECT tide.inbox_truncate_processed('name') |
| Relay log review | Daily | Check for recurring errors or warnings |
| Extension upgrades | As released | ALTER EXTENSION pg_tide UPDATE TO 'x.y.z' |
| Relay upgrades | As released | Rolling binary replacement |
| Backup verification | Weekly | Restore to a test environment and verify |
| Index bloat check | Monthly | REINDEX INDEX CONCURRENTLY if needed |
Monitoring Cookbook
Practical recipes for monitoring pg_tide in production. Each recipe addresses a specific operational concern with ready-to-use PromQL queries, alert rules, and dashboard configurations.
Recipe: Basic Health Monitoring
Goal: Know immediately when something is wrong.
Alerts
groups:
- name: pg-tide-health
rules:
- alert: PgTidePipelineUnhealthy
expr: pg_tide_pipeline_healthy == 0
for: 2m
labels:
severity: critical
annotations:
summary: "Pipeline {{ $labels.pipeline }} is unhealthy (circuit breaker open)"
- alert: PgTideNoActivity
expr: rate(pg_tide_messages_published_total[10m]) == 0
for: 15m
labels:
severity: warning
annotations:
summary: "Pipeline {{ $labels.pipeline }} has published zero messages for 15 minutes"
Dashboard Panel
# Traffic light: 1 = green, 0 = red
pg_tide_pipeline_healthy
Recipe: Throughput Monitoring
Goal: Understand message flow rates and detect anomalies.
Key Queries
# Messages published per second (per pipeline)
rate(pg_tide_messages_published_total[5m])
# Total throughput across all pipelines
sum(rate(pg_tide_messages_published_total[5m]))
# Publish success ratio
1 - (rate(pg_tide_publish_errors_total[5m]) / rate(pg_tide_messages_consumed_total[5m]))
Alert: Throughput Drop
- alert: PgTideThroughputDrop
expr: |
rate(pg_tide_messages_published_total[5m])
< 0.5 * rate(pg_tide_messages_published_total[1h] offset 1d)
for: 10m
labels:
severity: warning
annotations:
summary: "Pipeline {{ $labels.pipeline }} throughput dropped >50% vs yesterday"
Recipe: Latency Monitoring
Goal: Ensure messages are delivered within acceptable time bounds.
Key Queries
# P50 delivery latency
histogram_quantile(0.5, rate(pg_tide_delivery_latency_seconds_bucket[5m]))
# P99 delivery latency
histogram_quantile(0.99, rate(pg_tide_delivery_latency_seconds_bucket[5m]))
# Percentage of messages delivered within 1 second
sum(rate(pg_tide_delivery_latency_seconds_bucket{le="1.0"}[5m]))
/ sum(rate(pg_tide_delivery_latency_seconds_count[5m]))
Alert: High Latency
- alert: PgTideHighLatency
expr: histogram_quantile(0.99, rate(pg_tide_delivery_latency_seconds_bucket[5m])) > 5
for: 5m
labels:
severity: warning
annotations:
summary: "Pipeline {{ $labels.pipeline }} P99 latency exceeds 5 seconds"
Recipe: Consumer Lag Monitoring
Goal: Detect growing backlogs before they become critical.
Key Queries
# Current lag (pending messages)
pg_tide_consumer_lag
# Lag growth rate (positive = growing, negative = draining)
deriv(pg_tide_consumer_lag[5m])
# Estimated time to drain at current rate
pg_tide_consumer_lag / rate(pg_tide_messages_published_total[5m])
Alert: Growing Lag
- alert: PgTideGrowingLag
expr: pg_tide_consumer_lag > 10000
for: 10m
labels:
severity: warning
annotations:
summary: "Pipeline {{ $labels.pipeline }} has {{ $value }} pending messages"
- alert: PgTideCriticalLag
expr: pg_tide_consumer_lag > 100000
for: 5m
labels:
severity: critical
annotations:
summary: "Pipeline {{ $labels.pipeline }} has critical backlog: {{ $value }} messages"
Recipe: Error Rate Monitoring
Goal: Detect delivery problems early.
Key Queries
# Errors per second
rate(pg_tide_publish_errors_total[5m])
# Error ratio (errors / total consumed)
rate(pg_tide_publish_errors_total[5m]) / rate(pg_tide_messages_consumed_total[5m])
Alert: Error Spike
- alert: PgTideErrorSpike
expr: rate(pg_tide_publish_errors_total[5m]) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "Pipeline {{ $labels.pipeline }} has sustained errors: {{ $value }}/s"
Recipe: Dead Letter Queue Monitoring
Goal: Track messages that failed permanently and need attention.
SQL Query (for custom exporter or pg_stat_monitor)
-- Unresolved DLQ entries by pipeline
SELECT pipeline_name, count(*) as unresolved
FROM tide.relay_dlq
WHERE resolved_at IS NULL
GROUP BY pipeline_name;
Alert (via SQL-based exporter)
- alert: PgTideDLQGrowing
expr: pg_tide_dlq_unresolved > 0
for: 30m
labels:
severity: warning
annotations:
summary: "Pipeline {{ $labels.pipeline }} has {{ $value }} unresolved DLQ entries"
Recipe: Resource Monitoring
Goal: Ensure relay processes have adequate resources.
Key Queries (standard node/container metrics)
# CPU usage per relay pod
rate(container_cpu_usage_seconds_total{container="pg-tide"}[5m])
# Memory usage per relay pod
container_memory_working_set_bytes{container="pg-tide"}
# PostgreSQL active connections from relay
pg_stat_activity_count{application_name="pg-tide"}
Runbook Reference
| Alert | First Response | Escalation |
|---|---|---|
| PipelineUnhealthy | Check sink availability, review error logs | Restart relay if stuck |
| ThroughputDrop | Check source (outbox empty?), check sink (slow?) | Scale relay instances |
| HighLatency | Check batch size, check sink response time | Increase batch size or add instances |
| GrowingLag | Check relay health, check for slow transforms | Increase batch size, add instances |
| ErrorSpike | Check DLQ for error details, check sink logs | Fix root cause, replay DLQ |
| DLQGrowing | Inspect DLQ entries, identify error pattern | Fix issue, replay messages |
Further Reading
- Metrics — Full metrics reference
- Dashboards — Pre-built Grafana dashboard
- Troubleshooting — Diagnosing common issues
Troubleshooting
Common issues and their solutions.
Relay Won't Start
"error: --postgres-url is required"
The relay needs a PostgreSQL connection. Provide it via CLI flag or environment variable:
pg-tide --postgres-url "postgres://user:pass@localhost:5432/mydb"
# or
export PG_TIDE_POSTGRES_URL="postgres://..."
pg-tide
"PostgreSQL connection failed, retrying"
The relay cannot reach PostgreSQL. Check:
- Is PostgreSQL running and accepting connections?
- Is the connection string correct?
- Are firewall rules allowing the connection?
- Is the database user granted
CONNECTprivilege?
The relay retries with exponential backoff indefinitely.
No Messages Being Delivered
Check pending messages exist
SELECT * FROM tide.outbox_pending;
If empty, no messages have been published yet.
Check pipeline is enabled
SELECT tide.relay_list_configs();
Ensure the pipeline shows "enabled": true.
Check consumer group exists
Forward pipelines need a consumer group:
SELECT * FROM tide.tide_consumer_groups;
Check relay owns the pipeline
If another relay instance holds the advisory lock, this instance won't process the pipeline. Check logs for "acquired lock" messages.
Duplicate Messages
In the outbox
This is normal — your application may publish the same logical event multiple times. Consider adding a deterministic dedup key in the headers.
In the inbox
If the inbox receives duplicates, the UNIQUE(event_id) constraint should prevent it. If duplicates appear, check:
- Is the dedup key truly unique per logical event?
- Was the inbox table created with the UNIQUE constraint?
High Consumer Lag
SELECT * FROM tide.consumer_lag WHERE lag > 1000;
Possible causes:
- Sink is slow — check sink latency and error rate
- Relay is overwhelmed — increase
batch_sizeor deploy more relay instances - Relay is down — check health endpoint and process status
- Advisory lock not acquired — another relay or stale process holds the lock
Extension Errors
"outbox already exists"
Use p_if_not_exists or check before creating:
SELECT tide.create_consumer_group('my-group', 'events', p_if_not_exists := true);
"outbox not found"
The named outbox doesn't exist. Create it first:
SELECT tide.outbox_create('my-outbox');
Useful Diagnostic Queries
-- All outbox config
SELECT * FROM tide.tide_outbox_config;
-- All inbox config
SELECT * FROM tide.tide_inbox_config;
-- Active relay pipelines
SELECT * FROM tide.relay_outbox_config WHERE enabled;
SELECT * FROM tide.relay_inbox_config WHERE enabled;
-- Advisory locks held (relay pipelines)
SELECT * FROM pg_locks WHERE locktype = 'advisory';
-- Relay offset tracking
SELECT * FROM tide.relay_consumer_offsets;
Troubleshooting Guide
A comprehensive guide to diagnosing and resolving common pg_tide issues in production.
Quick Diagnostic Checklist
When something isn't working:
- Check relay logs — Look for ERROR or WARN messages
- Check
/healthendpoint — Returns 503 if circuit breaker is open - Check consumer lag —
pg_tide_consumer_lagmetric or query outbox directly - Check DLQ —
SELECT count(*) FROM tide.relay_dlq WHERE resolved_at IS NULL - Check advisory locks —
SELECT * FROM pg_locks WHERE locktype = 'advisory'
Messages Not Being Delivered
Symptom: Outbox rows accumulate, nothing reaches the sink
Check 1: Is the relay running?
# Check if the process is alive
ps aux | grep pg-tide
# Check Kubernetes
kubectl get pods -l app=pg-tide-relay
Check 2: Is the pipeline enabled?
SELECT name, enabled FROM tide.relay_outbox_config;
Check 3: Does the relay hold the advisory lock?
SELECT pid, objid
FROM pg_locks
WHERE locktype = 'advisory' AND granted = true;
If no locks are held, the relay may have lost its database connection.
Check 4: Is the circuit breaker open?
curl http://localhost:9090/health
# If unhealthy, circuit breaker is open for one or more pipelines
Check 5: Is the sink reachable? Test connectivity from the relay host to the sink (Kafka broker, HTTP endpoint, etc.).
Symptom: Messages delivered but arriving slowly
Check 1: Batch size too small? Small batch sizes (1-10) increase per-message overhead. Try increasing to 100+.
Check 2: Rate limiter configured?
SELECT config->'rate_limit' FROM tide.relay_outbox_config WHERE name = 'your-pipeline';
Check 3: Sink response time?
Check pg_tide_delivery_latency_seconds — if high, the sink is slow to acknowledge.
Circuit Breaker Stuck Open
Symptom: Pipeline unhealthy, /health returns 503
The circuit breaker opens after failure_threshold consecutive failures. It won't close until probe requests succeed.
Resolution:
- Check sink availability (is the Kafka broker up? Is the webhook endpoint responding?)
- Check relay logs for the specific error message
- Fix the underlying issue
- Wait for
half_open_timeout— the circuit will probe automatically - If the probe succeeds, the circuit closes and normal flow resumes
Force-close (nuclear option): Restart the relay. Circuit breaker state is in-memory and resets on restart.
Duplicate Messages at Sink
Symptom: Consumers see the same message multiple times
pg_tide guarantees at-least-once delivery. Duplicates can occur when:
- The relay publishes to the sink but crashes before acknowledging
- Network partition causes timeout after successful publish
- Replay mode reprocesses an already-delivered range
Resolution:
- Implement idempotent consumers (dedup on
outbox_idordedup_key) - Use the inbox on the receiving side for built-in deduplication
- For Kafka, use
enable.idempotence=trueon the consumer side
DLQ Entries Accumulating
Symptom: tide.relay_dlq table growing
Diagnose the error pattern:
SELECT error_kind, error_message, count(*)
FROM tide.relay_dlq
WHERE resolved_at IS NULL
GROUP BY error_kind, error_message
ORDER BY count(*) DESC
LIMIT 10;
Common causes:
| Error Kind | Typical Cause | Fix |
|---|---|---|
decode | Malformed message in outbox | Fix the producing application |
sink_permanent | Auth failure, schema mismatch | Update credentials or schema |
inbox_permanent | Constraint violation | Check inbox table constraints |
max_retries_exceeded | Transient issue that lasted too long | Fix sink, then replay |
After fixing: Replay the DLQ entries:
SELECT tide.relay_dlq_retry_all('your-pipeline');
Connection Issues
Symptom: "connection refused" or "timeout" in logs
PostgreSQL connection:
# Test from relay host
psql "${DATABASE_URL}" -c "SELECT 1"
Common issues:
- Connection string wrong (check
--postgres-url) - PostgreSQL max_connections reached
- Firewall rules blocking relay → database
- SSL/TLS certificate issues
Sink connection:
- Check DNS resolution from relay host
- Check firewall/security group rules
- Verify credentials haven't expired
- Check TLS certificate validity
Symptom: "too many connections" from PostgreSQL
Each pipeline worker uses one connection. With 50 pipelines, you need 50+ connections.
Resolution:
- Increase
max_connectionsin PostgreSQL - Use PgBouncer in front of PostgreSQL
- Reduce number of pipelines per relay instance
Transform/Filter Issues
Symptom: Messages being silently dropped
If messages are consumed but never published, a transform filter might be dropping them.
Diagnose:
- Check if
messages_consumed > messages_publishedconsistently - Check transform configuration:
SELECT config->'transform' FROM tide.relay_outbox_config WHERE name = 'your-pipeline';
- Test the filter expression against a sample payload
- Temporarily remove the filter to confirm
Symptom: Transform producing unexpected output
Set dry_run: true on the pipeline to see what transforms produce without publishing:
UPDATE tide.relay_outbox_config
SET config = config || '{"dry_run": true}'::jsonb
WHERE name = 'your-pipeline';
Check relay logs for the dry-run output, then disable dry-run when satisfied.
Advisory Lock Conflicts
Symptom: Pipeline not being picked up by any relay instance
Check lock ownership:
SELECT l.pid, a.application_name, l.objid
FROM pg_locks l
JOIN pg_stat_activity a ON l.pid = a.pid
WHERE l.locktype = 'advisory';
If a stale connection holds the lock (zombie process), terminate it:
SELECT pg_terminate_backend(<pid>);
Performance Degradation
Symptom: Gradual slowdown over time
Check 1: Table bloat
SELECT relname, n_dead_tup, last_autovacuum
FROM pg_stat_user_tables
WHERE relname LIKE 'outbox%' OR relname LIKE 'inbox%';
If n_dead_tup is high, autovacuum may be falling behind. See Maintenance.
Check 2: Index bloat
SELECT indexrelname, pg_size_pretty(pg_relation_size(indexrelid))
FROM pg_stat_user_indexes
WHERE relname LIKE 'outbox%';
Check 3: Disk pressure High I/O wait indicates the database can't keep up with the workload.
Further Reading
- Monitoring Cookbook — Alert rules and queries
- Maintenance — Preventive maintenance tasks
- Dead Letter Queue — DLQ management
Runbook: Crash Recovery
Applies to: pg-tide relay binary (pg-tide run)
Scope: What happens when the relay crashes mid-batch and how to recover.
At-Least-Once Guarantee
pg-tide implements an at-least-once delivery guarantee. When the relay crashes between polling messages from the outbox and acknowledging delivery to the sink, the un-acknowledged messages will be re-delivered after restart because the consumer offset has not advanced.
No manual intervention is required in the normal case. Simply restart the relay and it will resume from the last committed offset.
How the Relay Commits Offsets
The relay tracks its position in the outbox via tide.relay_consumer_offsets.
After each successful batch delivery, it advances the last_change_id for the
pipeline. If the relay crashes before this write, the batch is re-read and
re-delivered.
-- Inspect the current offset for each pipeline:
SELECT name, last_change_id, updated_at
FROM tide.relay_consumer_offsets
ORDER BY name;
Identifying a Stuck Pipeline
A stuck pipeline is one where the consumer lag is not decreasing despite the relay running. Signs include:
pg_tide_relay_consumer_lag{pipeline="..."}is high and flat in Grafana.pg-tide statusshows the pipeline as owned but not progressing.- Repeated error log entries for the same pipeline.
Common Causes
| Cause | Resolution |
|---|---|
| Sink is unreachable | Fix the sink endpoint; pipeline auto-resumes |
| Circuit breaker is open | Wait for half-open probe, or restart the worker |
| DLQ is full / INSERT denied | See DLQ Replay runbook |
| Advisory lock held by crashed pod | See below |
Clearing a Stale Advisory Lock
PostgreSQL advisory locks are session-scoped — they are automatically released when the backend connection closes (e.g. on relay crash or pod eviction). In practice, stale locks resolve themselves within seconds.
If you believe a lock is genuinely stuck (e.g. the PostgreSQL backend is still connected after a relay pod was forcibly killed), identify and terminate it:
-- Find the backend holding the lock for a pipeline named "my-pipeline":
SELECT pid, application_name, state, query_start
FROM pg_stat_activity
WHERE pid IN (
SELECT pid
FROM pg_locks
WHERE locktype = 'advisory'
AND classid = hashtext('default') -- relay_group_id
AND objid = hashtext('my-pipeline') -- pipeline name
);
-- Terminate the backend (replaces the lock with nothing; the relay
-- will reacquire on next reconcile):
SELECT pg_terminate_backend(<pid>);
Restart Procedure
- Verify the relay process has fully stopped (no ghost connections).
- Restart the relay:
docker restart pg-tideorkubectl rollout restart deployment/pg-tide. - Watch logs for
acquired lock — spawning workermessages confirming pipelines resume. - Check
pg-tide status --postgres-url $PG_TIDE_POSTGRES_URLfor lag convergence.
After a Partial Batch
Duplicate delivery to the sink is possible after a crash. Ensure your sink consumers are idempotent:
- Use the
event_id(UUID) field present in all pg-tide messages as an idempotency key. - For inbox targets, pg-tide automatically deduplicates via the
event_idprimary key withON CONFLICT DO NOTHING.
See Also
Runbook: DLQ Replay
Applies to: pg-tide relay v0.13.0+
Scope: How to drain a flooded dead-letter queue (DLQ), requeue messages
for retry, and monitor progress.
What the DLQ Is
When a message cannot be delivered after exhausting retries (permanent error
or circuit breaker open), pg-tide writes it to tide.relay_dlq. This
preserves the message for operator review rather than silently dropping it.
-- Count DLQ entries per pipeline:
SELECT pipeline_name, error_kind, COUNT(*) AS entries
FROM tide.relay_dlq
GROUP BY pipeline_name, error_kind
ORDER BY entries DESC;
Step 1 — Identify the Root Cause
Before requeuing, understand why messages landed in the DLQ:
-- Inspect recent DLQ entries for a pipeline:
SELECT id, event_id, error_kind, error_message, failed_at
FROM tide.relay_dlq
WHERE pipeline_name = 'my-pipeline'
ORDER BY failed_at DESC
LIMIT 20;
Common error_kind values:
| Kind | Meaning | Resolution |
|---|---|---|
permanent | Sink rejected the message (invalid schema, auth, etc.) | Fix the root cause before requeuing |
transient | Sink was unavailable; max retries exhausted | Confirm sink is healthy, then requeue |
dlq_write_failure | Secondary DLQ write failure | Indicates DLQ table is misconfigured; check tide.relay_dlq permissions |
Step 2 — Fix the Underlying Problem
Requeuing DLQ entries while the root cause is still present will return them to the DLQ immediately. Fix the sink, schema, or configuration first:
# Validate pipeline config against the live catalog and sink:
pg-tide validate-config --pipeline my-pipeline --postgres-url "$PG_TIDE_POSTGRES_URL"
# Check that the relay can connect to all required services:
pg-tide doctor --postgres-url "$PG_TIDE_POSTGRES_URL"
Step 3 — Requeue Messages for Retry
Via SQL
-- Requeue all DLQ entries for a pipeline (marks them as pending retry):
SELECT tide.dlq_requeue('my-pipeline');
-- Requeue a single entry by ID:
SELECT tide.dlq_requeue_entry(42);
Via CLI
# Requeue all DLQ entries for a pipeline:
pg-tide replay dlq-requeue --pipeline my-pipeline --postgres-url "$PG_TIDE_POSTGRES_URL"
# Preview without actually requeuing (dry run):
pg-tide replay dlq-requeue --pipeline my-pipeline --dry-run --postgres-url "$PG_TIDE_POSTGRES_URL"
Step 4 — Monitor Progress
Watch the DLQ depth decrease and message throughput increase:
# Tail the relay logs:
docker logs -f pg-tide 2>&1 | grep -E "(dlq|pipeline=my-pipeline)"
# Or watch Prometheus metrics:
curl -s http://localhost:9090/metrics | grep pg_tide_relay_dlq
Key metrics:
| Metric | Description |
|---|---|
pg_tide_relay_dlq_entries_written_total | Cumulative DLQ writes — should stop growing after root cause is fixed |
pg_tide_relay_messages_published_total | Should increase as requeued messages are delivered |
pg_tide_relay_consumer_lag | Should decrease as the pipeline catches up |
Step 5 — Purge Resolved DLQ Entries
After successful redelivery, clean up resolved entries:
-- Delete all successfully redelivered DLQ entries for a pipeline:
DELETE FROM tide.relay_dlq
WHERE pipeline_name = 'my-pipeline'
AND requeued_at IS NOT NULL;
-- Or delete all entries older than 30 days:
DELETE FROM tide.relay_dlq
WHERE failed_at < NOW() - INTERVAL '30 days';
Flood Control
If the DLQ is growing very fast (more than ~100 entries/min):
- Disable the pipeline to stop new DLQ writes:
SELECT tide.relay_disable('my-pipeline'); - Fix the root cause.
- Clear the DLQ entries that will never be retryable.
- Re-enable the pipeline and requeue surviving entries:
SELECT tide.relay_enable('my-pipeline'); SELECT tide.dlq_requeue('my-pipeline');
See Also
Runbook: Schema Migration
Applies to: pg_tide PostgreSQL extension
Scope: How to upgrade the pg_tide extension schema without relay downtime.
Overview
pg_tide uses the standard PostgreSQL extension upgrade mechanism:
ALTER EXTENSION pg_tide UPDATE;
This command applies the appropriate pg_tide--<from>--<to>.sql upgrade
script atomically within a transaction. The relay can continue running
during the upgrade with at most a brief window of elevated latency.
Pre-Migration Checklist
Before upgrading:
- Back up the database (or ensure your point-in-time recovery is current).
- Check the current version:
SELECT extversion FROM pg_extension WHERE extname = 'pg_tide'; - Check the available target version:
SELECT * FROM pg_available_extension_versions WHERE name = 'pg_tide'; - Run pg-tide doctor to confirm the relay is healthy before the upgrade:
pg-tide doctor --postgres-url "$PG_TIDE_POSTGRES_URL" - Review the CHANGELOG for any breaking changes or required manual steps in the target version.
Upgrade Procedure
1. Deploy the New Extension Files
Copy the new .so library, control file, and SQL migration files to the
PostgreSQL $libdir and share directory. For package-based installs:
# Debian/Ubuntu:
apt-get install pg-tide=0.19.0
# CNPG (CloudNativePG) — update the cluster manifest image tag:
kubectl patch cluster my-pg --type=merge \
-p '{"spec":{"imageName":"ghcr.io/my-org/pg-tide-cnpg:0.19.0"}}'
2. Apply the Migration
-- Connect as a superuser or the extension owner:
ALTER EXTENSION pg_tide UPDATE;
-- Verify:
SELECT extversion FROM pg_extension WHERE extname = 'pg_tide';
The relay does not need to be stopped. The upgrade script is
transactional and takes only a brief AccessShareLock on affected tables.
3. Verify Catalog Integrity
-- Confirm all expected functions are present:
SELECT routine_name, routine_type
FROM information_schema.routines
WHERE routine_schema = 'tide'
ORDER BY routine_name;
-- Confirm relay config tables are intact:
SELECT COUNT(*) FROM tide.relay_outbox_config;
SELECT COUNT(*) FROM tide.relay_inbox_config;
4. Run pg-tide doctor Again
pg-tide doctor --postgres-url "$PG_TIDE_POSTGRES_URL"
All checks should pass. If any check fails, see the troubleshooting section below.
Rolling Back
If the migration must be rolled back:
-- Extensions cannot be downgraded via ALTER EXTENSION.
-- Restore from backup or use PITR to the pre-upgrade snapshot.
PostgreSQL does not support extension downgrade scripts. Always take a database snapshot before applying an extension upgrade in production.
Relay Behaviour During Migration
- The relay continues to poll and deliver messages during the upgrade.
- The
ALTER EXTENSIONcommand takes a brief metadata lock. In-flight batches will complete normally; new polls may be delayed by a few milliseconds. - If the relay encounters a schema error mid-migration (extremely unlikely
with the standard upgrade path), it will classify it as a permanent error
and pause the affected pipeline. Resume with
SELECT tide.relay_enable('...').
Multi-Step Upgrade
If you are upgrading across multiple versions (e.g. 0.15.0 → 0.19.0), PostgreSQL applies each intermediate script automatically:
ALTER EXTENSION pg_tide UPDATE TO '0.19.0';
pg_tide ships upgrade scripts for every consecutive version pair, so this always works without manual intermediate steps.
CNPG (CloudNativePG) Notes
When using CloudNativePG, the extension upgrade happens automatically when
you update the cluster image to a version that includes the new .so and
SQL files. The bootstrap initdb / postInitSQL section runs
ALTER EXTENSION pg_tide UPDATE after the image update. See
examples/cnpg/cluster.yaml for a
reference manifest.
See Also
Runbook: Relay Upgrade
Applies to: pg-tide relay binary (pg-tide run)
Scope: Rolling upgrade procedure for high-availability deployments with
multiple relay instances.
Overview
The pg-tide relay is stateless between reconcile cycles. Pipeline ownership is coordinated via PostgreSQL advisory locks, so multiple relay instances can run simultaneously without split-brain. This makes rolling upgrades possible without any downtime to message delivery.
Pre-Upgrade Checklist
- Read the CHANGELOG for the target version — note any new required configuration keys or deprecated flags.
- Back up the database (or confirm PITR is current).
- Confirm current relay health:
pg-tide status --postgres-url "$PG_TIDE_POSTGRES_URL" pg-tide doctor --postgres-url "$PG_TIDE_POSTGRES_URL" - Upgrade the PostgreSQL extension first (if the target relay version
requires a newer extension schema):
The old relay is forward-compatible with the new schema; the new relay is backward-compatible with the old schema. Upgrading the extension first is always safe.ALTER EXTENSION pg_tide UPDATE;
Rolling Upgrade Procedure (Kubernetes)
1. Update the Deployment Image Tag
kubectl set image deployment/pg-tide \
pg-tide=ghcr.io/trickle-labs/pg-tide:0.19.0
Or update image.tag in values.yaml and run helm upgrade:
helm upgrade pg-tide oci://ghcr.io/trickle-labs/helm/pg-tide \
--set image.tag=0.19.0 \
--reuse-values
2. Watch the Rollout
kubectl rollout status deployment/pg-tide
Kubernetes replaces pods one at a time (controlled by maxUnavailable and
maxSurge). As each old pod is terminated:
- PostgreSQL advisory locks held by the old pod are released automatically when the connection closes (within ~1 s of pod termination).
- The new pod's coordinator reconciles and reacquires the released pipelines.
- Messages may be re-delivered (at-least-once) for any batch that was in-flight at the time of the old pod's termination.
3. Verify Post-Upgrade
# All pods should be running the new version:
kubectl get pods -l app.kubernetes.io/name=pg-tide -o json \
| jq '.items[].spec.containers[].image'
# Pipeline ownership should be fully restored:
pg-tide status --postgres-url "$PG_TIDE_POSTGRES_URL"
Rolling Upgrade Procedure (Docker / Systemd)
For single-instance or manual deployments:
-
Start the new relay alongside the old one (different container name or port is fine; both will contend for advisory locks and share pipeline ownership gracefully).
docker run -d --name pg-tide-new \ -e PG_TIDE_POSTGRES_URL="$PG_TIDE_POSTGRES_URL" \ ghcr.io/trickle-labs/pg-tide:0.19.0 -
Verify the new relay is healthy and has acquired pipelines:
docker logs pg-tide-new | grep "acquired lock" pg-tide status --postgres-url "$PG_TIDE_POSTGRES_URL" -
Stop the old relay (graceful drain):
docker stop --time 60 pg-tide-oldThe
--time 60gives the old relay up to 60 seconds to finish in-flight batches before hard-stopping. -
Clean up:
docker rm pg-tide-old docker rename pg-tide-new pg-tide
Configuration Changes Between Versions
New Required Configuration Keys
Check the CHANGELOG for any new required keys. The relay will fail to start with a clear error message if a required key is missing.
Deprecated Flags
Deprecated flags continue to work until the next major version. A
WARN-level log entry is emitted at startup if a deprecated flag is present.
Environment Variable Changes
| Old (pre-0.17.0) | New | Notes |
|---|---|---|
PG_TIDE_RELAY_POSTGRES_URL | PG_TIDE_POSTGRES_URL | Old name no longer recognised |
| Pre-v0.17 legacy env var | PG_TIDE_POSTGRES_URL | See CHANGELOG v0.17.0 for details |
Rollback
If the new relay is unhealthy after the upgrade:
- Start the old relay binary (advisory locks auto-transfer back).
- Stop the new relay.
- Investigate logs from the new relay for the root cause.
Because pipeline state lives in PostgreSQL, no data is lost during rollback.
HA Considerations
- Minimum two relay instances are recommended for production to ensure zero-downtime upgrades.
maxUnavailable: 0in the Kubernetes PodDisruptionBudget ensures at least one relay is always running during node drains.- The Helm chart defaults (
helm/pg-tide/values.yaml) setreplicaCount: 2and include a PodDisruptionBudget.
See Also
Tutorial: Getting Started with pg_tide
This tutorial takes you from zero to a working pg_tide pipeline in 10 minutes. You'll create an outbox, publish events, configure a relay pipeline, and see messages delivered to a sink.
Prerequisites
- PostgreSQL 14 or later
- The pg_tide extension installed (
CREATE EXTENSION pg_tide) - The
pg-tiderelay binary (see Installation)
Step 1: Install the Extension
CREATE EXTENSION pg_tide;
This creates the tide schema with all catalog tables and SQL functions.
Step 2: Create an Outbox
SELECT tide.outbox_create('my_events');
This creates an outbox table that will store your events until they're relayed.
Step 3: Publish an Event
SELECT tide.outbox_publish('my_events', 'user-signups', '{
"user_id": "USR-001",
"email": "alice@example.com",
"plan": "pro"
}'::jsonb);
The event is now stored in the outbox. It's part of your current transaction — if you ROLLBACK, the event disappears too. That's the transactional outbox guarantee.
Step 4: Check the Outbox
SELECT * FROM tide.outbox_status('my_events');
You'll see one pending event waiting to be relayed.
Step 5: Configure a Relay Pipeline
For this tutorial, we'll use the stdout sink (prints messages to the relay's terminal):
SELECT tide.relay_set_outbox(
'my-first-pipeline',
'my_events',
'{
"sink_type": "stdout",
"format": "json_pretty"
}'::jsonb
);
Step 6: Start the Relay
In a terminal:
pg-tide --postgres-url "postgres://user:pass@localhost/mydb"
You should see your event printed to the terminal:
{
"outbox_id": 1,
"op": "insert",
"stream_table": "user-signups",
"payload": {
"user_id": "USR-001",
"email": "alice@example.com",
"plan": "pro"
}
}
Step 7: Publish More Events
With the relay running, publish additional events and watch them appear in real-time:
SELECT tide.outbox_publish('my_events', 'user-signups', '{
"user_id": "USR-002",
"email": "bob@example.com",
"plan": "free"
}'::jsonb);
Next Steps
Now that you have a working pipeline, try:
- Switch to a real sink: Replace
stdoutwith Kafka, NATS, or any other supported sink - Add transforms: Filter and reshape messages before delivery
- Create an inbox: Receive events from external systems
- Set up monitoring: Prometheus metrics for production visibility
Further Reading
- First Pipeline (detailed) — Extended getting-started guide
- Concepts: Transactional Outbox — Why this pattern works
- SQL Reference: Outbox API — Complete function reference
Tutorial: End-to-End Pipeline
This tutorial builds a complete bidirectional messaging system: order events flow out from PostgreSQL to Kafka (forward pipeline), and payment confirmations flow in from NATS back into PostgreSQL (reverse pipeline). By the end, you'll have both directions working with exactly-once delivery guarantees.
This demonstrates a realistic pattern: your order service publishes events when orders are created, and a payment service (running elsewhere) confirms payments by publishing to NATS, which pg_tide relays back into your database for processing.
Prerequisites
- PostgreSQL 18+ with pg_tide installed
- Kafka cluster running (or Redpanda, which is Kafka-compatible)
- NATS server running
- pg-tide relay built with
kafkaandnatsfeatures kafka-console-consumerCLI (comes with Kafka) andnatsCLI
Part 1: Forward Pipeline (Orders → Kafka)
Step 1: Set up the database schema
-- Install the extension
CREATE EXTENSION pg_tide;
-- Create a table for our orders
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id TEXT NOT NULL,
total NUMERIC(10,2) NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
created_at TIMESTAMPTZ DEFAULT now()
);
-- Create an outbox for order events with 72-hour retention
-- (longer retention gives us time to investigate if something goes wrong)
SELECT tide.outbox_create('orders', p_retention_hours := 72);
-- Create a consumer group for the Kafka relay
-- Using 'earliest' so it processes all existing messages on startup
SELECT tide.create_consumer_group('kafka-relay', 'orders',
p_auto_offset_reset := 'earliest'
);
Step 2: Configure the Kafka pipeline
SELECT tide.relay_set_outbox('orders-to-kafka', 'orders', 'kafka',
jsonb_build_object(
'brokers', 'localhost:9092',
'topic', 'order-events',
'acks', 'all', -- wait for all replicas to acknowledge
'compression', 'snappy', -- good balance of speed and compression ratio
'key', '{event_type}' -- partition by event type for ordering
),
p_batch_size := 200 -- Kafka benefits from larger batches
);
Why these choices?
acks=all— ensures messages are replicated before the relay considers them delivered. This is the safest option for production.compression=snappy— reduces network bandwidth with minimal CPU overhead. Kafka consumers decompress transparently.key={event_type}— messages with the same event type go to the same Kafka partition, preserving ordering within a type.batch_size=200— Kafka's efficiency improves with larger batches (amortizes protocol overhead and compression).
Step 3: Start the relay
pg-tide --postgres-url "postgres://user:pass@localhost:5432/mydb"
The relay discovers the orders-to-kafka pipeline, acquires an advisory lock, and begins polling.
Step 4: Publish order events
Simulate a new order being placed:
BEGIN;
INSERT INTO orders (id, customer_id, total, status)
VALUES (1, 'alice', 149.99, 'confirmed');
-- Publish the event atomically with the business data
SELECT tide.outbox_publish('orders',
jsonb_build_object(
'order_id', 1,
'customer_id', 'alice',
'total', 149.99,
'status', 'confirmed',
'items', jsonb_build_array(
jsonb_build_object('sku', 'WIDGET-01', 'qty', 2, 'price', 49.99),
jsonb_build_object('sku', 'GADGET-03', 'qty', 1, 'price', 50.01)
)
),
jsonb_build_object(
'event_type', 'order.confirmed',
'schema_version', '1.0',
'correlation_id', 'req-abc-123'
)
);
COMMIT;
Publish a few more to make the pipeline active:
BEGIN;
INSERT INTO orders (id, customer_id, total, status) VALUES (2, 'bob', 42.00, 'confirmed');
SELECT tide.outbox_publish('orders',
'{"order_id": 2, "customer_id": "bob", "total": 42.00, "status": "confirmed"}'::jsonb,
'{"event_type": "order.confirmed", "schema_version": "1.0"}'::jsonb);
COMMIT;
BEGIN;
INSERT INTO orders (id, customer_id, total, status) VALUES (3, 'charlie', 299.95, 'confirmed');
SELECT tide.outbox_publish('orders',
'{"order_id": 3, "customer_id": "charlie", "total": 299.95, "status": "confirmed"}'::jsonb,
'{"event_type": "order.confirmed", "schema_version": "1.0"}'::jsonb);
COMMIT;
Step 5: Verify delivery to Kafka
kafka-console-consumer --bootstrap-server localhost:9092 \
--topic order-events --from-beginning --max-messages 3
You should see all three order events. Verify from the PostgreSQL side:
-- All messages delivered
SELECT * FROM tide.outbox_pending;
-- Should show 0 pending
-- Consumer lag at zero
SELECT * FROM tide.consumer_lag;
-- Should show lag = 0 for kafka-relay group
Part 2: Reverse Pipeline (Payment Confirmations from NATS → Inbox)
Now let's set up the reverse direction: payment confirmations arrive on a NATS subject and are written to a pg_tide inbox for processing.
Step 6: Create the inbox
-- Create an inbox for payment events
-- max_retries: how many times we'll retry processing before DLQ
-- processed_retention: keep successfully processed messages for 3 days (auditing)
-- dlq_retention: keep failed messages forever (manual investigation)
SELECT tide.inbox_create('payments',
p_max_retries := 5,
p_processed_retention_hours := 72,
p_dlq_retention_hours := 0
);
This creates a table tide."payments_inbox" with a UNIQUE constraint on event_id for deduplication.
Step 7: Configure the reverse pipeline
SELECT tide.relay_set_inbox('nats-to-payments', 'payments',
jsonb_build_object(
'url', 'nats://localhost:4222',
'subject', 'payments.confirmed',
'queue_group', 'pg-tide-payments'
),
p_source := 'nats',
p_batch_size := 50,
p_idempotent := true
);
The relay subscribes to payments.confirmed on NATS and writes incoming messages to the payments inbox. The queue_group ensures that if you run multiple relay instances, each message is processed by only one instance.
Step 8: Simulate incoming payment confirmations
Using the NATS CLI, publish some payment events (as if a payment service were sending them):
nats pub payments.confirmed '{"payment_id": "pay-001", "order_id": 1, "amount": 149.99, "status": "completed", "processor": "stripe"}'
nats pub payments.confirmed '{"payment_id": "pay-002", "order_id": 2, "amount": 42.00, "status": "completed", "processor": "stripe"}'
nats pub payments.confirmed '{"payment_id": "pay-003", "order_id": 3, "amount": 299.95, "status": "completed", "processor": "stripe"}'
Step 9: Verify inbox delivery
-- Check that messages arrived in the inbox
SELECT event_id, payload->>'payment_id' as payment_id,
payload->>'order_id' as order_id,
received_at, processed_at
FROM tide."payments_inbox"
ORDER BY id;
event_id | payment_id | order_id | received_at | processed_at
-----------------------+------------+----------+--------------------------+--------------
payments:nats:seq-1 | pay-001 | 1 | 2025-01-15 10:05:00+00 |
payments:nats:seq-2 | pay-002 | 2 | 2025-01-15 10:05:01+00 |
payments:nats:seq-3 | pay-003 | 3 | 2025-01-15 10:05:02+00 |
Messages are in the inbox, waiting to be processed. Notice the event_id — this is the dedup key that prevents duplicate processing if the same message is delivered twice.
Step 10: Process inbox messages
In your application, you'd read and process these messages:
-- Read pending payments
SELECT id, event_id, payload
FROM tide."payments_inbox"
WHERE processed_at IS NULL
AND retry_count < 5
ORDER BY id
LIMIT 10;
-- After successfully updating the order status
UPDATE orders SET status = 'paid' WHERE id = 1;
SELECT tide.inbox_mark_processed('payments', 'payments:nats:seq-1');
-- Process the rest
UPDATE orders SET status = 'paid' WHERE id = 2;
SELECT tide.inbox_mark_processed('payments', 'payments:nats:seq-2');
UPDATE orders SET status = 'paid' WHERE id = 3;
SELECT tide.inbox_mark_processed('payments', 'payments:nats:seq-3');
Step 11: Test deduplication
What happens if the same payment confirmation is delivered twice (e.g., the payment service retried)?
# Publish the same payment again
nats pub payments.confirmed '{"payment_id": "pay-001", "order_id": 1, "amount": 149.99, "status": "completed", "processor": "stripe"}'
The inbox's UNIQUE constraint catches the duplicate:
-- Still only one row for pay-001
SELECT count(*) FROM tide."payments_inbox"
WHERE payload->>'payment_id' = 'pay-001';
-- Returns: 1
No double-processing, no duplicate orders marked as paid.
Part 3: Monitoring the Complete System
Check both directions
-- Forward pipeline status: outbox → Kafka
SELECT * FROM tide.outbox_pending;
SELECT * FROM tide.consumer_lag;
-- Reverse pipeline status: NATS → inbox
SELECT
count(*) FILTER (WHERE processed_at IS NULL AND retry_count < 5) as pending,
count(*) FILTER (WHERE processed_at IS NOT NULL) as processed,
count(*) FILTER (WHERE processed_at IS NULL AND retry_count >= 5) as dead_letter
FROM tide."payments_inbox";
Prometheus metrics
curl -s http://localhost:9090/metrics | grep pg_tide
Key metrics to watch:
pg_tide_relay_messages_published_total{pipeline="orders-to-kafka",direction="forward"} 3
pg_tide_relay_messages_consumed_total{pipeline="nats-to-payments",direction="reverse"} 3
pg_tide_relay_pipeline_healthy{pipeline="orders-to-kafka"} 1
pg_tide_relay_pipeline_healthy{pipeline="nats-to-payments"} 1
Part 4: Failure Scenarios
Let's test what happens when things go wrong.
Scenario: Kafka is temporarily down
Stop Kafka, then publish a new order:
BEGIN;
INSERT INTO orders (id, customer_id, total, status) VALUES (4, 'diana', 75.50, 'confirmed');
SELECT tide.outbox_publish('orders',
'{"order_id": 4, "customer_id": "diana", "total": 75.50}'::jsonb,
'{"event_type": "order.confirmed"}'::jsonb);
COMMIT;
The message is safely in the outbox. The relay will retry delivery with exponential backoff until Kafka recovers. Check the outbox:
SELECT * FROM tide.outbox_pending;
-- Shows 1 pending message
Start Kafka again — the relay delivers the message automatically:
SELECT * FROM tide.outbox_pending;
-- Back to 0 pending
Scenario: Processing an inbox message fails
-- Simulate a processing failure
SELECT tide.inbox_mark_failed('payments', 'payments:nats:seq-4',
'External API timeout: Stripe returned 504');
The message stays in the inbox with an incremented retry_count. Your application can retry it later. After 5 failures (our max_retries), it's in the dead-letter queue:
-- Check DLQ
SELECT event_id, retry_count, last_error
FROM tide."payments_inbox"
WHERE processed_at IS NULL AND retry_count >= 5;
After fixing the issue, replay the message:
SELECT tide.replay_inbox_messages('payments', ARRAY['payments:nats:seq-4']);
Production Considerations
When adapting this tutorial for production:
- Use TLS for all connections (PostgreSQL, Kafka, NATS)
- Use
acks=allfor Kafka to ensure message durability - Set appropriate batch sizes — 200-500 for Kafka, 50-100 for NATS
- Monitor consumer lag and alert when it exceeds thresholds
- Run multiple relay instances (same
relay_group_id) for HA - Use structured JSON logging (
--log-format json) for log aggregation - Set retention appropriately — longer retention means more disk usage but easier debugging
Next Steps
- Bidirectional Sync → — synchronize state between two services
- Fan-out Pattern → — deliver one outbox to multiple sinks
- Dead-Letter Queue → — manage failed messages systematically
- Real-World Scenarios → — complete business use cases
Real-World Scenarios
This page demonstrates pg_tide in realistic business contexts. Each scenario shows a complete architecture with SQL setup, pipeline configuration, and operational guidance — giving you a blueprint for solving similar problems in your own systems.
Scenario 1: E-Commerce Order Pipeline
The situation: You're building an e-commerce platform. When a customer places an order, multiple downstream systems need to know: the warehouse service ships the items, the analytics pipeline tracks revenue, the email service sends a confirmation, and the search index updates product availability.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ PostgreSQL (pg_tide) │
│ │
│ orders table ─── outbox_publish() ──▶ "orders" outbox │
│ │ │
└──────────────────────────────────────────────┼────────────────────┘
│
┌─────────────────┼─────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ NATS │ │ Kafka │ │ Webhook │
│ (warehouse) │ │ (analytics) │ │ (email svc) │
└─────────────┘ └─────────────┘ └─────────────┘
Setup
-- Create the outbox
SELECT tide.outbox_create('orders', p_retention_hours := 72);
-- Three independent consumer groups — each tracks its own progress
SELECT tide.create_consumer_group('warehouse-relay', 'orders');
SELECT tide.create_consumer_group('analytics-relay', 'orders');
SELECT tide.create_consumer_group('email-relay', 'orders');
-- Three pipelines: same outbox, different destinations
SELECT tide.relay_set_outbox('orders-to-warehouse', 'orders', 'nats',
jsonb_build_object(
'url', 'nats://nats:4222',
'subject', 'warehouse.orders.{event_type}'
),
p_batch_size := 50
);
SELECT tide.relay_set_outbox('orders-to-analytics', 'orders', 'kafka',
jsonb_build_object(
'brokers', 'kafka:9092',
'topic', 'order-events',
'compression', 'zstd'
),
p_batch_size := 500
);
SELECT tide.relay_set_outbox('orders-to-email', 'orders', 'webhook',
jsonb_build_object(
'url', 'https://email-service.internal/hooks/orders',
'timeout_ms', 5000,
'headers', '{"Authorization": "Bearer ${ENV:EMAIL_SVC_TOKEN}"}'
),
p_batch_size := 10
);
Publishing events throughout the order lifecycle
-- When an order is placed
BEGIN;
INSERT INTO orders (id, customer_id, total, status)
VALUES (1001, 'cust-42', 249.99, 'confirmed');
SELECT tide.outbox_publish('orders',
jsonb_build_object(
'order_id', 1001,
'customer_id', 'cust-42',
'total', 249.99,
'items', '[{"sku": "LAPTOP-15", "qty": 1}]'::jsonb,
'shipping_address', '{"city": "Oslo", "country": "NO"}'::jsonb
),
'{"event_type": "order.confirmed", "schema_version": "2.0"}'::jsonb
);
COMMIT;
-- When the order ships
BEGIN;
UPDATE orders SET status = 'shipped' WHERE id = 1001;
SELECT tide.outbox_publish('orders',
'{"order_id": 1001, "tracking_number": "NO-123-456", "carrier": "Posten"}'::jsonb,
'{"event_type": "order.shipped"}'::jsonb
);
COMMIT;
Why this works well
- Each downstream system progresses independently — the email service can be slow without blocking the warehouse
- If the email service goes down, its messages accumulate in the outbox; the warehouse and analytics pipelines are unaffected
- The analytics pipeline uses large batches for Kafka efficiency
- The webhook pipeline uses small batches and a short timeout to detect email service issues quickly
Scenario 2: Multi-Tenant SaaS Webhook Delivery
The situation: You're building a B2B SaaS platform where tenants configure webhook endpoints to receive events. Each tenant has different endpoint URLs, different reliability requirements, and different event volumes. You need retry logic, per-tenant isolation, and visibility into delivery status.
Architecture
Each tenant gets their own outbox. This provides:
- Independent backpressure (one slow tenant doesn't block others)
- Per-tenant monitoring (consumer lag is per-outbox)
- Tenant-specific retention policies
Setup
-- Tenant onboarding: create a dedicated outbox
CREATE OR REPLACE FUNCTION provision_tenant_outbox(tenant_id TEXT, webhook_url TEXT)
RETURNS void LANGUAGE plpgsql AS $$
BEGIN
-- Each tenant gets their own outbox
PERFORM tide.outbox_create(
'webhooks-' || tenant_id,
p_retention_hours := 168 -- 7 days for webhook retry window
);
PERFORM tide.create_consumer_group(
'webhook-delivery-' || tenant_id,
'webhooks-' || tenant_id
);
-- Configure webhook delivery pipeline
PERFORM tide.relay_set_outbox(
'deliver-' || tenant_id,
'webhooks-' || tenant_id,
'webhook',
jsonb_build_object(
'url', webhook_url,
'timeout_ms', 10000,
'retry_codes', '[429, 500, 502, 503, 504]',
'headers', jsonb_build_object(
'X-Tenant-ID', tenant_id,
'X-Webhook-Version', '2024-01-01'
)
),
p_batch_size := 1 -- Deliver webhooks one at a time for ordering
);
END;
$$;
-- Provision a few tenants
SELECT provision_tenant_outbox('acme-corp', 'https://hooks.acme.com/events');
SELECT provision_tenant_outbox('globex-inc', 'https://api.globex.io/webhooks');
Publishing tenant events
-- Publish an event for a specific tenant
SELECT tide.outbox_publish(
'webhooks-acme-corp',
jsonb_build_object(
'event', 'invoice.paid',
'data', jsonb_build_object(
'invoice_id', 'inv-2025-001',
'amount_cents', 49900,
'currency', 'USD'
),
'timestamp', now()
),
'{"event_type": "invoice.paid"}'::jsonb
);
Monitoring per-tenant delivery
-- Which tenants have delivery lag?
SELECT
outbox_name,
pending_count,
oldest_at,
now() - oldest_at AS oldest_age
FROM tide.outbox_pending
WHERE outbox_name LIKE 'webhooks-%'
AND pending_count > 0
ORDER BY pending_count DESC;
Scenario 3: Event-Driven Data Warehouse Loading
The situation: Your operational database (PostgreSQL) processes transactions throughout the day, and your analytics team needs those changes loaded into a data warehouse (Snowflake, BigQuery, or a PostgreSQL analytics replica). Instead of batch ETL jobs that run once an hour, you want near-real-time streaming of changes.
Architecture
┌─────────────────┐ ┌──────────┐ ┌─────────────────┐
│ App Database │ │ pg-tide │ │ Data Warehouse │
│ (PostgreSQL) │────────▶│ relay │────────▶│ (staging area) │
│ │ outbox │ │ Kafka │ │
└─────────────────┘ └──────────┘ └────────┬────────┘
│
┌──────▼──────┐
│ dbt / ETL │
│ transforms │
└─────────────┘
Setup
-- Outbox for dimension table changes
SELECT tide.outbox_create('dim-changes', p_retention_hours := 168);
SELECT tide.create_consumer_group('warehouse-loader', 'dim-changes');
-- Pipeline to Kafka (where Kafka Connect or a custom consumer loads into the warehouse)
SELECT tide.relay_set_outbox('dims-to-kafka', 'dim-changes', 'kafka',
jsonb_build_object(
'brokers', 'kafka:9092',
'topic', 'warehouse.dim-changes',
'compression', 'zstd',
'key', '{event_type}' -- partition by entity type for parallel loading
),
p_batch_size := 500
);
Trigger-based publishing
For tables where every change should be captured:
-- Automatically publish changes to the customers dimension
CREATE OR REPLACE FUNCTION publish_customer_change()
RETURNS TRIGGER LANGUAGE plpgsql AS $$
BEGIN
PERFORM tide.outbox_publish('dim-changes',
jsonb_build_object(
'entity', 'customer',
'operation', TG_OP,
'data', row_to_json(NEW)::jsonb,
'changed_at', now()
),
jsonb_build_object(
'event_type', 'customer.' || lower(TG_OP),
'table', TG_TABLE_NAME
)
);
RETURN NEW;
END;
$$;
CREATE TRIGGER capture_customer_changes
AFTER INSERT OR UPDATE ON customers
FOR EACH ROW EXECUTE FUNCTION publish_customer_change();
Why this works
- Changes stream in near-real-time (seconds, not hours)
- The outbox provides a reliable buffer if the warehouse is temporarily unavailable
- Kafka provides durable, replayable storage for the warehouse loader
- The trigger captures changes automatically without modifying application code
- You control exactly what gets published (unlike CDC which captures raw row changes)
Scenario 4: Microservice Choreography (Saga Pattern)
The situation: You're processing an order that requires coordination across multiple services: reserve inventory, charge payment, schedule shipping. Rather than a central orchestrator, you want each service to react to events and publish its own events — choreography style.
Architecture
Each service has its own database with pg_tide. Events flow through NATS:
Order Service Inventory Service Payment Service
┌───────────┐ ┌───────────┐ ┌───────────┐
│ outbox: │──▶ NATS ──▶ │ inbox: │ │ inbox: │
│ "orders" │ order.confirmed │ "inv-req" │ │ "pay-req" │
└───────────┘ └─────┬─────┘ └─────┬─────┘
│ │
┌───────────┐ ┌─────▼─────┐ ┌─────▼─────┐
│ inbox: │◀── NATS ◀── │ outbox: │ │ outbox: │
│ "results" │ *.completed │ "inv-out" │ │ "pay-out" │
└───────────┘ └───────────┘ └───────────┘
Order service setup
-- Outbox: publishes order events
SELECT tide.outbox_create('orders', p_retention_hours := 168);
SELECT tide.create_consumer_group('nats-fanout', 'orders');
SELECT tide.relay_set_outbox('orders-fanout', 'orders', 'nats',
jsonb_build_object(
'url', 'nats://nats:4222',
'subject', 'orders.{event_type}'
)
);
-- Inbox: receives completion/failure events from other services
SELECT tide.inbox_create('order-results',
p_max_retries := 10,
p_processed_retention_hours := 168
);
SELECT tide.relay_set_inbox('results-from-services', 'order-results',
jsonb_build_object(
'url', 'nats://nats:4222',
'subject', 'orders.*.completed',
'queue_group', 'order-svc'
),
p_source := 'nats'
);
Inventory service setup
-- Inbox: receives inventory reservation requests
SELECT tide.inbox_create('inventory-requests', p_max_retries := 3);
SELECT tide.relay_set_inbox('inv-requests', 'inventory-requests',
jsonb_build_object(
'url', 'nats://nats:4222',
'subject', 'orders.order.confirmed',
'queue_group', 'inventory-svc'
),
p_source := 'nats'
);
-- Outbox: publishes reservation results
SELECT tide.outbox_create('inventory-results', p_retention_hours := 72);
SELECT tide.create_consumer_group('inv-nats', 'inventory-results');
SELECT tide.relay_set_outbox('inv-results-out', 'inventory-results', 'nats',
jsonb_build_object(
'url', 'nats://nats:4222',
'subject', 'orders.inventory.completed'
)
);
The flow
- Order service publishes
order.confirmed→ NATS - Inventory service receives it in its inbox, reserves stock, publishes
inventory.completed - Payment service receives the original event, charges the card, publishes
payment.completed - Order service receives both completion events in its inbox and updates order status
Each step is independently reliable:
- If the inventory service is down, messages accumulate in its inbox
- If a payment fails, it's retried up to
max_retriestimes - The order service's inbox deduplicates if any message is delivered twice
- Every service can be independently deployed, scaled, and debugged
Scenario 5: Audit Trail and Compliance Logging
The situation: You need an immutable audit trail of every significant business action for regulatory compliance. The audit log must be tamper-resistant, queryable, and forwarded to long-term archival storage (S3, GCS, or a compliance platform).
Architecture
-- Dedicated outbox for audit events (long retention, never disabled)
SELECT tide.outbox_create('audit-log',
p_retention_hours := 8760, -- 365 days local retention
p_inline_threshold := 1000000 -- very high threshold (never pause auditing)
);
-- Forward to cloud storage via webhook (to an internal archival service)
SELECT tide.create_consumer_group('archive-relay', 'audit-log');
SELECT tide.relay_set_outbox('audit-to-archive', 'audit-log', 'webhook',
jsonb_build_object(
'url', 'https://compliance-archiver.internal/ingest',
'timeout_ms', 30000,
'headers', '{"Authorization": "Bearer ${ENV:ARCHIVE_TOKEN}"}'
),
p_batch_size := 100
);
-- Also forward to Kafka for real-time compliance monitoring
SELECT tide.create_consumer_group('compliance-kafka', 'audit-log');
SELECT tide.relay_set_outbox('audit-to-kafka', 'audit-log', 'kafka',
jsonb_build_object(
'brokers', 'kafka:9092',
'topic', 'compliance.audit-events',
'acks', 'all'
),
p_batch_size := 200
);
Publishing audit events
Create a helper function that standardizes the audit event format:
CREATE OR REPLACE FUNCTION audit_log(
p_action TEXT,
p_actor TEXT,
p_resource_type TEXT,
p_resource_id TEXT,
p_details JSONB DEFAULT '{}'
) RETURNS void LANGUAGE plpgsql AS $$
BEGIN
PERFORM tide.outbox_publish('audit-log',
jsonb_build_object(
'action', p_action,
'actor', p_actor,
'resource_type', p_resource_type,
'resource_id', p_resource_id,
'details', p_details,
'timestamp', now(),
'ip_address', inet_client_addr()::text
),
jsonb_build_object(
'event_type', 'audit.' || p_action,
'compliance_class', 'SOC2'
)
);
END;
$$;
Use it throughout your application:
BEGIN;
UPDATE users SET email = 'new@example.com' WHERE id = 42;
SELECT audit_log('user.email_changed', 'admin-jane',
'user', '42',
'{"old_email": "old@example.com", "new_email": "new@example.com"}'::jsonb
);
COMMIT;
BEGIN;
DELETE FROM api_keys WHERE id = 7;
SELECT audit_log('api_key.revoked', 'user-42',
'api_key', '7',
'{"reason": "compromised"}'::jsonb
);
COMMIT;
Why this works for compliance
- Atomicity: Audit events are committed with the action they describe. It's impossible to perform an action without creating an audit record.
- Immutability: The outbox table is append-only from the application's perspective. Once committed, an audit event cannot be altered.
- Durability: Messages are replicated to multiple destinations (archive + Kafka). Even if one destination fails, the other captures the event.
- Queryability: The audit log is a PostgreSQL table — you can run SQL queries for investigation.
- Tamper detection: Compare the local outbox with the archived copy to detect any discrepancies.
Common Patterns Across Scenarios
Pattern: Use headers for routing
All scenarios use the headers JSONB for metadata that controls routing, filtering, and versioning:
'{"event_type": "order.confirmed", "schema_version": "2.0", "tenant_id": "acme"}'
This keeps the payload clean (business data only) while providing rich metadata for infrastructure decisions.
Pattern: One outbox per bounded context
Rather than one giant outbox for everything, create outboxes aligned with your domain boundaries:
orders— order lifecycle eventsinventory— stock level changespayments— payment processing eventsaudit-log— compliance events
This provides independent backpressure, monitoring, and retention per domain.
Pattern: Multiple consumer groups for fan-out
A single outbox can serve many purposes. Each consumer group tracks its own position independently, so a slow consumer doesn't block fast ones.
Pattern: Structured event schemas
Version your event payloads with a schema_version header. This allows consumers to handle schema evolution gracefully:
SELECT tide.outbox_publish('orders',
'{"order_id": 1, "total": 99.99, "currency": "USD"}'::jsonb,
'{"event_type": "order.confirmed", "schema_version": "2.0"}'::jsonb
);
Bidirectional Sync
Some systems need two-way data flow: events flow out of PostgreSQL to a message broker (forward relay), and events from that same broker flow back into PostgreSQL (reverse relay). pg_tide handles this without any external coordinator — you simply configure both directions as separate pipelines.
When to use bidirectional sync
- Microservice choreography: Service A writes orders; Service B processes them and writes fulfilments back. Both share the same broker topic but different outboxes and inboxes.
- Read model synchronisation: Keep an Elasticsearch index or Redis cache updated by streaming writes out of PostgreSQL and projecting them back into a read database via the inbox.
- Event sourcing with CQRS: The write-side emits domain events to the outbox; the read-side rebuilds its projection from the inbox.
Architecture
┌───────────────────────────────┐
│ PostgreSQL │
│ │
│ tide.outbox_messages │ ──(forward)──► NATS / Kafka
│ │
│ tide.<name>_inbox │ ◄──(reverse)── NATS / Kafka
└───────────────────────────────┘
▲ │
│ pg-tide relay
└──────────┘
(single process, two pipelines)
The relay reads both pipeline configurations from the tide schema on startup.
A single pg-tide process can run dozens of forward and reverse pipelines
simultaneously — no separate instances are required.
Step-by-step example
1. Set up the outbox and inbox
-- Forward: orders flow out.
SELECT tide.outbox_create('orders');
-- Reverse: fulfilments flow in.
SELECT tide.inbox_create('fulfilments');
2. Configure the relay pipelines
-- Forward pipeline: outbox → NATS subject "orders.events"
SELECT tide.relay_set_outbox(
'forward-orders', -- pipeline name
'orders', -- source outbox
'nats', -- sink type
'{"url":"nats://broker:4222","subject":"orders.events"}'::jsonb
);
-- Reverse pipeline: NATS subject "fulfilments.events" → inbox
SELECT tide.relay_set_inbox(
'reverse-fulfilments', -- pipeline name
'nats', -- source type
'fulfilments', -- target inbox
'{"url":"nats://broker:4222","subject":"fulfilments.events"}'::jsonb
);
3. Start the relay
pg-tide --postgres-url "$DATABASE_URL"
Both pipelines start automatically. The relay logs each pipeline direction on startup:
INFO pipeline name=forward-orders direction=Forward
INFO pipeline name=reverse-fulfilments direction=Reverse
4. Publish an order and receive the fulfilment
-- Application publishes an order:
SELECT tide.outbox_publish(
'orders',
'{"order_id": 1001, "item": "widget", "qty": 3}'::jsonb,
'{}'::jsonb
);
The relay picks this up and publishes it to orders.events. The fulfilment
service consumes it, processes it, and publishes to fulfilments.events. The
relay's reverse pipeline writes the result into the inbox:
SELECT event_id, source, payload, received_at
FROM tide.fulfilments_inbox
ORDER BY received_at DESC
LIMIT 5;
Preventing loops
Bidirectional sync carries the risk of infinite feedback loops if both sides subscribe to the same topic. Prevent this with:
- Separate subjects/topics for each direction (recommended).
- Event filtering: check a custom header (e.g.
x-source: service-a) in the relay's transform config and drop events originating from self. - Inbox idempotency: the inbox's
UNIQUE(event_id)constraint silently ignores messages it has already processed.
Monitoring
Both pipelines emit independent metrics:
| Metric | Labels |
|---|---|
pg_tide_relay_messages_consumed_total | pipeline=forward-orders, direction=forward |
pg_tide_relay_messages_published_total | pipeline=forward-orders, direction=forward |
pg_tide_relay_messages_consumed_total | pipeline=reverse-fulfilments, direction=reverse |
pg_tide_relay_pipeline_healthy | pipeline=forward-orders / pipeline=reverse-fulfilments |
Check consumer lag to verify neither side is falling behind:
SELECT group_name, consumer_id, lag, last_heartbeat
FROM tide.consumer_lag
ORDER BY lag DESC;
Tutorial: Fan-out Pattern
Deliver the same outbox events to multiple downstream systems simultaneously. Each system maintains its own consumer group with independent offset tracking.
The Pattern
┌──▶ NATS (real-time notifications)
orders outbox ──▶──┼──▶ Kafka (analytics pipeline)
└──▶ Webhook (partner integration)
Each destination gets its own relay pipeline and consumer group, progressing independently.
Step 1: Create the Shared Outbox
SELECT tide.outbox_create('orders', p_retention_hours := 72);
Step 2: Create Consumer Groups
SELECT tide.create_consumer_group('nats-relay', 'orders');
SELECT tide.create_consumer_group('kafka-relay', 'orders');
SELECT tide.create_consumer_group('webhook-relay', 'orders');
Step 3: Configure Pipelines
-- To NATS
SELECT tide.relay_set_outbox('orders-nats', 'orders', 'nats',
'{"url": "nats://nats:4222", "subject": "orders.events"}'::jsonb
);
-- To Kafka
SELECT tide.relay_set_outbox('orders-kafka', 'orders', 'kafka',
'{"brokers": "kafka:9092", "topic": "orders"}'::jsonb
);
-- To partner webhook
SELECT tide.relay_set_outbox('orders-webhook', 'orders', 'webhook',
'{"url": "https://partner.example.com/hooks/orders"}'::jsonb
);
Step 4: Monitor Independent Progress
SELECT * FROM tide.consumer_lag;
group_name | outbox_name | consumer_id | committed_offset | lag
---------------+-------------+-------------+------------------+-----
nats-relay | orders | relay-0 | 1000 | 0
kafka-relay | orders | relay-0 | 950 | 50
webhook-relay | orders | relay-0 | 800 | 200
The webhook relay is behind (maybe the partner endpoint is slow) — but it doesn't affect NATS or Kafka delivery.
Benefits
- Independent progress — slow consumers don't block fast ones
- Independent retry — if one sink fails, others continue
- Single source of truth — all events come from the same outbox
- No message duplication at the source — publish once, deliver many
Dead-Letter Queue
Even the most reliable systems occasionally encounter messages they cannot process: a malformed payload, a downstream service that is permanently down, or a bug in the consumer. pg_tide's idempotent inbox gives you a first-class dead-letter queue (DLQ) built directly into PostgreSQL — no extra broker configuration required.
How the DLQ works
When a message fails more than max_retries times, tide.inbox_mark_failed()
moves it to a separate retention bucket tracked by the dlq_retention_hours
column. The message stays visible in the inbox table with processed_at = NULL
and a non-empty last_error, making it trivial to query, investigate, and
replay.
Normal flow:
received → pending → mark_processed ──► deleted after processed_retention_hours
Failure flow:
received → pending → mark_failed (retry_count++)
│
retry_count > max_retries?
│
YES
▼
last_error set, stays in table
until dlq_retention_hours expires
Setting up an inbox with a DLQ
SELECT tide.inbox_create(
'payments',
'tide', -- schema
3, -- max_retries before DLQ
72, -- processed_retention_hours (3 days)
168 -- dlq_retention_hours (7 days)
);
With max_retries = 3, the relay will attempt to process each message up to
three times. After the third failure, last_error is populated and the message
is left in the DLQ section of the inbox.
Viewing DLQ messages
SELECT
event_id,
source,
payload,
retry_count,
last_error,
received_at
FROM tide.payments_inbox
WHERE processed_at IS NULL
AND retry_count >= 3
ORDER BY received_at;
Replaying failed messages
Once you have fixed the root cause, replay individual events or entire batches:
-- Replay a single event:
SELECT tide.replay_inbox_messages('payments', ARRAY['evt-abc-123']);
-- Replay all DLQ messages at once:
SELECT tide.replay_inbox_messages(
'payments',
ARRAY(
SELECT event_id
FROM tide.payments_inbox
WHERE processed_at IS NULL AND retry_count >= 3
)
);
replay_inbox_messages resets retry_count to 0 and clears last_error,
making the events eligible for re-processing on the next relay poll.
Alerting on DLQ depth
Wire the DLQ depth to your alerting stack using a simple SQL query:
-- Prometheus-style metric via pg_stat_statements or a custom exporter:
SELECT
inbox_name,
COUNT(*) AS dlq_depth
FROM tide.tide_inbox_config cfg
CROSS JOIN LATERAL (
SELECT COUNT(*) AS cnt
FROM tide.tide_inbox_messages msg
WHERE msg.processed_at IS NULL
AND msg.retry_count >= cfg.max_retries
) sub
WHERE cnt > 0;
Or, if you use Alertmanager with a Postgres exporter, add a rule:
# prometheus-rules.yaml
groups:
- name: pg_tide
rules:
- alert: InboxDLQNotEmpty
expr: pg_tide_dlq_depth > 0
for: 5m
labels:
severity: warning
annotations:
summary: "pg_tide inbox {{ $labels.inbox }} has {{ $value }} dead-lettered messages"
Forwarding DLQ messages to an external queue
For long-term storage or cross-team visibility, you can forward DLQ messages to an external queue (e.g. an S3 bucket or a dedicated Slack alert):
-- Notify via LISTEN/NOTIFY when a message exceeds max_retries:
CREATE OR REPLACE FUNCTION tide.notify_dlq() RETURNS TRIGGER LANGUAGE plpgsql AS $$
BEGIN
IF NEW.retry_count >= (
SELECT max_retries FROM tide.tide_inbox_config
WHERE inbox_name = TG_TABLE_NAME::text
) THEN
PERFORM pg_notify(
'tide_dlq',
json_build_object(
'inbox', TG_TABLE_NAME,
'event_id', NEW.event_id,
'error', NEW.last_error
)::text
);
END IF;
RETURN NEW;
END;
$$;
The relay's reverse pipeline can listen on tide_dlq and route messages to
a dedicated dead-letter topic in NATS or Kafka for out-of-band handling.
Automatic DLQ cleanup
DLQ rows are removed automatically when dlq_retention_hours expires via
tide.inbox_truncate_processed(). You can also clean them manually:
-- Remove DLQ rows older than 7 days from the payments inbox:
SELECT tide.inbox_truncate_processed('payments');
Tutorial: Kafka + Flink Stream Processing
This tutorial shows how to build a real-time analytics pipeline using pg_tide and Apache Flink. You'll publish order events from PostgreSQL to Kafka via pg_tide, then process them with Flink SQL to compute running totals and write results back.
What You'll Build
PostgreSQL (orders) → pg_tide → Kafka → Flink SQL → Kafka (results) → pg_tide → PostgreSQL (analytics)
Prerequisites
- PostgreSQL 14+ with pg_tide extension installed
- pg-tide relay binary
- Apache Kafka (or Redpanda)
- Apache Flink 1.17+ with SQL Gateway
Step 1: Create the Outbox
-- Create an outbox for order events
SELECT tide.outbox_create('order_events');
-- Insert some test events
SELECT tide.outbox_publish('order_events', 'orders', jsonb_build_object(
'order_id', 'ORD-001',
'customer_id', 'CUST-42',
'total', 149.99,
'region', 'us-east',
'created_at', now()
));
Step 2: Configure the Pipeline
SELECT tide.relay_set_outbox(
'orders-to-kafka',
'order_events',
'{
"sink_type": "kafka",
"brokers": "kafka:9092",
"topic": "order-events",
"wire_format": "debezium",
"wire_config": {
"server_name": "production"
}
}'::jsonb
);
Step 3: Start the Relay
pg-tide --postgres-url "postgres://user:pass@localhost/mydb"
Step 4: Create Flink SQL Job
Connect to Flink SQL Gateway and define a source table reading from Kafka:
CREATE TABLE order_events (
order_id STRING,
customer_id STRING,
total DECIMAL(10, 2),
region STRING,
created_at TIMESTAMP(3),
WATERMARK FOR created_at AS created_at - INTERVAL '5' SECOND
) WITH (
'connector' = 'kafka',
'topic' = 'order-events',
'properties.bootstrap.servers' = 'kafka:9092',
'properties.group.id' = 'flink-analytics',
'format' = 'debezium-json',
'scan.startup.mode' = 'earliest-offset'
);
Define the output table:
CREATE TABLE order_analytics (
window_start TIMESTAMP(3),
window_end TIMESTAMP(3),
region STRING,
order_count BIGINT,
total_revenue DECIMAL(12, 2)
) WITH (
'connector' = 'kafka',
'topic' = 'order-analytics',
'properties.bootstrap.servers' = 'kafka:9092',
'format' = 'json'
);
Run the aggregation:
INSERT INTO order_analytics
SELECT
window_start,
window_end,
region,
COUNT(*) as order_count,
SUM(total) as total_revenue
FROM TABLE(
TUMBLE(TABLE order_events, DESCRIPTOR(created_at), INTERVAL '1' MINUTE)
)
GROUP BY window_start, window_end, region;
Step 5: Ingest Results Back to PostgreSQL
Create an inbox and configure a reverse pipeline to consume Flink's output:
-- Create inbox for analytics results
SELECT tide.inbox_create('analytics_results');
-- Configure pipeline to consume from the results topic
SELECT tide.relay_set_inbox(
'analytics-from-flink',
'analytics_results',
'{
"source_type": "kafka",
"brokers": "kafka:9092",
"topic": "order-analytics",
"consumer_group": "pg-tide-analytics",
"auto_offset_reset": "earliest"
}'::jsonb
);
Step 6: Query Results
SELECT payload->>'region' as region,
(payload->>'total_revenue')::decimal as revenue,
(payload->>'order_count')::int as orders,
payload->>'window_end' as period
FROM tide.inbox_pending('analytics_results')
ORDER BY period DESC;
Key Takeaways
- pg_tide's Debezium wire format integrates seamlessly with Flink's
debezium-jsonformat - The bidirectional flow (outbox → Kafka → Flink → Kafka → inbox) keeps PostgreSQL as the source of truth
- Flink provides windowed aggregations that would be expensive to compute in PostgreSQL
Further Reading
- Sinks: Kafka — Kafka sink configuration
- Sources: Kafka — Kafka source configuration
- Wire Format: Debezium — Debezium compatibility
Tutorial: Loading Data into a Data Lake
This tutorial demonstrates how to stream events from PostgreSQL into a data lake (S3/GCS with Apache Iceberg or Delta Lake format) for analytics. You'll set up a pipeline that continuously loads transactional data into a lakehouse architecture.
What You'll Build
PostgreSQL (transactions) → pg_tide → Object Storage (S3/GCS)
└── Iceberg/Delta tables
└── Query with Spark/Trino/DuckDB
Prerequisites
- PostgreSQL with pg_tide installed
- Object storage (AWS S3, GCS, or MinIO for local testing)
- A query engine (Trino, Spark, or DuckDB) for reading the lake
Step 1: Create the Outbox
SELECT tide.outbox_create('analytics_events');
Step 2: Configure Iceberg Sink
SELECT tide.relay_set_outbox(
'events-to-lake',
'analytics_events',
'{
"sink_type": "iceberg",
"catalog_type": "rest",
"catalog_uri": "http://iceberg-rest:8181",
"warehouse": "s3://my-lake/warehouse",
"namespace": "raw",
"table": "events",
"s3_endpoint": "https://s3.amazonaws.com",
"s3_region": "us-east-1",
"aws_access_key_id": "${env:AWS_ACCESS_KEY_ID}",
"aws_secret_access_key": "${env:AWS_SECRET_ACCESS_KEY}",
"partition_by": ["year", "month", "event_type"],
"commit_interval_seconds": 60
}'::jsonb
);
Alternative: Delta Lake
SELECT tide.relay_set_outbox(
'events-to-delta',
'analytics_events',
'{
"sink_type": "delta",
"table_uri": "s3://my-lake/delta/events",
"partition_columns": ["year", "month"],
"aws_access_key_id": "${env:AWS_ACCESS_KEY_ID}",
"aws_secret_access_key": "${env:AWS_SECRET_ACCESS_KEY}",
"target_file_size_mb": 128
}'::jsonb
);
Alternative: Raw Object Storage (Parquet)
SELECT tide.relay_set_outbox(
'events-to-parquet',
'analytics_events',
'{
"sink_type": "object_storage",
"provider": "s3",
"bucket": "my-lake",
"prefix": "raw/events/",
"format": "parquet",
"partition_template": "year={year}/month={month}/day={day}/",
"file_rotation_seconds": 300,
"file_rotation_rows": 100000,
"aws_access_key_id": "${env:AWS_ACCESS_KEY_ID}",
"aws_secret_access_key": "${env:AWS_SECRET_ACCESS_KEY}"
}'::jsonb
);
Step 3: Publish Events
-- Your application publishes events as part of normal transactions
BEGIN;
INSERT INTO orders (id, customer_id, total) VALUES ('ORD-001', 'CUST-42', 299.99);
SELECT tide.outbox_publish('analytics_events', 'orders', jsonb_build_object(
'event_type', 'order_created',
'order_id', 'ORD-001',
'customer_id', 'CUST-42',
'total', 299.99,
'year', extract(year from now())::int,
'month', extract(month from now())::int
));
COMMIT;
Step 4: Start the Relay
pg-tide --postgres-url "postgres://user:pass@localhost/mydb"
Step 5: Query the Lake
With Trino
SELECT event_type, count(*), sum(cast(json_extract_scalar(payload, '$.total') as decimal))
FROM iceberg.raw.events
WHERE year = 2024 AND month = 6
GROUP BY event_type;
With DuckDB
SELECT event_type, count(*), sum(payload->>'total')
FROM read_parquet('s3://my-lake/raw/events/**/*.parquet')
GROUP BY event_type;
Choosing a Lake Format
| Format | Best For | Ecosystem |
|---|---|---|
| Iceberg | Large-scale analytics, schema evolution | Spark, Trino, Flink, Snowflake |
| Delta Lake | Databricks ecosystem, ACID transactions | Spark, Databricks, DuckDB |
| Parquet files | Simple analytics, maximum compatibility | Everything |
Key Considerations
- Commit interval: Controls how frequently data becomes queryable. 60s gives near-real-time; 300s reduces small file overhead.
- Partitioning: Partition by time (year/month/day) and optionally by event type for efficient pruning.
- File size: Target 128-256 MB files for optimal query performance.
Further Reading
- Sinks: Iceberg — Apache Iceberg configuration
- Sinks: Delta — Delta Lake configuration
- Sinks: Object Storage — Raw file storage
Tutorial: Microservice Event Bus
This tutorial shows how to use pg_tide as a reliable event bus between microservices. Instead of direct service-to-service HTTP calls (which create tight coupling and cascade failures), services publish events to their local outbox and subscribe to events from other services via inboxes.
Architecture
┌─────────────────┐ ┌──────────────────┐ ┌───────────────────┐
│ Order Service │ │ Payment Service │ │ Shipping Service │
│ │ │ │ │ │
│ outbox: orders │──┐ ┌──→│ inbox: payments │ ┌──→│ inbox: shipping │
└─────────────────┘ │ │ └──────────────────┘ │ └───────────────────┘
│ │ │
↓ │ │
┌─────────────┐ │
│ NATS │────────────────────────┘
│ (or Kafka) │
└─────────────┘
↑
│
┌─────────────────┐ │
│ Inventory Svc │ │
│ │ │
│ outbox: stock │──┘
└─────────────────┘
What You'll Build
- Order Service publishes
order.created,order.cancelledevents - Payment Service subscribes to order events and processes payments
- Shipping Service subscribes to order events and initiates fulfillment
- Each service has its own PostgreSQL database with pg_tide
Step 1: Order Service Setup
-- In the order service's database
CREATE EXTENSION pg_tide;
SELECT tide.outbox_create('order_events');
Application code publishes events within the order transaction:
BEGIN;
INSERT INTO orders (id, customer_id, total, status)
VALUES ('ORD-001', 'CUST-42', 149.99, 'created');
SELECT tide.outbox_publish('order_events', 'orders', jsonb_build_object(
'event_type', 'order.created',
'order_id', 'ORD-001',
'customer_id', 'CUST-42',
'total', 149.99,
'items', jsonb_build_array(
jsonb_build_object('sku', 'WIDGET-A', 'qty', 2),
jsonb_build_object('sku', 'GADGET-B', 'qty', 1)
)
));
COMMIT;
Configure the relay to publish to NATS:
SELECT tide.relay_set_outbox(
'orders-to-nats',
'order_events',
'{
"sink_type": "nats",
"url": "nats://nats:4222",
"subject_template": "events.orders.{op}"
}'::jsonb
);
Step 2: Payment Service Setup
-- In the payment service's database
CREATE EXTENSION pg_tide;
SELECT tide.inbox_create('payment_triggers');
Configure an inbox pipeline that subscribes to order events:
SELECT tide.relay_set_inbox(
'orders-for-payments',
'payment_triggers',
'{
"source_type": "nats",
"url": "nats://nats:4222",
"subject": "events.orders.>",
"consumer_group": "payment-service",
"durable_name": "payment-service"
}'::jsonb
);
Process incoming events:
-- Payment service worker queries pending inbox messages
SELECT id, payload
FROM tide.inbox_pending('payment_triggers')
LIMIT 10;
-- After processing, mark as done
SELECT tide.inbox_mark_processed('payment_triggers', 42);
Step 3: Shipping Service Setup
-- In the shipping service's database
CREATE EXTENSION pg_tide;
SELECT tide.inbox_create('shipping_triggers');
SELECT tide.relay_set_inbox(
'orders-for-shipping',
'shipping_triggers',
'{
"source_type": "nats",
"url": "nats://nats:4222",
"subject": "events.orders.>",
"consumer_group": "shipping-service",
"durable_name": "shipping-service",
"transform": {
"filter": "payload.event_type == '"'"'order.created'"'"'"
}
}'::jsonb
);
The shipping service only processes order.created events (not cancellations) thanks to the transform filter.
Step 4: Start Relays
Each service runs its own relay instance:
# Order service relay
pg-tide --postgres-url "postgres://user:pass@orders-db/orders"
# Payment service relay
pg-tide --postgres-url "postgres://user:pass@payments-db/payments"
# Shipping service relay
pg-tide --postgres-url "postgres://user:pass@shipping-db/shipping"
Benefits of This Architecture
Loose coupling: Services don't know about each other. The order service publishes events without knowing who consumes them. New consumers can subscribe without changing the producer.
Reliability: The transactional outbox guarantees events are published if and only if the business transaction commits. No dual-write problems.
Independent scaling: Each service scales independently. The payment service can process events at its own pace without affecting the order service.
Resilience: If the payment service is down, events queue in NATS (with durable consumers) and are delivered when it recovers. No lost events, no cascading failures.
Further Reading
- Sinks: NATS — NATS JetStream configuration
- Sources: NATS — NATS subscription configuration
- Concepts: Message Guarantees — Delivery semantics
- Fan-Out Pattern — Multiple consumers for the same event stream
Tutorial: Debezium-Compatible CDC Replication
This tutorial shows how to use pg_tide as a Debezium-compatible CDC (Change Data Capture) source. If you're currently using Debezium to capture PostgreSQL changes and publish them to Kafka, pg_tide can replace the Debezium connector while producing identical message formats — giving you transactional outbox guarantees instead of WAL-based CDC.
Why Replace Debezium with pg_tide?
| Aspect | Debezium | pg_tide |
|---|---|---|
| Mechanism | Reads WAL (logical replication) | Transactional outbox (application-level) |
| Consistency | Eventually consistent (WAL delay) | Transactionally consistent |
| Schema changes | Can miss or break on DDL | Application controls the schema |
| Selectivity | Captures all row changes | Application chooses what to publish |
| Infrastructure | Kafka Connect cluster | Single relay binary |
| Message format | Debezium JSON/Avro | Same (via wire_format = "debezium") |
What You'll Build
A pipeline that publishes order changes in Debezium format to Kafka, compatible with existing Debezium consumers.
Step 1: Create the Outbox
CREATE EXTENSION pg_tide;
SELECT tide.outbox_create('cdc_events');
Step 2: Create a Trigger (Optional)
If you want automatic CDC (capture all changes without modifying application code), add a trigger:
CREATE OR REPLACE FUNCTION capture_order_changes() RETURNS trigger AS $$
BEGIN
IF TG_OP = 'INSERT' THEN
PERFORM tide.outbox_publish('cdc_events', 'orders',
jsonb_build_object(
'op', 'insert',
'new_row', row_to_json(NEW)::jsonb,
'old_row', null
)
);
ELSIF TG_OP = 'UPDATE' THEN
PERFORM tide.outbox_publish('cdc_events', 'orders',
jsonb_build_object(
'op', 'update',
'new_row', row_to_json(NEW)::jsonb,
'old_row', row_to_json(OLD)::jsonb
)
);
ELSIF TG_OP = 'DELETE' THEN
PERFORM tide.outbox_publish('cdc_events', 'orders',
jsonb_build_object(
'op', 'delete',
'new_row', null,
'old_row', row_to_json(OLD)::jsonb
)
);
END IF;
RETURN COALESCE(NEW, OLD);
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER orders_cdc
AFTER INSERT OR UPDATE OR DELETE ON orders
FOR EACH ROW EXECUTE FUNCTION capture_order_changes();
Step 3: Configure Debezium Wire Format
SELECT tide.relay_set_outbox(
'cdc-orders',
'cdc_events',
'{
"sink_type": "kafka",
"brokers": "kafka:9092",
"topic": "dbserver1.public.orders",
"wire_format": "debezium",
"wire_config": {
"server_name": "dbserver1",
"emit_tombstones": true,
"key_strategy": "primary_key"
}
}'::jsonb
);
The topic name dbserver1.public.orders follows Debezium's naming convention: {server_name}.{schema}.{table}.
Step 4: Start the Relay
pg-tide --postgres-url "postgres://user:pass@localhost/mydb"
Step 5: Verify Consumer Compatibility
Your existing Debezium consumers should work without changes. The messages have the same shape:
{
"schema": { ... },
"payload": {
"before": null,
"after": {
"id": 1,
"customer_id": "CUST-42",
"total": 149.99,
"status": "created"
},
"op": "c",
"ts_ms": 1714029482000,
"source": {
"version": "pg-tide",
"connector": "postgresql",
"name": "dbserver1",
"ts_ms": 1714029482000,
"db": "mydb",
"schema": "public",
"table": "orders"
}
}
}
The only visible difference: source.version says "pg-tide" instead of a Debezium version number.
Migration Strategy
- Run in parallel: Deploy pg_tide alongside Debezium, publishing to a test topic
- Compare output: Verify messages are compatible with your consumers
- Switch consumers: Point consumers to the pg_tide topic
- Decommission Debezium: Remove the Debezium connector and Kafka Connect cluster
With Schema Registry (Avro)
For Avro-encoded Debezium messages:
SELECT tide.relay_set_outbox(
'cdc-orders-avro',
'cdc_events',
'{
"sink_type": "kafka",
"brokers": "kafka:9092",
"topic": "dbserver1.public.orders",
"wire_format": "debezium",
"wire_config": {
"server_name": "dbserver1",
"envelope": "avro"
},
"schema_registry": {
"url": "http://schema-registry:8081"
}
}'::jsonb
);
Further Reading
- Wire Format: Debezium — Complete Debezium format reference
- Schema Registry — Avro + Schema Registry integration
- Sinks: Kafka — Kafka sink configuration
Tutorial: Singer/Meltano ETL Pipelines
This tutorial shows how to use pg_tide with Singer taps and targets to build ETL pipelines. You'll extract data from a SaaS API (HubSpot), load it into PostgreSQL, transform it, and export results to a data warehouse.
What You'll Build
HubSpot API → tap-hubspot → pg_tide inbox → Transform → pg_tide outbox → target-snowflake
Prerequisites
- PostgreSQL with pg_tide installed
- Python 3.8+ (for Singer taps/targets)
- A HubSpot account with API access (or substitute any Singer tap)
Step 1: Install Singer Tap
pip install tap-hubspot
Step 2: Configure Extraction into pg_tide
CREATE EXTENSION pg_tide;
SELECT tide.inbox_create('hubspot_contacts');
SELECT tide.relay_set_inbox(
'hubspot-extraction',
'hubspot_contacts',
'{
"source_type": "singer",
"tap_command": "tap-hubspot",
"tap_config": {
"api_key": "${env:HUBSPOT_API_KEY}",
"start_date": "2024-01-01T00:00:00Z"
},
"stream_filter": ["contacts", "companies", "deals"],
"state_persistence": true
}'::jsonb
);
Step 3: Start the Relay
export HUBSPOT_API_KEY="your-api-key"
pg-tide --postgres-url "postgres://user:pass@localhost/mydb"
The relay runs the tap, captures its output, and writes records into the inbox. STATE messages are persisted for incremental syncs.
Step 4: Process Inbox Data
Query the inbox to see extracted records:
SELECT id, payload->>'email' as email, payload->>'company' as company
FROM tide.inbox_pending('hubspot_contacts')
WHERE payload->>'stream' = 'contacts'
LIMIT 10;
Process and transform:
-- Create a materialized view for analytics
CREATE MATERIALIZED VIEW contact_summary AS
SELECT
payload->>'company' as company,
count(*) as contact_count,
max((payload->>'last_activity_date')::date) as last_active
FROM tide.inbox_all('hubspot_contacts')
WHERE payload->>'stream' = 'contacts'
GROUP BY payload->>'company';
-- Mark processed
SELECT tide.inbox_mark_processed('hubspot_contacts', id)
FROM tide.inbox_pending('hubspot_contacts');
Step 5: Export to Data Warehouse
Create an outbox for warehouse loading:
SELECT tide.outbox_create('warehouse_events');
-- Publish transformed data
SELECT tide.outbox_publish('warehouse_events', 'contacts', jsonb_build_object(
'company', company,
'contact_count', contact_count,
'last_active', last_active
))
FROM contact_summary;
Configure Singer target export:
SELECT tide.relay_set_outbox(
'to-snowflake',
'warehouse_events',
'{
"sink_type": "singer",
"target_command": "target-snowflake",
"target_config": {
"account": "${env:SNOWFLAKE_ACCOUNT}",
"user": "${env:SNOWFLAKE_USER}",
"password": "${env:SNOWFLAKE_PASSWORD}",
"database": "ANALYTICS",
"schema": "RAW"
}
}'::jsonb
);
Step 6: Schedule Incremental Syncs
Since pg_tide persists Singer STATE, each run only extracts new/changed records. Schedule periodic syncs with cron or a workflow orchestrator:
# Run every hour - only extracts changes since last run
0 * * * * pg-tide --postgres-url "..." --run-once
Or keep the relay running continuously — it will re-run the tap at configurable intervals.
Available Taps (Examples)
| Tap | Data Source | Install |
|---|---|---|
| tap-hubspot | HubSpot CRM | pip install tap-hubspot |
| tap-salesforce | Salesforce | pip install tap-salesforce |
| tap-stripe | Stripe payments | pip install tap-stripe |
| tap-github | GitHub repos | pip install tap-github |
| tap-postgres | PostgreSQL DB | pip install tap-postgres |
| tap-mysql | MySQL DB | pip install tap-mysql |
| tap-google-analytics | GA4 | pip install tap-google-analytics |
| tap-shopify | Shopify | pip install tap-shopify |
Browse 500+ taps at hub.meltano.com.
Further Reading
- Sources: Singer — Singer source configuration
- Sinks: Singer — Singer target configuration
- Singer Protocol — Protocol details
Tutorial: Notification Fan-Out
This tutorial demonstrates how to fan out events from a single outbox to multiple notification channels simultaneously — Slack for team alerts, PagerDuty for on-call escalation, email via webhook, and a message archive.
What You'll Build
┌──→ Slack (#alerts channel)
│
PostgreSQL → pg_tide ──┼──→ PagerDuty (critical only)
(incidents) │
├──→ Email webhook (all incidents)
│
└──→ Kafka (archive + analytics)
Step 1: Create the Outbox
CREATE EXTENSION pg_tide;
SELECT tide.outbox_create('incident_events');
Step 2: Application Publishes Events
BEGIN;
INSERT INTO incidents (id, title, severity, team, description)
VALUES ('INC-042', 'Database latency spike', 'critical', 'platform', 'P99 > 5s');
SELECT tide.outbox_publish('incident_events', 'incidents', jsonb_build_object(
'event_type', 'incident.created',
'incident_id', 'INC-042',
'title', 'Database latency spike',
'severity', 'critical',
'team', 'platform',
'description', 'P99 latency exceeded 5s threshold'
));
COMMIT;
Step 3: Configure Multiple Pipelines
Pipeline 1: Slack (all incidents)
SELECT tide.relay_set_outbox(
'incidents-to-slack',
'incident_events',
'{
"sink_type": "slack",
"webhook_url": "${env:SLACK_ALERTS_WEBHOOK}",
"channel": "#incidents",
"transform": {
"payload": "{ text: join('"'"''"'"', ['"'"'🚨 *'"'"', payload.title, '"'"'* ('"'"', payload.severity, '"'"')\\nTeam: '"'"', payload.team]) }"
}
}'::jsonb
);
Pipeline 2: PagerDuty (critical only)
SELECT tide.relay_set_outbox(
'incidents-to-pagerduty',
'incident_events',
'{
"sink_type": "pagerduty",
"routing_key": "${env:PAGERDUTY_ROUTING_KEY}",
"transform": {
"filter": "payload.severity == '"'"'critical'"'"'"
}
}'::jsonb
);
Pipeline 3: Email webhook (all incidents)
SELECT tide.relay_set_outbox(
'incidents-to-email',
'incident_events',
'{
"sink_type": "webhook",
"url": "https://api.sendgrid.com/v3/mail/send",
"method": "POST",
"headers": {
"Authorization": "Bearer ${env:SENDGRID_API_KEY}"
},
"transform": {
"payload": "{ personalizations: [{ to: [{ email: '"'"'oncall@company.com'"'"' }] }], from: { email: '"'"'alerts@company.com'"'"' }, subject: join('"'"''"'"', ['"'"'['"'"', payload.severity, '"'"'] '"'"', payload.title]), content: [{ type: '"'"'text/plain'"'"', value: payload.description }] }"
}
}'::jsonb
);
Pipeline 4: Kafka archive (all incidents)
SELECT tide.relay_set_outbox(
'incidents-to-archive',
'incident_events',
'{
"sink_type": "kafka",
"brokers": "kafka:9092",
"topic": "incidents-archive",
"routing": {
"rules": [
{ "match_field": "severity", "match_value": "critical", "subject": "incidents.critical" },
{ "match_field": "severity", "match_value": "warning", "subject": "incidents.warning" }
],
"default_template": "incidents.info"
}
}'::jsonb
);
Step 4: Start the Relay
pg-tide --postgres-url "postgres://user:pass@localhost/mydb"
All four pipelines run independently. A single outbox event fans out to four destinations.
How Fan-Out Works
Each pipeline is independent:
- They each have their own consumer position (last relayed outbox ID)
- They process at their own pace
- If one sink is slow or down, others continue unaffected
- Each can have different transforms, filters, and routing rules
Rate Limiting Notifications
To avoid notification storms, add rate limiting to chatty channels:
-- Limit Slack to 1 message per second
SELECT tide.relay_set_outbox(
'incidents-to-slack',
'incident_events',
'{
"sink_type": "slack",
"webhook_url": "${env:SLACK_ALERTS_WEBHOOK}",
"rate_limit": {
"enabled": true,
"max_messages_per_second": 1,
"burst_size": 5
}
}'::jsonb
);
Further Reading
- Sinks: Slack — Slack configuration
- Sinks: PagerDuty — PagerDuty configuration
- Feature: Routing — Content-based routing
- Feature: Transforms — Message filtering and reshaping
- Feature: Rate Limiting — Controlling notification rate
Tutorial: Cross-Region Event Replication
This tutorial shows how to replicate events between PostgreSQL instances in different geographic regions using pg_tide. Each region maintains its own outbox and inbox, with NATS (or Kafka) as the cross-region transport.
Architecture
Region US-East Region EU-West
┌───────────────────┐ ┌───────────────────┐
│ PostgreSQL (US) │ │ PostgreSQL (EU) │
│ │ │ │
│ outbox: orders │──→ pg_tide ──┐ │ inbox: orders │
│ inbox: orders │ │ │ outbox: orders │──→ pg_tide ──┐
└───────────────────┘ │ └───────────────────┘ │
│ │
↓ ↓
┌─────────────┐ ┌─────────────┐
│ NATS (US) │←── NATS Gateway ──────→│ NATS (EU) │
└─────────────┘ └─────────────┘
↑ ↑
│ │
pg_tide ──→ inbox (US) pg_tide ──→ inbox (EU)
What You'll Build
- Orders created in US-East are replicated to EU-West within seconds
- Orders created in EU-West are replicated to US-East
- Each region can operate independently during network partitions
- The inbox provides deduplication for messages that arrive during recovery
Step 1: Setup US-East Region
-- US-East PostgreSQL
CREATE EXTENSION pg_tide;
SELECT tide.outbox_create('order_events');
SELECT tide.inbox_create('replicated_orders');
Publish pipeline (US → NATS):
SELECT tide.relay_set_outbox(
'us-orders-publish',
'order_events',
'{
"sink_type": "nats",
"url": "nats://nats-us:4222",
"subject_template": "orders.{op}.us-east",
"stream": "ORDERS"
}'::jsonb
);
Subscribe pipeline (NATS → US inbox, for EU-originated events):
SELECT tide.relay_set_inbox(
'eu-orders-subscribe',
'replicated_orders',
'{
"source_type": "nats",
"url": "nats://nats-us:4222",
"subject": "orders.*.eu-west",
"stream": "ORDERS",
"consumer_group": "us-east-replica",
"durable_name": "us-east-replica"
}'::jsonb
);
Step 2: Setup EU-West Region
-- EU-West PostgreSQL
CREATE EXTENSION pg_tide;
SELECT tide.outbox_create('order_events');
SELECT tide.inbox_create('replicated_orders');
Publish pipeline (EU → NATS):
SELECT tide.relay_set_outbox(
'eu-orders-publish',
'order_events',
'{
"sink_type": "nats",
"url": "nats://nats-eu:4222",
"subject_template": "orders.{op}.eu-west",
"stream": "ORDERS"
}'::jsonb
);
Subscribe pipeline (NATS → EU inbox, for US-originated events):
SELECT tide.relay_set_inbox(
'us-orders-subscribe',
'replicated_orders',
'{
"source_type": "nats",
"url": "nats://nats-eu:4222",
"subject": "orders.*.us-east",
"stream": "ORDERS",
"consumer_group": "eu-west-replica",
"durable_name": "eu-west-replica"
}'::jsonb
);
Step 3: Configure NATS Gateway
NATS supports multi-region replication via Gateway connections:
# nats-us.conf
gateway {
name: "us-east"
listen: "0.0.0.0:7222"
gateways: [
{ name: "eu-west", urls: ["nats://nats-eu:7222"] }
]
}
# nats-eu.conf
gateway {
name: "eu-west"
listen: "0.0.0.0:7222"
gateways: [
{ name: "us-east", urls: ["nats://nats-us:7222"] }
]
}
Step 4: Application Usage
In US-East:
BEGIN;
INSERT INTO orders (id, region, customer_id, total)
VALUES ('ORD-US-001', 'us-east', 'CUST-42', 149.99);
SELECT tide.outbox_publish('order_events', 'orders', jsonb_build_object(
'order_id', 'ORD-US-001',
'region', 'us-east',
'customer_id', 'CUST-42',
'total', 149.99
));
COMMIT;
This event automatically replicates to EU-West via: outbox → NATS (US) → NATS Gateway → NATS (EU) → inbox (EU).
Handling Network Partitions
During a network partition between regions:
- Each region continues operating independently (local outbox/inbox works fine)
- Cross-region messages queue in the local NATS JetStream
- When connectivity restores, NATS gateways resync automatically
- The inbox's deduplication prevents double-processing of any message
Conflict Resolution
For bidirectional replication, you need a conflict resolution strategy. Options:
- Last-writer-wins: Include timestamps, take the latest
- Region priority: US-East wins ties
- Merge: Application-specific merge logic in the inbox processor
- CRDT-style: Use commutative operations (add-only sets, counters)
Further Reading
- Sinks: NATS — NATS JetStream configuration
- Sources: NATS — NATS subscription configuration
- Bidirectional Sync — Two-way sync patterns
- HA Coordination — Per-region relay HA
Migrating to pg_tide
This guide covers migrating from common messaging and CDC solutions to pg_tide. Whether you're moving from Debezium, a custom outbox implementation, application-level message publishing, or another event streaming platform, this guide provides a path.
From Debezium
What Changes
| Aspect | Debezium | pg_tide |
|---|---|---|
| CDC mechanism | WAL logical replication | Transactional outbox |
| Infrastructure | Kafka Connect cluster | Single relay binary |
| Configuration | Connector JSON via REST API | SQL catalog + relay process |
| Message format | Debezium JSON/Avro | Same (use wire_format = "debezium") |
| Schema changes | Automatic (WAL captures all) | Application explicitly publishes |
| Filtering | SMTs (Single Message Transforms) | JMESPath transforms |
Migration Steps
-
Install pg_tide extension:
CREATE EXTENSION pg_tide; -
Create outbox for each captured table:
SELECT tide.outbox_create('orders_cdc'); -
Add triggers or modify application to publish events:
-- Option A: Trigger (captures all changes automatically) CREATE TRIGGER orders_cdc AFTER INSERT OR UPDATE OR DELETE ON orders FOR EACH ROW EXECUTE FUNCTION your_cdc_trigger(); -- Option B: Application publishes explicitly (recommended) SELECT tide.outbox_publish('orders_cdc', 'orders', your_payload); -
Configure pipeline with Debezium wire format:
SELECT tide.relay_set_outbox('orders-cdc', 'orders_cdc', '{ "sink_type": "kafka", "brokers": "kafka:9092", "topic": "dbserver1.public.orders", "wire_format": "debezium", "wire_config": { "server_name": "dbserver1" } }'::jsonb); -
Run in parallel: Deploy pg_tide alongside Debezium, publishing to a separate topic. Compare output.
-
Switch consumers: Once validated, point consumers to the pg_tide topic.
-
Decommission Debezium: Remove Kafka Connect connectors.
Key Differences to Communicate to Your Team
- Consumers see identical Debezium-format messages (no consumer changes needed)
- Events are now guaranteed to be published if and only if the transaction commits
- You choose what to publish (vs. Debezium capturing everything)
- No more "snapshot" mode — data is published at application time
From Custom Outbox Implementations
Many teams have hand-built outbox tables with cron jobs or application-level polling. Migrating to pg_tide replaces the custom infrastructure with a maintained, optimized solution.
Typical Custom Outbox
-- Your existing custom outbox table
CREATE TABLE outbox (
id BIGSERIAL PRIMARY KEY,
event_type TEXT,
payload JSONB,
published BOOLEAN DEFAULT FALSE,
created_at TIMESTAMPTZ DEFAULT now()
);
Migration Steps
-
Create pg_tide outbox:
SELECT tide.outbox_create('events'); -
Backfill unpublished messages (if needed):
INSERT INTO tide.outbox_events (stream_table, payload) SELECT event_type, payload FROM outbox WHERE published = FALSE; -
Update application code:
-- Before: INSERT INTO outbox (event_type, payload) VALUES (...) -- After: SELECT tide.outbox_publish('events', 'orders', '{"order_id": "..."}'::jsonb); -
Configure relay pipeline:
SELECT tide.relay_set_outbox('events-pipeline', 'events', '{ "sink_type": "your-sink", ... }'::jsonb); -
Remove old polling infrastructure: Delete cron jobs, background workers, custom retry logic.
From Direct Message Publishing
If your application publishes directly to Kafka/NATS/RabbitMQ (without an outbox), you have a dual-write problem — the database write and message publish can become inconsistent.
The Dual-Write Problem
BEGIN;
INSERT INTO orders (...); -- ✓ succeeds
COMMIT;
publish_to_kafka(...); -- ✗ fails (network error)
-- Order exists but event was never published!
Migration to Transactional Outbox
BEGIN;
INSERT INTO orders (...);
SELECT tide.outbox_publish('order_events', 'orders', ...);
COMMIT;
-- Both succeed or both fail. pg_tide relay handles delivery.
Steps
- Install pg_tide and create outbox
- Replace direct publish calls with
tide.outbox_publish()inside the transaction - Configure relay to deliver to the same broker
- Remove direct publishing code and client libraries
From RabbitMQ / ActiveMQ
If you're using a traditional message broker and want to move to pg_tide:
- Keep the broker as the transport — pg_tide publishes to RabbitMQ/AMQP
- Replace the producer — use transactional outbox instead of direct publish
- Consumers stay the same — they still read from the same queues
SELECT tide.relay_set_outbox('orders-to-rabbit', 'order_events', '{
"sink_type": "rabbitmq",
"url": "amqp://guest:guest@rabbitmq:5672",
"exchange": "orders",
"routing_key": "order.created"
}'::jsonb);
Rollback Plan
If you need to roll back the migration:
- pg_tide outbox tables remain in your database (no data loss)
- Re-enable your previous publishing mechanism
- Disable the pg_tide relay pipeline:
UPDATE tide.relay_outbox_config SET enabled = false WHERE name = 'your-pipeline'; - The extension can be dropped if no longer needed:
DROP EXTENSION pg_tide CASCADE;
Further Reading
- Tutorial: Debezium-Compatible Replication — Step-by-step Debezium replacement
- Concepts: Transactional Outbox — Why this pattern works
- Architecture — How pg_tide works internally
Security Guide
This guide covers security best practices for deploying pg_tide in production, including secret management, network security, authentication, and access control.
Secret Management
Environment Variable Substitution
pg_tide supports ${env:VARIABLE_NAME} syntax in pipeline configurations. Secrets are resolved at runtime from the relay process's environment — they never appear in the PostgreSQL catalog:
SELECT tide.relay_set_outbox('my-pipeline', 'events', '{
"sink_type": "kafka",
"brokers": "${env:KAFKA_BROKERS}",
"sasl_username": "${env:KAFKA_USER}",
"sasl_password": "${env:KAFKA_PASS}"
}'::jsonb);
The catalog stores the ${env:...} tokens, not the resolved values. The relay resolves them at startup.
File-Based Secrets
For secrets stored on disk (Kubernetes mounted secrets, vault agent files):
{
"password": "${file:/run/secrets/db-password}"
}
The relay reads the file content, trims whitespace, and substitutes the value.
Best Practices
- Never hardcode secrets in pipeline configurations
- Use Kubernetes Secrets mounted as environment variables or files
- Rotate secrets regularly — update the environment/file and the relay picks up new values on restart
- Use separate credentials per pipeline when possible (principle of least privilege)
- Restrict access to the
tideschema — only the application and relay need access
Database Access Control
Principle of Least Privilege
Create dedicated roles for different access patterns:
-- Application role: can publish to outbox and read inbox
CREATE ROLE app_writer;
GRANT USAGE ON SCHEMA tide TO app_writer;
GRANT EXECUTE ON FUNCTION tide.outbox_publish TO app_writer;
GRANT EXECUTE ON FUNCTION tide.inbox_mark_processed TO app_writer;
GRANT SELECT ON tide.inbox_pending TO app_writer;
-- Relay role: needs full access to catalog and outbox/inbox tables
CREATE ROLE relay_worker;
GRANT USAGE ON SCHEMA tide TO relay_worker;
GRANT ALL ON ALL TABLES IN SCHEMA tide TO relay_worker;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA tide TO relay_worker;
-- Read-only monitoring role
CREATE ROLE monitor;
GRANT USAGE ON SCHEMA tide TO monitor;
GRANT SELECT ON tide.outbox_status TO monitor;
GRANT SELECT ON tide.relay_dlq TO monitor;
Connection Security
Always use TLS for PostgreSQL connections:
pg-tide --postgres-url "postgres://relay:pass@db:5432/mydb?sslmode=require"
For strict certificate verification:
pg-tide --postgres-url "postgres://relay:pass@db:5432/mydb?sslmode=verify-full&sslrootcert=/certs/ca.pem"
Network Security
Relay Process
The relay exposes two network endpoints:
- Metrics endpoint (default
:9090) — Prometheus metrics and health check - Webhook receiver (if configured) — Incoming webhooks
Secure them:
- Bind metrics to internal network only:
--metrics-addr "10.0.0.0:9090" - Use network policies (Kubernetes) to restrict access
- Never expose metrics to the public internet
Sink Connections
- Use TLS for all sink connections (Kafka, NATS, HTTP, cloud services)
- Use SASL/mTLS for Kafka when available
- Verify certificates — don't disable TLS verification in production
- Use private endpoints for cloud services (AWS PrivateLink, GCP Private Service Connect)
Kubernetes Network Policies
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: pg-tide-relay
spec:
podSelector:
matchLabels:
app: pg-tide-relay
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: prometheus
ports:
- port: 9090
egress:
- to:
- podSelector:
matchLabels:
app: postgres
ports:
- port: 5432
- to: # Allow outbound to sinks
- namespaceSelector: {}
Webhook Security
Outgoing Webhooks
Sign outgoing webhooks so recipients can verify authenticity:
{
"sink_type": "webhook",
"url": "https://partner.example.com/events",
"signature": {
"scheme": "hmac-sha256",
"secret": "${env:WEBHOOK_SECRET}",
"header": "X-Signature-256"
}
}
Incoming Webhooks
Always verify incoming webhook signatures:
{
"source_type": "webhook",
"signature_scheme": "stripe",
"signature_secret": "${env:STRIPE_WEBHOOK_SECRET}"
}
Reject unsigned requests. See Webhook Signatures.
Audit Trail
pg_tide maintains a natural audit trail:
- Every published event has a sequential ID, timestamp, and stream table
- The DLQ records all delivery failures with error details
- Relay logs show all pipeline activity
For compliance, ensure:
- Relay logs are shipped to a centralized logging system
- DLQ entries are reviewed within your SLA
- Outbox tables have appropriate retention policies
Common Vulnerabilities to Avoid
| Risk | Mitigation |
|---|---|
| Secrets in catalog | Use ${env:...} substitution |
| Unencrypted connections | Enforce sslmode=require or verify-full |
| Open metrics endpoint | Bind to internal network, use network policies |
| Excessive permissions | Use dedicated roles with minimal grants |
| Unsigned webhooks | Always configure signature verification |
| Stale credentials | Implement secret rotation procedures |
Further Reading
- Webhook Signatures — Signature schemes
- Deployment Guide — Production deployment
- Reference: Security — Extension security model
pg-trickle Integration
pg-trickle is the PostgreSQL extension that pg_tide was originally extracted from. If you already use pg-trickle, you can adopt pg_tide incrementally — or run both side-by-side during a migration.
When to use pg_tide instead of pg-trickle
| Situation | Recommendation |
|---|---|
| Starting a new project | Use pg_tide — it is the focused, standalone successor |
| Existing pg-trickle installation | Migrate when ready; pg_tide's schema is compatible |
Need stream tables (pg_trickle_streams) | Stay on pg-trickle for now |
| Need transactional outbox + inbox only | pg_tide covers this fully |
Schema compatibility
pg_tide uses a tide.* schema prefix. pg-trickle uses pg_trickle_* table
names. The two schemas can coexist in the same database without conflict.
Migrating from pg-trickle
1. Install pg_tide alongside pg-trickle
CREATE EXTENSION pg_tide;
2. Create matching outboxes and inboxes
For each pg_trickle_outbox in your existing schema:
SELECT tide.outbox_create(outbox_name, retention_hours)
FROM pg_trickle_outbox_config;
3. Migrate pending messages
INSERT INTO tide.tide_outbox_messages (outbox_name, payload, headers, created_at)
SELECT stream_name, payload, headers, created_at
FROM pg_trickle_outbox_messages
WHERE consumed_at IS NULL;
4. Point the relay at pg_tide
Update your pg-tide relay configuration:
SELECT tide.relay_set_outbox(
'my-pipeline',
'orders',
'nats',
'{"url":"nats://broker:4222"}'::jsonb
);
5. Verify and cut over
Run both relays in parallel during the transition. Once the pg_tide relay is processing all new messages, decommission the pg-trickle relay.
Using both together
Both extensions can write to the same NATS subject or Kafka topic — consumers
should use the x-source header to distinguish messages originating from
pg_tide versus pg-trickle.
pg_tide stamps every envelope with:
{
"id": 42,
"outbox_name": "orders",
"payload": { ... },
"headers": { "x-source": "pg-tide", "x-outbox": "orders" }
}
dbt Integration
dbt (data build tool) transforms data inside
PostgreSQL using SQL SELECT statements. pg_tide can bridge the gap between
dbt-managed tables and external consumers: any dbt model that writes to a table
can trigger an outbox publish, and inbound events can feed into a staging table
that dbt reads.
Pattern 1 — Publish dbt model output to the outbox
After a dbt model run completes, use a PostgreSQL trigger or an explicit
CALL/SELECT to publish new rows to the outbox:
Trigger-based approach
-- Fired automatically whenever dbt inserts into the target table.
CREATE OR REPLACE FUNCTION public.on_customer_upsert()
RETURNS TRIGGER LANGUAGE plpgsql AS $$
BEGIN
PERFORM tide.outbox_publish(
'customers',
row_to_json(NEW)::jsonb,
'{"x-op": "upsert", "x-model": "dim_customers"}'::jsonb
);
RETURN NEW;
END;
$$;
CREATE TRIGGER publish_customer_events
AFTER INSERT OR UPDATE ON public.dim_customers
FOR EACH ROW EXECUTE FUNCTION public.on_customer_upsert();
Post-hook approach (dbt project.yml)
# dbt_project.yml
models:
my_project:
dim_customers:
+post-hook: >
INSERT INTO tide.tide_outbox_messages (outbox_name, payload, headers)
SELECT
'customers',
row_to_json(t)::jsonb,
'{}'::jsonb
FROM {{ this }} t
WHERE updated_at > '{{ run_started_at }}'
Pattern 2 — Read inbox events in a dbt source
Configure the inbox table as a dbt source so that transformation models can join against inbound events:
# sources.yaml
sources:
- name: tide
schema: tide
tables:
- name: payments_inbox
description: "Inbound payment events from Stripe via the pg_tide relay"
columns:
- name: event_id
description: "Globally unique event identifier (dedup key)"
- name: payload
description: "Event payload as JSONB"
- name: received_at
description: "Timestamp when the event arrived in the inbox"
Then reference it in a dbt model:
-- models/stg_payments.sql
SELECT
event_id,
payload ->> 'payment_id' AS payment_id,
(payload ->> 'amount_cents')::int AS amount_cents,
payload ->> 'currency' AS currency,
received_at
FROM {{ source('tide', 'payments_inbox') }}
WHERE processed_at IS NOT NULL
Pattern 3 — Event-driven dbt runs
Use pg_notify to trigger a dbt run when new outbox messages arrive:
-- Notify an external listener whenever the outbox receives a new message.
CREATE OR REPLACE FUNCTION tide.notify_dbt_trigger() RETURNS TRIGGER LANGUAGE plpgsql AS $$
BEGIN
PERFORM pg_notify('dbt_run_trigger', NEW.outbox_name);
RETURN NEW;
END;
$$;
CREATE TRIGGER dbt_trigger
AFTER INSERT ON tide.tide_outbox_messages
FOR EACH ROW EXECUTE FUNCTION tide.notify_dbt_trigger();
A lightweight listener process (e.g. a Python script using psycopg2) can
then invoke dbt run --select <model> on demand.
Best practices
- Idempotency: dbt runs must be idempotent. Use
ON CONFLICT DO NOTHINGor surrogate keys when writing to tables that publish to the outbox. - Dedup keys: Set
event_idin the payload to a deterministic key (e.g.{{ dbt_utils.generate_surrogate_key(['order_id', 'updated_at']) }}) so that the inbox deduplicates re-runs correctly. - Schema versioning: Add a
_schema_versionfield to every outbox payload so downstream consumers can handle schema evolution gracefully.
CloudNativePG Integration
CloudNativePG (CNPG) is the most popular PostgreSQL operator for Kubernetes. pg_tide integrates naturally with CNPG: the relay runs as a sidecar container alongside each PostgreSQL pod, sharing the same lifecycle and connection string.
Quick start
See the ready-to-use manifest in examples/cnpg/cluster.yaml.
Custom PostgreSQL image
CNPG uses container images for PostgreSQL. To bundle pg_tide, extend the official CNPG image:
FROM ghcr.io/cloudnative-pg/postgresql:18
# Copy the compiled extension files.
COPY pg_tide.so /usr/lib/postgresql/18/lib/
COPY pg_tide.control /usr/share/postgresql/18/extension/
COPY sql/pg_tide--0.1.0.sql /usr/share/postgresql/18/extension/
Build and push to your container registry, then reference it in the CNPG
Cluster spec:
spec:
imageName: ghcr.io/your-org/postgres-pg-tide:18
Sidecar pattern
The relay runs as a sidecar in the same pod as PostgreSQL. It connects to
localhost:5432 via the injected *-app secret that CNPG creates automatically:
spec:
sidecars:
- name: pg-tide-relay
image: ghcr.io/trickle-labs/pg-tide:0.1.0
env:
- name: PG_TIDE_RELAY_POSTGRES_URL
valueFrom:
secretKeyRef:
name: my-cluster-app # CNPG-generated secret
key: uri
Because the relay runs in the same pod, it connects over the loopback interface with zero network latency — ideal for high-throughput workloads.
High availability
CNPG manages primary/replica failover automatically. The relay on the primary pod holds PostgreSQL advisory locks on each pipeline. When a failover occurs:
- The primary pod is terminated → advisory locks are released.
- CNPG promotes a replica to primary.
- The relay sidecar on the new primary pod starts and acquires the locks.
- Message delivery resumes from the last committed consumer offset.
No messages are lost. In-flight messages from the old primary are re-delivered because the consumer offset was not yet advanced.
Initialisation SQL
Bootstrap pg_tide when CNPG creates the cluster:
spec:
bootstrap:
initdb:
database: app
owner: app
postInitSQL:
- CREATE EXTENSION IF NOT EXISTS pg_tide;
- CREATE ROLE relay_user LOGIN PASSWORD 'strong-password';
- GRANT USAGE ON SCHEMA tide TO relay_user;
- GRANT SELECT, INSERT ON tide.tide_outbox_messages TO relay_user;
- GRANT SELECT ON ALL TABLES IN SCHEMA tide TO relay_user;
Prometheus monitoring
CNPG integrates with the Prometheus Operator. Add a ServiceMonitor to scrape
relay metrics alongside the built-in CNPG metrics:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: pg-tide-relay
spec:
selector:
matchLabels:
cnpg.io/cluster: my-cluster
endpoints:
- port: relay-metrics
path: /metrics
interval: 15s
PgBouncer Integration
PgBouncer is the standard PostgreSQL connection pooler. pg_tide's relay can run comfortably behind PgBouncer with a couple of small configuration choices.
Which pool mode to use
| Pool mode | Advisory locks | LISTEN/NOTIFY | pg_tide compatibility |
|---|---|---|---|
| Session | ✅ Work | ✅ Work | ✅ Recommended |
| Transaction | ❌ Released between statements | ❌ Do not work | ⚠️ Not recommended for the relay |
| Statement | ❌ | ❌ | ❌ Do not use |
Use session mode for the relay connection. The relay holds PostgreSQL
advisory locks for pipeline ownership and uses LISTEN tide_relay_config for
hot-reload — both require a persistent session.
Recommended setup
1. Dedicate a small pool for the relay
Give the relay its own PgBouncer database entry so it always gets a session-mode connection, even if the main application pool uses transaction mode:
# pgbouncer.ini
[databases]
myapp = host=127.0.0.1 port=5432 dbname=myapp pool_mode=transaction
myapp_relay = host=127.0.0.1 port=5432 dbname=myapp pool_mode=session
[pgbouncer]
listen_port = 6432
listen_addr = 127.0.0.1
auth_type = scram-sha-256
2. Point the relay at the session pool
pg-tide \
--postgres-url "postgres://relay_user:pass@127.0.0.1:6432/myapp_relay"
3. Application connections use transaction mode
Your application can still use the standard myapp pool with transaction mode —
only the relay needs session mode.
Connection count
The relay opens one persistent PostgreSQL connection per pg-tide process
(plus one short-lived connection for the LISTEN/NOTIFY channel). For a typical
deployment:
- 1 relay process → 2 PgBouncer connections → 2 PostgreSQL server connections.
- 3 relay replicas (HA) → 6 PgBouncer connections → at most 3 active PostgreSQL connections (only the primary lock-holder does real work).
Set max_client_conn and default_pool_size in PgBouncer to accommodate this.
Health check
PgBouncer's server_check_query pings idle connections. pg_tide's relay sends
its own heartbeat updates (UPDATE tide.tide_consumer_offsets SET last_heartbeat = now()),
so no additional check query is needed.
TLS
If PgBouncer terminates TLS from the relay, ensure the relay's --postgres-url
includes sslmode=require:
postgres://relay_user:pass@pgbouncer:6432/myapp_relay?sslmode=require
The relay will negotiate TLS with PgBouncer; PgBouncer can then connect to PostgreSQL using its own TLS configuration (including certificate pinning).
Pgpool-II
Pgpool-II is an alternative pooler with load-balancing capabilities. The same session-mode requirement applies. Route relay connections to the primary node only — the relay must never connect to a read replica because it writes consumer offsets.
Integration: Terraform
This guide shows how to manage pg_tide infrastructure as code using Terraform. You can provision PostgreSQL extensions, create outboxes/inboxes, and configure relay pipelines declaratively.
Provider Setup
Use the cyrilgdn/postgresql provider for managing PostgreSQL objects:
terraform {
required_providers {
postgresql = {
source = "cyrilgdn/postgresql"
version = "~> 1.22"
}
}
}
provider "postgresql" {
host = var.postgres_host
port = var.postgres_port
database = var.postgres_database
username = var.postgres_username
password = var.postgres_password
sslmode = "require"
}
Install the Extension
resource "postgresql_extension" "pg_tide" {
name = "pg_tide"
schema = "tide"
version = "0.11.0"
}
Create Outboxes and Inboxes
resource "postgresql_function" "create_outbox" {
depends_on = [postgresql_extension.pg_tide]
name = "create_order_outbox"
language = "sql"
body = "SELECT tide.outbox_create('order_events');"
lifecycle {
ignore_changes = [body]
}
}
For a more maintainable approach, use local-exec provisioners:
resource "null_resource" "outboxes" {
depends_on = [postgresql_extension.pg_tide]
for_each = toset(var.outbox_names)
provisioner "local-exec" {
command = <<-EOT
psql "${var.postgres_url}" -c "SELECT tide.outbox_create('${each.value}');"
EOT
}
}
variable "outbox_names" {
type = list(string)
default = ["order_events", "user_events", "notification_events"]
}
Configure Relay Pipelines
resource "null_resource" "pipeline_orders_to_kafka" {
depends_on = [null_resource.outboxes]
provisioner "local-exec" {
command = <<-EOT
psql "${var.postgres_url}" -c "
SELECT tide.relay_set_outbox(
'orders-to-kafka',
'order_events',
'${jsonencode({
sink_type = "kafka"
brokers = var.kafka_brokers
topic = "orders"
wire_format = "debezium"
})}'::jsonb
);
"
EOT
}
triggers = {
config_hash = sha256(jsonencode({
sink_type = "kafka"
brokers = var.kafka_brokers
topic = "orders"
}))
}
}
Deploy the Relay (Kubernetes)
resource "kubernetes_deployment" "pg_tide_relay" {
metadata {
name = "pg-tide-relay"
namespace = var.namespace
}
spec {
replicas = var.relay_replicas
selector {
match_labels = {
app = "pg-tide-relay"
}
}
template {
metadata {
labels = {
app = "pg-tide-relay"
}
annotations = {
"prometheus.io/scrape" = "true"
"prometheus.io/port" = "9090"
}
}
spec {
termination_grace_period_seconds = 60
container {
name = "pg-tide"
image = "${var.relay_image}:${var.relay_version}"
args = [
"--postgres-url", "$(DATABASE_URL)",
"--relay-group-id", var.relay_group_id,
"--shutdown-timeout", "45",
]
port {
container_port = 9090
name = "metrics"
}
env {
name = "DATABASE_URL"
value_from {
secret_key_ref {
name = kubernetes_secret.pg_tide.metadata[0].name
key = "database-url"
}
}
}
liveness_probe {
http_get {
path = "/health"
port = 9090
}
initial_delay_seconds = 10
period_seconds = 15
}
resources {
requests = {
cpu = "100m"
memory = "128Mi"
}
limits = {
cpu = "500m"
memory = "512Mi"
}
}
}
}
}
}
}
resource "kubernetes_secret" "pg_tide" {
metadata {
name = "pg-tide-secrets"
namespace = var.namespace
}
data = {
"database-url" = var.postgres_url
}
}
Variables
variable "postgres_url" {
type = string
sensitive = true
}
variable "kafka_brokers" {
type = string
default = "kafka:9092"
}
variable "relay_replicas" {
type = number
default = 2
}
variable "relay_group_id" {
type = string
default = "production"
}
variable "relay_image" {
type = string
default = "ghcr.io/your-org/pg-tide"
}
variable "relay_version" {
type = string
default = "0.11.0"
}
Further Reading
- Deployment Architectures — Choosing your topology
- GitHub Actions Integration — CI/CD automation
Integration: GitHub Actions
This guide shows how to integrate pg_tide into your CI/CD pipeline with GitHub Actions — running tests against the extension, deploying relay instances, and managing database migrations.
Testing pg_tide in CI
Run Integration Tests
name: Test pg_tide pipelines
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:16
env:
POSTGRES_PASSWORD: test
POSTGRES_DB: testdb
ports:
- 5432:5432
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- uses: actions/checkout@v4
- name: Install pg_tide extension
run: |
# Install the extension into the test database
PGPASSWORD=test psql -h localhost -U postgres -d testdb -f sql/pg_tide--0.11.0.sql
- name: Create test outbox and inbox
run: |
PGPASSWORD=test psql -h localhost -U postgres -d testdb -c "
SELECT tide.outbox_create('test_events');
SELECT tide.inbox_create('test_inbox');
"
- name: Run application tests
run: |
DATABASE_URL="postgres://postgres:test@localhost/testdb" cargo test
env:
DATABASE_URL: postgres://postgres:test@localhost/testdb
Validate Pipeline Configurations
- name: Validate pipeline configs
run: |
# Dry-run the relay to validate all pipeline configs parse correctly
pg-tide \
--postgres-url "postgres://postgres:test@localhost/testdb" \
--validate-only
Deploy Relay to Kubernetes
name: Deploy pg-tide relay
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build and push relay image
uses: docker/build-push-action@v5
with:
push: true
tags: ghcr.io/${{ github.repository }}/pg-tide-relay:${{ github.sha }}
- name: Deploy to Kubernetes
uses: azure/k8s-deploy@v4
with:
manifests: k8s/relay-deployment.yaml
images: ghcr.io/${{ github.repository }}/pg-tide-relay:${{ github.sha }}
Database Migrations
Apply Extension Upgrades
name: Migrate pg_tide
on:
workflow_dispatch:
inputs:
target_version:
description: 'Target pg_tide version'
required: true
default: '0.11.0'
jobs:
migrate:
runs-on: ubuntu-latest
environment: production
steps:
- uses: actions/checkout@v4
- name: Apply migration
run: |
psql "${DATABASE_URL}" -c "ALTER EXTENSION pg_tide UPDATE TO '${{ inputs.target_version }}';"
env:
DATABASE_URL: ${{ secrets.DATABASE_URL }}
Configure Pipelines on Deploy
- name: Apply pipeline configurations
run: |
for config_file in pipelines/*.sql; do
echo "Applying $config_file"
psql "${DATABASE_URL}" -f "$config_file"
done
env:
DATABASE_URL: ${{ secrets.DATABASE_URL }}
Health Check After Deploy
- name: Wait for relay to be healthy
run: |
for i in $(seq 1 30); do
STATUS=$(curl -s -o /dev/null -w "%{http_code}" http://$RELAY_HOST:9090/health)
if [ "$STATUS" = "200" ]; then
echo "Relay is healthy"
exit 0
fi
echo "Waiting... (attempt $i)"
sleep 5
done
echo "Relay did not become healthy"
exit 1
Further Reading
- Deployment Architectures — Production topologies
- Terraform Integration — Infrastructure as code
Integration: Prometheus + Grafana
This guide covers setting up complete observability for pg_tide using Prometheus for metrics collection and Grafana for visualization and alerting.
Architecture
pg-tide relay (:9090/metrics) → Prometheus → Grafana
↓
Alertmanager → PagerDuty/Slack
Prometheus Configuration
Static Target
# prometheus.yml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'pg-tide'
static_configs:
- targets: ['pg-tide-relay:9090']
labels:
environment: 'production'
Kubernetes Service Discovery
scrape_configs:
- job_name: 'pg-tide'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: (.+)
replacement: ${1}:$1
- source_labels: [__meta_kubernetes_pod_label_app]
action: keep
regex: pg-tide-relay
Prometheus Operator (ServiceMonitor)
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: pg-tide-relay
labels:
release: prometheus
spec:
selector:
matchLabels:
app: pg-tide-relay
endpoints:
- port: metrics
interval: 15s
path: /metrics
Grafana Dashboard
Import the pre-built dashboard from pg-tide/dashboards/relay-health.json:
- Grafana → Dashboards → Import
- Upload
relay-health.json - Select your Prometheus data source
Or provision automatically:
# grafana/provisioning/dashboards/pg-tide.yaml
apiVersion: 1
providers:
- name: 'pg-tide'
folder: 'Infrastructure'
type: file
options:
path: /var/lib/grafana/dashboards/pg-tide
Alert Rules
Prometheus Alert Rules
# prometheus/rules/pg-tide.yaml
groups:
- name: pg-tide
rules:
- alert: PgTidePipelineDown
expr: pg_tide_pipeline_healthy == 0
for: 2m
labels:
severity: critical
annotations:
summary: "Pipeline {{ $labels.pipeline }} circuit breaker is open"
runbook_url: "https://wiki.example.com/pg-tide/circuit-breaker"
- alert: PgTideHighErrorRate
expr: rate(pg_tide_publish_errors_total[5m]) > 0.5
for: 5m
labels:
severity: warning
annotations:
summary: "Pipeline {{ $labels.pipeline }} error rate: {{ $value }}/s"
- alert: PgTideHighLag
expr: pg_tide_consumer_lag > 50000
for: 10m
labels:
severity: warning
annotations:
summary: "Pipeline {{ $labels.pipeline }} backlog: {{ $value }} messages"
- alert: PgTideLatencyHigh
expr: histogram_quantile(0.99, rate(pg_tide_delivery_latency_seconds_bucket[5m])) > 10
for: 5m
labels:
severity: warning
annotations:
summary: "Pipeline {{ $labels.pipeline }} P99 latency: {{ $value }}s"
- alert: PgTideRelayDown
expr: up{job="pg-tide"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "pg-tide relay is not responding to scrapes"
Alertmanager Routing
# alertmanager.yml
route:
receiver: 'default'
routes:
- match:
severity: critical
receiver: 'pagerduty'
- match:
severity: warning
receiver: 'slack'
receivers:
- name: 'slack'
slack_configs:
- channel: '#alerts'
send_resolved: true
- name: 'pagerduty'
pagerduty_configs:
- routing_key: '${PAGERDUTY_KEY}'
Docker Compose (Local Development)
version: '3.8'
services:
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./rules:/etc/prometheus/rules
ports:
- "9090:9090"
grafana:
image: grafana/grafana:latest
volumes:
- ./dashboards:/var/lib/grafana/dashboards/pg-tide
- ./provisioning:/etc/grafana/provisioning
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
pg-tide:
image: pg-tide:latest
environment:
- DATABASE_URL=postgres://user:pass@postgres:5432/mydb
ports:
- "9091:9090" # Metrics
Key PromQL Queries
# Overall health
min(pg_tide_pipeline_healthy)
# Total throughput
sum(rate(pg_tide_messages_published_total[5m]))
# Per-pipeline error ratio
rate(pg_tide_publish_errors_total[5m]) / rate(pg_tide_messages_consumed_total[5m])
# Delivery latency percentiles
histogram_quantile(0.5, rate(pg_tide_delivery_latency_seconds_bucket[5m]))
histogram_quantile(0.95, rate(pg_tide_delivery_latency_seconds_bucket[5m]))
histogram_quantile(0.99, rate(pg_tide_delivery_latency_seconds_bucket[5m]))
# Lag trend (positive = growing)
deriv(pg_tide_consumer_lag[5m])
Further Reading
- Metrics — Complete metrics reference
- Dashboards — Dashboard details
- Monitoring Cookbook — Alert recipes
- Datadog Integration — Alternative monitoring platform
Integration: Datadog
This guide covers monitoring pg_tide with Datadog, including metrics collection, log forwarding, and APM trace correlation.
Metrics Collection
Option 1: Prometheus Integration
Datadog's OpenMetrics check can scrape pg_tide's Prometheus endpoint directly:
# datadog-agent/conf.d/openmetrics.d/conf.yaml
instances:
- openmetrics_endpoint: http://pg-tide-relay:9090/metrics
namespace: pg_tide
metrics:
- pg_tide_messages_published_total
- pg_tide_messages_consumed_total
- pg_tide_publish_errors_total
- pg_tide_dedup_skipped_total
- pg_tide_pipeline_healthy
- pg_tide_consumer_lag
- pg_tide_delivery_latency_seconds
Option 2: Kubernetes Annotations
With the Datadog Agent running as a DaemonSet:
apiVersion: apps/v1
kind: Deployment
metadata:
name: pg-tide-relay
spec:
template:
metadata:
annotations:
ad.datadoghq.com/pg-tide.checks: |
{
"openmetrics": {
"instances": [{
"openmetrics_endpoint": "http://%%host%%:9090/metrics",
"namespace": "pg_tide",
"metrics": ["pg_tide_*"]
}]
}
}
Log Collection
Structured JSON Logs
Configure the relay to emit JSON logs:
pg-tide --log-format json --postgres-url "..."
Datadog Agent Log Collection
# Kubernetes pod annotation
annotations:
ad.datadoghq.com/pg-tide.logs: |
[{
"source": "pg-tide",
"service": "pg-tide-relay",
"log_processing_rules": [{
"type": "multi_line",
"name": "rust_panics",
"pattern": "^thread '"
}]
}]
Log Facets
Create facets for common fields:
pipeline— Pipeline namedirection— forward/reversebatch_size— Messages in batcherror— Error message
APM / Traces
Option 1: OpenTelemetry → Datadog
pg_tide exports OTLP traces. Route them through the OTEL Collector to Datadog:
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: "0.0.0.0:4317"
exporters:
datadog:
api:
key: ${DD_API_KEY}
service:
pipelines:
traces:
receivers: [otlp]
exporters: [datadog]
Configure pg_tide to send traces:
pg-tide --otel-endpoint "http://otel-collector:4317" --postgres-url "..."
Option 2: Datadog Agent OTLP Ingestion
The Datadog Agent can receive OTLP directly (Agent 7.35+):
# datadog.yaml
otlp_config:
receiver:
protocols:
grpc:
endpoint: "0.0.0.0:4317"
pg-tide --otel-endpoint "http://datadog-agent:4317" --postgres-url "..."
Dashboards
Create a Datadog dashboard with these widgets:
Throughput (Timeseries)
sum:pg_tide.pg_tide_messages_published_total.count{*} by {pipeline}.as_rate()
Error Rate (Timeseries)
sum:pg_tide.pg_tide_publish_errors_total.count{*} by {pipeline}.as_rate()
Pipeline Health (Query Value)
min:pg_tide.pg_tide_pipeline_healthy{*} by {pipeline}
Consumer Lag (Timeseries)
avg:pg_tide.pg_tide_consumer_lag{*} by {pipeline}
Monitors (Alerts)
Pipeline Down
Monitor Type: Metric
Query: min(last_5m):min:pg_tide.pg_tide_pipeline_healthy{*} by {pipeline} < 1
Alert: Pipeline {{pipeline.name}} is unhealthy
High Error Rate
Monitor Type: Metric
Query: sum(last_5m):sum:pg_tide.pg_tide_publish_errors_total.count{*} by {pipeline}.as_rate() > 1
Warning: Pipeline {{pipeline.name}} error rate above threshold
Growing Lag
Monitor Type: Metric
Query: avg(last_10m):avg:pg_tide.pg_tide_consumer_lag{*} by {pipeline} > 10000
Warning: Pipeline {{pipeline.name}} has {{value}} pending messages
Further Reading
- Metrics — Available metrics
- OpenTelemetry — Trace configuration
- Prometheus + Grafana — Alternative monitoring stack
Integration: OpenTelemetry Collector
This guide covers advanced OpenTelemetry Collector configurations for pg_tide traces, including routing to multiple backends, sampling strategies, and enrichment.
Basic Setup
pg_tide exports traces via OTLP gRPC. The OpenTelemetry Collector acts as a proxy/router between pg_tide and your observability backend(s).
pg-tide relay → OTEL Collector → Backend (Jaeger, Tempo, Datadog, etc.)
Minimal Configuration
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: "0.0.0.0:4317"
exporters:
otlp:
endpoint: "tempo:4317"
tls:
insecure: true
service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlp]
Start the collector:
otelcol --config otel-collector-config.yaml
Configure pg_tide:
pg-tide --otel-endpoint "http://otel-collector:4317" --postgres-url "..."
Multi-Backend Routing
Send traces to multiple backends simultaneously:
receivers:
otlp:
protocols:
grpc:
endpoint: "0.0.0.0:4317"
exporters:
# Local Grafana Tempo for development
otlp/tempo:
endpoint: "tempo:4317"
tls:
insecure: true
# Datadog for production alerting
datadog:
api:
key: ${DD_API_KEY}
# Jaeger for detailed trace debugging
jaeger:
endpoint: "jaeger-collector:14250"
tls:
insecure: true
service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlp/tempo, datadog, jaeger]
Sampling Strategies
For high-throughput deployments, sampling reduces storage costs while maintaining visibility:
Tail-Based Sampling (Recommended)
Keep all error traces and sample normal traces:
processors:
tail_sampling:
decision_wait: 10s
policies:
# Always keep traces with errors
- name: errors
type: status_code
status_code:
status_codes: [ERROR]
# Always keep slow traces (>5s)
- name: slow
type: latency
latency:
threshold_ms: 5000
# Sample 10% of normal traces
- name: normal
type: probabilistic
probabilistic:
sampling_percentage: 10
service:
pipelines:
traces:
receivers: [otlp]
processors: [tail_sampling]
exporters: [otlp/tempo]
Head-Based Sampling (Simpler)
processors:
probabilistic_sampler:
sampling_percentage: 25 # Keep 25% of traces
service:
pipelines:
traces:
receivers: [otlp]
processors: [probabilistic_sampler]
exporters: [otlp/tempo]
Resource Enrichment
Add deployment metadata to all traces:
processors:
resource:
attributes:
- key: deployment.environment
value: production
action: upsert
- key: service.namespace
value: messaging
action: upsert
- key: k8s.cluster.name
value: production-east
action: upsert
# Detect Kubernetes metadata automatically
k8sattributes:
extract:
metadata:
- k8s.pod.name
- k8s.namespace.name
- k8s.deployment.name
service:
pipelines:
traces:
receivers: [otlp]
processors: [resource, k8sattributes]
exporters: [otlp/tempo]
Span Processing
Filter or modify spans before export:
processors:
# Drop health check spans (noisy)
filter:
traces:
span:
- 'attributes["http.route"] == "/health"'
- 'attributes["http.route"] == "/metrics"'
# Batch spans for efficient export
batch:
timeout: 5s
send_batch_size: 1000
service:
pipelines:
traces:
receivers: [otlp]
processors: [filter, batch]
exporters: [otlp/tempo]
Kubernetes Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: otel-collector
spec:
replicas: 2
selector:
matchLabels:
app: otel-collector
template:
metadata:
labels:
app: otel-collector
spec:
containers:
- name: collector
image: otel/opentelemetry-collector-contrib:latest
args: ["--config=/etc/otel/config.yaml"]
ports:
- containerPort: 4317 # OTLP gRPC
- containerPort: 8888 # Collector metrics
volumeMounts:
- name: config
mountPath: /etc/otel
volumes:
- name: config
configMap:
name: otel-collector-config
---
apiVersion: v1
kind: Service
metadata:
name: otel-collector
spec:
selector:
app: otel-collector
ports:
- port: 4317
targetPort: 4317
name: otlp-grpc
Correlating Traces with Metrics
Use the span metrics connector to derive RED metrics from traces:
connectors:
spanmetrics:
histogram:
explicit:
buckets: [1ms, 5ms, 10ms, 50ms, 100ms, 500ms, 1s, 5s]
dimensions:
- name: pipeline.name
- name: pipeline.direction
exporters:
prometheus:
endpoint: "0.0.0.0:8889"
service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlp/tempo, spanmetrics]
metrics:
receivers: [spanmetrics]
exporters: [prometheus]
Further Reading
- OpenTelemetry Feature — pg_tide trace configuration
- Prometheus + Grafana — Metrics visualization
- Datadog Integration — Datadog as trace backend
Integration: Microcks
Microcks is a CNCF incubating project for API mocking and
contract conformance testing. It ingests AsyncAPI, OpenAPI, gRPC, and GraphQL
specifications — including the document emitted by pg-tide asyncapi export — and
uses the examples within them to both mock event channels and verify that real
consumers and producers stay conformant to the contract.
This guide shows how to use pg-tide asyncapi export as a first-class integration
touchpoint: export your relay spec, import it into Microcks, develop against the
mocks, and gate your CI pipeline on contract conformance.
Why This Matters
pg-tide publishes events from the database to any number of downstream consumers.
Those consumers need a stable, documented contract: "what topics exist, in what
format, with what shape of payload?" Today that contract lives implicitly inside
pg-tide's relay configuration. The asyncapi export command makes it explicit —
and Microcks makes it enforceable.
PostgreSQL → pg-tide relay → [Kafka / NATS / SQS / …] → consumers
↓
asyncapi export
↓
Microcks ──► mock topics (development)
──► conformance test (CI)
Prerequisites
| Requirement | Version |
|---|---|
pg-tide CLI (pg-tide) | ≥ 0.14.0 |
| Docker / Docker Compose | any recent |
| Microcks (dev-mode image) | ≥ 1.11.0 |
Step 1 — Export Your AsyncAPI Spec
Connect to a database that has relay pipelines configured and run:
pg-tide asyncapi export \
--postgres-url "postgres://user:pass@localhost/mydb" \
--format yaml \
> relay-asyncapi.yaml
This produces an AsyncAPI 3.0 YAML document that enumerates every outbox forward pipeline and every inbox reverse pipeline as named channels, operations, and message schemas.
Example output excerpt:
asyncapi: 3.0.0
info:
title: pg-tide Relay AsyncAPI
version: 0.16.0
description: Auto-generated AsyncAPI 3.0 document from pg-tide relay catalog metadata.
channels:
forward/orders:
address: kafka/orders
description: "Forward relay: outbox 'orders' → kafka"
messages:
ordersMessage:
$ref: '#/components/messages/ordersMessage'
operations:
sendOrders:
action: send
channel:
$ref: '#/channels/forward~1orders'
description: "Publish messages from outbox 'orders' to kafka"
components:
messages:
ordersMessage:
name: ordersMessage
contentType: application/json
payload:
type: object
description: "pg_tide outbox message (wire_format: cloudevents)"
Enriching the spec: The auto-generated payload schemas use
type: objectas a baseline. For full contract value, add JSON Schema definitions for your actual message payloads either by hand or by runningasyncapi exportagainst a database where schema evolution guardrails have already captured column types.
Step 2 — Start Microcks Locally
The quickest path is Microcks' dev-mode Docker Compose, which bundles a Kafka broker (Redpanda) alongside the Microcks server:
git clone https://github.com/microcks/microcks
cd microcks/install/docker-compose
docker compose -f docker-compose-devmode.yml up -d
Wait until all five containers are healthy, then open http://localhost:8080.
Step 3 — Import the Spec
Via the Microcks UI
- Go to API | Services → Import.
- Upload
relay-asyncapi.yamlas a primary artifact. - Microcks parses the channels and begins publishing mock messages to the embedded Kafka broker on the addresses defined in the spec.
Via the Microcks REST API (CI-friendly)
curl -s -X POST http://localhost:8080/api/v1/artifact/upload \
-H "Content-Type: multipart/form-data" \
-F "file=@relay-asyncapi.yaml"
Step 4 — Develop Against Mock Topics
Once imported, Microcks publishes mock events at regular intervals to Kafka topics named after the channel addresses in your spec. Downstream consumer teams can target the Microcks Kafka endpoint instead of a real pg-tide deployment:
# Confirm mock messages are flowing
kcat -b localhost:9092 -t kafka/orders -C -e
# Output:
# {"id":1,"type":"order.created","source":"/orders","data":{...}}
# {"id":2,"type":"order.updated","source":"/orders","data":{...}}
This means consumer teams can write and test their Kafka consumers in isolation — no running PostgreSQL, no relay process, no seeded test data required.
Step 5 — Run Conformance Tests
Once a real pg-tide relay is running (e.g., in a staging environment), use Microcks to verify that the actual published events conform to the spec:
Via the Microcks UI
- Go to the imported pg-tide Relay AsyncAPI service.
- Click New Test, set the endpoint to your staging Kafka broker
(
kafka://staging-kafka:9092), choose AsyncAPI conformance as the runner. - Microcks subscribes to the topic, collects messages, validates them against the schema and examples, and returns a conformance score.
Via the REST API (CI step)
# Retrieve the service ID
SERVICE_ID=$(curl -s http://localhost:8080/api/v1/services \
| jq -r '.[] | select(.name=="pg-tide Relay AsyncAPI") | .id')
# Launch the conformance test
TEST_ID=$(curl -s -X POST http://localhost:8080/api/v1/tests \
-H "Content-Type: application/json" \
-d '{
"serviceId": "'"$SERVICE_ID"'",
"testEndpoint": "kafka://staging-kafka:9092",
"runnerType": "ASYNC_API_SCHEMA",
"timeout": 15000
}' | jq -r '.id')
echo "Test launched: $TEST_ID"
# Poll for result
sleep 20
curl -s http://localhost:8080/api/v1/tests/$TEST_ID \
| jq '{success: .success, conformanceScore: .conformanceScore}'
A failing test means the relay is publishing events that violate the contract — caught before production.
Step 6 — Gate CI on the Contract
GitHub Actions example
name: Contract conformance
on:
push:
branches: [main]
jobs:
contract-test:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:16
env: { POSTGRES_PASSWORD: test, POSTGRES_DB: ci }
ports: ["5432:5432"]
steps:
- uses: actions/checkout@v4
- name: Install pg_tide + seed pipelines
run: |
PGPASSWORD=test psql -h localhost -U postgres ci \
-f sql/pg_tide--0.16.0.sql
# ...register relay pipelines via tide.relay_*_upsert()...
- name: Export AsyncAPI spec
run: |
pg-tide asyncapi export \
--postgres-url "postgres://postgres:test@localhost/ci" \
--format yaml > relay-asyncapi.yaml
- name: Start Microcks
run: |
git clone --depth 1 https://github.com/microcks/microcks /tmp/microcks
docker compose \
-f /tmp/microcks/install/docker-compose/docker-compose-devmode.yml \
up -d
# Wait for readiness
until curl -sf http://localhost:8080/api/v1/health; do sleep 2; done
- name: Import spec into Microcks
run: |
curl -s -X POST http://localhost:8080/api/v1/artifact/upload \
-F "file=@relay-asyncapi.yaml"
- name: Verify mock topics are live
run: |
# At least one message must arrive within 10 s
kcat -b localhost:9092 -t kafka/orders -C -c 1 -e -w 10
- name: Run conformance test against relay
run: |
# Start relay against test DB pointing at Microcks Kafka
pg-tide --postgres-url "postgres://postgres:test@localhost/ci" &
sleep 5
SERVICE_ID=$(curl -s http://localhost:8080/api/v1/services \
| jq -r '.[] | select(.name=="pg-tide Relay AsyncAPI") | .id')
RESULT=$(curl -s -X POST http://localhost:8080/api/v1/tests \
-H "Content-Type: application/json" \
-d "{\"serviceId\":\"$SERVICE_ID\",
\"testEndpoint\":\"kafka://localhost:9092\",
\"runnerType\":\"ASYNC_API_SCHEMA\",
\"timeout\":15000}")
TEST_ID=$(echo $RESULT | jq -r '.id')
sleep 20
SUCCESS=$(curl -s http://localhost:8080/api/v1/tests/$TEST_ID \
| jq -r '.success')
echo "Contract test success: $SUCCESS"
[ "$SUCCESS" = "true" ]
Enriching the Generated Spec
The exported spec is intentionally minimal — type: object payload schemas are
safe placeholders. You can enrich it in two ways:
Option A — Secondary artifact with example payloads
Create a relay-examples.yaml in the Microcks API Examples Format
and import it as a secondary artifact alongside relay-asyncapi.yaml. Microcks
merges the two: the spec provides structure, the examples provide realistic mock
data and dynamic templates.
# relay-examples.yaml
apiVersion: mocks.microcks.io/v1alpha1
kind: APIExamples
metadata:
name: pg-tide Relay AsyncAPI
version: 0.16.0
operations:
sendOrders:
examples:
order-created:
value:
id: "{{ uuid() }}"
specversion: "1.0"
type: "order.created"
source: "/orders"
time: "{{ now() }}"
data:
orderId: "{{ randomInt(1000,9999) }}"
customerId: "{{ randomInt(100,999) }}"
total: "{{ randomInt(10,500) }}.00"
Option B — Emit richer schemas from the relay config
If your relay configuration stores JSON Schema definitions alongside the wire
format config, pg-tide can incorporate them into the exported spec. Open an issue
or contribute a PR to pg-tide-relay/src/main.rs::run_asyncapi_export to expose
this.
Summary
| Step | What you get |
|---|---|
asyncapi export | A machine-readable contract for every relay pipeline |
| Import into Microcks | Live mock Kafka/NATS/SQS topics for consumer development |
| Conformance test | Automated verification that the relay honours the contract |
| CI gate | Catch wire-format regressions before they reach production |
The pg-tide asyncapi export command is the bridge between pg-tide's relay
catalog and the broader API-contract ecosystem. Microcks is the natural home for
running and enforcing that contract.
See Also
- CLI Reference — asyncapi export
- Wire Formats Overview
- GitHub Actions Integration
- Microcks AsyncAPI tutorial
- Microcks conformance testing
Security
Security considerations for pg_tide deployments.
Extension Security
Schema Isolation
All pg_tide objects live in the tide schema. The extension is marked trusted = true and superuser = false — it can be installed by any user with CREATE privilege on the database.
No Elevated Privileges
The extension uses no:
- Background workers
- Shared memory
- File system access
- Network connections
SECURITY DEFINERfunctions
All operations run with the privileges of the calling user.
Catalog Table Access
By default, any user with access to the tide schema can read and write all catalog tables. For multi-tenant deployments, consider:
-- Restrict outbox creation to admin role
REVOKE INSERT ON tide.tide_outbox_config FROM PUBLIC;
GRANT INSERT ON tide.tide_outbox_config TO tide_admin;
-- Allow publishing to all application users
GRANT EXECUTE ON FUNCTION tide.outbox_publish(text, jsonb, jsonb) TO app_user;
Relay Security
Connection String Protection
Never embed passwords in config files committed to version control. Use environment variables:
postgres_url = "postgres://${ENV:PG_USER}:${ENV:PG_PASSWORD}@${ENV:PG_HOST}:5432/mydb"
Or use Kubernetes secrets, HashiCorp Vault, or your platform's secret management.
Least-Privilege Database User
The relay needs minimal privileges:
CREATE ROLE pg_tide_relay LOGIN PASSWORD 'secret';
GRANT USAGE ON SCHEMA tide TO pg_tide_relay;
GRANT SELECT, UPDATE ON tide.tide_outbox_messages TO pg_tide_relay;
GRANT SELECT ON tide.tide_outbox_config TO pg_tide_relay;
GRANT SELECT ON tide.relay_outbox_config TO pg_tide_relay;
GRANT SELECT ON tide.relay_inbox_config TO pg_tide_relay;
GRANT SELECT, INSERT, UPDATE ON tide.tide_consumer_offsets TO pg_tide_relay;
GRANT SELECT, INSERT, UPDATE, DELETE ON tide.tide_consumer_leases TO pg_tide_relay;
GRANT SELECT, INSERT, UPDATE ON tide.relay_consumer_offsets TO pg_tide_relay;
Network Security
- The relay's metrics endpoint (default
:9090) should not be exposed publicly - Use TLS for PostgreSQL connections in production (
sslmode=require) - Use TLS for sink connections (NATS TLS, Kafka SSL, HTTPS webhooks)
Docker Security
The official Docker image runs as non-root user pgtide (UID 1000):
USER pgtide
No capabilities are required. Use securityContext in Kubernetes:
securityContext:
runAsNonRoot: true
runAsUser: 1000
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
Payload Security
Sensitive Data
Avoid publishing sensitive data (PII, credentials) in outbox payloads. If you must, encrypt at the application layer before calling outbox_publish().
Input Validation
pg_tide accepts any valid JSONB as payload. Validate payloads at the application level before publishing. The extension does not perform content validation.
Reporting Vulnerabilities
Report security issues to: security@trickle-labs.com
Do not open public GitHub issues for security vulnerabilities.
Architecture Decision Records
This page documents the key architectural decisions made in pg_tide's design, their rationale, and the alternatives that were considered.
ADR-1: Transactional Outbox over WAL-Based CDC
Decision: pg_tide uses the transactional outbox pattern rather than WAL (logical replication) for change capture.
Context: Most CDC tools (Debezium, pgoutput) read the PostgreSQL WAL to capture changes. This is transparent to the application but has limitations: it captures all changes (including internal ones), can't easily enrich events with application context, and requires careful management of replication slots.
Rationale:
- Applications control exactly what events are published and when
- Events are guaranteed to be published if and only if the business transaction commits
- No replication slot management or WAL retention concerns
- Events can include derived data not present in any single table
- Schema is explicit and application-controlled
Trade-offs:
- Applications must explicitly call
tide.outbox_publish()(not transparent) - Slightly more application code vs. transparent CDC
- Cannot capture changes from direct SQL or other tools (unless triggers are used)
ADR-2: PostgreSQL Advisory Locks for HA Coordination
Decision: Use pg_try_advisory_lock() for pipeline ownership coordination rather than external consensus (etcd, ZooKeeper) or leader election.
Context: Multiple relay instances need to coordinate which instance processes which pipeline without double-processing.
Rationale:
- Zero additional infrastructure — PostgreSQL is already required
- Advisory locks are automatically released on connection close (crash safety)
- Non-blocking
pg_try_advisory_lockprevents deadlocks - Well-understood PostgreSQL primitive with decades of production use
Trade-offs:
- Requires all relay instances to connect to the same PostgreSQL instance
- Discovery interval determines failover speed (not instant)
- Lock granularity is per-pipeline (cannot split a single pipeline across instances)
ADR-3: Pipeline Configuration in PostgreSQL Catalog
Decision: Store pipeline configurations in PostgreSQL tables (tide.relay_outbox_config, tide.relay_inbox_config) rather than in TOML/YAML files or environment variables.
Context: The relay needs to know which pipelines to run and how to configure each one.
Rationale:
- Configuration changes via SQL (hot-reload without restart)
- LISTEN/NOTIFY enables instant propagation
- Configuration is transactional (rollback on error)
- All relay instances see the same configuration (single source of truth)
- Easy to manage programmatically (Terraform, application code)
Trade-offs:
- Requires database access to view/change configuration
- Secrets must use
${env:...}substitution (not stored in catalog) - Slightly less familiar than file-based config for ops teams
ADR-4: Single Binary Relay
Decision: The relay is a single statically-linked binary rather than a framework, library, or JVM application.
Context: The relay needs to be deployed easily across diverse environments.
Rationale:
- Single binary deployment (no runtime dependencies)
- Small container images (~20 MB)
- Fast startup time (< 1 second)
- Low resource consumption (Rust, no GC pauses)
- Cross-compilation for Linux/macOS/ARM
Trade-offs:
- Feature-gated compilation (some sinks/sources require build flags)
- Rust ecosystem less familiar than Java/Python for some teams
- Plugin system not possible (all sinks compiled in)
ADR-5: JMESPath for Transforms
Decision: Use JMESPath (not JSONPath, jq, or a custom DSL) for message transforms and filters.
Context: Messages need lightweight filtering and reshaping without external tools.
Rationale:
- Well-specified language with formal grammar
- Deterministic evaluation (no side effects)
- Good balance of power and simplicity
- Fast compiled evaluation
- Familiar from AWS CLI and other tools
Trade-offs:
- Less powerful than jq (no recursion, limited array manipulation)
- No support for array indexing in field paths
- Users must learn JMESPath syntax
ADR-6: At-Least-Once Delivery
Decision: pg_tide provides at-least-once delivery semantics by default, not exactly-once.
Context: Distributed systems cannot provide exactly-once delivery without end-to-end coordination.
Rationale:
- At-least-once is achievable without two-phase commit
- Simpler implementation, more reliable operation
- The inbox provides application-level deduplication for exactly-once processing
- Most sinks (Kafka, NATS) inherently provide at-least-once anyway
- Idempotent consumers are a well-understood pattern
Trade-offs:
- Consumers may receive duplicate messages (rare, only on failure/recovery)
- Applications that need exactly-once must implement deduplication
- The inbox pattern adds complexity for cross-system exactly-once
Further Reading
- Architecture — System design overview
- Version Compatibility — Version support matrix
Version Compatibility
This page documents compatibility between pg_tide versions, PostgreSQL versions, and the relay binary.
PostgreSQL Compatibility
| pg_tide Extension | PostgreSQL Versions | Notes |
|---|---|---|
| 0.11.0 | 14, 15, 16, 17, 18 | Current release |
| 0.10.0 | 14, 15, 16, 17, 18 | |
| 0.9.0 | 14, 15, 16, 17 | |
| 0.1.0–0.8.0 | 14, 15, 16 | Initial releases |
Extension / Relay Compatibility
The pg_tide extension and relay binary are versioned together. Use matching major.minor versions:
| Extension | Relay Binary | Compatible? |
|---|---|---|
| 0.11.x | 0.11.x | ✓ |
| 0.11.x | 0.10.x | ✓ (backward compatible) |
| 0.10.x | 0.11.x | ⚠️ May work, not tested |
| 0.10.x | 0.10.x | ✓ |
Rule of thumb: The relay binary should be ≥ the extension version. Newer relays are backward compatible with older extensions (they ignore unknown catalog features). Older relays may not understand newer catalog schema additions.
Upgrade Path
pg_tide supports sequential upgrades. Each version provides a migration script:
-- Upgrade from 0.10.0 to 0.11.0
ALTER EXTENSION pg_tide UPDATE TO '0.11.0';
Available upgrade scripts:
pg_tide--0.1.0--0.2.0.sqlpg_tide--0.2.0--0.3.0.sqlpg_tide--0.3.0--0.4.0.sqlpg_tide--0.4.0--0.5.0.sqlpg_tide--0.5.0--0.6.0.sqlpg_tide--0.6.0--0.7.0.sqlpg_tide--0.7.0--0.8.0.sqlpg_tide--0.8.0--0.9.0.sqlpg_tide--0.9.0--0.10.0.sqlpg_tide--0.10.0--0.11.0.sql
Important: Upgrades must be sequential. You cannot skip versions (e.g., jump from 0.8.0 to 0.11.0 directly). Apply each migration in order.
Feature Availability by Version
| Feature | Since Version |
|---|---|
| Outbox/Inbox core | 0.1.0 |
| Relay catalog | 0.1.0 |
| Consumer groups | 0.3.0 |
| Dead letter queue | 0.5.0 |
| Circuit breaker | 0.5.0 |
| Rate limiting | 0.6.0 |
| JMESPath transforms | 0.7.0 |
| Content-based routing | 0.7.0 |
| Wire formats (Debezium, Maxwell, Canal) | 0.8.0 |
| Schema Registry | 0.8.0 |
| OpenTelemetry | 0.9.0 |
| CDC JSON wire format | 0.10.0 |
| Webhook signatures | 0.10.0 |
| DuckLake sink | 0.11.0 |
| Arrow Flight sink | 0.11.0 |
Sink/Source Availability
All sinks and sources listed in the documentation are available in the current release (0.11.0). Some are feature-gated at compile time:
| Feature Gate | Includes |
|---|---|
| Default (no flag) | Kafka, NATS, HTTP, stdout, PostgreSQL sinks/sources |
cloud | S3, GCS, Azure Blob, BigQuery, Pub/Sub, SQS, Kinesis, Event Hubs, Service Bus |
analytics | ClickHouse, Snowflake, Iceberg, Delta, DuckLake, Arrow Flight |
schema-registry | Confluent Schema Registry + Avro |
otel | OpenTelemetry tracing |
singer | Singer/Meltano tap and target support |
airbyte | Airbyte connector support |
Pre-built release binaries include all feature gates enabled.
Breaking Changes
| Version | Breaking Change |
|---|---|
| 0.8.0 | Wire format config moved from top-level to wire_config sub-object |
| 0.5.0 | DLQ table schema changed (added error_kind column) |
| 0.3.0 | Consumer group API changed (renamed functions) |
Further Reading
- Architecture Decisions — Design rationale
- CHANGELOG — Detailed release notes
Changelog
Glossary
Advisory lock — A PostgreSQL locking mechanism used by pg_tide for high-availability coordination. Each relay instance acquires advisory locks for the pipelines it owns, preventing multiple instances from processing the same pipeline.
At-least-once delivery — A delivery guarantee where every message is delivered one or more times. pg_tide provides at-least-once by default — messages may be re-delivered on failure recovery, but are never lost.
Batch — A group of messages processed together in a single poll-publish-acknowledge cycle. Larger batches improve throughput at the cost of latency.
Circuit breaker — A fault-tolerance pattern that stops attempting delivery when a sink is consistently failing. After a timeout, it probes with a single message to test recovery.
Consumer group — An independent cursor into an outbox, allowing multiple services to consume the same event stream at their own pace without interfering with each other.
Dead letter queue (DLQ) — A PostgreSQL table (tide.relay_dlq) that stores messages which failed delivery after all retry attempts. Messages can be inspected and replayed from the DLQ.
Deduplication key (dedup_key) — A unique identifier for inbox messages that prevents duplicate processing. If the same dedup_key arrives twice, the second write is silently ignored.
Discovery — The process by which the relay coordinator finds and reconciles pipeline configurations from the PostgreSQL catalog.
Dry-run mode — A pipeline mode where the relay performs all processing (poll, transform, route) but logs output instead of publishing to the sink.
Envelope — The wire format wrapper around a message payload. Determines how metadata (operation type, timestamps, source info) is encoded alongside the data.
Exactly-once processing — Achieved through the combination of at-least-once delivery and inbox deduplication. Each unique message is processed exactly once.
Fan-out — A pattern where a single event stream is delivered to multiple independent consumers (via consumer groups or multiple pipelines).
Forward pipeline — A pipeline that moves messages from an outbox to an external sink (outbox → sink direction).
Graceful shutdown — The relay's shutdown sequence: drain in-flight batches, acknowledge processed messages, release advisory locks, then exit.
Half-open — The circuit breaker state between open and closed, where a single probe message tests whether the sink has recovered.
Hot-reload — Updating pipeline configurations without restarting the relay process. Triggered by LISTEN/NOTIFY or periodic discovery.
Inbox — A PostgreSQL table that receives messages from external systems, providing deduplication and transactional processing guarantees.
JMESPath — A query language for JSON used by pg_tide for message transforms and filters.
NATS JetStream — NATS's persistent messaging layer. pg_tide uses JetStream for durable subscriptions with consumer groups.
Outbox — A PostgreSQL table that stores events published by the application, to be relayed to external systems by the relay process.
Pipeline — A configured relay path: source → transforms → routing → sink. Each pipeline has a name, direction, and configuration.
Relay — The pg-tide binary process that moves messages between PostgreSQL and external systems.
Relay group — A set of relay instances coordinating via the same relay_group_id. Instances within a group distribute pipelines among themselves.
Replay — Reprocessing a range of outbox messages, typically to backfill a new consumer or recover from a failure.
Reverse pipeline — A pipeline that moves messages from an external source into a pg_tide inbox (source → inbox direction).
Routing — Content-based routing that dynamically determines the destination subject/topic for each message based on its payload.
Schema Registry — A service (typically Confluent Schema Registry) that stores and manages Avro schemas for serialization/deserialization.
Sink — The destination for a forward pipeline: Kafka, NATS, HTTP, S3, BigQuery, etc.
Source — The origin for a reverse pipeline: Kafka, NATS, webhook receiver, Singer tap, etc.
Stream table — The logical name/category of an outbox event (e.g., "orders", "user-signups"). Used for routing and filtering.
Subject template — A string with {variable} placeholders that resolves to the final topic/subject name at runtime.
Token bucket — The rate limiting algorithm used by pg_tide. Allows bursts up to a configured capacity, then enforces a steady-state rate.
Tombstone — A null-value message (Kafka concept) that signals deletion of a key during log compaction.
Transform — A JMESPath expression that filters messages (drops non-matching) or reshapes payloads before publishing.
Wire format — The serialization format for messages on the transport layer (native, Debezium, Maxwell, Canal, CDC JSON).