Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Feature: Airbyte Protocol Support

pg_tide implements the Airbyte protocol for running Airbyte source and destination connectors. This gives you access to approximately 400 data connectors from the Airbyte catalog, each packaged as a Docker container with a standardized interface.

What is the Airbyte Protocol?

The Airbyte protocol defines how source connectors (extractors) and destination connectors (loaders) communicate. Connectors are Docker containers that read configuration from a JSON file and exchange messages via stdout/stdin:

  • AirbyteRecordMessage — A single data row
  • AirbyteStateMessage — Sync checkpoint for incremental mode
  • AirbyteCatalogMessage — Available streams and their schemas
  • AirbyteLogMessage — Connector log output

pg_tide as an Airbyte Source Host

pg_tide can run Airbyte source connectors and ingest their records into an inbox:

SELECT tide.relay_set_inbox(
    'salesforce-data',
    'crm_inbox',
    '{
        "source_type": "airbyte",
        "source_image": "airbyte/source-salesforce:latest",
        "source_config": {
            "client_id": "${env:SF_CLIENT_ID}",
            "client_secret": "${env:SF_CLIENT_SECRET}",
            "refresh_token": "${env:SF_REFRESH_TOKEN}"
        },
        "streams": ["contacts", "opportunities"],
        "sync_mode": "incremental"
    }'::jsonb
);

See Sources: Airbyte for full configuration.

pg_tide as an Airbyte Destination Host

pg_tide can run Airbyte destination connectors and feed them outbox messages:

SELECT tide.relay_set_outbox(
    'warehouse-sync',
    'analytics_events',
    '{
        "sink_type": "airbyte",
        "destination_image": "airbyte/destination-bigquery:latest",
        "destination_config": {
            "project_id": "my-project",
            "dataset_id": "raw_events",
            "credentials_json": "${env:GCP_CREDENTIALS}"
        }
    }'::jsonb
);

See Sinks: Airbyte for full configuration.

Sync Modes

ModeBehavior
incrementalOnly new/changed records since last sync
full_refreshRe-extract all records on every run

Incremental mode persists state between runs (same as Singer STATE), so each sync only transfers the delta.

Docker Requirement

Airbyte connectors run as Docker containers. The relay host must have Docker available:

# Verify Docker is accessible
docker ps

The relay pulls connector images automatically on first use. For air-gapped environments, pre-pull images into a local registry.

Differences from Singer

AspectSingerAirbyte
PackagingPython packages (pip)Docker containers
Discovery--discover flagSeparate discover command
StateJSON file on stdinState messages in protocol
SchemaSCHEMA messagesCatalog with supported sync modes
Ecosystem~500 connectors~400 connectors
OverheadLow (native process)Higher (Docker container per sync)

Choose Singer when you want lightweight connectors without Docker. Choose Airbyte when you need connectors only available in the Airbyte catalog or prefer container isolation.

Further Reading