Scaling¶

How to scale riverbank horizontally and what bottlenecks to watch for.

Horizontal scaling¶

Increase replicaCount in the Helm chart:

replicaCount: 5

Throughput scales linearly with replicas until you hit:

LLM provider rate limits — the circuit breaker will open if you exceed the provider's concurrency limit
PostgreSQL connection limits — each replica holds a connection pool
Advisory lock contention — negligible in practice (locks are fragment-level)

For memory-intensive corpora (large PDFs via Docling):

resources:
  limits:
    cpu: 4000m
    memory: 8Gi
  requests:
    cpu: 1000m
    memory: 2Gi

Symptom	Likely bottleneck	Fix
High `run_duration_seconds`	LLM latency	Switch provider or model
Runs completing but `triples_written` = 0	Editorial policy too strict	Lower `min_fragment_length`
Circuit breaker opening frequently	Provider rate limit	Reduce `maxConcurrency` or add replicas (distributes load)
PostgreSQL CPU saturated	SHACL validation overhead	Reduce SHACL shape complexity
Memory OOM on workers	Large fragments in memory	Reduce `max_fragment_tokens`

For >5 replicas, consider PgBouncer between workers and PostgreSQL:

dbDsn: "postgresql://riverbank:pass@pgbouncer:6432/riverbank"

Corpus size	Recommended replicas	Notes
< 100 documents	1	Single replica sufficient
100–1000 documents	3	Default Helm configuration
1000–10000 documents	5–10	Consider provider-level rate limits
> 10000 documents	10+	Use PgBouncer, monitor lock contention