Replication Modes
replication_type in the pipeline definition controls how nanosync reads from the source. There are three modes.
| Mode | What it does | Use case |
|---|---|---|
cdc_backfill | Snapshot existing rows, then stream live changes | Default — production pipelines |
cdc_only | Skip snapshot, stream changes from now | Historical data already in destination |
snapshot_only | Full table read, then stop | One-time migration, no ongoing CDC |
cdc_backfill — snapshot then live CDC
The default. Two sequential phases:
- Snapshot phase — nanosync reads every row across all configured tables using parallel partition workers. The snapshot runs inside a consistent read (Postgres:
REPEATABLE READtransaction; SQL Server: snapshot isolation) and is fully resumable if interrupted. - CDC phase — before the snapshot starts, nanosync records the current WAL LSN (Postgres) or CDC LSN (SQL Server). When the snapshot completes, replication picks up from that exact position. There is no gap between the last snapshotted row and the first CDC event.
Use cdc_backfill when you’re starting from scratch. It guarantees the destination has a complete, consistent copy of the source before live streaming begins.
pipelines:
- name: orders-pipeline
replication_type: cdc_backfill
source:
connection: prod-postgres
tables: [public.orders]
sink:
connection: prod-bigquery
The snapshot phase is visible in nanosync monitor as snapshotting. Once it completes, the pipeline transitions to live CDC automatically.
cdc_only — live changes only
Skips the snapshot entirely. Nanosync creates the replication slot (or CDC subscription) and starts streaming from the current position. No historical rows are written to the destination.
Use this when you’ve already loaded historical data by another method — for example, pg_dump + bq load for Postgres, or SSIS for SQL Server — and only need ongoing changes from this point forward.
pipelines:
- name: orders-pipeline
replication_type: cdc_only
source:
connection: prod-postgres
tables: [public.orders]
sink:
connection: prod-bigquery
The pipeline goes directly to live CDC status on start — no snapshotting phase.
snapshot_only — one-time full copy
Reads all rows from every configured table and exits. Nanosync does not create a replication slot and does not hold an ongoing connection to the source after the snapshot completes. The pipeline status becomes idle when done.
Use this for:
- Bulk migrations where you want to seed a new destination from an existing database.
- One-off data exports to BigQuery or a file sink.
- Testing destination schema before setting up a live pipeline.
pipelines:
- name: orders-migration
replication_type: snapshot_only
source:
connection: prod-postgres
tables: [public.orders, public.order_items]
sink:
connection: prod-bigquery
Because no replication slot is created, there is no WAL retention pressure on the source during or after the snapshot.
CDC capture mode (SQL Server only)
SQL Server pipelines have a second dimension: properties.cdc_mode, which controls the underlying mechanism used to read changes. This is independent of replication_type.
| Mode | Mechanism |
|---|---|
cdc (default) | Reads SQL Server CDC change tables. Requires CDC enabled on the database and target tables. |
tlog | Reads the transaction log directly via sys.fn_dblog. No CDC setup required. |
cdc is the default and is recommended for most cases. Use tlog when you cannot enable CDC on the source — for example, in managed cloud instances where CDC is restricted, or when you need to minimize source-side configuration.
Postgres has only one capture mechanism (WAL logical replication via pgoutput). cdc_mode is SQL Server-specific and has no effect on Postgres pipelines.
Full example
pipelines:
- name: orders-pipeline
replication_type: cdc_backfill # snapshot then live
source:
connection: prod-sqlserver
tables: [dbo.orders]
properties:
cdc_mode: cdc # SQL Server only — "cdc" or "tlog"
poll_interval: "5s"
sink:
connection: prod-bigquery
If you’re unsure which mode to use, start with cdc_backfill. It’s the safe default — it guarantees historical data is present in the destination before live streaming begins, with no gap between the snapshot and the CDC stream.