Handling Schema Drift
Schema drift happens when the source table’s structure changes while the pipeline is running — a column is added, dropped, renamed, or its type changes. Nanosync detects all of these and responds according to how you’ve configured schema_drift_mode.
Default behavior by change type
Nanosync checks schema on every batch. When it detects a difference between the source schema and what it last wrote to the sink, it classifies the change and acts:
| Change | Default behavior |
|---|---|
| Column added | Automatically adds the column to the sink on the next batch. No intervention needed. |
| Column dropped | Pipeline pauses with a SCHEMA_DRIFT alert. Manual resolution required. |
| Column renamed | Treated as a drop + add. Pipeline pauses. |
| Type changed | Pipeline pauses with a SCHEMA_DRIFT alert. |
| Table added (new table in publication) | Ignored unless the table appears in the pipeline’s tables list. |
Column additions are non-breaking and handled automatically. Everything else stops the pipeline so you can make a deliberate decision about what the sink should look like.
The schema_drift_mode option
Control the pipeline’s response to breaking schema changes with schema_drift_mode in the source properties block:
pipelines:
- name: orders-pipeline
source:
connection: prod-postgres
tables: [public.orders]
properties:
schema_drift_mode: pause
sink:
connection: prod-bigquery
| Value | Behavior |
|---|---|
pause | (default) Pipeline pauses on any breaking change. Fires a schema_drift alert if configured. |
ignore | Pipeline continues. Events for columns that don’t exist in the sink are silently dropped. |
fail | Pipeline exits immediately with a non-zero status. Use in CI or environments where schema changes should never happen unattended. |
pause is the right default for production — it stops replication rather than silently losing data, and gives you time to decide what to do with the sink schema.
What a paused pipeline looks like
When a breaking change is detected, nanosync monitor shows:
NAME SOURCE TARGET STATUS LAG
orders-pipeline prod-postgres bigquery ⚠ paused —
SCHEMA_DRIFT: column dropped — public.orders.internal_flag
The pipeline stops consuming new WAL events. No events are lost — nanosync holds its LSN at the point of the drift and will resume from exactly there once you clear the pause.
Resuming after a pause
-
Identify the drift — read the alert message or run
nanosync pipeline status orders-pipelineto see the specific column and change type. -
Update the sink schema — make the corresponding change in the sink. For BigQuery, you might drop the column, add a nullable replacement, or do nothing if the column is no longer needed. This step is manual and intentional.
-
Resume the pipeline:
nanosync pipeline resume orders-pipelineThe pipeline picks up from the LSN where it paused. No re-snapshot, no duplicates.
If you want to skip the resume and instead re-snapshot the table from scratch (for example, after significant schema restructuring):
nanosync pipeline reset orders-pipeline --table public.orders
nanosync pipeline resume orders-pipeline
reset clears the snapshot checkpoint for the specified table, triggering a new full-table read on resume.
Configuring drift alerts
Send a notification when schema drift occurs by adding a schema_drift alert rule:
pipelines:
- name: orders-pipeline
source:
connection: prod-postgres
tables: [public.orders]
sink:
connection: prod-bigquery
alerts:
- event: schema_drift
channels: [slack, pagerduty]
Notification channels are defined in the server config’s notifications block. See Configuration reference for the full alert config.
Preventing drift in the first place
For tables where schema changes are frequent, consider using exclude_columns to explicitly exclude columns you don’t need in the sink. Excluded columns are never written, so changes to them don’t trigger drift detection:
source:
connection: prod-postgres
tables: [public.orders]
exclude_columns:
public.orders: [debug_flag, internal_notes, legacy_field]
schema_drift_mode: ignore silently drops events for columns that don’t exist in the sink. If a new required column is added to the source, those values are discarded without any error or alert. Only use ignore if you have independent schema validation in place downstream.