Handling Schema Drift

Schema drift happens when the source table’s structure changes while the pipeline is running — a column is added, dropped, renamed, or its type changes. Nanosync detects all of these and responds according to how you’ve configured schema_drift_mode.


Default behavior by change type

Nanosync checks schema on every batch. When it detects a difference between the source schema and what it last wrote to the sink, it classifies the change and acts:

ChangeDefault behavior
Column addedAutomatically adds the column to the sink on the next batch. No intervention needed.
Column droppedPipeline pauses with a SCHEMA_DRIFT alert. Manual resolution required.
Column renamedTreated as a drop + add. Pipeline pauses.
Type changedPipeline pauses with a SCHEMA_DRIFT alert.
Table added (new table in publication)Ignored unless the table appears in the pipeline’s tables list.

Column additions are non-breaking and handled automatically. Everything else stops the pipeline so you can make a deliberate decision about what the sink should look like.


The schema_drift_mode option

Control the pipeline’s response to breaking schema changes with schema_drift_mode in the source properties block:

pipelines:
  - name: orders-pipeline
    source:
      connection: prod-postgres
      tables: [public.orders]
      properties:
        schema_drift_mode: pause
    sink:
      connection: prod-bigquery
ValueBehavior
pause(default) Pipeline pauses on any breaking change. Fires a schema_drift alert if configured.
ignorePipeline continues. Events for columns that don’t exist in the sink are silently dropped.
failPipeline exits immediately with a non-zero status. Use in CI or environments where schema changes should never happen unattended.

pause is the right default for production — it stops replication rather than silently losing data, and gives you time to decide what to do with the sink schema.


What a paused pipeline looks like

When a breaking change is detected, nanosync monitor shows:

NAME              SOURCE         TARGET     STATUS        LAG
orders-pipeline   prod-postgres   bigquery   ⚠ paused     —
                  SCHEMA_DRIFT: column dropped — public.orders.internal_flag

The pipeline stops consuming new WAL events. No events are lost — nanosync holds its LSN at the point of the drift and will resume from exactly there once you clear the pause.


Resuming after a pause

  1. Identify the drift — read the alert message or run nanosync pipeline status orders-pipeline to see the specific column and change type.

  2. Update the sink schema — make the corresponding change in the sink. For BigQuery, you might drop the column, add a nullable replacement, or do nothing if the column is no longer needed. This step is manual and intentional.

  3. Resume the pipeline:

    nanosync pipeline resume orders-pipeline

    The pipeline picks up from the LSN where it paused. No re-snapshot, no duplicates.

If you want to skip the resume and instead re-snapshot the table from scratch (for example, after significant schema restructuring):

nanosync pipeline reset orders-pipeline --table public.orders
nanosync pipeline resume orders-pipeline

reset clears the snapshot checkpoint for the specified table, triggering a new full-table read on resume.


Configuring drift alerts

Send a notification when schema drift occurs by adding a schema_drift alert rule:

pipelines:
  - name: orders-pipeline
    source:
      connection: prod-postgres
      tables: [public.orders]
    sink:
      connection: prod-bigquery
    alerts:
      - event: schema_drift
        channels: [slack, pagerduty]

Notification channels are defined in the server config’s notifications block. See Configuration reference for the full alert config.


Preventing drift in the first place

For tables where schema changes are frequent, consider using exclude_columns to explicitly exclude columns you don’t need in the sink. Excluded columns are never written, so changes to them don’t trigger drift detection:

source:
  connection: prod-postgres
  tables: [public.orders]
  exclude_columns:
    public.orders: [debug_flag, internal_notes, legacy_field]

schema_drift_mode: ignore silently drops events for columns that don’t exist in the sink. If a new required column is added to the source, those values are discarded without any error or alert. Only use ignore if you have independent schema validation in place downstream.