ETL Overview

ETL (extract, transform, load) integrations in Matia move data from a source (database, SaaS app, or file store) to a destination (typically a data warehouse). Each integration is a pipeline with its own source, destination, schema, schedule, and run history.

What an ETL Integration Is

An integration is one ETL pipeline: a configured connection from a source to a destination that runs on a schedule or on demand. The source and destination are assets — connections you create in Matia (e.g. a Postgres connection, a Snowflake warehouse). You choose which tables (streams) to sync and how (full refresh, incremental, or append-only). Matia extracts data, applies normalization and transformations as needed, and loads it into the destination.

How Data Flows

Extract: Matia reads data from the source according to the schema (enabled tables and sync mode). For incremental syncs, only new or changed rows are read using a cursor.
Transform: Data is normalized and transformed as required by the connector (e.g. flattening nested structures, type coercion). This can create a difference between emitted records (read from source) and committed records (written to destination).
Load: Data is written to the destination in the configured schema/database. Sync history is recorded per run.

Where to Manage ETL Integrations

Integrations (sidebar): View all available integrations in a single list. Select an integration row to open its details, or use Add Integration to start the creation flow.
Integration details: A tabbed view for managing integrations.
- Status: View sync history, emitted/committed data, and per-table insights
- Schema: Enable or disable tables, set sync mode, and cursor settings
- Settings: Manage triggers, schema changes, post-run actions, notifications, data resyncs, and integration deletion
- Changelog: View audit log of changes
- If supported by the connector, a Schema changes tab shows detected schema updates.

Key Concepts

Sync: A single run of the integration. Each sync has status (successful, failed, completed with errors), volume, emitted/committed counts, and duration.
Sync mode: Full refresh (all data each time), incremental (only new/changed rows), or append-only. Configured per table in the Schema tab where supported.
Schema changes: New schemas, tables, or columns in the source. You control whether the integration adopts them automatically; see How ingestion works and Sync modes.

For a first pipeline, see Create Your First Integration. For reference, see Sync modes, Sync logs and run history, and Assets and connections.

What an ETL Integration Is

How Data Flows

Where to Manage ETL Integrations

Key Concepts

ON THIS PAGE