Ingest large amounts of data into PostgreSQL

Use this Python package to ingest data into SQLAlchemy-defined PostgreSQL tables, leveraging high-watermarking to keep them up to date without re-ingesting the same data.

Get started
pg-bulk-ingest logo of a database icon with many arrows pointing towards it

Auto migrations

Existing tables are automatically migrated as needed - no need for separate migrations.

Memory efficient

The API supports streaming large amounts of data into PostgreSQL without loading it all into memory.

Performance

The PostgreSQL COPY statement is used to make ingests as performant as possible.

Transactional

Data in ingested in batches - each batch is completely visible to other clients or not at all

Avoids long locks

Operations are structured to minimise the time ACCESS EXCLUSIVE locks are needed.

Upserts

Optionally will perform an upsert based on a primary key.


Contributions

The code for pg-bulk-ingest is public and contributions are welcome though the pg-bulk-ingest repository on GitHub.