Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions docs/features/sharding/resharding/databases.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,17 @@ icon: material/database-plus-outline

# New databases

PgDog's strategy for resharding Postgres databases is to create a new, independent cluster of machines and move data over to it in real-time. Creating new databases is environment-specific, and PgDog doesn't currently automate this step.
PgDog's strategy for resharding Postgres databases is to create a new, independent cluster of machines and move data over to it in real-time. Creating new databases is environment-specific, and PgDog doesn't currently automate this step[^1].

## Requirements

New databases should be **empty**: don't migrate your [table definitions](schema.md) or [data](hash.md). These will be taken care of automatically by PgDog.

### Database users

Since PgDog was built to work in cloud-managed environments, like AWS RDS, we don't usually have access to the `pg_shadow` view, which contains password hashes. Therefore, tools like [`pg_dumpall`](https://www.postgresql.org/docs/current/app-pg-dumpall.html) aren't able to operate correctly, and we can't automatically migrate users to the new database.
Since PgDog was built to work in cloud-managed environments, like AWS RDS, it doesn't usually have access to the `pg_shadow` view, which contains password hashes. Therefore, tools like [`pg_dumpall`](https://www.postgresql.org/docs/current/app-pg-dumpall.html) aren't able to operate correctly, and we can't automatically migrate users to the new database.

For this reason, migrating users to the new database cluster is currently **not supported** and is the responsibility of the operator.

Make sure to create all the necessary Postgres users and roles before proceeding to the [next step](schema.md).
For this reason, migrating users to the new database cluster is currently **not supported** and is the responsibility of the operator. Make sure to create all the necessary Postgres users and roles before proceeding to [schema synchronization](schema.md).

## Multiple Postgres databases

Expand All @@ -25,5 +23,8 @@ If you are operating multiple Postgres databases on the same database server, th
## Next steps

{{ next_steps_links([
("Schema sync", "schema.md", "Synchronize schema across new shards before moving data."),
("Schema sync", "schema.md", "Synchronize schema entities like table and index definitions to the new shards before moving data. This will make sure logical replication works as expected."),
]) }}


[^1]: We are building a control plane for the [Enterprise Edition](../../../enterprise_edition/index.md) which will take care of creating databases, with support for various cloud vendors (e.g., AWS RDS, Azure SQL, etc.).
45 changes: 32 additions & 13 deletions docs/features/sharding/resharding/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,32 +4,51 @@ icon: material/set-split

# Resharding Postgres

!!! note
This feature is a work in progress. Support for resharding with logical replication was started in [#279](https://github.com/pgdogdev/pgdog/pull/279).
!!! note "Work in progress"
This feature is in active development. Support for resharding with logical replication was started in [#279](https://github.com/pgdogdev/pgdog/pull/279) and
received major improvements in [#784](https://github.com/pgdogdev/pgdog/pull/784).

Resharding changes the number of shards in an existing database cluster, in order to add or remove capacity. To make this less impactful on production operations, PgDog's strategy for resharding is to create a new database cluster and reshard data in-flight, while moving it to the new databases.
Resharding changes the number of shards in an existing database cluster in order to add or remove capacity. To make this less impactful on production operations, PgDog's strategy for resharding is to create a new database cluster and reshard data while moving it to the new databases.

To make this an online process, with zero downtime or data loss, PgDog hooks into the logical replication protocol used by PostgreSQL and reroutes messages between nodes to create and update rows in real-time.

<center>
<img src="/images/resharding-arch-1.png" width="90%" height="auto" alt="Mirroring">
<center style="margin-top: 40px;">
<img src="/images/resharding-arch-1.png" width="80%" height="auto" alt="Mirroring">
</center>

## Step by step
## Resharding process

The resharding process is composed of four independent operations:
The resharding process is composed of four independent operations. The first one is currently the responsibility of the user, while the remaining 3 are automated by PgDog:

1. #### [Create new databases](databases.md)
2. #### [Synchronize schema](schema.md)
3. #### [Move data](hash.md)
4. #### [Cutover traffic](cutover.md)
| Operation | Description |
|-|-|
| [Create new cluster](databases.md) | Create a new set of empty databases that will be used for storing data in the new, sharded cluster. |
| [Schema synchronization](schema.md) | Replicate table and index definitions to the new shards, making sure the new cluster has the same schema as the old one. |
| [Move & reshard data](hash.md) | Copy data using logical replication, while redistributing rows in-flight between new shards. |
| [Cutover traffic](cutover.md) | Make the new cluster service both reads and writes from the application, without taking downtime. |

While each step can be executed separately by the operator, PgDog provides an [admin database](../../../administration/index.md) command to perform online resharding and traffic cutover steps in a completely automated fashion:

```
RESHARD <source> <destination> <publication>;
```

Steps two and three are automated by PgDog, while their orchestration is currently the responsibility of the user.
The `<source>` and `<destination>` parameters accept the name of the source and destination databases respectively. The `<publication>` parameter expects the name of the Postgres [publication](schema.md#publication) for the tables that need to be resharded.

!!! note "Traffic cutover"
Traffic cutover requires careful synchronization to avoid data loss and a split-brain situation. The `RESHARD` command supports this for **single node** PgDog deployments only. The [Enterprise Edition](../../../enterprise_edition/index.md) provides a control plane, which supports traffic cutover with multiple PgDog containers.

## Terminology

| Term | Description |
|-|-|
| Source database | The database cluster that's being resharded and contains all data and table definitions. |
| Destination database | The database cluster with the new sharding configuration, where the data will be copied from the source database. |
| Destination database | The database cluster with the new sharding configuration, to where the data will be copied from the source database. |
| Logical replication | Replication protocol available to PostgreSQL databases since version 10. |

## Next steps

{{ next_steps_links([
("Schema sync", "schema.md", "Synchronize table, index and other schema entities between the source and destination databases."),
("Move data", "hash.md", "Redistribute data between shards using the configured sharding function. This happens without downtime and keeps the shards up-to-date with the source database until traffic cutover."),
]) }}
179 changes: 135 additions & 44 deletions docs/features/sharding/resharding/schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,52 @@ icon: material/database-edit-outline
---
# Schema sync

PgDog can copy tables, indexes and other entities from your production database to the new, sharded database automatically. This is faster than using `pg_dump`, because we separate this process into two parts:
PostgreSQL logical replication requires that tables on both the source and destination databases contain the same columns, with compatible data types. PgDog takes care of this, by using `pg_dump` under the hood, and re-creating table and index definitions, in an optimal order, on the new shards.

1. [Create tables](#tables-and-primary-keys), primary key indexes, and sequences
2. Create [secondary indexes](#secondary-indexes)
3. Move [sequence](#sequences) values
### Synchronization phases

The create tables step needs to be performed first, before [copying data](hash.md). The second step is performed once the data sync is almost complete.
The schema synchronization process is composed of 4 distinct steps, all of which are executed automatically by PgDog during resharding:

## CLI
| Phase | Description |
|-|-|
| [Pre-data](#pre-data-phase) | Create identical tables on all shards along with the primary key constraint (and index). Secondary indexes are _not_ created yet. |
| [Post-data](#post-data-phase) | Create secondary indexes on all tables and shards. This is done after [moving data](hash.md), as a separate step, because it's considerably faster to create indexes on whole tables than while inserting individual rows. |
| [Cutover](#cutover) | This step is executed during traffic cutover, while application queries are blocked from executing on the database. |
| Post-cutover | This step makes sure the rollback database cluster can handle reverse logical replication. |

## Performing the sync

Schema synchronization can be performed using one of two methods:

1. Using an [admin database](../../../administration/index.md) command
2. Using a CLI command

### Admin database command

The admin database provides an easy way to execute commands, without having to spawn a new PgDog process. The schema synchronization command has the following syntax:

```
SCHEMA_SYNC <phase> <source database> <destination database> <publication>;
```

The `<phase>` argument accepts the following values:

| Phase | Description |
|-|-|
| `PRE` | Perform the pre-data schema synchronization phase. |
| `POST` | Perform the post-data schema synchronization phase. |

PgDog has a command line interface you can call by running it directly. Schema sync is controlled by a CLI command:
##### Example

To perform schema synchronization for the pre-data step from database `"prod"` to database `"prod_sharded"` and the `"all_tables"` publication, execute the following command:

```
SCHEMA_SYNC PRE prod prod_sharded all_tables;
```

### CLI

PgDog has a command line interface you can call by running the `pgdog` executable directly. Schema sync has its own CLI command with the following arguments:

```
pgdog schema-sync \
Expand All @@ -22,47 +57,105 @@ pgdog schema-sync \
--publication <publication>
```

Required (*) and optional parameters for this command are as follows:

| Parameter | Description |
|-|-|
| `--from-database`* | The name of the source database in `pgdog.toml`. |
| `--to-database`* | The name of the destination database in `pgdog.toml`. |
| `--publication`* | The name of the Postgres table [publication](#publication) with the tables you want to sync. |
| `--dry-run` | Print the SQL statements that will be executed on the destination database and exit. |
| `--ignore-errors` | Execute SQL statements and ignore any errors. |
| `--data-sync-complete` | Run the second step to create secondary indexes and sequences. |
| `--cutover` | Run the cutover step to move sequence values. |
| `--from-database` * | The name of the source database in [`pgdog.toml`](../../../configuration/pgdog.toml/databases.md). |
| `--to-database` * | The name of the destination database in [`pgdog.toml`](../../../configuration/pgdog.toml/databases.md). |
| `--publication` * | The name of the PostgreSQL [publication](#publication) with the tables you want to synchronize. |
| `--dry-run` | Only print the SQL statements that will be executed on the destination database and exit. |
| `--ignore-errors` | Ignore any errors caused by executing any of the schema synchronization SQL statements. |
| `--data-sync-complete` | Run the post-data step to create secondary indexes and sequences. |
| `--cutover` | Run the cutover step during traffic cutover. |

## Tables and primary keys
## Pre-data phase

The first step in the schema sync copies over tables and their primary key indexes from the source database to the new, resharded cluster. This has to be done separately, because Postgres's logical replication only copies data and doesn't manage table schemas.
The pre-data phase takes care of replicating the following Postgres schema entities:

### Primary keys
1. Table schemas (e.g. `CREATE SCHEMA`)
2. Table definitions, with identical columns and data types (e.g., `CREATE TABLE`)
3. Custom types and domains (e.g., `CREATE TYPE`, `CREATE DOMAIN`)
4. Extensions (e.g., `CREATE EXTENSION pgvector`)
5. Primary key constraints and corresponding unique indexes (e.g., `PRIMARY KEY (id)`)
6. Table publications (e.g., `CREATE PUBLICATION`)

A primary key constraint is **required** on all tables for logical replication to work correctly. Without a unique index identifying each row in a table, logical replication is not able to perform `UPDATE` and `DELETE` commands.
!!! note "Primary key requirement"
PgDog requires that _all_ tables that are being resharded contain a **primary key** constraint. This is important for logical replication
and guarantees that `UPDATE` and `DELETE` statements are replicated correctly between the source database and the new shards.

Before starting the resharding process for your database, double-check that you have primary keys on all your tables.
Since the pre-data phase creates only empty tables, it can be executed very quickly even for databases with a larger number of tables, extensions and custom data types.

### Publication

Since PgDog is using logical replication to move and reshard data, a [publication](https://www.postgresql.org/docs/current/sql-createpublication.html) for the relevant tables needs to be created on the source database.
Since PgDog is using logical replication to move and reshard data, a [publication](https://www.postgresql.org/docs/current/sql-createpublication.html) for the relevant tables needs to be created on the source database beforehand. The simplest way to do this is to run the following command:

```postgresql
CREATE PUBLICATION pgdog FOR ALL TABLES;
```

This will make sure all tables and schemas in your database are copied and resharded into the destination database cluster.

##### Example

=== "Admin database"
```
SCHEMA_SYNC PRE prod prod_sharded all_tables;
```
=== "CLI"
```
pgdog schema-sync \
--from-database prod \
--to-database prod_sharded \
--publication all_tables
```

## Post-data phase

The post-data phase is performed after the [data copy](hash.md) is complete and tables have been synchronized with logical replication. Its job is to create all secondary indexes (e.g., `CREATE INDEX`).

The simplest way to do this is to run the following command:
This step is performed after copying data because it makes the copy process considerably faster: Postgres doesn't need to update several indexes while writing rows into the tables.

=== "Source database"
```postgresql
CREATE PUBLICATION pgdog FOR ALL TABLES;
##### Example

=== "Admin database"
```
SCHEMA_SYNC POST prod prod_sharded all_tables;
```
=== "CLI"
```
pgdog schema-sync \
--from-database prod \
--to-database prod_sharded \
--publication all_tables
--data-sync-complete
```

### Tracking progress

This will make sure _all_ tables in your database will be copied and resharded into the destination database cluster.
Since creating indexes on large tables can take some time, PgDog provides an admin database command to monitor the progress:

!!! note "Multiple schemas"
If you're using schemas other than `public`, create them on the destination database before running the schema sync.
=== "Admin command"
```
SHOW SCHEMA_SYNC;
```
=== "Output"
```
-[ RECORD 1 ]+-----------------------------------------------------------------------------------------------
database | pgdog
user | pgdog
shard | 1
kind | index
sync_state | running
started_at | 2026-01-15 10:32:01.042 UTC
elapsed | 3s
elapsed_ms | 3012
table_schema | public
table_name | users
sql | CREATE INDEX CONCURRENTLY IF NOT EXISTS "users_email_idx" ON "public"."users" USING btree ("email")
```

### Schema admin

Schema sync creates tables, indexes, and other entities on the destination database. To make sure that's done with a user with sufficient privileges (e.g., `CREATE` permission on the database), you need to add it to [`users.toml`](../../../configuration/users.toml/users.md) and mark it as the schema administrator:
Schema sync creates tables, indexes, and other entities on the destination database. To make sure that this is done with a user with sufficient privileges (e.g., `CREATE` and `REPLICATION` permissions on the database), make sure to add such a user to [`users.toml`](../../../configuration/users.toml/users.md) and mark it as the schema administrator:

```toml
[[users]]
Expand All @@ -72,11 +165,19 @@ password = "hunter2"
schema_admin = true
```

PgDog will use that user to connect to the source and destination databases, so make sure to specify one for both of them.
PgDog will use this user to connect to the source and destination databases, so make sure to specify one for both databases in the configuration.

## Cutover

During the cutover, PgDog will execute last minute schema synchronization commands to make sure the destination sharded cluster works as expected. This involves putting back constraints that were removed for logical replication to work and moving sequence values.

## Post-cutover

### `pg_dump` version
The post-cutover phase makes sure the source database can accept logical replication streams from the new shards. This maintains the old database in sync, in case the operator decides to roll back the resharding process.

PgDog is using `pg_dump` under the hood to export schema definitions. Postgres requires the version of `pg_dump` and the Postgres server to be identical. Our [Docker image](../../../installation.md) comes with `pg_dump` for PostgreSQL 16, but your database server may run a different version.
## Dependencies

PgDog is using `pg_dump` under the hood to export table, schema and index definitions. PostgreSQL servers typically require that the version of `pg_dump` and the version of the server are identical. Our [Docker image](../../../installation.md) comes with `pg_dump` for PostgreSQL 16 by default, but your database server may run a different version.

Before proceeding, make sure to install the correct version of `pg_dump` for your source database. If you have multiple versions of `pg_dump` installed on the same host, you can specify the path to the right one in `pgdog.toml`:

Expand All @@ -85,18 +186,8 @@ Before proceeding, make sure to install the correct version of `pg_dump` for you
pg_dump_path = "/path/to/pg_dump"
```

## Secondary indexes

This step is performed after [data sync](hash.md) is complete. Running this step will create secondary indexes on all your tables, which will take some time, depending on the number of indexes in your schema.

## Sequences

This steps is performed during the cutover stage once both schema sync and data sync are complete. The source database is no longer accepting writes and we are ready to move them to the destination.

This step will calculate the `MAX(column) + 1` values for all table sequences and set them on the respective columns.

## Next steps

{{ next_steps_links([
("Move data", "hash.md", "Redistribute data across shards using hash-based resharding."),
("Move data", "hash.md", "Redistribute data between shards using the configured sharding function. This happens without downtime and keeps the shards up-to-date with the source database until traffic cutover."),
]) }}
Loading