You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Sep 9, 2025. It is now read-only.
Copy file name to clipboardExpand all lines: README.md
+9-11Lines changed: 9 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,22 +5,19 @@ This script is used to sync data from ClickHouse to Postgres based on a YAML con
5
5
6
6
**Key features:**
7
7
8
-
- Replicates data from ClickHouse to Postgres.
9
-
-Create and ensure indexes are kept in sync.
10
-
-Batch processing to improve performance and memory usage.
11
-
-Cursor-based processing for time-based data replication.
8
+
- Replicates data from ClickHouse to PostgreSQL.
9
+
-Manage primary keys, indexes and destination columns types.
10
+
-Time-series data can be synced via cursor to avoid full table scans.
11
+
-Batch processing coupled with temporary tables in separate thread and connection.
12
12
13
13
**About performance:**
14
14
15
15
Measured from table creation to last upsert, with batch size of 50k rows:
16
16
17
-
- 800k rows with 5 columns: around 12s, 67k rows/s
18
-
- 170k rows with 18 columns: around 6s, 28k rows/s
17
+
- 800k rows with 5 columns: around 10s, 80k rows/s
18
+
- 170k rows with 18 columns: around 5s, 34k rows/s
19
19
20
-
> Note:
21
-
>
22
-
> - There is no concurrency yet, everything is single-threaded.
23
-
> - This tool was not designed for high volume of data, this solution might not be the best fit for 10M+ rows. However, this might change in the future!
20
+
> Note: This tool might not be the best fit for high volume of data. We tested it only under 10 million rows.
24
21
25
22
## Configuration
26
23
@@ -29,10 +26,11 @@ Configuration is done via a YAML file. See `config.example.yml` for reference.
29
26
## Running
30
27
31
28
```bash
32
-
go run . [-only=<table_name>] [-config=<path>]
29
+
go run . [-only=<table_name>] [-drop=<table_name>] [-config=<path>]
33
30
```
34
31
35
32
-`-only=<table_name>`: Avoid running all tables and only process the one specified.
33
+
-`-drop=<table_name>`: Drop the table after processing and reset cursor, if any.
36
34
-`-config=<path>`: Path to the configuration file. Defaults to `config.yml`.
0 commit comments