Skip to content

Big memory usage when inserting large strings to Clickhouse #1085

@ilevd

Description

@ilevd

It seems that the client consumes a lot of memory when inserting large strings to Clickhouse. Here are steps to demonstrate.

  1. Create 2000 strings with 100000 length, so totally it is about 200Mb.
  2. Run /usr/bin/time -v ./test to measure memory usage. On my PC:
    Maximum resident set size (kbytes): 253960
    That seems ok.
  3. Uncomment inserting, build and run again /usr/bin/time -v ./test. Now on my PC:
    Maximum resident set size (kbytes): 1140576
    More than 1.1 Gb.
  4. Check table in Clickhouse for uncompressed table size: test table 190.74 MiB

Clickhouse table:

CREATE TABLE IF NOT EXISTS test.table
(
    a String
)
ENGINE = MergeTree()
ORDER BY a

Checking table size:

SELECT
    database,
    table,
    formatReadableSize(sum(data_uncompressed_bytes)) AS uncompressed_size
FROM system.parts
WHERE active = 1 AND database = 'test' AND table = 'table'
GROUP BY
    database,
    table
ORDER BY sum(data_uncompressed_bytes) DESC;

test.go

package main

import (
	"context"
	"fmt"
	"log"
	"math/rand"
	"time"

	"github.com/ClickHouse/ch-go"
	"github.com/ClickHouse/ch-go/proto"
)

var r = rand.New(rand.NewSource(time.Now().UnixNano()))

func getChar() int {
	return r.Intn(26) + int('a')
}

func makeString(size int) string {
	b := make([]byte, size)
	for i := range size {
		b[i] = byte(getChar())
	}
	return string(b)
}

func makeStrings(count int, strLen int) []string {
	strs := make([]string, count)
	for i := range count {
		strs[i] = makeString(strLen)
	}
	return strs
}

func insertRows(rows []string) {
	ctx := context.Background()
	conn, err := ch.Dial(ctx, ch.Options{
		Address: "localhost:9000"})
	if err != nil {
		log.Fatal("Cannot connect to Clickhouse")
	}
	var a proto.ColStr
	for _, row := range rows {
		a.Append(row)
	}
	input := proto.Input{
		{Name: "a", Data: a},
	}
	err = conn.Do(context.Background(), ch.Query{
		Body:  "INSERT INTO test.table VALUES",
		Input: input,
	})
	if err != nil {
		log.Println("Cannot write to Clickhouse", err)
	}
	conn.Close()
}

func main() {
	count := 2000
	strLen := 100000
	strs := makeStrings(count, strLen) // Create strings (~ 200Mb)
	fmt.Println(len(strs), len(strs[0]))
	fmt.Println("Insert to Clickhouse...") // Testing with and without inserting
	//insertRows(strs)
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions