Skip to content

Write Roaring Bitmap data from Spark(ck-jdbc) to CH #532

@zhou-binbin

Description

@zhou-binbin

The more information of the question as follows:

the CK table :

CREATE TABLE app_userid_bit_ch
(
appname String,
uidbit AggregateFunction(groupBitmap, UInt64)
)
ENGINE = AggregatingMergeTree()
ORDER BY appname
SETTINGS index_granularity = 128

the Spark main program statements:

` val prop=new Properties()
val ckDriver="ru.yandex.clickhouse.ClickHouseDriver"
prop.put("driver",ckDriver)
prop.put("user",user)
prop.put("password",password)
import spark.implicits._
val ckdata = sc.textFile(args(0)).map(x=>{
val y = x.split(",")
val appname = y(0)
val uidbitmap = new RoaringBitmap()
y(1).split(":").foreach(k=>uidbitmap.add(k.toInt))
(appname,uidbitmap)
}).toDF()

ckdata.write.mode(saveMode="append")
.option("batchsize", "200")
.option("isolationLevel", "NONE")
.option("numPartitions", "1")
.jdbc(url, table,prop)`

the CH version is 20.8.3.18

I want to write Roaring Bitmap data into CK by spark,use java org.roaringbitmap.RoaringBitmap but failed. have CK support it ,if not ,have future plan?At present, we can only import the original ID data into CK,and then build RBM structure by creating materialized view. But the amount of original ID data is often very large.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions