Add HNSW support for Clickhouse client #500
Conversation
|
@MansorY23 There are some conflicts; please rebase the main branch first. |
sure! |
|
@alwayslove2013 hi! i squashed commits into one. but there is test, that i failed, and i cant see why, because my code works |
|
@MansorY23 upgrade black and ruff maybe help pip install black --upgrade
pip install ruff --upgrade
make lint
make format |
|
@alwayslove2013 thank you! your advice helped me |
|
@alwayslove2013 can you please approve this pr? |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: alwayslove2013, MansorY23 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
* fix: Unable to run vebbench and cli fix: remove comma of logging str fix cli unable to run zilliztech#444 Signed-off-by: yangxuan <xuan.yang@zilliz.com> * enhance: Unify optimize and remove ready_to_load PyMilvus used to be the only client that uses ready_to_load. Not it'll load the collection when creating it, so this PR removes `ready_to_load` from the client.API Also this PR enhance optimize and remove the optimize_with_size Signed-off-by: yangxuan <xuan.yang@zilliz.com> * add mongodb client Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com> * add mongodb client in readme Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com> * add some risk warnings for custom dataset - limit the number of test query vectors. Signed-off-by: min.tian <min.tian.cn@gmail.com> * Bump grpcio from 1.53.0 to 1.53.2 in /install Bumps [grpcio](https://github.com/grpc/grpc) from 1.53.0 to 1.53.2. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](grpc/grpc@v1.53.0...v1.53.2) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> * add mongodb config Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com> * Opensearch interal configuration parameters (zilliztech#463) * Added the configuration parameters to create Opensearch dynamically with right replicas, shards and other opensearch related configurations. Added the feature to create OS index with 0 replica and once the data is loaded update the replicas according to the parameter. * Updated the readme for config parameters --------- Co-authored-by: xavrathi <xavrathi@amazon.com> * ui control num of concurrencies Signed-off-by: siqi.an <ansiqi_7777@163.com> * Update README.md * environs version should <14.1.0 Signed-off-by: min.tian <min.tian.cn@gmail.com> * Support GPU_BRUTE_FORCE index for Milvus (zilliztech#476) Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com> Co-authored-by: Signed-off-by: Rachit Chaudhary - r0c0axe <Rachit.Chaudhary@walmart.com> * Add table quantization type * Support MariaDB database (zilliztech#375) MariaDB introduced vector support in version 11.7, enabling MariaDB Server to function as a relational vector database. https://mariadb.com/kb/en/vectors/ Now add support for MariaDB server, verified against MariaDB server of version 11.7.1: - Support MariaDB vector search with HNSW algorithm, support filter search. - Support index and search parameters: - storage_engine: InnoDB or MyISAM - M: M parameter in MHNSW vector indexing - ef_search: minimal number of result candidates to look for in the vector index for ORDER BY ... LIMIT N queries. - max_cache_size: Upper limit for one MHNSW vector index cache - Support CLI of `vectordbbench mariadbhnsw`. * Add TiDB backend (zilliztech#484) * Add TiDB backend Signed-off-by: Wish <breezewish@outlook.com> * Fix Signed-off-by: Wish <breezewish@outlook.com> * Fix Signed-off-by: Wish <breezewish@outlook.com> * Improve error handling Signed-off-by: Wish <breezewish@outlook.com> --------- Signed-off-by: Wish <breezewish@outlook.com> * CLI fix for GPU index (zilliztech#485) * Support GPU_BRUTE_FORCE index for Milvus Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com> * MilvusGPUBruteForceTypedDict addition Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com> --------- Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com> Co-authored-by: Signed-off-by: Rachit Chaudhary - r0c0axe <Rachit.Chaudhary@walmart.com> * remove duplicated code * feat: initial commit * Add vespa integration * remove redundant empty_field config check for qdrant and tidb Signed-off-by: min.tian <min.tian.cn@gmail.com> * reformat all Signed-off-by: min.tian <min.tian.cn@gmail.com> * fix cli crush Signed-off-by: min.tian <min.tian.cn@gmail.com> * downgrade streamlit version * add more milvus index types: hnsw sq/pq/prq; ivf rabitq Signed-off-by: min.tian <min.tian.cn@gmail.com> * add more milvus index types: ivf_pq Signed-off-by: min.tian <min.tian.cn@gmail.com> * Add HNSW support for Clickhouse client (zilliztech#500) * feat: add hnsw support * refactor: minor fixes * feat: reformat code * fix: remove sql injections, reformat code * fix bugs when use custom_dataset without groundtruth file Signed-off-by: min.tian <min.tian.cn@gmail.com> * fix: prevent the frontend from crashing on invalid indexes in results * fix ruff warnings * Fix formatting * Add lancedb * Add --task-label option for cli (zilliztech#517) * Add --task-label option for cli * Fix lint issues * Add qdrant cli * Update README.md * Fixing Bugs in Benchmarking ClickHouse with vectordbbench (zilliztech#523) * Update cli.py * Update clickhouse.py * Update clickhouse.py * Update cli.py * Update config.py * remove space * Add --concurrency-timeout option to avoid long time waiting (zilliztech#521) * Add --concurrency-timeout option to avoid long time waiting, by default, it's 3600s. * Fix lint error * Update README.md, add --concurrency-timeout option --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com> Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com> Signed-off-by: min.tian <min.tian.cn@gmail.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: siqi.an <ansiqi_7777@163.com> Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com> Signed-off-by: Wish <breezewish@outlook.com> Co-authored-by: yangxuan <xuan.yang@zilliz.com> Co-authored-by: zhuwenxing <wenxing.zhu@zilliz.com> Co-authored-by: min.tian <min.tian.cn@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Xavierantony1982 <xavierantony@gmail.com> Co-authored-by: xavrathi <xavrathi@amazon.com> Co-authored-by: siqi.an <ansiqi_7777@163.com> Co-authored-by: Xiaofan <83447078+xiaofan-luan@users.noreply.github.com> Co-authored-by: Rachit Chaudhary <65501028+Rachit-Chaudhary11@users.noreply.github.com> Co-authored-by: Signed-off-by: Rachit Chaudhary - r0c0axe <Rachit.Chaudhary@walmart.com> Co-authored-by: Luca Giacchino <luca.giacchino@intel.com> Co-authored-by: Hugo Wen <46255328+HugoWenTD@users.noreply.github.com> Co-authored-by: Wenxuan <breezewish@outlook.com> Co-authored-by: yuyuankang <yuyuankang@hotmail.com> Co-authored-by: Arseniy Ahtaryanov <mansoryaye@gmail.com> Co-authored-by: nuvotex-tk <161840620+nuvotex-tk@users.noreply.github.com> Co-authored-by: Polo Vezia <pauvezia@publicisgroupe.net> Co-authored-by: MansorY <119126888+MansorY23@users.noreply.github.com> Co-authored-by: Andreas Opferkuch <andreas.opferkuch@gmail.com> Co-authored-by: LoveYou3000 <760583490@qq.com> Co-authored-by: Yuyuan Kang <36235611+yuyuankang@users.noreply.github.com>
* feat: add hnsw support * refactor: minor fixes * feat: reformat code * fix: remove sql injections, reformat code
Resolves #496
Add support for HNSW index in clickhouse, improve and refactor some code
Reference: https://clickhouse.com/docs/engines/table-engines/mergetree-family/annindexes