Skip to content

Add HNSW support for Clickhouse client #500

Merged
alwayslove2013 merged 4 commits intozilliztech:mainfrom
MansorY23:main
Apr 24, 2025
Merged

Add HNSW support for Clickhouse client #500
alwayslove2013 merged 4 commits intozilliztech:mainfrom
MansorY23:main

Conversation

@MansorY23
Copy link
Contributor

Resolves #496
Add support for HNSW index in clickhouse, improve and refactor some code
Reference: https://clickhouse.com/docs/engines/table-engines/mergetree-family/annindexes

@alwayslove2013
Copy link
Contributor

@MansorY23 There are some conflicts; please rebase the main branch first.

@MansorY23
Copy link
Contributor Author

@MansorY23 There are some conflicts; please rebase the main branch first.

sure!

@MansorY23 MansorY23 reopened this Apr 16, 2025
@MansorY23
Copy link
Contributor Author

@alwayslove2013 hi! i squashed commits into one. but there is test, that i failed, and i cant see why, because my code works

@alwayslove2013
Copy link
Contributor

@MansorY23 upgrade black and ruff maybe help

pip install black --upgrade
pip install ruff --upgrade
make lint
make format

@MansorY23
Copy link
Contributor Author

@alwayslove2013 thank you! your advice helped me

@alwayslove2013 alwayslove2013 self-requested a review April 18, 2025 09:21
@MansorY23
Copy link
Contributor Author

@alwayslove2013 can you please approve this pr?

@sre-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: alwayslove2013, MansorY23
To complete the pull request process, please assign xuanyang-cn after the PR has been reviewed.
You can assign the PR to them by writing /assign @xuanyang-cn in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@alwayslove2013 alwayslove2013 merged commit 7f83936 into zilliztech:main Apr 24, 2025
4 checks passed
shaharuk-yb added a commit to yugabyte/VectorDBBench that referenced this pull request May 15, 2025
* fix: Unable to run vebbench and cli

fix: remove comma of logging str
fix cli unable to run zilliztech#444

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* enhance: Unify optimize and remove ready_to_load

PyMilvus used to be the only client that uses ready_to_load.
Not it'll load the collection when creating it, so
this PR removes `ready_to_load` from the client.API

Also this PR enhance optimize and remove the optimize_with_size

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* add mongodb client

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>

* add mongodb client in readme

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>

* add some risk warnings for custom dataset
- limit the number of test query vectors.

Signed-off-by: min.tian <min.tian.cn@gmail.com>

* Bump grpcio from 1.53.0 to 1.53.2 in /install

Bumps [grpcio](https://github.com/grpc/grpc) from 1.53.0 to 1.53.2.
- [Release notes](https://github.com/grpc/grpc/releases)
- [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md)
- [Commits](grpc/grpc@v1.53.0...v1.53.2)

---
updated-dependencies:
- dependency-name: grpcio
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* add mongodb config

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>

* Opensearch interal configuration parameters (zilliztech#463)

* Added the configuration parameters to create Opensearch dynamically with right replicas, shards and other opensearch related configurations.

Added the feature to create OS index with 0 replica and once the data is loaded update the replicas according to the parameter.

* Updated the readme for config parameters

---------

Co-authored-by: xavrathi <xavrathi@amazon.com>

* ui control num of concurrencies

Signed-off-by: siqi.an <ansiqi_7777@163.com>

* Update README.md

* environs version should <14.1.0

Signed-off-by: min.tian <min.tian.cn@gmail.com>

* Support GPU_BRUTE_FORCE index for Milvus (zilliztech#476)

Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com>
Co-authored-by: Signed-off-by: Rachit Chaudhary - r0c0axe <Rachit.Chaudhary@walmart.com>

* Add table quantization type

* Support MariaDB database (zilliztech#375)

MariaDB introduced vector support in version 11.7, enabling MariaDB
Server to function as a relational vector database.
https://mariadb.com/kb/en/vectors/

Now add support for MariaDB server, verified against MariaDB server
of version 11.7.1:

- Support MariaDB vector search with HNSW algorithm, support filter
  search.
- Support index and search parameters:
   - storage_engine: InnoDB or MyISAM
   - M: M parameter in MHNSW vector indexing
   - ef_search: minimal number of result candidates to look for in the
                vector index for ORDER BY ... LIMIT N queries.
   - max_cache_size: Upper limit for one MHNSW vector index cache
- Support CLI of `vectordbbench mariadbhnsw`.

* Add TiDB backend (zilliztech#484)

* Add TiDB backend

Signed-off-by: Wish <breezewish@outlook.com>

* Fix

Signed-off-by: Wish <breezewish@outlook.com>

* Fix

Signed-off-by: Wish <breezewish@outlook.com>

* Improve error handling

Signed-off-by: Wish <breezewish@outlook.com>

---------

Signed-off-by: Wish <breezewish@outlook.com>

* CLI fix for GPU index (zilliztech#485)

* Support GPU_BRUTE_FORCE index for Milvus

Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com>

* MilvusGPUBruteForceTypedDict addition

Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com>

---------

Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com>
Co-authored-by: Signed-off-by: Rachit Chaudhary - r0c0axe <Rachit.Chaudhary@walmart.com>

* remove duplicated code

* feat: initial commit

* Add vespa integration

* remove redundant empty_field config check for qdrant and tidb

Signed-off-by: min.tian <min.tian.cn@gmail.com>

* reformat all

Signed-off-by: min.tian <min.tian.cn@gmail.com>

* fix cli crush

Signed-off-by: min.tian <min.tian.cn@gmail.com>

* downgrade streamlit version

* add more milvus index types: hnsw sq/pq/prq; ivf rabitq

Signed-off-by: min.tian <min.tian.cn@gmail.com>

* add more milvus index types: ivf_pq

Signed-off-by: min.tian <min.tian.cn@gmail.com>

* Add HNSW support for Clickhouse client  (zilliztech#500)

* feat: add hnsw support

* refactor: minor fixes

* feat: reformat code

* fix: remove sql injections, reformat code

* fix bugs when use custom_dataset without groundtruth file

Signed-off-by: min.tian <min.tian.cn@gmail.com>

* fix: prevent the frontend from crashing on invalid indexes in results

* fix ruff warnings

* Fix formatting

* Add lancedb

* Add --task-label option for cli (zilliztech#517)

* Add --task-label option for cli

* Fix lint issues

* Add qdrant cli

* Update README.md

* Fixing Bugs in Benchmarking ClickHouse with vectordbbench (zilliztech#523)

* Update cli.py

* Update clickhouse.py

* Update clickhouse.py

* Update cli.py

* Update config.py

* remove space

* Add --concurrency-timeout option to avoid long time waiting (zilliztech#521)

* Add --concurrency-timeout option to avoid long time waiting, by default, it's 3600s.

* Fix lint error

* Update README.md, add --concurrency-timeout option

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
Signed-off-by: min.tian <min.tian.cn@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: siqi.an <ansiqi_7777@163.com>
Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com>
Signed-off-by: Wish <breezewish@outlook.com>
Co-authored-by: yangxuan <xuan.yang@zilliz.com>
Co-authored-by: zhuwenxing <wenxing.zhu@zilliz.com>
Co-authored-by: min.tian <min.tian.cn@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xavierantony1982 <xavierantony@gmail.com>
Co-authored-by: xavrathi <xavrathi@amazon.com>
Co-authored-by: siqi.an <ansiqi_7777@163.com>
Co-authored-by: Xiaofan <83447078+xiaofan-luan@users.noreply.github.com>
Co-authored-by: Rachit Chaudhary <65501028+Rachit-Chaudhary11@users.noreply.github.com>
Co-authored-by: Signed-off-by: Rachit Chaudhary - r0c0axe <Rachit.Chaudhary@walmart.com>
Co-authored-by: Luca Giacchino <luca.giacchino@intel.com>
Co-authored-by: Hugo Wen <46255328+HugoWenTD@users.noreply.github.com>
Co-authored-by: Wenxuan <breezewish@outlook.com>
Co-authored-by: yuyuankang <yuyuankang@hotmail.com>
Co-authored-by: Arseniy Ahtaryanov <mansoryaye@gmail.com>
Co-authored-by: nuvotex-tk <161840620+nuvotex-tk@users.noreply.github.com>
Co-authored-by: Polo Vezia <pauvezia@publicisgroupe.net>
Co-authored-by: MansorY <119126888+MansorY23@users.noreply.github.com>
Co-authored-by: Andreas Opferkuch <andreas.opferkuch@gmail.com>
Co-authored-by: LoveYou3000 <760583490@qq.com>
Co-authored-by: Yuyuan Kang <36235611+yuyuankang@users.noreply.github.com>
euphoria0-0 pushed a commit to CryptoLabInc/VectorDBBench that referenced this pull request Nov 21, 2025
* feat: add hnsw support

* refactor: minor fixes

* feat: reformat code

* fix: remove sql injections, reformat code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add HNSW support for Clickhouse client

3 participants