-
Notifications
You must be signed in to change notification settings - Fork 801
Support Elasticsearch 8 #1664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Elasticsearch 8 #1664
Conversation
pquentin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a few comment to explain my thought processes when working on this pull request, including places where different decisions could have been made.
| mkdir /tmp/elasticsearch | ||
| wget -O - https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-${{ matrix.es-version }}-linux-x86_64.tar.gz | tar xz --directory=/tmp/elasticsearch --strip-components=1 | ||
| /tmp/elasticsearch/bin/elasticsearch -d | ||
| /tmp/elasticsearch/bin/elasticsearch -E xpack.security.enabled=false -E discovery.type=single-node -d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think disabling security is OK here as HTTPS is not a concern of this library: this is handled at the elasticsearch-py/elastic-transport-python level.
| """ | ||
|
|
||
| def __init__(self): | ||
| def __init__(self, *, elasticsearch_class=Elasticsearch): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change allows test_connections.py to use a dummy Elasticsearch class for most of the tests.
| d = self.to_dict(count=True) | ||
| # TODO: failed shards detection | ||
| return es.count(index=self._index, body=d, **self._params)["count"] | ||
| resp = es.count(index=self._index, query=d.get("query", None), **self._params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not keep body because the tests used mocker.spy and since body is dynamically added in elasticsearch-py the mock did not see it. But I could also use a different testing strategy if you prefer.
| if r.get("error", False): | ||
| if raise_on_error: | ||
| raise TransportError("N/A", r["error"]["type"], r["error"]) | ||
| raise ApiError("N/A", meta=responses.meta, body=r) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elasticsearch-py 8.x has two types of errors now, ApiError being used for errors reported by Elasticsearch itself and TransportError being used for errors at the transport layer (such as connection issues).
| for index_name in client.indices.get(index="test-*", expand_wildcards="all"): | ||
| client.indices.delete(index=index_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleting indices with a wildcard require a cluster setting in Elasticsearch 8. I had to specify expand_wildcards as some tests are closing indices.
| "name": {"first": "Shay", "last": "Bannon"}, | ||
| "lang": "java", | ||
| "twitter": "kimchy", | ||
| return ObjectApiResponse( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In elasticsearch-py 8.x, the response pretends it is a dictionary to be backwards-compatible, but it does not support __setitem__ that is needed for faceted search to work. This is why we explicitly request .body in Search.execute: this is an actual dictionary that supports __setitem__.
The alternative is to add __setitem__ support in elastic-transport-python.
| def test_get_connection_created_connection_if_needed(): | ||
| c = connections.Connections() | ||
| c.configure(default={"hosts": ["es.com"]}, local={"hosts": ["localhost"]}) | ||
| c = connections.Connections(elasticsearch_class=DummyElasticsearch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those tests are about elasticsearch_dsl.connections, not about how we store hosts in Elasticsearch. In elasticsearch-py 8.x, the hosts are technically available, but they're inside a deep hierarchy of classes, so using a dummy class avoids depending on those internals.
| def test_create_connection_adds_our_serializer(): | ||
| c = connections.Connections() | ||
| c.create_connection("testing", hosts=["es.com"]) | ||
| c = connections.Connections(elasticsearch_class=Elasticsearch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test however uses the real client, for what it's worth.
| return 42 | ||
|
|
||
| assert serializer.serializer.dumps(MyClass()) == "42" | ||
| assert serializer.serializer.dumps(MyClass()) == b"42" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In elasticsearch-py 8.x, serializers now explicitly encode the results to UTF-8. Which is more correct, after all serializing is supposed to output bytes.
JoshMock
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good!
When I say "support", I mean that all tests are passing with >90% coverage.