-
Notifications
You must be signed in to change notification settings - Fork 467
Description
Hi everyone,
I'm encountering an issue with Apache AGE where a complex MATCH query always defaults to using a sequential scan, even though indexes exist on the queried columns. Disabling sequential scans via SET enable_seqscan=off has no effect, and the query plan i.e. explain analyze output continues to show a sequential scan.
Query Example:
Here's a simplified version of the query we are using:
SELECT * FROM cypher('graph_name', $$
MATCH (n1:NodeType1)
WHERE n1.attribute1 = '<value1>'
AND n1.attribute2 IN ('<value2>')
WITH n1
OPTIONAL MATCH (n1)-[:RelType1_NodeType1]-(n2:NodeType2)
WITH n1, n2
OPTIONAL MATCH (n1)-[:RelType2_NodeType1]-(n3:NodeType3)
WITH n1, n2, n3
OPTIONAL MATCH (n1)-[:RelType3_NodeType1]-(n4:NodeType4)
RETURN DISTINCT n1 AS Node1, n2 AS Node2, n3 AS Node3, n4 AS Node4
$$) AS (result_column agtype);
Data Setup:
We have populated the graph with data using queries like the following :
SELECT * FROM cypher('graph_name', $$
MERGE (n:NodeType1 {key1: "value1"})
SET n.property1 = "value1",
n.property2 = "value2",
n.property3 = "value3",
....
$$) AS (result_column agtype);
Problem:
The query plan indicates that a sequential scan is being used on NodeType1 and other nodes, despite indexes being present on attribute1 and attribute2. For performance, we expect the query to utilize the indexes for an index scan.
Observed Behavior:
The query consistently uses sequential scans.
Setting enable_seqscan = off doesn't change the behavior.
Expected Behavior:
The query should leverage the indexes on NodeType1.attribute1 and NodeType1.attribute2 to perform an index scan.
Environment Details:
We are running a containerised apache age docker image on k8s.
Apache AGE version: release_PG16_1.5.0
PostgreSQL version: 16
K8S Version: v1.29.6
What We've Tried:
-
Verified that the relevant indexes exist.
-
Created indexes on individual properties, e.g., attribute1 and attribute2.
CREATE INDEX idx_attribute1 ON graph_table USING btree ((properties->>'attribute1'));
CREATE INDEX idx_attribute2 ON graph_table USING btree ((properties->>'attribute2')); -
Created indexes on the entire properties column for broader coverage.
CREATE INDEX idx_properties ON graph_table USING gin (properties); -
Set enable_seqscan = off.
-
Rebuilt the indexes and reanalyzed the table using ANALYZE.
-
Simplified the query to test individual segments but observed the same issue.
-
The Merge queries are working as expected with indexes, as we see as significant difference in write latencies with and without indexes.
-
We have ingested a good amount of data (around 25k rows) so as to warrant an index scan.
Questions:
Why does the MATCH query ignore available indexes and use sequential scans?
Are there any specific configurations or query optimizations required to enable index scans in Apache AGE for graph queries?
Could this be a limitation or a bug in Apache AGE?
Any help or insights from the community would be greatly appreciated!
If additional information, logs, or examples are needed, please let me know.
Thank you!