Skip to content

MATCH Query Always Using Sequential Scan, Ignoring Index Scan #2137

@pritish-moharir

Description

@pritish-moharir

Hi everyone,

I'm encountering an issue with Apache AGE where a complex MATCH query always defaults to using a sequential scan, even though indexes exist on the queried columns. Disabling sequential scans via SET enable_seqscan=off has no effect, and the query plan i.e. explain analyze output continues to show a sequential scan.

Query Example:

Here's a simplified version of the query we are using:

SELECT * FROM cypher('graph_name', $$  
MATCH (n1:NodeType1)  
WHERE n1.attribute1 = '<value1>'  
  AND n1.attribute2 IN ('<value2>')  
WITH n1  
OPTIONAL MATCH (n1)-[:RelType1_NodeType1]-(n2:NodeType2)  
WITH n1, n2  
OPTIONAL MATCH (n1)-[:RelType2_NodeType1]-(n3:NodeType3)  
WITH n1, n2, n3  
OPTIONAL MATCH (n1)-[:RelType3_NodeType1]-(n4:NodeType4)  
RETURN DISTINCT n1 AS Node1, n2 AS Node2, n3 AS Node3, n4 AS Node4  
$$) AS (result_column agtype);

Data Setup:

We have populated the graph with data using queries like the following :

SELECT * FROM cypher('graph_name', $$  
MERGE (n:NodeType1 {key1: "value1"})  
SET n.property1 = "value1",  
    n.property2 = "value2",  
    n.property3 = "value3",   
    ....
$$) AS (result_column agtype); 

Problem:

The query plan indicates that a sequential scan is being used on NodeType1 and other nodes, despite indexes being present on attribute1 and attribute2. For performance, we expect the query to utilize the indexes for an index scan.

Observed Behavior:

The query consistently uses sequential scans.
Setting enable_seqscan = off doesn't change the behavior.

Expected Behavior:

The query should leverage the indexes on NodeType1.attribute1 and NodeType1.attribute2 to perform an index scan.

Environment Details:

We are running a containerised apache age docker image on k8s.
Apache AGE version: release_PG16_1.5.0
PostgreSQL version: 16
K8S Version: v1.29.6

What We've Tried:

  • Verified that the relevant indexes exist.

  • Created indexes on individual properties, e.g., attribute1 and attribute2.
    CREATE INDEX idx_attribute1 ON graph_table USING btree ((properties->>'attribute1'));
    CREATE INDEX idx_attribute2 ON graph_table USING btree ((properties->>'attribute2'));

  • Created indexes on the entire properties column for broader coverage.
    CREATE INDEX idx_properties ON graph_table USING gin (properties);

  • Set enable_seqscan = off.

  • Rebuilt the indexes and reanalyzed the table using ANALYZE.

  • Simplified the query to test individual segments but observed the same issue.

  • The Merge queries are working as expected with indexes, as we see as significant difference in write latencies with and without indexes.

  • We have ingested a good amount of data (around 25k rows) so as to warrant an index scan.

Questions:

Why does the MATCH query ignore available indexes and use sequential scans?
Are there any specific configurations or query optimizations required to enable index scans in Apache AGE for graph queries?
Could this be a limitation or a bug in Apache AGE?
Any help or insights from the community would be greatly appreciated!

If additional information, logs, or examples are needed, please let me know.

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    StaleStale issues/PRsquestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions