TL;DR:
Slow Cypher Query For Large Dataset.. Why?
QUERY:
EXPLAIN
WITH "(?i).test.*" as target
MATCH (AA:A)-[r:B]-(CC:C)
WHERE ANY(host IN AA.D WHERE host =~ target)
OPTIONAL MATCH (DD:D) WHERE CC.id =~ DD.CCid
OPTIONAL MATCH (EE:E) WHERE CC.id =~ EE.CCid
OPTIONAL MATCH (FF:F) WHERE CC.id =~ FF.CCid
OPTIONAL MATCH (GG:G) WHERE CC.id =~ GG.CCid
OPTIONAL MATCH (HH:H) WHERE CC.id =~ HH.CCid
WITH COLLECT(AA) + COLLECT(CC) + COLLECT(DD)
+ COLLECT(EE) + COLLECT(FF) + COLLECT(GG)
+ COLLECT(HH) as data
UNWIND data as datum
RETURN DISTINCT datum;
CONTEXT:
Without Neo4J Enterprise (procurement delays), I've been trying to performance tune some queries by hand using
EXPLAIN
andPROFILE
, with limited success.I'd like to leverage the wisdom of crowds at this point, if feasible..
Any insights to be shared about the performance heuristics (or obvious nonperformant clauses) of the query above would be immensely appreciated.
Thanks,
Neo4j User