Multiple matches performance drop

Hello everyone,

I am facing a weird issue with Neo4j. I have a relatively large graph with about 2 million nodes, and I would like to run personalized pagerank on some lists of nodes. I use the following syntax to grab the nodes I need
MATCH (a:type {id:value})
MATCH (b:type {id:value2})
MATCH (c:type {id:value3})
and it seems not to be working out well in terms of performance.

More specifically, fetching 500 nodes, even without feeding them to pagerank, takes about a minute, and 1000 takes about 10, which is not the linear increase I expected.
Using PROFILE reveals that cartesian products are formed, first for a, b then for a, b, c, etc. Given that id is a unique index and that I provide the right type of node, is this performance drop for multiple matches expected? If not, what could be the culprit?

Thanks in advance,

In this kind of case, a cartesian product is expected and correct, and since these are unique indexes your result should only be a single row.

It would help to confirm the existence of an index on :type(id), and to see the PROFILE query plan with all elements expanded.

1 Like

Thanks for the feedback.

One thing I should note, even though it may be clear from the first post, is that when I need to find N nodes, I perform N matches, so the final query has about ~N lines. From what I searched online, this is considered a bad practice. Maybe I should use some form of batching instead? Or would that not be relevant?

I think we would need to see the full query with some description on what it's supposed to do before we can make that call.

You should

  1. have an index or constraint
  2. use parameters
  3. use IN
MATCH (n:Type) WHERE IN $params