Speeding up my graph update

Guys, how are you?

I am updating my graph to get rid of my Year nodes. I have set the following query and it has been running for roughly 8 hours, but it only executed 15% of the job so far:

call apoc.periodic.commit("
  afe.date is null
  afe, y
  afe.date = date({year: y.value})

My feeling is that the query starts to slow down because it gets increasingly harder to find the nodes with null values as the nodes get updated.

Question: would it be better to build a index on that property beforehand ? Or would it make it worse since I would have to update also the index during the execution?

Is there any other way to speed things up?

Thanks in advance,

I believe you're correct. Since the query is executed repeatedly, it will also be matching to and evaluating the properties from the same nodes over and over each iteration.

You may want to try apoc.periodic.iterate() instead, it's designed to only match once, and stream results and process in batches:

CALL apoc.periodic.iterate("
  afe.date is null
RETURN afe, y",
  afe.date = date({year: y.value})", {batchSize:1500}) YIELD total, batches, errorMessages
RETURN total, batches, errorMessages

The batching is handled for you here, no need for explicit usage of limit.


Hey Andrew,

Thank you for your answer. I will give apoc.periodic.iterate a go.

Could you clarify one thing? Should I build a index beforehand or not?



I tried the suggested APOC and it is running way faster!

I did not create the index, as I wanted to have a fair comparison with the previous one.


Glad to hear it!

With your query, an index wouldn't have helped, as there's no property lookup here (null checks on properties of nodes don't use index lookup).