Cluster not processing write cypher

We have a cluster of 3 servers (normal config, leader is read/write, 2 followers are read).

We're having an issue of WRITE queries not completing, they're eventually returning an OOM exception.

The cypher (with labels and props obsfucated):

UNWIND ["01", "02", "03", "04", "05", "06", "07", "08", "09"] as pnum
MATCH (:Thing {thingUniqueNumber:pnum})-[r:REL_TO_DELETE]->()
RETURN pnum, r

returns in 13ms with 27 results.

Replacing the RETURN with DELETE r carries on for hours till a log message is reported with oom (I'll update with the actual message if I can).

Possibly related, one of our followers wasn't picking up the heartbeat and getting replicated transactions from the leader after being restarted earlier in the week. The cluster_state folder was cleared as the recommended solution for that.

Thanks for any help!

-Mike French

A reboot of the cluster has resolved the issue. Further investigation returned this:
ERROR LEAK: ByteBuf.release() was not called before it's garbage-collected. See Reference counted objects for more information.

What MAY have caused that....
There were 2 sessions, one adding labels and one removing a different label from the same subset of nodes. That may have caused lock contention.

Hello Mike,

Which version are you using.
Are you able to reproduce it?

enterprise version 3.5.5 and we haven't been able to reproduce it yet.

  1. Can you try to update to the latest maintenance release available?

  2. Did you notice some GC pause on any node of the cluster ?
    you should have messages like this in in debug.log Detected VM stop-the-world
    as mentionned in Fatal error occurred when handling a client connection causes crash
    in neo4j.log : java.lang.OutOfMemoryError: Java heap space