We are indexing a property of a node and the server is doing something (I see increments in the physical disk size of the index) but it is running for several days (more than a week) and it doesn't seem near completion. For me several days of indexing in a powerful machine denotes that something is wrong but I don't know what, so asking help here in order to diagnose the problem.
- The number of nodes is 999x10^6 , a billion.
- The property is a string hash of size 64 completely random characters
- We are using Neo4J Enterprise Causal Cluster created from Google Cloud Marketplace template. The cluster is just 3 core members:
- The leader has 32 CPU cores and 128Gb of RAM.
- Followers have 4 CPU cores and 128Gb of RAM
- Memory is configured as:
as was recommended by neo4j-admin memrec
- Memory usage for the java process in Leader is reported as 64%
- CPU usage is just at 10% consistently
- Index size keeps growing although slowly
neo4j-enterprise-causal-cluster-1-core-vm-1:/var/lib/neo4j/data/databases/graph.db$ while true; do echo "$(date -Iseconds) $(du -ck schema/index/native-btree-1.0/* | grep total)" ; sleep 60; done
2019-04-18T07:06:07+00:00 145382404 total
2019-04-18T07:07:07+00:00 145384212 total
2019-04-18T07:08:07+00:00 145386312 total
2019-04-18T07:09:07+00:00 145388496 total
2019-04-18T07:10:07+00:00 145390420 total
2019-04-18T07:11:07+00:00 145392748 total
- There is only one index being created at the moment and it has been running for 6 days.
I don't know if these numbers are "normal" for the problem's size, but I would appreciate any help on trying to diagnose if this is expected or if I should tweak some parameter or check any log to discover who us causing this slow down.
no this definitely not normal.
Which Neo4j version is this running?
What kind of disk setup, do you have info about the disk performance.
How did you create the index?
How is the I/O load? What kind of disk did you provision?
How big is that store on disk?
Can you add this to your config.
- I'm using Neo4J 3.5.3 from Google Cloud Marketplace (link)
- As disks I'm using standard disks from Google. You can check specs here Storage options | Compute Engine Documentation | Google Cloud
- I created the index by ussing a Cypher query from Neo4J Desktop: "CREATE INDEX ON :Transactions(hash)"
- Store of the DB can be checked below, disk size is 2 terabytes
neo4j-enterprise-causal-cluster-1-core-vm-1:/var/lib/neo4j/data/databases$ du -h *
The DB was created using
neo4j-import tool and after that I launched index creation.
How could I check I/O load for the DB?
I'll add that config to the DB and I will report back.
Ok, yesterday I restarted the DB with
at the beginning it was very fast when I was checking
call db.indexes but now it is stuck. I have two cluster to test ideas about how to solve this, one has a leader with 128Gb RAM and the other with 64Gb.
- In first cluster, index creation is stuck (although slowly advancing) at 37%
- In second cluster, index creation is stuck (although slowly advancing) at 15%
They both were launched during same hour, and index creation status seems to follow memory ratios. Does that ring any bell on you? I'm running out of ideas. I don't know if it has something to do here, but this machine hasn't any swap memory just RAM
Sorry for the delay, answer from our team:
Took a quick look - looks like they're on 3.5.3? I think they need to be on 3.5.5 or higher to get all the index population fixes.
(and will still need
Hi Michael. I'm running into a similar problem where my index building is taking a prohibitively long time (~7 hours to index a node property on 200M nodes). I've tried using this
blockBasedPopulation but building the index with this setting enabled causes an out of memory crash. The index builds for a while (around an hour) with low memory usage. Then memory usage spikes, and the DB gets OOM-killed.
Any ideas on how this could be resolved? I'm using 3.5.6
It works! Index that was taking 7 hours to build now takes 45 minutes, and uses pretty much constant memory.
Also affecting this is the type of disk, OS and Drivers/Firmware.
Data imported on Ubuntu and then indexed took a few hours.
Same data and database imported onto AWS Linux or CentOS 7.4 to less than 10 minutes.
Same SSD and IO provisioning on both OS's
Better drivers on some than others.
This was happening to my database as well, with 488 GB RAM on AWS i3.16xlarge, after 45 min the entire db was being killed. We are moving to 3.5.7 now to see if it fixes the issue.
Update - 3.5.7 has fixed it, the machine no longer dies on index creation.