Load-CSV very slow with millions of nodes

Two things:

1.) Using Neo4j admin import seen here Import - Operations Manual
We load 6 Billions of nodes and 10 Billion rels in 2-3 hours. You want to break the files down into distinct nodes and distinct pairs of nodes. if your numerical id's are unique you can maybe use the flag "id-type =integer" which will give you a boost on memory used.

2.)If you have to use somekind of load csv, Load in Parallel. Consider apoc.periodic.iterate, seen here Neo4j APOC Procedures User Guide .Relationships cannot load in parallel unless you are sure that the way the file is built that no two cpus will try and grab the same nodes, so it looks like you may be stuck with slow loading, if you can be smart about sorting the data such that for each batch, you never have a race condition to the same nodes I think it could be parallelized, i.e. if you have a batch size of 1000 and you have 8 cpu's then the first 1-1000 can have the same nodes, but the next 2000-6000 must have different nodes, It might be tough but could give lift if the sort was figured out. Also I bet neo4j would appreciate it xD! I would suggest maybe increasing the the commit size to 100k, 5k is small, you have 30 GB RAM to use when you load relationships, you might be able to get away with higher, like 300k per batch.