Link nodes hierarchically during LOAD CSV

Hi everyone,
I reach you out because I have a problem :slight_smile:
I have a CSV file I'm getting from an export that is structured as follows:

 L1  | L2  | L3  | L4  | ...
 id1 |     |     |     | ...
     | id2 |     |     | ...
     | id3 |     |     | ...
     |     | id4 |     | ...
     |     | id5 |     | ...
     |     |     | id6 | ...
     |     |     | id7 | ...
     | id8 |     |     | ...
 id9 |     |     |     | ...

As you can see, there is a hierarchy here between the rows:

  • id1
    • id2
    • id3
      • id4
      • id5
        • id6
        • id7
    • id8
  • id 9

I try to recreate that hierarchy into my graph using Cypher but I honestly don't know where to begin... I already have a working import using LOAD CSV with all nodes and some relationships created. Now the only missing part is that ()-[:CHILD_OF]->() relationship.

Has anyone faced this situation already? Do you have a strategy and/or code to share?
Any help very much appreciated :slight_smile:

It would be better for your CSV to be simpler, as such the number of columns for your CSV depends on the structure, when for these simple kinds of connections you should have a fixed number of columns.

In this case, it would be far easier to use a CSV formatted like:


Something like this, when all you need for the relationship is represented on a row (relationships represented in a CSV should not depend upon other rows, or row ordering). Then do passes to MERGE the nodes (2 passes to avoid Eager operators), then a final pass to MATCH the nodes and MERGE the relationships between them.

Sounds like the nodes already exist. Add an index on them (if it doesn't exist already) to support quick matching, MATCH the nodes, CREATE the relationships between.

I followed your advice and it perfectly works. I added some Excel macros to reduce manual data transformations as much as possible. Thanks @andrew_bowman