Create a relationship based on the property value of the generated node

hilumen94 · May 23, 2022, 9:21am

Hello.
Currently, I have created about 1,000,000 nodes using the "ciations file.csv" file, and each node has its property values configured as {lenid, title, year, journal_name}. When I created the node, I used it as below.

LOAD CSV FROM 'file:///citations%20file.csv' AS ci
CREATE (jj:Journal{lensid:ci[0], title:ci[1], year:ci[2], journal_name:ci[3]})

Since then, for connections between nodes, we are trying to create a relationship using a unique identification value called lensid.
For matching identification values, a separate "reference file.csv" file consists of one lensid and reference per line. The lensid has an average of about 30 references per one. The "reference file.csv" contains approximately 30,000,000 (lensid, reference) pairs. Below is what I wrote to create a relationship. ref[0] is the lensid, and ref[1] is the identifier reference belonging to the lensid.

LOAD CSV FROM 'file:///reference%20file.csv' AS ref
WITH ref
MATCH (j:Journal{lensid:ref[1]})
WITH j, ref
MATCH(j2:Journal{lensid:ref[0]})
CREATE (j)-[:referenced]->(j2)

The question is this.
I have performed the work using the code written above, but it is still in progress for 3 days.
However, it seems that the generated code takes a lot of time to match.
Could somebody give me some advice on how to make the calculation for the above relationship simpler?

I don't know if it's necessary, but the hardware information and RAM allocation are as follows.
CPU : intel i7-9750h
RAM : 32GB
dbms.memory.heap.initial_siz e=16G
dbms.memory.heap.max_size=16G
dbms.memory.pagecache.size=10G

Thank you for reading.

glilienfield · May 23, 2022, 10:59am

Do you have an index created for the ‘lensid’ property for the journal label? If not, it will perform a full scan of the Journal nodes for each match. You could use EXPLAIN to see the query plan.

Also, you shouldn’t need the ‘with’ clauses in your query. The following should work too.

LOAD CSV FROM 'file:///reference%20file.csv' AS ref
MATCH (j:Journal{lensid:ref[1]})
MATCH (j2:Journal{lensid:ref[0]})
CREATE (j)-[:referenced]->(j2)

hilumen94 · May 23, 2022, 11:50pm

The 'lensid' property is created for all journal labels.
I confirmed that the method you told me is also proceeding normally.
I will check the query plan you mentioned.
Thank you for your reply.

Topic		Replies	Views
Create relationships between nodes from different CSV Cypher	6	471	September 8, 2020
Create Relationship based on the column header Cypher apoc , performance , cypher	6	300	June 14, 2021
Create a relation by searching all nodes properties and when both from , to exists Neo4j Graph Platform migrated	2	116	August 24, 2022
Create relationship based on column value from CSV Cypher querying , cypher	3	605	July 13, 2021
Create relationship based on column value from CSV Neo4j Graph Platform migrated	1	121	July 8, 2022

Create a relationship based on the property value of the generated node

Related Topics