How to create relationship between two existing nodes by using node id?

Hello,

I am trying to create a relationship between two existing nodes. I am reading the node ID's from a CSV and creating the relationship with the following query:

LOAD CSV WITH HEADERS FROM "file:///8245.csv" AS f
MATCH (Ev:Event) where id(Ev) =f.first 
MATCH (Ev_sec:Event) where id(Ev_sec) = f.second
WITH Ev, Ev_sec
MERGE  (Ev) - [:DF_mat] - > (Ev_sec)

However, it is not changing anything the database. I am totally aware that this is not a safe way to create the relationship but I do not have any other key property for this node.
So, can anyone tell me what the problem here is?

Thanks!

relationship with the following query:
Hi @pinartyilmaz:

Please go the documentation on how to load csv.

  1. Use Match when you try to select something from Neo4j DB. In your case it should be Create/ Merge.

  2. Every object in Neo4j has a metadata id column and this column can’t be overwrite by user . The value of that property can we retrieved by using id(node-var). You yourself cannot initialise it.

  3. Please go to documentation on difference between Merge and Create in Neo4j. I used Create for Relationship so that every time it will create new edge . However based on your usecase you can use Merge too instead of Create.

Try below code for your data load.

LOAD CSV WITH HEADERS FROM "file:///8245.csv" AS f
Merge (Ev:Event{ID:f.first })
Merge( (Ev_sec:Event{ID:  f.second})
Create  (Ev) - [:DF_mat] - > (Ev_s

Hello @intouch_vivek,

Thank you for your reply!

I do not understand why I should Merge, I should use Match because I am not creating new nodes, my nodes are already existing in the database
Also, `(Ev:Event{ID:f.first }) will not work because I do not have any property called ID for event node. As you told, I am retrieving the node id by using id(node-var).

Ah ok.. thanks for clarifying that If you already have node present in the DB and you are using the current load to create a relationship.

in that case what do you mean by id(Ev)?
id(Ev) mean property id value of the respective node. This id is DB specific.

However if you have an custom property id respective to the node then use Ev.id
In short when you are saying id(Ev) then that means property id of a node created by Neo4j for it's own use.
Ev.id act same as other property of a node.

image
In the attached pic both the property are different is created by db and other is property created by user

If above is correct then please try
LOAD CSV WITH HEADERS FROM "file:///8245.csv" AS f
MATCH (Ev:Event {id :f.first })
MATCH (Ev_sec:Event{ id: f.second})
MERGE (Ev) - [:DF_mat] - > (Ev_sec)

Infact when you download node from db to csv id does not come.
However if by property id in the csv file you meant to say that

Yeah, I am using id(Ev) in terms of property id value which is given by DB. True id does not come, but I queried it and in my CSV I have id(node) values which are columns first and second. That is why I want to create a new relationship depending on that id values.
I am not sure whether you wrote the following line intentionally but it is unfortunately not working:

MATCH (Ev_sec:Event{id(Ev_sec:f.second)})

Yeah you are write it was typo mistake I corrected it.. When you are aware that you are intentionally using db Id .. Please let us know how have you stored that id into your csv.

Incase it is urgent and need help then we can connect on Skype and you can show me your issue

Thank you so much @intouch_vivek! But I solved my problem. So, I again queried for the ID(node) and this time I exported them as a string (by using toString(ID(node)) ). Then while loading to the database, I converted them to Integer. The query is as follows:

LOAD CSV WITH HEADERS FROM "file:///8245_new.csv" AS csvLine
match (ev:Event) where id(ev)=toInteger(csvLine.first)
match (ev_sec:Event) where id(ev_sec)=toInteger(csvLine.second)
merge (ev)-[:DF_mat]-> (ev_sec)

I know it is a very strange solution, but it worked!

Avoid using internal id(node)

especially for importing and matching data

Where possible, you should avoid using the internal id property. I had a similar habit from relational dbs to always link by a db-generated identity field, and it can only make your efforts more difficult. For example, there are many timing oddities with internal IDs, especially within one transaction after creating or deleting nodes.

Further, Neo4j reuses its internal ids when nodes and relationships are deleted. Meaning your LOAD CSV could be matching the wrong nodes, especially if you ever delete an :Event.

What to do instead

Consider how much more simple your import would be, using ev.id instead of id(ev) (they are two different properties), with proper indexing and constraints. See: Neo4j Docs: Cypher Schema

CREATE CONSTRAINT ON (e:Event) ASSERT e.id IS UNIQUE;
CREATE INDEX ON :Event(id);
MATCH (e:Event) SET e.id = id(e);

With the e.id property set to behave similar to a primary-key from relational land, you'll find that operations and queries become much simpler, and faster. With the one caveat that you'll need to set the id property whenever you create a new :Event. Better to find or create another mechanism for maintaining identity, like apoc.create.uuid, or a Best would be to use a hash/key from when the event was first recorded.

You could potentially use apoc.trigger to auto-generate a UUID on create, if it is really necessary.

CALL apoc.trigger.add('create-event-gen-id',"UNWIND {createdNodes} AS e set e.id=apoc.create.uuid()", {phase:'after'});

Import with UUID instead of id()

LOAD CSV WITH HEADERS FROM "file:///8245.csv" AS f
MATCH (e1:Event {id: f.first}), (e2:Event {id: f.second})
MERGE (e1)-[:DF_mat]->(e2)

This would have many added advantages:

  • Faster.
  • Less prone to breaking.
  • No need for conversions or type-juggling.
  • Statistically zero chance of collision (id reuse).
  • Portable between many formats and systems.
2 Likes

Hello @tony.chiboucas, I am totally aware that it is not the safest way, thank you so much for the idea you gave. I will do that!