Less efficiency while performing create relation with neo4j from python

Hi , I am trying to make relations among nodes using python (neo4j lib). The main job that is creating the nodes for the data is done but making relationships among them is really taking time than I expected (it is almost 12hrs for exact 213867 relations).
The main concern is that no process is using my CPU more than 20% , especially Zulu platform ,which does the work i.e creating the relations . Is there a way to speedup this process or to increase my cpu usage?

Hi @grpnpraveen !

Can you share your model and the queries you are using in order to do the import?

I do assume you already created some indexes in order to speed up the MATCH process.

Bennu

@Bennu how can I share the model with you here ? by the way its 213867 relations, it just completed.
yes, I put some ids to match.
thanks in advance

Hi @grpnpraveen!

You can always do it yourself on Arrow website.

Maybe you can share the queries with a EXPLAIN so we can check if the indexes are working properly on your query.

Bennu

@Bennu thanks for the reply.
I don't think that the nodes are that complex. First, I try here to explain about what I am facing with.
I read a csv first where it has ids of cast for each movie in a row ie "0991993|11688537|1190847" , which are ids of the cast that need to be connected.
This is the code.
image

pls do ask if any :hugs:

Hi @grpnpraveen !

Can you share the result pipeline of this query?

EXPLAIN MATCH(p:actor),(m:movie)
WHERE p.id = 0991993 AND m.id = setMovieIdHere
return *
1 Like

Sure @Bennu !
I used this , here instead of 0958345 , I put 958345

WHERE p.id = 958345 AND m.id = "tt0037711"
return *

I have created relation of actor for 5000 movies only! It took 12 hrs +. I still need to add for 85000 movies.

Hi @grpnpraveen !

Just as I tho. You have no Indexes! If you feel confident enough on us, try this:

CREATE INDEX ID_ACTOR for (n:actor) on (n.id);
CREATE INDEX ID_MOVIE for (n:movie) on (n.id);

Then change your query to:

MATCH(p:actor)
WHERE p.id = $actor_id
with p
MATCH (m:movie) 
WHERE m.id = $movie_id
WITH p,m
CREATE (p)-[r:acted-in]->(m)
RETURN type(r)

Bennu

1 Like

@Bennu right now I am creating relations for directors and writers which may take some time.
image
I am confident enough to add those . I will, after completion of these directors and writers.
If you don't mind Can you explain what is this INDEX is?

Hi!

You should add indexes on your directors and writers as well.

Index in general:

Database index - Wikipedia.

Index in particular:

Bennu

PS: With this you easily go from 12h to 12 minutues or much less.

1 Like

Can I add this index even now ? Cz I have already added all the nodes .

Also,

CREATE INDEX ID_ACTOR for (n:actor) on (n.id);
CREATE INDEX ID_MOVIE for (n:movie) on (n.id);

what is ID_ACTOR,ID_MOVIE here?

Sure. But I strongly suggest you to retry the whole process with the indexes Online in order to verify everything has being said.

1 Like

okay ! thanks a lot .

The name of the index.

Spend some time reading the articles shared. It may help a lot.

1 Like

@Bennu OMG!! THANKS for millennium!
It just completed in 10 min for 5000 movies!:heart_eyes_cat:

1 Like

Told ya :wink:

It's all about indexing

1 Like