Code optimization to decrease run time within apoc.periodic.iterate

lavanya_kannan · February 3, 2020, 8:36pm

My current code below runs for a long time for 100001 alias nodes:


CALL apoc.periodic.iterate("MATCH (a:alias) RETURN a",
"Match 
path=((a) -- (c1:citation) -[p1]-> (t:BIOTERM) <-[p2]- (c2:citation) -- (b:alias))
WHERE id(a) < id(b) AND id(c1) <> id(c2)
With a, b, p1, p2, 2 as precision
 WITH a, b, p1, p2, 10^precision as factor
Create (a)-[e:through_topic]->(b)
Set e.weight= round(factor* (1/(2+p1.weight+p2.weight))) / factor", {batchSize:1000}) YIELD batches, total, errorMessages

When I ran for a single alias


Match 
path=((a:alias {name: 293} ) -- (c1:citation) -[p1]-> (t:BIOTERM) <-[p2]- (c2:citation) -- (b:alias))
WHERE id(a) < id(b) AND id(c1) <> id(c2)
With a, b, p1, p2, 2 as precision
 WITH a, b, p1, p2, 10^precision as factor
Create (a)-[e:through_topic]->(b)
Set e.weight= round(factor* (1/(2+p1.weight+p2.weight))) / factor

completed in 1 or 2 ms. Should I try to optimize my code or play more with the batchsize of the apoc.periodic.iterate or both? I had no luck decreasing the batchsize.

I ran EXPLAIN and PROFILE with

Thanks,
Lavanya

andrew_bowman · February 3, 2020, 8:55pm

You may want to rearrange your query somewhat, doing the heavy lifting of the MATCH and calculation in your driving query, and only doing the CREATE in the updating query:

CALL apoc.periodic.iterate("MATCH  path=((a:alias) -- (c1:citation) -[p1]-> (t:BIOTERM) <-[p2]- (c2:citation) -- (b:alias))
WHERE id(a) < id(b) AND id(c1) <> id(c2)
WITH a, b, p1, p2, 2 as precision
 WITH a, b, p1, p2, 10^precision as factor
WITH a, b, round(factor* (1/(2+p1.weight+p2.weight))) / factor as weight
RETURN a, b, weight",
CREATE (a)-[e:through_topic]->(b)
SET e.weight= weight", {batchSize:5000}) YIELD batches, total, errorMessages

As for execution time, if you're seeing around 500k rows being processed for just a single alias, then yes I would expect that this could take a long time.

You may also want to check your memory settings with neo4j-admin memrec.

Topic		Replies	Views
Optimizing code within apoc.periodic.iterate Cypher	1	473	February 22, 2020
Optimize cypher query that involves multiple match commands Cypher	4	526	February 15, 2020
Statement using Apoc Periodic Iterate gets stuck, but works without the iterate Cypher apoc , cypher	3	151	March 10, 2023
Parallel Cypher & Apoc Cypher apoc , cypher	8	3851	June 19, 2019
Struggling with apoc.periodic.iterate in a big Query from python code Cypher apoc , cypher , apocperiodiciterate	12	5154	May 8, 2019

Code optimization to decrease run time within apoc.periodic.iterate

Related Topics