I am using Neo4J Community 3.5.17. I wanted to find the closest :Fraud node to a :Person node for a list of fids(Person unique identifier) and am using the following query. Please note fids are currently of string type whereas the input and output files contain fid in int type.
profile cypher runtime=interpreted load csv with headers from 'file:///shortest_path_data/test.csv' as line with line.fid as fid match (n:Person) where n.fid=toString(fid) with n call apoc.path.expandConfig(n,{labelFilter:'/Fraud', maxLevel:10, optional:true, limit:1}) yield path return toInteger(n.fid) as fid,length(path)/2 as distance;
This query worked for an input file of 30K fids, however runs endlessly and triggers GC for 300K fids. Here is the debug log-
if you rerun and include a PERIODIC COMMIT ( LOAD CSV - Cypher Manual ) such that you commit every 5k records, for example does this provide any improvement?
I do have an index on :Person(fid). On using periodic commit, I get the following error-
Cannot use periodic commit in a non-updating query (line 1, column 36 (offset: 35))
"using periodic commit 5000 load csv with headers from 'file:///shortest_path_data/test.csv' as line with line.fid as fid match (n:Person) where n.fid=toString(fid) with n call apoc.path.expandConfig(n,{labelFilter:'/Fraud', maxLevel:10, optional:true, limit:1}) yield path return toInteger(n.fid) as fid,length(path)/2 as distance;"
Invalid input ':': expected <init> (line 1, column 36 (offset: 35))
":auto using periodic commit 5000 load csv with headers from 'file:///shortest_path_data/test.csv' as line with line.fid as fid match (n:Person) where n.fid=toString(fid) with n call apoc.path.expandConfig(n,{labelFilter:'/Fraud', maxLevel:10, optional:true, limit:1}) yield path return toInteger(n.fid) as fid,length(path)/2 as distance;"