Is there any way of querying on small part of a big graph in graph db

I am running gds algorithm(link prediction algorithm) in a graph db consist 10 million nodes and relations. It is taking huge amount of time in full node and relationship base.

call apoc.export.csv.query("MATCH (p:Vehicle)-[r:takes_rent]->(p2:CUSTOMER)WHERE p2.BMCC IN ['1','10','11','12','14','18','19','2','20','21','22','23','24','25','27','2741','28','29','30','32','3351','36','4','4131','4215','4511','4722','4812','4814','4816','4899','4900','5','5039','5047','5065','5111','5199','5200','5211','5251','5261','5399','5411','5441','5511','5533','5541','5641','5651','5661','5697','5712','5722','5732','5733','5734','5811','5813','5814','5912','5940','5941','5942','5944','5947','5948','5950','5977','5992','6','7','7011','7210','7221','7230','7379','7399','7531','7629','763','7832','7911','7996','7997','7999','8021','8062','8099','9']return distinct(p.ID) as Vehicle_ID,p2.ID as customer_wallet,p2.BMCC as BMCC_CODE,gds.alpha.linkprediction.adamicAdar(p,p2,{relationshipQuery:'takes_rent'}) as score","prefescore_dump_adamic_ader.csv", {batchsize:10000})

here I am using apoc library for dumping csv the result for parallel processing purpose. but it is taking huge time . Is there any way to make this query faster or how could I apply this query in small sample of this graph db

Hi, You could use UNWIND. You can iterate the list of BMCC then create several files.


the interesting fact is . The preferential attachment algorithm query took less time to finish. but Adamic ader, Common neighbour algorithm test queries didnt give me the output. Those query run for long time . So is it algorithm performance issue ?. My instance ram is 374gb and neo4j version is 3.5.11.

I am learning neo4j , could you kindly give me a Unwind statement sample format