How to avoid factorial (n!) path?

In my project, Neo4J stores staff nodes and relationships between staff.

People who have been to the same place within A period of time will have A relationship. For example, A and B have arrived at the same scene within A day, and A and B will have an association relationship.

I need to focus on any one person and find all the people within three degrees of path depth. My search statement is as follows:

MATCH p = (x:user)-[r:REL*1..3]->(y:user)
where x.user_id = '20121'
WHERE testLength = LENGTH(p) + 1

When the site is small, there is no problem in searching in this way, and the path is obtained:


When there are more people in the place, the number of paths will change a lot. For example, when 10 people arrive at a place, the number of paths will be 10*9*8
When 10,000 people arrive at a site, the number of paths is 1000*999*988, which is a disaster


I do not know how to deal with this situation, hope to get help
This is my data script
export-1.txt (3.8 KB)
export-2.txt (16.7 KB)


I don't know if it's what you need,
but If you can use APOC procedures maybe you could write your query like this:

MATCH p = (x:user)-[r:REL*1..3]->(y:user)
where x.user_id = '20121'
with collect(NODES(p)) as nodes, collect(relationships(p)) as rels
return apoc.coll.toSet(apoc.coll.flatten(nodes)) as nodes, apoc.coll.toSet(apoc.coll.flatten(rels)) as rels

or with a map return:

return {nodes: apoc.coll.toSet(apoc.coll.flatten(nodes)), rels: apoc.coll.toSet(apoc.coll.flatten(rels))}

If your use case is to find distinct people within 3 degrees (and you don't actually care about distinct paths the way Cypher does) then you can use apoc.path.subgraphNodes(), a path expander procedure in APOC Procedures that is optimized for finding distinct nodes (and pruning any path found to a previously-encountered node).

MATCH (x:user)
WHERE x.user_id = '20121'
CALL apoc.path.subgraphNodes(x, {maxLevel:3, relationshipFilter:'REL>', labelFilter:'>user'}) YIELD node as y

This will only give distinct nodes back, it won't give back paths. If you want paths, then use apoc.path.spanningTree() instead, but be aware that it will only give back a single path per reached nodes, not multiple paths.

Thank you for your help. Maybe I didn't express it clearly in my question. Is there any way to query only the shortest path, for example, query only [P1 ->p2], [P1 ->p3]... [p1 - > p10] such a shortest path, exclude [p1 - > p3 - > p2], [p1 - > p4 - > p3 - > p2] such a path

Thanks for your help I need to find not only nodes, but also the shortest path between nodes. This is what I want: Nodes :[P2, P3, P4, P5, P6, P7, P8, P9, P10], rel: [p1 - > p2], [p1 - > p3], [p1 - > p4], [p1 - > p5], [p1 - > p6], [p1 - > p7], [p1 - > p8], [p1 - > p9], [p1 - > p10]
I do not know how to write such a statement, looking forward to your help

In this case you'll need to use apoc.path.spanningTree(), which does YIELD path instead of node.

The paths will automatically be shortest paths, since the procedure uses breadth-first expansion, and will only use the first path found to a node and no other.

1 Like

thanks a lot! The apoc.path.spanningTree() function solved my problem. thanks again!