Run All Pairs Shortest Path on a subset of a graph projection

I am using gds.alpha.allShortestPaths.stream() on a graph projection.

I want to run the algorithm on a subset of nodes that have a certain value for a property, but
the docs say that the allShortestPaths algorithm only accepts a graph projection as input. I don't think running the algorithm on the entire graph projection would be possible because it consists of ~15 million nodes.

I need something like this:

match (hgnc:Code {SAB:"HGNC"}) CALL algo.shortestPath.stream(hgnc) YIELD sourceNodeId, targetNodeId, distance WITH sourceNodeId, targetNodeId, distance WHERE gds.util.isFinite(distance) = true WITH source, target, distance WHERE source <> target RETURN source.name AS source, target.name AS target, distance

But with the All Pairs Shortest Path algorithm obviously.

Thank you!

Hello @benstear and welcome to the Neo4j community :slight_smile:

You should have a look at Cypher projection but make sure the nodes you select have relationships or the algorithm won't work:

CALL gds.graph.create.cypher(
    'my-cypher-graph',
    'MATCH (n:Code {SAB:"HGNC"})) RETURN id(n) AS id',
    'MATCH (n)-->(m) RETURN id(n) AS source, id(m) AS target'
)

Otherwise, there is the version without graph projection using Cypher shortestPath Functions:

MATCH (n:Code {SAB:"HGNC"})
WITH collect(n) AS nodes
UNWIND nodes AS a
UNWIND nodes AS b
WITH a, b
WHERE a <> b
MATCH p = shortestPath((a)-[*]-(b))
RETURN p

Or

MATCH (n:Code {SAB:"HGNC"})
WITH collect(n) AS nodes
UNWIND nodes AS a
UNWIND nodes AS b
WITH a, b
WHERE a <> b
MATCH p = allShortestPaths((a)-[*]-(b))
RETURN p

Regards,
Cobra

1 Like

Hi @cobra,

Thanks so much for the response. I think what I was looking for is shortestPath() because I want only the single shortest path between each of my nodes.

However, I am worried that my query might still be too large. I have about 40,000 nodes I need to find the shortest paths between.

Do you know if shortestPath() will automatically exclude the inverse shortestPath searches?

For example if you find the shortest path between Node A and Node B then you don't need to find the shortest path between Node B and Node A again.

I think I my have to manually exclude the inverse relationships as Michael Hunger did in his post here: All shortest paths between a set of nodes

Thanks again.

1 Like

No problem :slight_smile:

To be sure, I would recommend excluding them like Michael Hunger did in his query.

Regards,
Cobra

1 Like