Projecting multi-partite networks


Asked this on the slack community as well. New to neo4j and cypher.

Trying to run some graph queries over a bipartite network (nodes are gene, and microbe, respectively), with a relationship of associated. I want to run node centrality measures over the network, but with a projection of the graph. For example on a graph that consists of only genes where genes are connected if they share an associated microbe (where the weight of this projected relationship is equal to the sum of the individual weights). How do I use cypher to do this? I know I can use cypher projection for the various algo., but this is running the algorithm on the original graph structure.

I have tried to google and read through the docs/community apge, but I don't quite know the terminology that neo4j/cypher use for this process. Any help would be greatly appreciated

Additional Information:
Neo4j version is 3.5 on a linux system, using the community server edition (don't have access to a gui). I am primarily testing using the py2neo interface at the moment.
Number of nodes ~43 000
Number of relationships ~1.7e6

(I saw a recorded conference talk on something similar once in neo4j, but for the life of me I cannot find that video).

The following cypher is what I have tried:

MATCH (n1:gene)-[r]-(p:microbe)-[r2]-(n2:gene) 
RETURN id(n1) AS source, id(n2) AS target, abs(toFloat(r2:Beta)) + abs(toFloat(r:Beta)) AS weight

but this returns the following error (which may be py2neo specific):

CypherSyntaxError: Type mismatch: expected Node but was Relationship (line 1, column 110 (offset: 109))
"MATCH (n1:gene)-[r]-(p:microbe)-[r2]-(n2:gene) RETURN id(n1) AS source, id(n2) AS target, abs(toFloat(r2:Beta)) + abs(toFloat(r:Beta)) AS weight"

And also given a query like that, how do I ensure that the algorithm uses the above structure to determine the node centrality measures?

EDIT: Second attempt (noticed I should have used "." instead of ":")

MATCH (n1:gene)-[r:association]-(p:microbe)-[r2:association]-(n2:gene) CALL apoc.create.relationship(n1,"infer",{weight:abs(toFloat(r.Beta))+abs(toFloat(r2.Beta))},n2) YIELD rel RETURN rel;

(this appears to be very slow). Is there a more optimal way to do this?

I then plan to run the graph algorithms using the cypher project.

CALL algo.<algorithm>(
'MATCH (n:gene) RETURN id(n) as id',
'MATCH (n1:gene)-[r:infer]-(n2:gene) RETURN id(n1) as source, id(n2) as target,r.weight as weight',

Is this the most efficient approach? Not even sure if this is correct :confused:

I have decided to go with the following solution, I still have to confirm correctness and remove duplicate relationships (also not sure if this is the most efficient).

CALL algo.<algorithm>(
'MATCH (n:gene) RETURN id(n) as id',
'MATCH (n:gene)-[r1:association]-(x:microbe)
WITH DISTINCT x,collect([n,abs(toFloat(r1.Beta))]) as inPairs
MATCH (x)-[r2:association]-(m:gene)
WITH inPairs, collect([m,abs(toFloat(r2.Beta))]) as outPairs
CALL  apoc.coll.partition(apoc.coll.flatten(,outPairs)),2) YIELD value
WITH apoc.coll.flatten(value) as ww
RETURN id(ww[0]) as source, id(ww[2]) as target, ww[1] + ww[3] as weight',

Hi! Have you seen this helpful blog: Neo4j Graph Algorithms projecting a virtual graph – Graph people? It helped me create a projection from a bipartite graph similar to what you're describing so that I could use community detection algos.