Goal: run analytic metrics for the nearest neighbors (nn) and nn + next nearest neighbors.
I created a bimodal graph from Kaggle's Marvel Universe Social Network:
CALL apoc.schema.assert( {},
{Comic:['name'],Hero:['name']})
CALL apoc.load.csv('https://raw.githubusercontent.com/tomasonjo/neo4j-marvel/master/data/edges.csv') yield map as row WITH row
MERGE (h:Hero {name:row.hero})
MERGE (c:Comic {name:row.comic})
MERGE (h)-[:APPEARS_IN]->(c)
I want to get the density within a local ego net (density around a specific node with it's nearest nearest neighbors and the nearest neighbors + next nearest neighbors). I want to do this for all Heroes in the graph (6,439).
The best solution appears to be to use the apoc.paths.subgraphAll()
function.
Using the following code I can get the subgraph for a specific Hero:
CALL apoc.path.subgraphAll(h, {maxLevel:2})
YIELD nodes, relationships
RETURN nodes, relationships
Which returns the following:
Replacing "WITH" for "RETURN" should allow me to use just the nodes and relationships in the subgraph, but I can't figure out the syntax to reference just the results.
If I do:
MATCH (h:Hero {name:"4-D MAN/MERCURIO"})
CALL apoc.path.subgraphAll(h, {maxLevel:2})
YIELD nodes, relationships
RETURN count(nodes)
I get a result of 1, when the actual number of nodes in the subgraph is 82.
If I insert a match statement after "WITH" it reverts to the full graph and I get a count of all nodes in the full graph:
MATCH (h:Hero {name:"4-D MAN/MERCURIO"})
CALL apoc.path.subgraphAll(h, {maxLevel:2})
YIELD nodes, relationships
WITH nodes, relationships
MATCH (n)
RETURN count(n)
Result: 19,090
Is there an easy way to reference the nodes and relationships in the subgraph so I can count them or do other analytics on the actual subgraph that was returned?
I'll also want this capability when I add the ~900 villains to the graph and want to count how many are in the subgraph.