Hello, I have this huge graph dataset loaded to Neo4j and I needed to project the whole graph to check for its weakly connected component. Is there are procedures or a standard way to delete the nodes so that I can reduce the size of the graph?
Hi @chim3yy ,
Creating a graph projection is a common step in using Neo4j GDS. The short explanation is that you can either describe the graph using a cypher query, which is called a "cypher projection" or you can use a "native projection" by specifying the node labels and relationship types.
For creating graph projections, take a look at: https://neo4j.com/docs/graph-data-science/current/graph-create/
Creating a Cypher projection:
Creating a native projection:
Then, for weakly connected components, see:
Thank you so much for your kind response and recommendations. My issue is that I want to project the whole graph but whenever I try to project the whole graph to run weakly connected component algorithm, it gives me memory error. So I was wondering if I can reduce the graph by removing some nodes, but the problem is which nodes to remove?
Actually I want to scale down the graph using random node sampling method as was done in this blog (Sampling A Neo4j Database - DZone). The problem is that I have many nodes with various labels so I am not sure how can I do it? The main objective is to reduce the size of the graph so that I can project the sampled graph (from the large graph) and then run weakly connected component algorithm. It’s not feasible to project the whole large graph as it gives memory error.
Does it make sense to do something like this?
Match (n1: label1)-[r]-(), (n2:label2)-[r2]-()
Return n1, r, n2, r2
And then I can export this graph using apoc.
Any suggestion in this regard would be so helpful.