Load nodes of a densely connected graph

Hi,

I'm very new to neo4j and graph databases (and graph theory in general) so please bear with me if my questions seems basic. I couldn't find a way to solve my issue via search (although it is related to SDN findAll performance and populating entities at depth > 1 - #9 by rkwasnicki).

I'm trying to map the bitcoin lightning network in neo4j, using SDN 6 (via Spring Boot 2.6.x). The network currently consists of roughly 30K nodes and 80K channels (relationships) between those nodes so the overall dataset is rather small. However, the graph is densely connected and you can go from any nodes to any other nodes with only using a few channels (one of the reason to load this in neo4j is to get some better data on this).

@Node("LnNode")
public class GraphNode {

    // CREATE INDEX FOR (a:LnNode) ON (a.pubKey)

    @Id
    private String pubKey;

    @Property("alias")
    private String alias;

    @Version
    private Long version;

    @Property("capacity")
    private Long capacity;


    @Relationship(type = "CONNECTED_TO")
    private Set<GraphChannel> channels = new HashSet<>(4096);
@Data
@RelationshipProperties
public class GraphChannel {

    @Id
    @GeneratedValue
    private Long id;

    private Long channelId;

    private Long capacity;

    @TargetNode
    private GraphNode targetNode;

A lightning channel is unidirectional by its nature. When persisting the data I'll add a new channel to Node 1 (channels.add()) that has Node 2 as its @TargetNode.

This works fine when I'm persisting new data. However, when I load a node it recursively loads the complete graph because all nodes are connected to all other nodes.
While the docs do mention this problem specifically and advice to avoid it I couldn't find a recipe how to best deal with this.

When I refresh the data in neo4j I want to fetch the complete graph anyways (to check for new/removed/updated channels and nodes) and it's also fine if it takes a while but currently it justs hangs.

I can think of two solutions:

  1. Find a way to load only nodes with channels/targetNodes of a limited depths (1).
  2. Improve the fetching logic to check for already loaded instances of nodes/channels. Since the total data set is so small it can easily fit into memory.

Any feedback and/or other ideas are highly appreciated!

Thanks

In my opinion it is the best way to solve your problem, esp. multi-level projection: Documentation
This would mean, that you would have to create three projections:

interface GraphNodeProjection {
// other things
Set<GraphChannelProjection> getChannels();
}

interface GraphChannelProjection {
// other things
GraphNodeProjectionWithoutRelationship getTargetNode();
}

interface GraphNodeProjectionWithoutRelationship {
// only other things
}

I hope this makes sense for you in combination with the linked documentation.

For 2. I don't know about what you want to do with the data. If you do not want to persist changes, you can always use custom queries. Usually custom queries produce just a slice of the actual existing data and if you would save/persist/update such incomplete entities SDN might delete existing data from the database.

interface GraphNodeRepository extends <GraphNode, String> {
@Query("MATCH (g:GraphNode)-[c:CONNECTED_TO]->(otherNode:GraphNode) where g.pubKey=$param return g, collect(c), collect(otherNode)")
GraphNode loadFirstLevelNodes(@Param String param);

should give you an idea.

1 Like

Thanks @gerrit.meier for the super quick reply!
I'll have a look at the projection docs and play around with both approaches.

Happy Holidays!

1 Like