How is link prediction different than existing Graph DL models

I am looking at some recommender models and especially interested in the graph models like LightGCN. Except that Neo4j is natively stored as graph, I am wondering if GDS 1.7 can replicate similar G-DL models out there.

Would be interested in an article to compare the differences in terms of prediction accuracy and performance.

Thanks!

Hi there,

I'll try to help:

First off: the Neo4j Link prediction in the ML Models catalog (https://neo4j.com/docs/graph-data-science/current/algorithms/ml-models/linkprediction/) deploys a logistic regression algorithm. The LightGCN, a stripped-down GCN model, still relies on embeddings. To the best of my knowledge, there has not been a benchmarking exercise comparing both models.
However, for recommendations, the GraphSage model is often used and indeed proven to be very effective (https://neo4j.com/docs/graph-data-science/current/algorithms/graph-sage/). This allows to compute a vector representation of the node and it's neighborhood after which you can release a bunch of similarity algo's from the Neo4j GDS library to compare and classify vectors. Also, the GraphSage model is inductive; once trained you can 'estimate'/ predict vectors for any new node that is added to the graph, and subsequently find similar vectors such as the predicted one.
Should you still want to do some more specific modeling then you can deploy the Neo4j Python drivers and use one of the Deep Graph Library models and neural network engines and populate your Neo4j Graph from your local environment, say a Colab notebook, like shown in this article:
Neo4j & DGL — a seamless integration | by Kristof Neys | Towards Data Science

Hope this helps, and happy modeling!

Kristof

Thank you, it helps a lot!