I need to apply linear regression machine learning method on Neo4j already created database to find node importance and correlation.
Any examples to learn from?
hello @familylife103 ,
Are you using neo4j 3.X or neo4j 4.X ?
If you're looking at measuring node importance probably a good place to start are the centrality algorithms - Centrality . They use the structure of the graph itself to measure the importance of nodes.
Similarly, the Similarity - Neo4j Graph Data Science algorithms, use standard metrics for correlation (jaccard, cosine, pearson, etc) to measure the similarity of nodes.
GDS doesn't currently offer node regression, but there are some apoc procedures you can take a look at.
Thanks a lot for the info @alicia.frame1 , apoc procedures are great start, specially apoc.math.regr(). However, it is missing a lot of regression parameters. There were a great effort done by Lauren Shin in Graphs and ML: Multiple Linear Regression | by Lauren Shin | Towards Data Science
However, it doesn't work in neo4j 4.X. It would be great if it is integrated there
Thanks for your reply, maybe that's why I couldn't apply the regression example on Neo4j, because I'm using the latest version. Does it mean if I change the version I will be able to use regression functions?
Is there any example about this topic using Python connected to the Neo4j graph?
Check out the tutorials under our developer guide pages, here: Link Prediction with GDSL and scikit-learn - Developer Guides
We walk through using Neo4j with scitkit learn, sagemaker, and training models inside neo4j.
If you use neo4j 3.5.x you can use this package https://towardsdatascience.com/graphs-and-ml-multiple-linear-regression-c6920a1f2e70