https://neo4j.com/docs/graph-data-science/current/algorithms/graph-sage/
For this training procedure:
CALL gds.beta.graphSage.train(
'persons',
{
modelName: 'exampleTrainModel',
featureProperties: ['age', 'heightAndWeight'],
aggregator: 'mean',
activationFunction: 'sigmoid',
sampleSizes: [25, 10]
}
) YIELD modelInfo as info
RETURN
info.name as modelName,
info.metrics.didConverge as didConverge,
info.metrics.ranEpochs as ranEpochs,
info.metrics.epochLosses as epochLosses
The results shown are below:
didConverge ranEpochs epochLosses
yes 1 [186.0494816886275, 186.04946806237382]
For only one epoch, how can the losses be two in the list? Shouldn't only be one loss?
Also, I am using GDS 1.7 and Neo4j 4.3.4, I copied the same training code from the tutorial, but got this result:
didConverge ranEpochs epochLosses
false 1 [186.0494681481392]
So only one loss for one epoch, which makes more sense. But 'didConverge' is false instead of 'true'. I also tried all other examples in the graphSage examples, the 'didConverge' is always 'false', but the embedding numbers look the same as in the example. Also, in my own dataset, the 'didConverge' is always 'false' for different tryout of hyperparameters. Based on this info, the implementation of didConverge may be not quite right?
Is there a real dataset of good size that shows the implementation of graphSage is largely right according to the original paper?