Hello!
I have a wonderful set of metrics exposed in Prometheus regarding the cluster/db health. However, I am not too sure what would be good to alert on? I am a bit of a newbie. Of course, I have standard alerts on the following:
- whether the node is up
- is disk filling up
- cpu/ram
But with regards to specific neo4j alerts, I am a bit lost. Vault provides a little guide https://s3-us-west-2.amazonaws.com/hashicorp-education/whitepapers/Vault/Vault-Consul-Monitoring-Guide.pdf for newbies like me, which was great - does anyone know of anything similar for neo4j or can anyone nudge me in the right direction?
I was thinking of the following so far:
- leader changing frequently
- cache page faults high
Thank you so much!