What to alert on?


I have a wonderful set of metrics exposed in Prometheus regarding the cluster/db health. However, I am not too sure what would be good to alert on? I am a bit of a newbie. Of course, I have standard alerts on the following:

  • whether the node is up
  • is disk filling up
  • cpu/ram

But with regards to specific neo4j alerts, I am a bit lost. Vault provides a little guide https://s3-us-west-2.amazonaws.com/hashicorp-education/whitepapers/Vault/Vault-Consul-Monitoring-Guide.pdf for newbies like me, which was great - does anyone know of anything similar for neo4j or can anyone nudge me in the right direction?

I was thinking of the following so far:

  • leader changing frequently
  • cache page faults high

Thank you so much!