Best practice for specified modelling example

Hello everyone,
I am trying to develop a neo4j demo project.
With using GitLab API's, i am trying to get Merge Request's information and I want to collect comments posted there.

  • ER-1 Diagram
    Reduces File node count, but don't know which comment belongs to which merge request.

  • ER-2 Diagram
    File node count increased, but we know the comments relation with merge request.

I am not sure which one is correct, both approaches have advantages and disadvantages.
Is it okay to increase node count like that ?
Or i can just add a property ( merge request info ) to contains relation between File and Comments.
Thank you for your time!

  1. Your arrows are backwards.
    • Think in terms of arrows pointing to parents.
  2. Graphs let you define your data more closely to a real world representation. Is File part of Project, or Merge? Is the merge part of the project?
    • To my mind, File probably belongs to both, but I don't know about Merge.
  3. What are you going to be querying for?
    • Designing a graph database is about balancing complexity with performance. If you know what kind of results you want to get out, that should inform how best to structure the graph to make those queries faster.
  4. "...which one is correct... Is it okay to increase node count like that ?"
    • Both are "correct", it's perfectly find to create more nodes to have better data, or faster queries, or more accurately represent reality.

Without knowing more about what you're really trying to do, why you're making this graph, I'd suggest something a little more representative of the data and history, without losing information:

1 Like

Thank you for your detailed answers, it helped me a lot!