What is the best approach for populating the nodes and relationships in Neo4j.
I am thinking of two possibilities -
- Load Nodes with NULL values. In this case Relationships defined before hand. Populate nodes and relationship properties when we have the data.
- Create Relationships only when you can create the Node. In this case query may need to use OPTIONAL MATCH clause.
Our use case actually have
Could you say a little more about what your data source is? You would want to create nodes and relationships as you have them... there's no need for NULL nodes, as that relationship just doesn't exist yet. This lack of relationship is what takes the place of NULLs in tabular data... it's automatically implied if there is no relationship. I hope that makes sense. :)
Thanks for your recommendation.
Our use case is to build a customer journey platform and data sources are multiple systems across the enterprise and 3rd party data.
In most cases user can be present only in one system.
If the same user needs to be on boarded coming from a new system, I am wondering how lookup across all other systems work.
My thinking at this time is whenever we are on boarding a user from a new system into graph, we need to perform a lookup if the user is present in any other systems and add the attributes we gather from the new system to his profile. Wondering how costly is it going to be for insertion. Considering we are looking across 10 - 15 systems.
I'm not sure I understand enough about your system, but insertions and updates in Neo4j aren't costly. I would make sure to go over your data model/schema to so that it makes sense for the types of queries you plan to run against it in the future. Are you combining the data from the different systems into a single graph database?
Yes, that is right. We are trying to create an unified view of clients across all the systems across the enterprise.