Do I need foreign keys?

Hello,

I'm new to neo4j and I'm setting up my first schema. I have what I'm sure is a dumb question but I'm going to ask it anyway. Do I need foreign keys? To give an example...

I'm setting up a simple graph consisting of customer nodes and customer bank card nodes and I want to relate them using a [:CARD_BELONGS_TO] relationship.

Do I need to include the customer card number in my customer nodes or is it enough just to establish the relationship? Is there any advantage in terms of indexing the card number on the customer node when it comes to searching?

Thanks in advance for any help!

S

Hi,

You don't need any foreign key, the relationship is sufficient.

In Neo4j, it's really cheap to traverse a relationship, it is not computed like in SQL, we have it on the disk.
So when you have your Customer node, you can retrieve really fast your BankCard node(s).

This is the key concept of a graph database. A graph database works locally, you find a node and then expand in the graph by following relationships.

On the other side, SQL database works globally (on tables) with the concept of join that comes from the set theory.

Hope it's more clear now for you.

Cheers.

That's excellent, thanks for confirming!

S

As an aside, the value of a foreign key for me goes beyond what it represents in the schema to the implementation in the DB that enforces it as a constraint and supports operation as such. Thus my application is prevented from creating a row (node) with a column (property) value that is not contained within the set of the related table. Similarly, I (optionally) benefit from automatic deletion cascades.

I was drawn to this page because I am considering coding such validation in my application and was hoping to find similar functionality in Neo4j. But I suppose it's not the nature of the beast... ?

You may need to explain a bit more.

my application is prevented from creating a row (node) with a column (property) value that is not contained within the set of the related table.

If I'm reading this right, you're referring to how you can't create a foreign key where the primary key value that it references doesn't exist, or where a deletion could cause the foreign key to no longer point toward an existing value in the other table.

In Neo4j, relationships are used in place of foreign keys or join tables. Relationships can only be between existing nodes, so in that respect, you cannot create a dangling relationship pointing to nothing, nor can you delete a node and leave a dangling relationship. Before you are allowed to delete a node, you must delete all its immediate relationships first, which can be done by using a DETACH DELETE clause (otherwise a DELETE of a node with relationships will error out).

In other words, relationships will always be consistent, connecting existing nodes, never allowed to be dangling with only a single node.

Yes, that's how Neo4j works, and yes, DETACH DELETE is a rough analog to the cascade I referred to. But a foreign key in RDBMS prevents creation of a row, not a relation. Relations are ephemeral in a RDBMS as they are only tangible by virtue of join operations. Joins are conceptually preprocessed with Neo4j, wherein lies a potential performance advantage at time of inquiry. But the application benefit of a foreign key is not implemented. The feature doesn't exist as a schema option.

Similarly, DETACH DELETE does not delete a referenced node. It merely removes a reference (Relationship) to the node associated with the deleted node via that reference. In that sense, there is certainly something "dangling." It just a matter of how one defines dangling. It "unjoins", but it does not take the extra step to assure the referential integrity assured by a foreign key constraint.

Or perhaps there is some capability I've overlooked? I don't want a node to exist if it contains a property with a value not contained within a constrained set defined by another set of nodes. As I understand it, my application must assure this explicitly.

Ah, so what you seem to be asking about would take the form (if implemented) of a constraint on relationships on a node, such that a certain type of node must have a certain relationship type present, and likely to a specific type of node as well.

While this isn't current supported in our schema constraints, we do have a means to create an equivalent of triggers in neo4j using a TransactionEventHandler as a kernel extension. APOC Procedures provides its own trigger procedures that can leverage this for you. You would have to write the logic out in Cypher or in code to check for nodes that have been edited of the types you are interested in and ensure they conform to this custom constraint.

Correct. I much appreciate the explicit pointers to guidance on implementing such via triggers and stored procedures. I'm sure you appreciate that you have referred me to a mountain of docs to support coding functionality that is merely a simple declarative feature of a typical RDBMS.

As I initially surmised, this type of thing just isn't in the nature of the rather schema-less Neo4j. That's not intended to be a harsh criticism. I like the loosey-goosey feel which is part of its appeal for my current application. I was just wishing for a particular convenience.

Thanks for your help.

That's fair, I'll pass on the feedback, that seems valuable when we get to the point of making a decision on these types of constraints.

I appreciate that too. But don't view it as a feature request - at least not from me. Again, the Neo4j approach to data organization is just different, and perhaps a foreign key constraint feature is not philosophically appropriate. As a general purpose tool, it should have a clean consistent design, not be a conglomeration of features.

To the case in point, which is my current application, I'm not even sure using a foreign key constraint is such a great idea if I step back and consider alternative ways of populating and maintaining my DB. It merely would have been sweet to have as I ambled along on my current path of rapid application development and exploration. In other words, the fact that I want to use it at all probably exposes the fact that I could be using the tool in a fundamentally more intelligent way. I'm not very proficient with Neo4j, so I don't have the intellectual foundation to make well-informed judgments about its design.

At the moment, I would say the it wouldn't hurt to have the feature, but it might hurt to use it too often. Make sense?

For what its worth, Assuming end users are not directly writing cypher against your graph, you do have the option of MATCH (n:SomeRequiredNode { property: 'SomeValue' }) before any create/merge on another node (effectively requiring that foreign key).

I think the constraint functionality described by @neo4jforum would add great value to neo4j and, since he asked not to view it as a feature request for him, Id ask that it be a feature request for me :smiley:

I would, in fact, find it VERY useful in my application to be able to define native neo4j composite unique constraints that consist on the combination of properties of nodes with certain labels and properties of related nodes, connected by specific relationships with certain properties.

By the same token, it would also be useful to add a constraint that requires that a node have a relationship of a specific type to another node with certain labels, analogous to a not nullable foreign key in a sql db.