Neo4j transactions have ACID compliance, and I want to know neo4j how to promise consistency and durability.
Source code shows that disk file loads into page cache which neo4j allocated, if the file mapped into page cache, it should no longer be accessed directly through the file system, and page cache will keep changes in memory.
Is it right that new data value will be kept in memory if a transaction contains a write operation, then the data will be kept multiple copies in memory？So noe4j how to promise consistency? and neo4j do log to promise atomic. but it can not promise consistency fully, so the isolation level is read committed.
I know that neo4j use copy-on-write technology to improve write performance. but update value still be kept in memory, how neo4j do to promise durability？
At last, I want to ask how neo4j does write operations, how to ensure consistency, and what level of consistency is guaranteed? I found very few documents describe these.
I will continue to understand the code in depth. I would be very grateful if anyone can reply to me!
While Neo4j does use the pagecache for an in-memory graph, it writes out changes to the graph to transaction logs on disk as well as the pagecache. When a transaction is recorded in the transaction log, it is considered committed. These are durable, and will survive unexpected shutdowns of Neo4j or the server itself.
At intervals, checkpoint operations flush changes from the transaction logs to the store files themselves, and performs any rotation and pruning of the transaction logs needed.
During startup, Neo4j checks for the occurrence of the last checkpoint in the transaction logs, and replays all subsequent transactions to the pagecache, this is part of what's going on when you see Neo4j going through a recovery process on startup.
When pages are evicted from the pagecache that are dirty, they are flushed to the store files. That way even when performing recovery from the tx logs, the pagecache doesn't need to be large enough to contain all subsequent changes.
For write transactions, as the transaction is processed the heap is used to contain transitory state as well as the final transactional changes resulting from the query. The commit process writes these to tx logs and pagecache.
As you said, Neo4j guarantees read-committed isolation. When used in a cluster, it promises causal consistency (if you are using bookmarks or between transactions within the same session, otherwise eventual consistency as far as reading your own writes. If only interacting with the leader, then it's read-committed isolation just as it is in a single instance).
thank you ！！I have understood a lot.
The transaction will do transaction logs on disk and write changes in pagecache. When a transaction is recorded in the transaction log, it is considered committed. The dirty pages will be flushed to disk.
But there are still some questions. Is it update in place in pagecache? Isn't it copy-on-write? If it is the former, then it can be explained that it cannot be read repeatedly. If it is the latter, do neo4j copy the entire page and then modify it. Will there be a write amplification problem?
Currently it's updating in place in the pagecache, so not repeatable read.