I've been reloading the same dataset to the same database repeatedly (for the purpose of optimizing my data pipeline) and observed that the store size has been always growing. What is the reason of this and what can I do to release the space? I have an database admin role but do not have access to server configuration.
Thanks in advance.
Can you provide details of
a. what version of Neo4j?
b, how are you determining store size? A simple size of the OS directory? running
:sysinfo from the browser? from some jmx metric? or
Thanks for replying. I'm using Neo4j Enterprise server 4.2.5 and determined the store size from Neo4j Browser using :sysinfo
@sabrina.liu thanks for this detail. although there is not enough detail to know what
reloading the same dataset means but if it involves
deletes then you may have encountered Understanding Database Growth - Knowledge Base
You may also want to check if transaction logs are contributing to the size you're seeing. These build over time, I think the default is 7 days of tx logs. You can adjust the retention settings to keep logs for a lesser amount of time, or to keep only up to a certain limit.
The growth is transaction logs. They will rotate out and delete according to the transaction log rotation policy in the neo configs. Transaction logs contain the data you’re sending in the database. So even if you are throwing the same data at it to test cypher or changing dates to ingest large amounts of data, the tx logs will grow. But rotation policy will prune out the logs based on the set schedule. And the DB size or Hdd usage will go down.