Scale Neo4J with data

We are hosting neo4j in AWS (running on EC2 instance) and are working with single node cluster. We are just starting up with neo4j, and since we are in dev phase we don't have huge amount of data. However we foresee that the data will grow exponentially pretty soon.

How do you propose we set up our neo4j installation such that we will be able to grow it as our data grows...

Hi there

what does "single node cluster" mean? What is your workload like (volume? read/write distribution? concurrent users? types of graph operations?)? Different deployment strategies exist for different workloads, and really it would help if we had more info.

FYI: I wrote about "scale" and all its vague definitions on Bruggen Blog: Autocompleting Neo4j - part 2/4 of a Googly Q&A quite some time ago.



Some extra information would be helpful. What is your data schema like? What kind of queries do you need to run against this data?

The performance tuning guidebook has a lot of good information in it for scaling neo4j, but as with all databases, how best to do this depends on your data and query workload. Neo4j Performance Tuning - Developer Guides