I have a Neo4j 4.1.0 community edition setup on an EC2 instance (Ubuntu 18.04) with 16 GB RAM. The size of the database is 211 M, determined by running du -hs /var/lib/neo4j/data/databases/neo4j/
which is made up of about 93K nodes of 3 labels with a single property each.
I have configured the following settings as suggested by neo4j-admin memrec.
I am running the following query which is taking about 6 minutes to get completed. MATCH (person:Person), (person:Person)-[r0:STUDIED_AT]-(college:College), (college:College)-[r]-(x) RETURN type(r) AS label, last(labels(x)) AS target, count(r) AS count ORDER BY count(r) DESC
Can someone help me understand why this query is taking so long to run although the size of the graph is pretty small and the system specs are good enough? Also, is there a way to speed up the execution considerably without modifying the query (because the query is coming from popoto.js and I do not have much control over it).
I have already tried the following:
CALL apoc.warmup.run()
Run the same query twice (expecting a better time at second execution)
Create index on all three labels (I do not need to write to the DB, it is largely read-only).
Couple of more questions:
What limits the size/number of requests to the DB? How can I accommodate more?
Is caching results possible? I know that neo4j caches the db and the query plans but not sure if results can be cached. I saw a feature request in the github issues but not sure if it got addressed.
Unfortunately I can't edit the query. It's created internally by a js library which I am using for my application. So firstly I am trying to assess if this performance (given the size of the data and the machine configuration) is warranted and if there is a way to configure neo4j for faster performance
Thanks for trying to help. Would you be able to comment on whether this performance (given the size of the data and the machine configuration) is warranted?
6 minutes seems outrageous long, which instance type are you using?
I would love to see how much time is shaved off with the query rewrite.
Even if you can't "fix" the query its good to know if this helps.
Are you able to download the dataset and try it on a local Neo4J desktop instance?
Just to see how it compares to the EC2 instance..
Thanks. I am trying to build a web interface for neo4j to make a dataset available for users to explore. I figured out a way to edit the queries created by popoto on the server side. That resolved the issue. Thanks for looking into this.
Hi, Could you please share the way to edit queries created by popoto on server side. Also I want to know how we can write custom queries in popoto js . I am trying to do it with help of schema. But your help means a lot to me. Thanks in advance.