Why isn't this measurement of performance consistent?

lingvisa · November 20, 2020, 3:41am

I have a query:

MATCH (m:Product)
WHERE  'beautify' IN  m._effect
RETURN properties(m) as properties LIMIT 1

I used logger to measure the time spent just on the query:

print(cypher)
        logger.info('before query ...')
        result = tx.run(cypher)
        logger.info('after query ...')

And the result through the Python Bolt Driver is blow:

2020-11-19 19:33:54,841 - kg_api - INFO - 461 - before query ...
2020-11-19 19:33:54,895 - kg_api - INFO - 463 - after query ...

So the time is 895-841, which is 54 milliseconds. However, in the Neo4j Browser, the same query it shows it only takes 8 milliseconds:

Started streaming 1 records after 1 ms and completed after 8 ms.

In both measurements, I repeated the query many times, and the two measurements are very different.
Is this two measurements comparable?

clem · November 24, 2020, 12:39am

Depending on the size of your DB, the query cache can cause some variability of speed. That is, if the results of the query (plus intermediate results) cause a lot of cache misses, then the query will slow down. If other queries cause the cache to lose the results you want later, your query will perform slower again.

So, a lot depends on the size of your DB, size of your memory, and how you configured your Neo4J DB (e.g. with more or less cache memory). Probably some other things I'm not aware of too.

Joel · November 24, 2020, 3:09pm

a few thoughts

these are very different environments
my first guess is that the query you are measuring is very fast (we know it is below 8ms right?) and that most of what you are measuring right now is overhead for the environment specifics of each to start/stop a timer, and send/receive a query (not the query itself)
these times are close the limits of the timer resolution, at this level of granularity the host operating system (and other tasks running on it) will introduce timing variance

I suggest timing a longer running query, a query that takes more than a few seconds. Then you'll be measuring the query time instead of the start up / shutdown overhead. Make sure everything else is the same for both tests, remove all other variables (no load on client, no load on server, run both tests on the same machine as the database, etc..)

if we were comparing language drivers, you could start a timer, open the db connection and run 10,000 fast queries then close the connection and stop the timer (you are comparing to the browser so I didn't suggest this)

Topic		Replies	Views
Neo4j query performance Cypher performance	6	285	April 23, 2023
Query execute time varies Cypher cypher	2	267	September 4, 2020
Is this query speed normal? Neo4j Graph Platform	5	485	October 23, 2020
Neo4j performance differences Newbie Questions	3	350	August 7, 2023
Execution time of cypher query Cypher cypher	5	5446	December 26, 2021

Why isn't this measurement of performance consistent?

Related Topics