Cypher subqueries -- why can't subqueries refer to variables from the enclosing query?

I'm in the process of moving my company's application from neo4j 3.5 to 4.0, and I saw that with neo4j 4.0 cypher finally supports subqueries. This is great news, since this feature is long overdue. It's very difficult to write complex queries without subqueries, and previously APOC procedures have been required in lieu of actual subqueries.

However reading the docs, I see the following:

  • A subquery cannot refer to variables from the enclosing query.

This seriously limits the usefulness of subqueries. APOC subqueries can be passed parameters, so they remain strictly more powerful than standard cypher subqueries.

Then I read further and see:

A correlated subquery is a subquery that uses variables defined outside of the CALL clause.

Okay great, sounds exactly like what I want. But then:

This functionality is currently only available in Neo4j Fabric. Find out more about this feature in Operations Manual → Fabric.

My understanding is that Neo4j Fabric is a way to shard data into different databases. What does that have to do with passing parameters to a subquery? I can't think of any connection. It's not like people using a single database don't have a use for subqueries.

I'm really interested to hear from someone who works for neo on these design decisions, because from the outside it's totally insane. Including subqueries without parameters is like including functions without arguments in a programming language. It misses the point entirely about why this kind of feature is useful.

Hello,

Subqueries are a feature in-progress, correlated subqueries are definitely intended and not meant to remain Fabric-only, but are not implemented yet, they're likely to show up with a 4.1.0 minor release, this was a matter of prioritization of features, and uncorrelated subqueries are a first step toward the next.

The implementation of Fabric's correlated subqueries are different in some way, so we could not just reuse the implementation here. The work on Fabric's correlated subqueries will be referenced and influence the implementation of correlated subqueries for the main project.

As for "why does Fabric even have this", the feature was needed to support dispatch of subqueries among multiple databases, as noted in the examples here.

I know it's kind of a bummer only getting half of the feature you were looking forward to, I've also been eagerly awaiting full subquery support (as well as other features we could build on this, such as actual conditional subquery structures), but we'll both have to wait just a little longer for this one.

2 Likes

With 4.1, correlated subqueries are here!

Please see this article for how you can leverage this for your own use cases: