Feedback requested on proposal for Spark Cypher in Spark 3.0

Xiangrui Meng of Databricks recently posted this on the Apache Spark project users list:

Databricks and Neo4j contributors are looking to bring Cypher queries into the core Spark project as part of Spark 3.0 (slated for release mid-year 2019). This will build on elements from the Cypher for Apache Spark and Graphframes projects.

All the details are in the links in Xiangrui's e-mail.

It would be great to see Neo4j and Spark community users expressing their support for/adding feedback on this Spark Project Improvement Proposal before it goes for a vote in the Spark dev community.

The more detail you can provide on your interest in this, the better, but a simple +1 in reply to Xiangrui's post would be just great if you are short of time ...

Thanks, Alastair



It appears that the list is not the right medium for replying to Xiangrui's post with feedback or messages of support.

Please take the following steps to comment on the Spark users list:

  1. Subscribe to user@ by sending an email to

  2. Go to this link<> and click reply -> reply via mail client.

Thanks, and apologies for the mix-up.

Sorry, but this appears not to be a simple process.

If you do Step 1, you will get a mail that requires you to reply to confirm. Then and only then will you be able to perform Step 2 (reply to the users list).

Following user comments (thanks everyone who pitched in with feedback), Xiangrui launched a vote on the proposal on the Spark devs list, and it closed yesterday with the following result:

Hi all,

The vote passed with the following +1s (* = binding) and no 0s/-1s:

  • Denny Lee
  • Jules Damji
  • Xiao Li*
  • Dongjoon Hyun
  • Mingjie Tang
  • Yanbo Liang*
  • Marco Gaido
  • Joseph Bradley*
  • Xiangrui Meng*

Please watch SPARK-25994 and join future discussions there. Thanks!


The binding votes are Apache Spark PMC members. This is a great outcome, reflecting a ton of work from various contributors and backers.

There's going to be a discussion about how Cypher can feed into the proposed international standard GQL at the forthcoming Fifth openCypher Implementers' Meeting in Berlin in early March.

This news about Spark Cypher adds to the importance of making the long-term transition from Cypher to GQL as easy as possible. (There are also reports of one or two additional industrial implementations of Cypher in the works.) The ever-growing interest in a standard graph query language shows how graph data management is beginning to go mainstream, in my view.