Spark Connector only returns empty DataFrames

lukas.b · November 9, 2020, 12:14pm

Hi,

I need help troubleshooting a rather weird error. I have setup a (single instance) Azure VM to run Neo4j following the official documentation to feed data to an Azure Databricks cluster running Spark. I connected to the Neo4j VM via HTTP on port 7474 to populate it with some data. For the Databricks cluster, I installed the connector and followed this documentation, basically just setting the connection address and login credentials as Spark parameters.

When I run a sample query via the spark connector on the Databricks cluster, I can successfully establish a connection - however, it only returns empty data:

%scala
import org.neo4j.spark._
val neo = Neo4j(sc)
# => neo: org.neo4j.spark.Neo4j = org.neo4j.spark.Neo4j@7c444d23

%scala
val rdd = neo.cypher("MATCH (n:Person) RETURN id(n) as id ").loadRowRdd
rdd.count
# => rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = Neo4jRDD partitions Partitions(1,9223372036854775807,9223372036854775807,None) MATCH (n:Person) RETURN id(n) as id  using Map()
# => res1: Long = 0

the same happens for .loadDataFrame, .loadGraphFrame etc:

# => java.lang.RuntimeException: Cannot infer schema-types from empty result, please use loadDataFrame(schema: (String,String)*)

I can confirm that the query should in fact not return an empty DF by connecting to the remote VM from my local Neo4j Desktop and running it there:

Where is my mistake here? Thanks in advance!

(Logs and specs, see below)

david_allen · November 12, 2020, 12:28pm

You are using the old driver, which works in a different way and has different versions of spark that it works with and supports.

Please consider having a look at the new Neo4j connector for spark - it's easier to use, more modern, and is under active development Neo4j Connector for Apache Spark v5.0.0 - Neo4j Spark Connector

Topic		Replies	Views
Neo4j / Spark / Databricks - connection fails Neo4j Graph Platform	0	152	May 24, 2023
Hello everyone - Fábio from Florianópolis, Brazil Introduce-Yourself	2	261	March 22, 2021
Connecting Neo4j 3.5.6 DB within Spark Databricks for ML purpose AI, ML, NLP	1	1060	January 12, 2021
Neo4j Connector with Spark Neo4j Graph Platform	2	246	August 25, 2021
org.neo4j.driver.v1.exceptions.AuthenticationException: Unsupported authentication token, scheme='none' only allowed when auth is disabled: { scheme='none' } Integrations neo4j-driver , spark	1	1402	October 16, 2019

Spark Connector only returns empty DataFrames

Related Topics