Importing Data from Hive into Neo4J through Spark application

Hello everyone,

I am almost newbie to Neo4J and graph databases world, and I have some doubts about importing data into Neo4J.

In my case, I have several related tables in Hive which I would like to load to Neo4J. I wonder if I should use a Spark2 application which worked as a bridge between both technologies (e.g. transforming Hive data using dataframes) by using Neo4J connector or it is possible to convert Hive contents into edge and nodes directly by using the Hive JDBC driver. This second approach would be similar to the test described in this post of the community.

As I have not found several examples including a Spark application, what should be the best approach for this case? Is it possible to load data directly from Hive?

Furthermore, does it worth using another external tool such as StreamSets Data Collector?
I saw this video by their Technical Director, Pat Patterson, and it looks quite good.

Thanks in advance,

Álvaro López

Hi Álvaro,

Thanks for the mention! I've been integrating Data Collector with Neo4j for some time now - it works well. Feel free to ask questions here, or over at our community:



1 Like