How-To: Import CSV Data with Neo4j Desktop

The LOAD CSV command reads all values as a string. No matter how the value appears in a file, it will be loaded as a string with LOAD CSV. So, before we import, we want to ensure we convert any values that are non-string.

There are a variety of conversion functions in Cypher. The ones we will use for this exercise are as follows:

  • toInteger(): converts a value to an integer.

  • toFloat(): converts a value to a float (in this case, for monetary amounts).

  • datetime(): converts a value to a datetime.

We will look at the values in each CSV file to determine what needs to be converted.

Products.csv

The values in the products.csv files are for product ID, product name, and unit cost. Product ID looks like an integer value that increases with each row, so we can convert this to an integer using the toInteger() function in Cypher. Product name can remain a string since it consists of characters. The final column is the product unit cost. Though the sample values from our inspection are all whole numbers, we know that monetary amounts often have decimal place values, so we will convert these values to floats using the toFloat() function.

We can see the Cypher to handle all of these conversions below; however, we are still not loading the values into Neo4j yet. We are just viewing the CSV files with converted values.

LOAD CSV FROM 'file:///desktop-csv-import/products.csv' AS row
WITH toInteger(row[0]) AS productId, row[1] AS productName, toFloat(row[2]) AS unitCost
RETURN productId, productName, unitCost
LIMIT 3

Note that we are using collection positions (row[0], row[1], row[2]) to refer to the columns in the row and improve readability by using aliases to reference them in the return. In a file that has no headers, this is how to reference values in each position.

Orders.csv

The values in the orders.csv (per the column names) are for orderID, orderDate, and shipCountry. Again, we can evaluate the values and determine any conversions to apply.

OrderID looks like an integer, so we can convert that using the toInteger() function. The orderDate column is certainly in a date format and will require us to format it using the datetime() function. Finally, the shipCountry values are characters, so we can leave that as a string.

If you are using a version of Neo4j prior to 3.4, the datetime() function will not be available. That function was released in 3.4 and is supported from that version forward.

Just as we did with the last CSV file, let us look at the results of these conversions without importing the data.

LOAD CSV WITH HEADERS FROM 'file:///desktop-csv-import/orders.csv' AS row
WITH toInteger(row.orderID) AS orderId, datetime(replace(row.orderDate,' ','T')) AS orderDate, row.shipCountry AS country
RETURN orderId, orderDate, country
LIMIT 5

There was one tricky thing with this CSV in the orderDate column. Neo4j’s datetime uses the ISO 8601 format which uses the delimiter T between the date and time values. The CSV file does not have the 'T' joining the date and time values but has a space between them instead. We used the replace() function to change the space to the character 'T' and get the string into the expected format. Then, we wrapped the datetime() function around that to convert the changed string to a datetime value.

Order-details.csv

The values in the order-details.csv (from column names) are for productID, orderID, and quantity. Let us look at which ones need to be converted.

Our product ID is also from our products.csv file, where we converted that value to an integer. We will do the same here to ensure we match formats. The order ID field contains values from our orders.csv file, so we will match our previous conversion and translate this field to an integer, as well. The quantity field in this file is a numeric value. We can convert this to an integer with the toInteger() function we have been using.

The results of these conversions are in the code below. Remember that we still are not loading any data yet.

LOAD CSV WITH HEADERS FROM 'file:///desktop-csv-import/order-details.csv' AS row
WITH toInteger(row.productID) AS productId, toInteger(row.orderID) AS orderId, toInteger(row.quantity) AS quantityOrdered
RETURN productId, orderId, quantityOrdered
LIMIT 8

This is a companion discussion topic for the original entry at https://neo4j.com/developer/desktop-csv-import/