String Parsing in Cypher

I have sincerely tried to get this working but I honestly could use a hand.. Can someone help me out.

What I really would like is a string function like split(someString, Regex) but I don’t see one..

Split(“XX:22 13”, regex to remove : and spaces) yielding a list of XX, 22, 13 that I can then Unwind and do something like “merge (n:Node{name:”XX”, location:”22”, count:”13”}) – [r:Knows]-> (p:Person(name:”whatever”)”.

I am new to Cypher but I did try this:

load csv with headers from "file:///file.csv" as row
merge (i:Test2 {testRow:row.num})
with i, split(row.impacted_plans, ";") as planNames
unwind planNames as planName
foreach (n in planNames | split(planName, " ")) as Parts
unwind Parts as Part
foreach (x in Parts | merge(n:node{name:Part}))

the error is....
Invalid input 'p': expected 't/T' or 'e/E' (line 5, column 28 (offset: 204))
"foreach (n in planNames | split(planName, " ") as Parts"

as a side note: this also does not seem to work as I thought it might.. Functions can not be nested?
foreach (n in split(apoc.convert.toString(split("1-2:3", ":")), "-") | merge (x:test1{name:n}))

I am kind of new to this.. Can you help.. thanks

I'd suggest breaking down each step then you can see what the return values are (type and structure) For example, just to start looking at your situation I started with a few simple queries

return split("1-2:3", ":")

Which returned

["1-2", "3"]

and then

unwind split("1-2:3", ":") as a
return a

which output

"1-2"
"3"

and then perhaps, combine the two (there are many ways, just sketching here...)

unwind split("1-2:3", ":") as a 
unwind split(a, "-") as b 
return b

yields this

"1"
"2"
"3"

Cool.. thanks Joel.. I also found two other solutions that I will post here to help people in the similar situation out. :slight_smile:

load csv with headers from "file:///data.csv" as row
merge (inc:Test2 {testRow:row.num})
with inc, row
unwind split(row.impacted_plans, ‘;’) as plans
merge (n:plan{incident_type:split(plans,’ ‘)[0], incYear: split(split(plans,’ ‘)[1],’-‘)[0], incNumber:toInteger(split(split(plans,’ ‘)[1],’-‘)[1])})
return count(n)

and

UNWIND split(row.impacted_plans,';') as plans
WITH plans, apoc.text.regexGroups(plans,'([^ ]+)\s([0-9]+)-([0-9]+)') as els
MERGE (n:Plan {incident_type:els[0][1],year:els[0][2],number:toInteger(els[0][3])})

hope this helps out other people that might have this same question..
thanks guys for helping out.. see you..

your idea is easier to read thought.. :slight_smile:

I'll 2nd @Joel's comment to break up the cypher and do returns as you go. It's a good way to ensure what you think is happening is in fact happening.

Another option, if you know what the characters are that you want to split on, and there are multiple, you can replace them all with a single character, allowing for a single split to be called on the final clean string.

Something like

WITH 'AA:12 34-BB' as initialString
WITH  replace(replace(replace(initialString, ':', ','), '-', ','), ' ', ',') as commaDelimitedString 
// => 'AA,12,34,BB'
RETURN split(commaDelimitedString) as result

Just another way to do it. Some people don't like the nested replace, but sometimes it's the right tool.

Hope that helps. Either way, enjoy graphing :slight_smile:

Mike

Thanks Mike! adding this into my know-how pallet.. Thanks man for taking time to help me out.. thanks!

1 Like