I have some data in two languages and I'm trying to write a query to add data. I already made a thread about modelling this data here: Best way to model this data
Because the data can come in either English only, Chinese only, or both, I need to add it in different ways in order to correctly merge people, but I'm not sure I'm going about this in the best way. If it makes any difference I am working in node to run the queries.
Here's a simplified example of my data coming in:.
{
title: { zh: "《如夢之夢》", en: "A Dream Like a Dream" },
entities: [
{
name: { zh: "賴聲川", en: "Stan Lai" },
role: { zh: "編劇及導演", en: "Playwright / Director" },
},
{
name: { zh: "馮蔚衡", en: "Fung Wai Hang" },
role: { zh: "聯合導演", en: "Co-director" },
},
{
name: { zh: "蘇玉華", en: "Louisa So" },
role: { zh: "特邀主演", en: "Guest Leading Cast" },
},
{
name: { zh: "雷思蘭", en: "Lui Si Lan" },
role: { zh: "演員", en: "Cast" },
},
],
}
This describes a 'Show', with a bunch of people credited in it. I want to create an Entity node for each.
For this data: If an entity with the same Chinese name but no English name exists, I want to merge and add the English name. If an entity with the same English name but no Chinese name exists, I want to merge and add the Chinese name. If an entity with the same English name but a different Chinese name exists, I want to create a new entity. Same for roles.
If my data had only Chinese:
name: { zh: "賴聲川", en: undefined },
role: { zh: "編劇及導演", en: undefined },
I would want to merge, but retain the English name if it already existed, simple enough. Same in reverse if I only had English. But if I do this, I will end up in some cases with two nodes for one person - sometimes they're credited only with an English name, sometimes with only a Chinese name.
Then later in the data, perhaps I will have a case where I have both their English and Chinese names together. In that case I will want to go and find those two nodes and merge them. So I found I can do something like this:
MERGE (a:Entity {name_zh: $name_en}, (b:Entity {name_en: $name_en})
CALL apoc.refactor.mergeNodes([a,b], {mergeRels: true, properties:"combine"}) YIELD node
But to be honest I'm not sure if I'm using this correctly, or if this is a 'bad idea'.
I'm having a hard time working out how to write the above logic using FOREACH and CASE as I've seen is meant to be used to do conditional logic in cypher. I'd be grateful for any help, but I will also continue trying to work it out myself - I'll add a query code example later.