Create a generalization node and corresponding relationships

Hello guys,

i have several graphs with the same structure (one central node called Perception_Group and several peripheric nodes called Perception). I need to compare Perception_Groups (PGs) by measuring distance of values in their peripheric perception nodes. If the difference between PGs is less than 0.2 i want to create a Generalization (G) node and connect all Perception_Group (PG) nodes that are similar to each other, by using a INSTANCEOF Relationship from each PG to the G node.

The following cypher creates several Generalization Nodes instead of just one for similar PGs.

Can you help me fix it or show me an alternative ?

Thanks

MATCH (pg1:Perception_Group)-[r:RELATED]-(p1:Perception)
WHERE ID(pg1) = 801
WITH pg1, p1
MATCH (pg2:Perception_Group)-[:RELATED]-(p2:Perception)
WHERE pg1 <> pg2 AND p1.name = p2.name
WITH pg1,pg2, sum(abs(p1.value-p2.value)) AS diff
WHERE diff < 0.2
WITH  pg1,pg2,diff
MERGE (pg1)-[:INSTANCEOF {subType:'UE5'}]->(:GENERALIZATION{date: datetime()})<-[:INSTANCEOF {subType:'UE5'}]-(pg2);

got this one working, but i feel there could be a shorter version and also how can i avoid duplication of the Generalization node everytime i run the query ?

MATCH (pg1:Perception_Group)-[r:RELATED]-(p1:Perception)
WHERE ID(pg1) = 801
WITH pg1, p1
MATCH (pg2:Perception_Group)-[:RELATED]-(p2:Perception)
WHERE pg1 <> pg2 AND p1.name = p2.name
WITH pg1,pg2, sum(abs(p1.value-p2.value)) AS diff
WHERE diff < 0.2
WITH  pg1,pg2,diff
MERGE (g:GENERALIZATION{date: datetime()})
MERGE (g)<-[:INSTANCEOF {subType:'UE5'}]-(pg2)
MERGE (g)<-[:INSTANCEOF {subType:'UE5'}]-(pg1);

trying this one to avoid "Generalization" nodes duplication in multiple runs, but not working.

MATCH (pg1:Perception_Group)-[r:RELATED]-(p1:Perception)
WHERE ID(pg1) = 801
WITH pg1, p1
MATCH (pg2:Perception_Group)-[:RELATED]-(p2:Perception)
WHERE pg1 <> pg2 AND p1.name = p2.name
WITH pg1,pg2, sum(abs(p1.value-p2.value)) AS diff
WHERE diff < 0.2
WITH  pg1,pg2,diff
OPTIONAL MATCH (g:Generalization {id:$id1})-[:INSTANCEOF]-(pg1)
WITH pg1,pg2,diff,g
CALL apoc.do.when(g is null, 'MERGE (g1:GENERALIZATION) RETURN g1', 'RETURN g', {id:$id1, g:GENERALIZATION}) YIELD value
WITH value.g as g, pg1,pg2
MERGE (g)<-[:INSTANCEOF {subType:'UE5'}]-(pg2)
MERGE (g)<-[:INSTANCEOF {subType:'UE5'}]-(pg1);

This query is working. Any suggestion on optimization ?

MATCH (pg1:Perception_Group)-[r:RELATED]-(p1:Perception)
WHERE ID(pg1) = 801
WITH pg1, p1
MATCH (pg2:Perception_Group)-[:RELATED]-(p2:Perception)
WHERE pg1 <> pg2 AND p1.name = p2.name
WITH pg1,pg2, sum(abs(p1.value-p2.value)) AS diff
WHERE diff < 0.2
WITH  pg1,pg2
OPTIONAL MATCH (g:Generalization)-[:INSTANCEOF]-(pg1)
WITH pg1,pg2,g
CALL apoc.do.when(g is null, 'MERGE (g:GENERALIZATION) RETURN g', 'RETURN g', {}) YIELD value
WITH value.g as g, pg1,pg2
MERGE (g)<-[:INSTANCEOF {subType:'UE5'}]-(pg2)
MERGE (g)<-[:INSTANCEOF {subType:'UE5'}]-(pg1);