Native projection property problem

Hello everyone,

I have been working on a graph project with more than 660,000,000+ nodes and 1,500,000,000+ relationships. I am trying to create a graph in native mode with using CALL gds.graph.project() module. Some of my property types are array, when I write projection code like:
{
label: "Customer",
properties:
{
AGE: {defaultValue: 0},
GENDER_ARR: {defaultValue: [0,0,0]}
}
},
..rest of my code..

I get an error which says:
Failed to invoke procedure gds.graph.project: Caused by: java.lang.IllegalArgumentException: Specifying different default values for the same property with identical neoPropertyKey is not allowed, found propertyKey: GENDER_ARR with conflicting default values: [J@376f4bc8, [J@60ee48b0

But, this is the first time I am setting a default value.

After projection, I have to apply fastRP algorithm to my graph. So, I must be able to make successful projection with true types of property values. I tried to set GENDER_ARR default value as null, but despite I could make projection successfully, I had an error while applying fastRP algorithm. So, I have to write projection type same as the property value which is array.
If there is any suggestions or solutions for this issue, I would be happy to hear them. Thank you.

ps. I am using neo4j version: 4.4.23 and gds version: 2.3.1

I have the same problem ....

Hey,
@irem.tunaliarslan could you share your whole projection query? (at least the gds call part)

Independent of your project query, what error are you facing on fastRP?
Did you check your customers for missing GENDER_ARR properties?

@berkay.coskuner98 Are you running the same query and face the same problem?

Hi @florentin_dorre ,

My query is logically the same but content is different. This problem occurs with array properties. In short, you cannot give a default value to the array property. Let's not get the fastrp part involved. I cannot use defaultvalue in array properties using gds.graph.project(). That's the problem I am facing.

Hi @florentin_dorre , thank you for responding.

Whole projection query:

CALL gds.graph.project(
    "graph",
    {
        Label1:
        {
            label: "Label1",
            properties: 
            {
                AGE: {defaultValue: 0},
                GENDER_ARR: {defaultValue: [0,0,0]},
                CUSTOMER_FLG: {defaultValue: 0},
                AMOUNT: {defaultValue: 0.0}
            }
        },
        Label2:
        {
            label: "Label2",
            properties: 
            {
                AGE: {defaultValue: 0},
                GENDER_ARR: {defaultValue: [0,0,0]},
                CUSTOMER_FLG: {defaultValue: 0},
                AMOUNT: {defaultValue: 0.0}
            }
        },
        Label3:
        {
            label: "Label3",
            properties: 
            {
                AGE: {defaultValue: 0},
                GENDER_ARR: {defaultValue: [0,0,0]},
                CUSTOMER_FLG: {defaultValue: 0},
                AMOUNT: {defaultValue: 0.0}
            }
        },
        Label4:
        {
            label: "Label4",
            properties: 
            {
                AGE: {defaultValue: 0},
                GENDER_ARR: {defaultValue: [0,0,0]},
                CUSTOMER_FLG: {defaultValue: 0},
                AMOUNT: {defaultValue: 0.0}
            }
        },
        Label5:
        {
            label: "Label5",
            properties: 
            {
                AGE: {defaultValue: 0},
                GENDER_ARR: {defaultValue: [0,0,0]},
                CUSTOMER_FLG: {defaultValue: 0},
                AMOUNT: {defaultValue: 0.0}
            }
        },
        Label6:
        {
            label: "Label6",
            properties: 
            {
                AGE: {defaultValue: 0},
                GENDER_ARR: {defaultValue: [0,0,0]},
                CUSTOMER_FLG: {defaultValue: 0},
                AMOUNT: {defaultValue: 0.0}
            }
        },
        Label7:
        {
            label: "Label7",
            properties: 
            {
                AGE: {defaultValue: 0},
                GENDER_ARR: {defaultValue: [0,0,0]},
                CUSTOMER_FLG: {defaultValue: 0},
                AMOUNT: {defaultValue: 0.0}
            }
        },
        Label8:
        {
            label: "Label8",
            properties: 
            {
                AGE: {defaultValue: 0},
                GENDER_ARR: {defaultValue: [0,0,0]},
                CUSTOMER_FLG: {defaultValue: 0},
                AMOUNT: {defaultValue: 0.0}
            }
        }
    },
    ["*"]
    
)YIELD graphName AS graph, nodeProjection, nodeCount AS nodes, relationshipProjection, relationshipCount AS rels

As an output of this query, I got an error which I mentioned above.

Here is the fastRP query:

CALL gds.fastRP.stream(
    "graph",
    {
    embeddingDimension: 16,
    nodeLabels: ["*"],
    relationshipTypes: ["*"],
    nodeSelfInfluence: 0.6,
    propertyRatio: 0.1,
    featureProperties: ["AGE", "GENDER_ARR", "CUSTOMER_FLG", "AMOUNT"],
    randomSeed: 7
    }
    ) YIELD nodeId, embedding

And output of this query is:

Thank you! @irem.tunaliarslan

I could find the bug in our projection code. Essentially, we check if the defaultValue objects are the same. However in this query, every [0,0,0] creates a new object under the hood.
You should be able to workaround the bug by using
WITH [0,0,0] as defaultGenderArr CALL gds.graph.project(... GENDER_ARR: {defaultValue: defaultGenderArr}, ...). (should also work for @berkay.coskuner98 )

Another option is to set missing the GENDER_ARR in the data.
MATCH (n) WHERE n:Label1 OR n:Label2 OR n:Label3 ... OR n:Label8 SET n.GENDER_ARR = [0,0,0]

I will also create a card on our side to get this bug fixed.

Hi @florentin_dorre ,

Thank you for your reply, I will try your solution (WITH [0,0,0]). Probably it is going to work but the other one is not applicable because my amount of data.

Some of my collegues opened this issue like 1 year ago but probably it's inactive now. If you want a take a look here is the link : Transferring graph from neo4j-community-4.4.12 to neo4j-enterprise-4.4.18 creates problem when projecting graph. (Array properties)

After you fixed this error, what should we do ?

Thank you so much.

Hello again @florentin_dorre , thank you for your advice.
I tried what you say but I got the same error. Here is the query that I write:

WITH [0,0,0] AS defaultGenderArr
CALL gds.graph.project(
    "graph",
    {
        Label1:
        {
            label: "Label1",
            properties: 
            {
                AGE: {defaultValue: 0},
                GENDER_ARR: {defaultValue: defaultGenderArr},
                CUSTOMER_FLG: {defaultValue: 0},
                AMOUNT: {defaultValue: 0.0}
            }
        },
        Label2:
        {
            label: "Label2",
            properties: 
            {
                AGE: {defaultValue: 0},
                GENDER_ARR: {defaultValue: defaultGenderArr},
                CUSTOMER_FLG: {defaultValue: 0},
                AMOUNT: {defaultValue: 0.0}
            }
        },
        Label3:
        {
            label: "Label3",
            properties: 
            {
                AGE: {defaultValue: 0},
                GENDER_ARR: {defaultValue: defaultGenderArr},
                CUSTOMER_FLG: {defaultValue: 0},
                AMOUNT: {defaultValue: 0.0}
            }
        },
        Label4:
        {
            label: "Label4",
            properties: 
            {
                AGE: {defaultValue: 0},
                GENDER_ARR: {defaultValue: defaultGenderArr},
                CUSTOMER_FLG: {defaultValue: 0},
                AMOUNT: {defaultValue: 0.0}
            }
        },
        Label5:
        {
            label: "Label5",
            properties: 
            {
                AGE: {defaultValue: 0},
                GENDER_ARR: {defaultValue: defaultGenderArr},
                CUSTOMER_FLG: {defaultValue: 0},
                AMOUNT: {defaultValue: 0.0}
            }
        },
        Label6:
        {
            label: "Label6",
            properties: 
            {
                AGE: {defaultValue: 0},
                GENDER_ARR: {defaultValue: defaultGenderArr},
                CUSTOMER_FLG: {defaultValue: 0},
                AMOUNT: {defaultValue: 0.0}
            }
        },
        Label7:
        {
            label: "Label7",
            properties: 
            {
                AGE: {defaultValue: 0},
                GENDER_ARR: {defaultValue: defaultGenderArr},
                CUSTOMER_FLG: {defaultValue: 0},
                AMOUNT: {defaultValue: 0.0}
            }
        },
        Label8:
        {
            label: "Label8",
            properties: 
            {
                AGE: {defaultValue: 0},
                GENDER_ARR: {defaultValue: defaultGenderArr},
                CUSTOMER_FLG: {defaultValue: 0},
                AMOUNT: {defaultValue: 0.0}
            }
        }
    },
    ["*"]
    
)YIELD graphName AS graph, nodeProjection, nodeCount AS nodes, relationshipProjection, relationshipCount AS rels
RETURN graph, nodes, rels;

And here is the error:

As you can see I get the same error. Thank you.

Sorry, I could not find an immediate workaround for you.

I merged the bug fix and it will be part of the 2.6.3 release.
You can find the commit at https://github.com/neo-technology/graph-analytics/commit/5c52eb7b49f0a3e71e370edf1bfb93af574171b5 if your are curios.

1 Like

Thank you @florentin_dorre , Is there an exact date for the release of the 2.6.3 version?

I would say in the next 2 weeks.

1 Like

Where can we find out that the new version has arrived? And when the new version comes, we'll have to update our entire system, right?

You can find the latest version at Releases · neo4j/graph-data-science · GitHub or at Neo4j Deployment Center - Graph Database & Analytics.

I would definitely recommend to update at least to the latest 4.4 patch version.
For 4.4.23, it could also work to replace the GDS jar manually.

1 Like

Thank you for all your responses and consideration @florentin_dorre

Hi @florentin_dorre , we updated our gds version to 2.6.3.

Our gds version is 2.6.3 and Neo4j version is 4.4.23.

When we try to run projection query including arrays, we got the error:

Can you help me to fix this error?

Hey @irem.tunaliarslan ,
can you share the exception stacktrace?
Its stored in the debug log.

Hello @florentin_dorre , sorry for late response.

Unfortunately, I can't share the whole file but if you tell me in detail what you are looking for, we will try to share a part of it.

inside the debug file, there should be a stacktrace which contains gds.graph.project