Apoc.text.urlencode seems to be adding extra % characters to URL for Wikidata

I am trying to follow the tutorial here: Neo4j: Enriching an existing graph by querying the Wikidata SPARQL API · Mark Needham (markhneedham.com)

When I get to the apoc.load.jsonParams section (right after Table 2), I get the following error:

Failed to invoke procedure apoc.load.jsonParams: Caused by: java.lang.RuntimeException: Can't read url or key https://query.wikidata.org/sparql?query=SELECT+*%250AWHERE+%257B+%253Fperson+wdt%253AP106+wd%253AQ10833314+%253B%250A++++++++++++++++rdfs%253Alabel+%2522Nick+Kyrgios%2522%2540en+%253B%250A++++++++++++++++wdt%253AP569+%253FdateOfBirth+%253B%250A++++++++++++++++wdt%253AP27+%255B+rdfs%253Alabel+%253FcountryName+%255D+.%250A+++++++filter%2528lang%2528%253FcountryName%2529+%253D+%2522en%2522%2529%250A%257D as json: Server returned HTTP response code: 400 for URL: https://query.wikidata.org/sparql?query=SELECT+*%250AWHERE+%257B+%253Fperson+wdt%253AP106+wd%253AQ10833314+%253B%250A++++++++++++++++rdfs%253Alabel+%2522Nick+Kyrgios%2522%2540en+%253B%250A++++++++++++++++wdt%253AP569+%253FdateOfBirth+%253B%250A++++++++++++++++wdt%253AP27+%255B+rdfs%253Alabel+%253FcountryName+%255D+.%250A+++++++filter%2528lang%2528%253FcountryName%2529+%253D+%2522en%2522%2529%250A%257D

It seems like extra "25"s (% sign used in the HTML encoding) are getting inserted into the URL when it gets sent to Wikidata. I have tried cleaning the string using the replace() function but that doesn't work.

When I manually remove the "25"s and paste the URL in my browser, it is able to communicate with Wikidata just fine. Does anyone know how I can encode the URL without the extra "25"s

  • Neo4j 4.2.7, Windows 10, Chrome
1 Like

Can you please create a GH issue for APOC for this, thank you!

Hmm works for me

WITH "SELECT *
WHERE { ?person wdt:P106 wd:Q10833314 ;
                rdfs:label \"" + "Nick Kyrgios" + "\"@en ;
                wdt:P569 ?dateOfBirth ;
                wdt:P27 [ rdfs:label ?countryName ] .
       filter(lang(?countryName) = \"en\")
}" AS sparql
CALL apoc.load.jsonParams(
  "https://query.wikidata.org/sparql?query=" + apoc.text.urlencode(sparql),
  { Accept: "application/sparql-results+json"},
  null
)
YIELD value
RETURN value

Hi Micheal,

I am having the same issue as reported by cobyyohacofa.

The query you just shared here is not working and giving me the same exception. I am using the latest version of neo4j, i.e., 4.3.4 with all the dependencies installed.
I encountered this issues while following this article: Making Sense of News, the Knowledge Graph Way

Try to use single quotes around your string so you don't need to escape the double quotes.
We're just using a library function that we expose through apoc.

URL encoding adds "%xx" transcriptons for all invalid characters in an URL parameter.

%25 is acutally the percent sign which needs to be escaped.

Looks kinda like double encoding. What happens if you don't URLencode that string?

Hi Micheal,

Thanks for your reply!

After following your suggestion, my query now looks as follow:

MATCH (e:Entity)
// Prepare a SparQL query
WITH 'SELECT *
      WHERE{
        ?item rdfs:label ?name .
        filter (?item = wd:' + e.wikiDataItemId + ')
        filter (lang(?name) = "en" ) .
      OPTIONAL{
        ?item wdt:P31 [rdfs:label ?class] .
        filter (lang(?class)="en")
      }}' AS sparql, e
// make a request to Wikidata
CALL apoc.load.jsonParams(
      'https://query.wikidata.org/sparql?query=' + 
    apoc.text.urlencode(sparql),
     { Accept: 'application/sparql-results+json'}, null)
YIELD value
UNWIND value['results']['bindings'] as row
FOREACH(ignoreme in case when row['class'] is not null then [1] else [] end | 
        MERGE (c:Class{name:row['class']['value']})
        MERGE (e)-[:INSTANCE_OF]->(c));

But it does not work either.

Failed to invoke procedure `apoc.load.jsonParams`: Caused by: java.lang.RuntimeException: Can't read url or key https://query.wikidata.org/sparql?query=SELECT+*%250D%250A++++++WHERE%257B%250D%250A++++++++%253Fitem+rdfs%253Alabel+%253Fname+.%250D%250A++++++++filter+%2528%253Fitem+%253D+wd%253AQ267298%2529%250D%250A++++++++filter+%2528lang%2528%253Fname%2529+%253D+%2522en%2522+%2529+.%250D%250A++++++OPTIONAL%257B%250D%250A++++++++%253Fitem+wdt%253AP31+%255Brdfs%253Alabel+%253Fclass%255D+.%250D%250A++++++++filter+%2528lang%2528%253Fclass%2529%253D%2522en%2522%2529%250D%250A++++++%257D%257D as json: Server returned HTTP response code: 400 for URL: https://query.wikidata.org/sparql?query=SELECT+*%250D%250A++++++WHERE%257B%250D%250A++++++++%253Fitem+rdfs%253Alabel+%253Fname+.%250D%250A++++++++filter+%2528%253Fitem+%253D+wd%253AQ267298%2529%250D%250A++++++++filter+%2528lang%2528%253Fname%2529+%253D+%2522en%2522+%2529+.%250D%250A++++++OPTIONAL%257B%250D%250A++++++++%253Fitem+wdt%253AP31+%255Brdfs%253Alabel+%253Fclass%255D+.%250D%250A++++++++filter+%2528lang%2528%253Fclass%2529%253D%2522en%2522%2529%250D%250A++++++%257D%257D

When I don't use apoc.text.urlencode then It give me the following ERROR:

Blockquote
Failed to invoke procedure apoc.load.jsonParams: Caused by: java.net.URISyntaxException: Illegal character in path at index 52: https://query.wikidata.org/sparql%3Fquery=SELECT%20*
%20%20%20%20%20%20WHERE%7B
%20%20%20%20%20%20%20%20%3Fitem%20rdfs:label%20%3Fname%20.
%20%20%20%20%20%20%20%20filter%20(%3Fitem%20=%20wd:Q267298)
%20%20%20%20%20%20%20%20filter%20(lang(%3Fname)%20=%20%22en%22%20)%20.
%20%20%20%20%20%20OPTIONAL%7B
%20%20%20%20%20%20%20%20%3Fitem%20wdt:P31%20%5Brdfs:label%20%3Fclass%5D%20.
%20%20%20%20%20%20%20%20filter%20(lang(%3Fclass)=%22en%22)
%20%20%20%20%20%20%7D%7D

I am using neo4j version 4.3.4; APOC version 4.3.0.2.

Is there any issue to my configuration setup?