Friday, 18 September 2020

PHP cURL: Encoding/Parsing (AWS Neptune using SPARQL)

I'm new to SPARQL so this might be a dumb question but I'm having troubles inserting data. I'm using the HTTPS REST endpoint to communicate with the DB.

I have the following triple:

<urn:data:Gen`000001>
<urn:data:content>
"This is a literal"

Which I urlencode() because, as I thought, it would give me no errors since it has the proper URI format:

<urn:data:Gen%60000001>
...

Yet the % character is raising the error (because if I use a - instead of the % it works.) Because of this answer I tried backslashing the % character:

<urn:data:Gen\%60000001>
...

Then I tried using a URL instead of the URN:

<https://test.com/data/Gen%60000001>
<https://test.com/data/content>
...

But it keeps giving me the same error:

The requested URL returned error: 400 Bad Request

So what am I doing wrong? How can I escape the % character (and maybe other urlencode characters?) Or should I ask, Is this the right way of doing it, doing a urlencode() before submitting?

EDIT:

my PHP code:

$q = 'update=INSERT DATA { '.
        '<https://test.com/data/Gen%60000001> '.
        '<https://test.com/data/content> '.
        '"This is a literal." }';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$config->db->public->neptune->cluster."/sparql");
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 2); 
curl_setopt($ch, CURLOPT_TIMEOUT, 3);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $q);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$server_output = curl_exec($ch);

if (curl_errno($ch)) {
    $error = curl_error($ch);
}

if(!$error)
    echo $server_output;
else
    echo $error;

Again, if I just change % to - it works:

query=select * where {?s ?p ?o}
result:
  "s" : {
    "type" : "uri",
    "value" : "https://test.com/data/Gen-60000001"
  },
  "p" : {
    "type" : "uri",
    "value" : "https://test.com/data/content"
  },
  "o" : {
    "type" : "literal",
    "value" : "This is a literal."
  }

EDIT2 + Solution:

As Kelvin Lawrence pointed out, it's likely it is some parsing/encodigin issue on PHP's side. I deleted the CURLOPT_FAILONERROR and now I have these errors showing up:

using URN:
    {
        "detailedMessage":"Malformed query: Lexical error at line 1, column 28. 
            Encountered: \"`\" (96), after : \"\"",
        "requestId":"***",
        "code":"MalformedQueryException"
    }

using URL:
    {
        "detailedMessage":"Malformed query: Lexical error at line 1, column 32. 
            Encountered: \"/\" (47), after : \"test.com\"",
        "requestId":"***",
        "code":"MalformedQueryException"}

It looks like the error using the URN shows that Neptune receives it (or decodes it) as the original character before I urlencoded it. So if I urlencode it once more before sending it does work as it should:

$q = 'update=INSERT DATA { '.
        '<https://test.com/data/'.urlencode(urlencode('Gen`60000001')).'> '.
        '<https://test.com/data/content> '.
        '"This is a literal." }';

But why does Neptune receive it decoded or why does it decode it? It is send in the POST body so it doesn't need to be decoded, right? What am I missing?

EDIT3:

Another thing I should mention:

$q = 'query=select * where {?s ?p ?o}';

This query works perfectly, while the same query with newlines does not work:

$q = 'query=select * 
        where {
            ?s
            ?p 
            ?o
        }';

It gives me this error:

{
    "detailedMessage":"Malformed query: Encountered \"\" at line 1, column 9.\n
        Was expecting one of:\n    \"{\" ...\n    \"from\" ...\n    \"where\" ...\n    
        \"with\" ...\n    ",
    "requestId":"***",
    "code":"MalformedQueryException"
}

Why? I can solve this by keeping the query at one line, but that's not how it should work.



from PHP cURL: Encoding/Parsing (AWS Neptune using SPARQL)

No comments:

Post a Comment