Hemant Vishwakarma: Why doesn't the Python interpreter return the explicit SyntaxError message?

Wednesday, 10 July 2019

Why doesn't the Python interpreter return the explicit SyntaxError message?

When looking at CPython's tokenizer.c, the tokenizer returns specific error messages.

As an example, you can take a look at the part where the tokenizer tries to parse a decimal number. When trying to parse the number 5_6 everything should be OK, but when trying to parse the number 5__6 the tokenizer should return a SyntaxError with the message "invalid decimal literal":

static int
tok_decimal_tail(struct tok_state *tok)
{
    int c;

    while (1) {
        do {
            c = tok_nextc(tok);
        } while (isdigit(c));
        if (c != '_') {
            break;
        }
        c = tok_nextc(tok);
        if (!isdigit(c)) {
            tok_backup(tok, c);
            syntaxerror(tok, "invalid decimal literal");
            return 0;
        }
    }
    return c;
}

Using Python, I've tried to reach the tokenizer's SyntaxError message:

In [12]: try: 
    ...:     eval('5__6') 
    ...: except SyntaxError as e: 
    ...:     print(e.args, e.filename, e.lineno, e.msg, e.text) 

('invalid token', ('<string>', 1, 2, '5__6')) <string> 1 invalid token 5__6

Is there any way to extract the SyntaxError message from the tokenizer?

from Why doesn't the Python interpreter return the explicit SyntaxError message?

Hemant Vishwakarma

Wednesday, 10 July 2019

Why doesn't the Python interpreter return the explicit SyntaxError message?

No comments:

Post a Comment