Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delayed error inspection #60

Merged

Conversation

MatthieuDartiailh
Copy link
Collaborator

@MatthieuDartiailh MatthieuDartiailh commented Mar 23, 2022

This attempts to mimic CPython behavior by calling invalid rule only in a second pass to speed up parsing on the happy path. It also includes changes python/cpython@390459d to avoid calling invalid rules from within without invalid rules.

With those changes, I thought I would be able to get rid of having both immediate and deferred raising of exceptions (store_ vs raise_) since looking at CPython it looks to me as if as soon as an exception is reported the parser exits. This fast exit is controlled by the error_indicator attribute on the parser state. However I hit one issue when enabling direct raising in invalid rules (so in CPython terms when looking for details about a syntax error):
f(i for i in range(10)) fails to parse because in t_primary the alternative a=t_primary b=genexp &t_lookahead fails and then the alternative a=t_primary '(' b=[arguments] ')' &t_lookahead trips through the invalid argument rule. This should parse and would mask any actual error.

I would very much appreciate if somebody could shed some light on this issue.

This will need some extra documentation and tests once the above has been addressed.

@MatthieuDartiailh
Copy link
Collaborator Author

MatthieuDartiailh commented Mar 24, 2022

I actually figured the issue. I was not checking for the number of args in the invalid_arguments rule when we matched a generator expr. I will do my best to update the PR ASAP.

@MatthieuDartiailh MatthieuDartiailh marked this pull request as ready for review March 24, 2022 10:23
@MatthieuDartiailh MatthieuDartiailh marked this pull request as draft March 24, 2022 10:23
@MatthieuDartiailh
Copy link
Collaborator Author

MatthieuDartiailh commented Mar 24, 2022

I have one issue remaining on which I would appreciate some insight @pablogsal @lysnikolaou which relates to the rule real_number.

This rule is basically invoked when we have the following:

x = 1 - 1j
match x:
    case 1 - 1j:
        y = 0

In this case we do not have a simple real number which would be handled by signed_number, and we will use the complex_number rule which will call the real_number rule. (line 8573 of https://github.com/python/cpython/main/Parser/parser.c). My issue is that currently we check that the real part of the literal is a complex number (through _PyPegen_ensure_real which calls PyComplex_CheckExact) which to me should never be true since I expect the real part of the literal to be parsed as either an int or a float. However it does work in CPython so I must be missing something.

@pablogsal
Copy link
Contributor

I have one issue remaining on which I would appreciate some insight @pablogsal @lysnikolaou which relates to the rule real_number.

Will try to investigate today or tomorrow if @lysnikolaou doesn't do it before. We are currently releasing 3.10.4 and 3.9.12 as we had some problems with the latest release and it may take a while.

@lysnikolaou
Copy link
Contributor

lysnikolaou commented Mar 24, 2022

I'm not sure what the intent of that part of the grammar is, and sadly, I wasn't around when this was introduced. What I can say though, is that NUMBER tokens and number expr_tys, as a result, contain the whole number, imaginary or real.

Even in the real_number rule, the NUMBER that gets parsed and will be passed to _PyPegen_ensure_real will be the complex number in its entirety, and that's why PyComplex_CheckExact returns true. You can check parsenumber_raw for how this is done.

Does that answer your question?

@MatthieuDartiailh
Copy link
Collaborator Author

MatthieuDartiailh commented Mar 24, 2022

Sorry for the noise.... I missed the absence of a leading ! in _PyPegen_ensure_real. I am not used enough to C. All good now.

@MatthieuDartiailh
Copy link
Collaborator Author

In case it was not clear from the status change. This is now ready for review. I would like to get it in before picking up again #41

Copy link
Contributor

@pablogsal pablogsal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I have also played with this locally and I could not find anything odd. If we find edge cases, we can fix them in separate PRs.

Fantastic job @MatthieuDartiailh !

@pablogsal pablogsal merged commit e28fe4f into we-like-parsers:main Mar 30, 2022
@MatthieuDartiailh MatthieuDartiailh deleted the delayed-error-inspection branch March 30, 2022 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants