Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imporve/fix some scopes reported by the JavaScript parser #1568

Merged
merged 12 commits into from
Oct 15, 2017

Conversation

b4n
Copy link
Member

@b4n b4n commented Oct 5, 2017

No description provided.

@b4n b4n assigned masatake and b4n Oct 5, 2017
@b4n b4n requested a review from masatake October 5, 2017 07:29
b4n referenced this pull request Oct 5, 2017
Handle scope more consistently, avoiding dealing with it in many places
in a fragile and sometimes incorrect manner.

This fixes most scopes in current test cases.  Handling of qualified
names is still questionable, but mostly follows the current handling to
prevent breakage.  Better handling of these would require tracking
which tag the qualified name actually refers to to be able to choose
the correct scope, but that's outside the scope of this patch.
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.02%) to 85.956% when pulling 2f80aa5 on b4n:jscript/scope into 924c0ab on universal-ctags:master.

@masatake
Copy link
Member

masatake commented Oct 5, 2017

For newly added test cases, could you set --sort=no option?
Just put "--sort=no" to args.ctags file.

(In the future, I will put --sort=no to misc/units script.)

@b4n
Copy link
Member Author

b4n commented Oct 5, 2017

@masatake OK, but for JS I don't know if it makes much sense as the parser emits some tags out of order.

@b4n
Copy link
Member Author

b4n commented Oct 5, 2017

@masatake actually, what's the point in not sorting? I agree it's easier to check when the parser is emitting things in the order they appear in the file, but otherwise I kind of feel like the tests would relying too much on an implementation detail that never mattered. Namely, I don't have a really sensible way of "not sorting" the new test I added and that doesn't pass yet: which order should I put? None is really relevant.
Again, I'm not sure why you don't want sorting, but maybe we should have a mean to sort by line numbers?

@masatake
Copy link
Member

masatake commented Oct 6, 2017

@b4n, how about adding --fields=+n (or --fields=+{line}) instead of --sort=no ?
As you wrote, I intended making check easier with --sort=no option.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.02%) to 85.988% when pulling 9704a80 on b4n:jscript/scope into 924c0ab on universal-ctags:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.03%) to 85.999% when pulling fc26546 on b4n:jscript/scope into 924c0ab on universal-ctags:master.

@masatake
Copy link
Member

masatake commented Oct 6, 2017

I will review them in the week end. Thank you. BTW, is this change something to do with
#1566 (comment) ?

b4n added 11 commits October 6, 2017 17:27
Handle scope more consistently, avoiding dealing with it in many places
in a fragile and sometimes incorrect manner.

This fixes most scopes in current test cases.  Handling of qualified
names is still questionable, but mostly follows the current handling to
prevent breakage.  Better handling of these would require tracking
which tag the qualified name actually refers to to be able to choose
the correct scope, but that's outside the scope of this patch.
Fix a corner case of do-while loops where the semicolon is actually not
needed, not even implicitly.
Unnamed blocks do not create a new scope, leading to the declarations
inside being effectively at the root scope.
@b4n
Copy link
Member Author

b4n commented Oct 7, 2017

@masatake thanks! I just squashed the fixups and added --fields=+n to the failing test case as you suggested.
And no, it's unrelated to that issue, yet related to Geany. I was in the process of updating Geany's parser to current U-CTags for geany/geany#934 and realized one of the test cases was giving bad results, so I went ahead and spend a day fixing it -- which itself lead to finding other small issues and trying to fix them, and trying to increase the coverage.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.03%) to 85.999% when pulling bd3258c on b4n:jscript/scope into 924c0ab on universal-ctags:master.

dummy1 input.js /^ this.dummy1 = function() {}$/;" m line:18 class:doesnt_extend.Cls
dummy2 input.js /^ Cls.prototype.dummy2 = function() {$/;" m line:20 class:doesnt_extend.Cls
extend input.js /^function extend() {$/;" f line:7
extended input.js /^ Cls.prototype.extended = function() {$/;" m line:9 class:Cls
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very interesting.

"Cls" in the scope field is obviously useful information about "extended".
However, "extend" in which "extended" is assigned is also useful information.

None defines the meaning of "scope" field. I have just recognized that there are two type of scopes:
source code aspect scope and runtime scope. lexical scope and dynamic scope are more suitable terms?

For reading and understanding source code, source code aspect scope information is also useful. The
information can be regenerated with line: and end: fiedls but recording it in ctags is not bad idea.

Some parsers may records runtime scope and the other parsers may records source code aspect scope as the value for scope field. It may be too late to redefine the meaning of scope field. However, I would like to explore this area in the future.

@b4n, you have developed an IDE. I would like to hear your comments.

One of my idea is introducing new scope named sourceScope that is filled only when if source code aspect scope and runtime scope about a tag is not matched.

Copy link
Member

@masatake masatake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think introducing "unknown" kind is better.

b input.js /^var A = {}, b = 2;$/;" v
c input.js /^var c = 3, d = 4;$/;" v
d input.js /^var c = 3, d = 4;$/;" v
mem1 input.js /^A.mem1 = function() {return 42;}, A.mem2 = function() {return 43;}$/;" f function:A
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the kind of A, the scope of mem1, function?

I think resolving the kind for A is not easy. Especially if A is defined in another file.
One of technique I used is introducing "unknown" kind. See 563bf97 as an example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that in JavaScript there's not really anything but objects, a class, a function, no real differences. Actually you can see a fun thing in Firefox's JS console:

>> class foo {}
undefined
>> print(foo)
"function foo() {
    
}"

Anyway, my point is that the effective high-level kind is very dynamic. In this example, A should be a class merely because it gets methods added, but it really becomes a class. So ideally A should be reported as a class, but when it's first seen and the tag emitted it looks like a naked object, hence the "variable" kind.

v5 input.js /^, v5 = 3$/;" v
f1_c1_m1 input.js /^ f1_c1_m1:$/;" m class:f1.f1_c1
f1_c1_m2 input.js /^ f1_c1_m2$/;" m class:f1.f1_c1
f1_c1 input.js /^ var f1_c1 = {$/;" c class:f1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that f1 is a function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. As mentioned somewhere it's a known limitation of the JS parser, with a FIXME in the code.

@masatake
Copy link
Member

masatake commented Oct 7, 2017

Have you ever consider to use corkAPI for representing scopes in jscritp.c?

For filling a scope field, a parser have to track both scope name and scope kind.
tokenInfo of jscript.c has only a structure field for scope name.
Adding a field of scope kind is an approach.
Using corkAPI is another approach.

In my experince, corkAPI works well in this case.

I can take over the rest of job for introducing corkAPI if you are o.k. In such case, I will merge this pull request first.

@masatake
Copy link
Member

masatake commented Oct 7, 2017

@b4n, after rebasing the lastet master branch where CTAGS_INLINE macro is introduced, could you consider chery-picking following changes?

masatake@690c456
masatake@d4a27f0
masatake@cacb33f
masatake@524dd1b

All of them are in found at https://github.com/masatake/ctags/commits/jscript-make-signature-field-sophisticated

@b4n
Copy link
Member Author

b4n commented Oct 8, 2017

@masatake I though about using that new cork API, but it would require quite some changes, especially as many tags are held back (to find the right kind). Maybe though it would be quite handy if we can change the kind of a tag after having prepared it.
Anyway, I'm not against using it at all, I just felt it would be a fairly larger undertaking I didn't want to start right away, but I'd be fine reviewing the changes, or trying it at some point.

@b4n
Copy link
Member Author

b4n commented Oct 8, 2017

I think introducing "unknown" kind is better.

Hum… maybe, I'm not sure. OK, it doesn't report something wrong and a client could special-case it, but OTOH the parser is generally right in the assumption, only a few cases are incorrect, so I'm not sure.

@masatake
Copy link
Member

masatake commented Oct 8, 2017

I found we have js-scope-resolution.b test case.
I'm sorry to be late to know that.

LGTM. Thank you.

@masatake
Copy link
Member

@b4h, should I merge this?

@b4n
Copy link
Member Author

b4n commented Oct 15, 2017

@masatake You can, yes.
I am in the process of importing your mentioned changes, but wanted to try and simplify a few things, but didn't get there yet. It's however mostly unrelated to this PR, so I think merging would be fine.
And I have another set of changes for the JS parser (for supporting Unicode identifiers and character escapes) in the works, so it's not like this one would be the last one anyway ☺️

@masatake masatake merged commit 2bc8de8 into universal-ctags:master Oct 15, 2017
@masatake
Copy link
Member

masatake commented Oct 15, 2017

Could you look at#1519? It will change the last argument of initTagEntry. It required a pointer to kindDefintion struct. With the change of #1579, it requires an integer instead. This change may have small impact on the code you are working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants