Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure Variable Parsing to support underscores #3412

Merged
merged 6 commits into from
Jan 12, 2024
Merged

Restructure Variable Parsing to support underscores #3412

merged 6 commits into from
Jan 12, 2024

Conversation

bleaphar
Copy link
Contributor

@bleaphar bleaphar commented Jan 12, 2024

Made a few changes around the parser design so that underscores are supported in the name of a variable and added two tests for binding and the underscores in the variable respectively.

@@ -17,7 +17,7 @@ internal HttpVariableDeclarationNode(SourceText sourceText, HttpSyntaxTree? synt
{
}

public string VariableName => ChildTokens.Where(t => t.Kind == HttpTokenKind.Word).Select(t => t.Text).FirstOrDefault() ?? string.Empty;
public string VariableName => string.Join("", ChildTokens.Where(t => t.Kind == HttpTokenKind.Word || (t.Kind == HttpTokenKind.Punctuation && t.Text is "_")).Select(t => t.Text)) ?? string.Empty;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah looks like the above node is for the entire declaration and not just the variable name. I think it may be better to have a dedicated sub-node for the variable name (which could potentially also be used to parse the variable references that appear inside {{}} expressions).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could create a sub-node but I think the main utility would just be getting the value of the text and reconfiguring the parser to use this new node and I'm not sure that it is necessarily cleaner and worth the rework

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be wrong, but I think we'll need to parse variable names once we start interpreting expressions and I think we may need to start doing that for the named requests feature (to implement support for this syntax for dotting into the named requests).

We may be able to sidestep this by using a separate / simpler (regex-based?) parser just for parsing the expression contents. Not sure if that is the path we will decide to go down though...

In any case, your call to keep this simple for now seems reasonable considering that we don't need to parse variable names in other places yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think a regex is the right path. The different approaches within the same parser will make it harder to understand, and the regex isn't going to support language services as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I had the same thoughts. Having a dedicated sub node could also make the implementation of features like go to definition / find all references simpler / more natural.

@@ -17,7 +17,7 @@ internal HttpVariableDeclarationNode(SourceText sourceText, HttpSyntaxTree? synt
{
}

public string VariableName => ChildTokens.Where(t => t.Kind == HttpTokenKind.Word).Select(t => t.Text).FirstOrDefault() ?? string.Empty;
public string VariableName => string.Join("", ChildTokens.Where(t => t.Kind == HttpTokenKind.Word || (t.Kind == HttpTokenKind.Punctuation && t.Text is "_")).Select(t => t.Text)) ?? string.Empty;
Copy link
Contributor

@shyamnamboodiripad shyamnamboodiripad Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we report an error if a variable name contains characters like - that are not allowed? Perhaps this is not super urgent - but having dedicated parsing for variable names like we discussed above may make such error reporting easier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggests an alternative approach, which is to parse everything but whitespace into the variable declaration node, and then report diagnostics for invalid characters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that characters like - will be allowed on the RHS of the variable declaration (if they appear inside a string literal or embedded expression for example) which is why I think having dedicated parsing for the name portion may be helpful.

Copy link
Contributor

@shyamnamboodiripad shyamnamboodiripad Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah never mind, I was confused by the naming. I assumed the HttpVariableDeclarationNode encompasses the entire declaration (including the = and the expression on the RHS). Looks like that is the HttpVariableDeclarationAndAssignmentNode - So please discount the above confusion from my comments above :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I also wonder if the naming could be made less confusing - but perhaps its ok since it was not as confusing once I looked at the names of all the involved node types :))

@@ -17,7 +17,7 @@ internal HttpVariableDeclarationNode(SourceText sourceText, HttpSyntaxTree? synt
{
}

public string VariableName => ChildTokens.Where(t => t.Kind == HttpTokenKind.Word).Select(t => t.Text).FirstOrDefault() ?? string.Empty;
public string VariableName => string.Join("", ChildTokens.Where(t => t.Kind == HttpTokenKind.Word || (t.Kind == HttpTokenKind.Punctuation && t.Text is "_")).Select(t => t.Text)) ?? string.Empty;
Copy link
Contributor

@shyamnamboodiripad shyamnamboodiripad Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this could return the wrong thing if there happened to be an _ in the value portion of a variable declaration. Could you please add a test for that? (Again, I think having dedicated sub nodes for the variable name and value may make things easier.)

Copy link
Contributor

@shyamnamboodiripad shyamnamboodiripad Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah never mind, I was confused by the naming. I assumed the HttpVariableDeclarationNode encompasses the entire declaration (inlcuding the = and the expression on the RHS). Looks like that is the HttpVariableDeclarationAndAssignmentNode :)

Since HttpVariableDeclarationNode only includes the name portion in addition to the @ prefix, perhaps we could just take this.Text (which will strip away any leading and trailing trivia) and then strip out the first @ character.

In other words, would something like below work?

Suggested change
public string VariableName => string.Join("", ChildTokens.Where(t => t.Kind == HttpTokenKind.Word || (t.Kind == HttpTokenKind.Punctuation && t.Text is "_")).Select(t => t.Text)) ?? string.Empty;
public string VariableName => this.Text.SubString(1);

@@ -194,6 +194,10 @@ private HttpVariableDeclarationNode ParseVariableDeclaration()
{
ConsumeCurrentTokenInto(node);
}
else if(CurrentToken is { Text: "_" })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these three branches perform the same action, you can combine them into this:

if (CurrentToken is { Text: "@" } or { Kind: HttpTokenKind.Word } or { Text: "_" })
{
    ConsumeCurrentTokenInto(node);
}

@jonsequitur jonsequitur merged commit e035978 into dotnet:main Jan 12, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants