Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List the token kinds in child documentation in SyntaxNodes #2168

Merged

Conversation

natikgadzhi
Copy link
Contributor

Motivation

Changes

  • Changes Child.documentation to be a computed property that wraps documentation provided when declaring the child, with the list of possible token choices. This should be extensible in the future to add more documentation for other child kinds as well.
  • Provides additional func in GrammarGenerator to generate token choices in a list.
  • Includes the generated code in the same commit.

Copy link
Contributor Author

@natikgadzhi natikgadzhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahoppen, @kimdv, here's the first attempt!

This pull request handles .token children, but not .keyword yet. I would prefer to keep this small and get this one merged, then immediately follow up with .keyword (to cover #1987 (comment)) for easier review.

How do you feel about the approach? What can I improve?

@natikgadzhi natikgadzhi force-pushed the codegeneration/children-token-choices branch 2 times, most recently from 3bf5a7f to 5344894 Compare September 8, 2023 05:50
@Matejkob Matejkob mentioned this pull request Sep 8, 2023
13 tasks
Copy link
Contributor Author

@natikgadzhi natikgadzhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahoppen ready for a quick review.

I'm not 100% sold on putting the templating code in Child, but maybe it's good for now — perhaps we can merge this, and then add .kind comment to, and then I could move the template logic out.

The initial reason I've put this code in Child is because in SyntaxNodeFile, we can do let documentation = ... and then render it in the builder. But, we're iterating over children already in the result builder, so not like we can do some free-standing code in there.


/// A docc comment describing the child, including the trivia provided when
/// initializing the ``Child``, and the list of possible token choices inferred automatically.
public var documentation: SwiftSyntax.Trivia {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part that I don't quite like about this is that we're essentially templating the doc comment for Child in here. But for Node, it's in templates/SyntaxNodeFile.swift. So this feels like we have a way of storing templates — in templates, but this sort of dilutes that distinction. "Don't put logic in your views", as we say in the web apps world.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have quite a bit of logic in the templates (I’m not even sure if templates really is the right term here but it’s what we have right now)… My take on this has always been that it’s not too bad because we control the entire pipeline (no other clients of the model and we even control the inputs) and thus we can get away with a slightly more sloppy separation of concerns.

We could make this part of GrammarGenerator and then stitch the documentationSummary and the Tokens section together in SyntaxNodesFile.swift and SyntaxTraitsFile.swift.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think perhaps this is a little bit bigger than one occurrence. The pattern is as follows:

  • A file in templates/, let's say SyntaxNodesFile.swift has a result builder that renders the code. Suddenly, you need to generate something that is not easily available as an object property or a function on a Node or a Child.
  • You can't comfortably put that code in the result builder, so you shove that code somwehere.
  • So far, I've seen extensions of LayoutNode (in the other PR of mine) and now I'm doing a function on Child.

That feels slightly unorthodox, but not too bad:

  • templates should be as easy to read as possible. Sure, we're not SwuftUI here, but the closer we can get in terms of readability, the better.
  • Node and Child and other definitions should ideally contain the information you need to generate the code, but not the generation code itself — it's like the view layer and data layer to me.
  • Seems like Utils / extensions solve that problem partially, but since we're extending Child / Node, it feels like we're injecting view code in the model.

I'm uncertain if my opinion is valid — I might not have enough exposure and experience. But so far, my idea is something like:

  • Add templates/helpers/*.Swift
  • Make enums or structs that you can't instantiate with static functions that are clean. They take the object in, and render syntax out. They can only depend on their arguments, without side effects.

@ahoppen want me to start a discussion / post on forums on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we aren’t super strict on the distinction between the pure data layer and the view layer in CodeGeneration and you probably notice this a lot more than I do because I haven’t been doing much view-based / UI development in a few years, where you think a lot more about that separation.

But then also, the distinction is a little bit muddy IMO. Just taking the documentation as an example. I think we both agree that the documentation should be a property of a node/child but should that documentation already contain the generated Token section? I think you can argue in either direction.

That being said, if you have ideas how to make that distinction clearer, I’m all in favor for them. One thing that’s important to me, though, is that we don’t introduce too many new abstraction layers that make the code less readable. Since CodeGeneration is still fairly reasonable in size and I don’t expect it to grow too much faster in the future, I would prefer an unorthodox decision here and there to an overengineered system of abstraction layers.

CodeGeneration/Sources/SyntaxSupport/Child.swift Outdated Show resolved Hide resolved
@natikgadzhi natikgadzhi force-pushed the codegeneration/children-token-choices branch from 8ee7df4 to 6bc0968 Compare September 8, 2023 20:44
@@ -908,6 +945,9 @@ public struct ActorDeclSyntax: DeclSyntaxProtocol, SyntaxHashable, _LeafDeclSynt
}
}

/// ### Tokens
///
/// For syntax trees generated by the parser, this is guaranteed to be the following kind: `'actor'`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kimdv, the actor keyword doc showed up!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make it actor and remove single quote

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed that in here: #2168 (comment)

But what I missed is I've put backticks back, but left single quotes in.

To confirm, what I want to do here is:

/// For syntax trees generated by the parser, this is guaranteed to be of kind \actor`.`

Sounds good?

Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could merge this as is but if you want to clean the code a little bit more, I’ve got a couple suggestions inline.


/// A docc comment describing the child, including the trivia provided when
/// initializing the ``Child``, and the list of possible token choices inferred automatically.
public var documentation: SwiftSyntax.Trivia {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have quite a bit of logic in the templates (I’m not even sure if templates really is the right term here but it’s what we have right now)… My take on this has always been that it’s not too bad because we control the entire pipeline (no other clients of the model and we even control the inputs) and thus we can get away with a slightly more sloppy separation of concerns.

We could make this part of GrammarGenerator and then stitch the documentationSummary and the Tokens section together in SyntaxNodesFile.swift and SyntaxTraitsFile.swift.

@@ -121,6 +127,11 @@ public protocol EffectSpecifiersSyntax: SyntaxProtocol {
set
}

/// ### Tokens
///
/// For syntax trees generated by the parser, this is guaranteed to be the following kinds:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would read slightly nicer if this was

Suggested change
/// For syntax trees generated by the parser, this is guaranteed to be the following kinds:
/// For syntax trees generated by the parser, this is guaranteed to be one of the following kinds:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me factor this in, I agree. Then, I would ask to merge this, with a condition that I start an issue and follow up on extracting repetitive helper / template code into helpers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me

Comment on lines 119 to 124
// Grab the documentation secions, filter out empty ones, and join them
// with an empty documentation line inbetween.
return [documentationSummary, tokenChoicesTrivia]
.filter { !$0.isEmpty }
.joined(separator: separator)
.reduce(SwiftSyntax.Trivia(), { $0.appending($1) })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the third copy of this code. I think it would be nice if we could extract this to something like

extension Trivia {
  init(joiningDocComments: [Trivia]) {  }
}

And maybe that would be motivation to extract the generation of the Token section to GrammarGenerator so that we can put that extension on Trivia in the generate-swift-syntax module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly! Triggered me, too: #2168 (comment)

I'll try to choose the right spot for it. There's a helper that generates Trivia from String in Utils — I think I will take that helper too, and move it + this new helper in CodeGeneration/SwiftSupport/Helpers or CodeGeneration/Utils/Helpers, with docc comment on helpers enum that outlines why they are here.

Should be a nice proof of concept. Let me think for a bit whether that should be a separate PR though — I think the discussion on it might take a little bit, and I want to scope my attention 🙃

@natikgadzhi natikgadzhi force-pushed the codegeneration/children-token-choices branch 2 times, most recently from 1653444 to 2862db3 Compare September 12, 2023 00:12
Copy link
Contributor Author

@natikgadzhi natikgadzhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost ready! One more cleanup for consistency in node.documentation and I'll try to convince you to merge it ;)

I think this is good to go. There's additional cleanup that is possible (aligning Node.documentation and Child.documentation), but they're good enough as is I think, and I can return to them later if that's okay with you.

@@ -22,24 +22,12 @@ import Utils
func syntaxNode(nodesStartingWith: [Character]) -> SourceFileSyntax {
SourceFileSyntax(leadingTrivia: copyrightHeader) {
for node in SYNTAX_NODES.compactMap(\.layoutNode) where nodesStartingWith.contains(node.kind.syntaxType.description.first!) {
let documentationSections = [
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is clean, there's a way to make this even nicer — we can rename Node.documentation to Node.documenationSummary and make a computed property documentation that works like Child.documentation and internally joins them together.

I'll push that change in a bit to this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. In particular, it would be nice to be consistent with how we stitch together the comment sections.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we agree on this, then perhaps I should include that cleanup in this PR.

I'm hesitant to bring new abstractions, or even move things into Grammar yet — I want to understand it's place better. But what we can do, is we can absolutely make Node and Child consistent with each other. I'm clear on how to do that, and have everything to prepare a commit.

from: """
### Tokens

For syntax trees generated by the parser, this is guaranteed to be one of the following \(choices)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, maybe I should have been cleaner. In the singular, it should be “this is guaranteed to be the following kind” (“one of” should only be there in the plural form). Overall I think it might be easiest to just have two completely separate branches for the singular and plural case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦🏼 that's my mistake — your comment was perfectly clear, I was rushing to get this in, and I need to learn to pace myself.

I'll clean it up.

@@ -22,24 +22,12 @@ import Utils
func syntaxNode(nodesStartingWith: [Character]) -> SourceFileSyntax {
SourceFileSyntax(leadingTrivia: copyrightHeader) {
for node in SYNTAX_NODES.compactMap(\.layoutNode) where nodesStartingWith.contains(node.kind.syntaxType.description.first!) {
let documentationSections = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. In particular, it would be nice to be consistent with how we stitch together the comment sections.

@natikgadzhi
Copy link
Contributor Author

I've just pushed up a commit that addresses a couple of review items:

  • @kimdv, I dropped single quotes from token choices list.
  • @ahoppen, hopefully I've made the wording cleaner. I like it more this way.
  • I moved some of the string generation into GrammarGenerator for now.

TODOs:

  • I'm deliberately not doing further refactoring / extraction of generation code, or extra documentation on what should go into GrammarGenerator in the future — I would love to work more on related problems to learn and listen to users first. Are we okay with that for now?
  • I noticed that if the child already has documentation screen passed from the declaration, it might look awkward, like this:
    /// The `@` sign.
    ///
    /// ### Tokens
    /// 
    /// For syntax trees generated by the parser, this is guaranteed to be `@` 
    public var atSign: TokenSyntax {
    
    @ahoppen, are we okay with these, and will remove manual comment? We could make some logit that omits the generated comment but only if it's a single choice and if child.documentation is not empty, but that's hacky.
  • This needs a rebase now.

Looking to get your thumbs up on the above, but will do a rebase in a sec to make the PR cleaner.

For children of syntax nodes that are TokenSyntax, list the kinds of
tokens it can contain. Fixes swiftlang#1987.
- Moves complexity of generating the comment into `Child.swift` and uses
  `grammar.grammar()` directly.
- Appends a `TriviaPiece.newlines(1)` between trivia pieces when merging
  with the original comment.
- Uses approach similar to `SyntaxNodesFile.swift` to concatenating
  documentation pieces.
@natikgadzhi natikgadzhi force-pushed the codegeneration/children-token-choices branch from a61bb8d to c59ce25 Compare September 15, 2023 01:25
// We are actually handling this node now
try! StructDeclSyntax(
"""
// MARK: - \(node.kind.syntaxType)

\(documentation)
\(SwiftSyntax.Trivia(joining: [node.documentation, node.grammar, node.containedIn]))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahoppen I think you told me there's this code in three spots, but I only caught two: here, and in Child.swift. Where's the third one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s another one in SyntaxCollectionsFile.swift:21

Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor nitpick, otherwise looks good to merge to me.

return """
### Tokens

For syntax trees generated by the parser, this is guaranteed to be \(grammar.grammar(for: choices.first!))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And a very minor nitpick.

Suggested change
For syntax trees generated by the parser, this is guaranteed to be \(grammar.grammar(for: choices.first!))
For syntax trees generated by the parser, this is guaranteed to be \(grammar.grammar(for: choices.first!)).

@ahoppen
Copy link
Member

ahoppen commented Sep 16, 2023

  • I'm deliberately not doing further refactoring / extraction of generation code, or extra documentation on what should go into GrammarGenerator in the future — I would love to work more on related problems to learn and listen to users first. Are we okay with that for now?

Yes, let’s leave any further refactorings to follow-up PRs.

  • I noticed that if the child already has documentation screen passed from the declaration, it might look awkward, like this:

    /// The `@` sign.
    ///
    /// ### Tokens
    /// 
    /// For syntax trees generated by the parser, this is guaranteed to be `@` 
    public var atSign: TokenSyntax {
    

    @ahoppen, are we okay with these, and will remove manual comment? We could make some logit that omits the generated comment but only if it's a single choice and if child.documentation is not empty, but that's hacky.

Yes, that’s OK

- Improves readability by omitting single quotes inside backticks
  (@kimdv)
- Cleans up the language on possible token choices (@ahoppen)
- Moves token choices generation into `GrammarGenerator`, but leaves the
  documentation property in `Child` so it's consistent with how `Node`
  works.
- Regenerated sources with the changes after this cleanup, and after
  rebaising on main.
@natikgadzhi natikgadzhi force-pushed the codegeneration/children-token-choices branch from c59ce25 to be28272 Compare September 16, 2023 20:24
@natikgadzhi
Copy link
Contributor Author

@ahoppen, thank you for showing up and reviewing on a Saturday! Cleaned up the nit, and cleaned up SwiftCollectionsFile to align. Then ran the generator to ensure the generated sources stay intact.

@ahoppen
Copy link
Member

ahoppen commented Sep 17, 2023

@swift-ci Please test

@ahoppen
Copy link
Member

ahoppen commented Sep 17, 2023

Looks like the code isn’t formatted correctly. Could you run format.py?

We've introduced `Trivia(joining: [Trivia], separator: [Trivia])` to
generate doc comments from multiple pieces, and already use this
approach in `SyntaxNodesFile` for nodes themselves, and for node
children documentation.

This commit switches from joining pieces manually to using
`Trivia(joining:)` in `SyntaxCollectionsFile` so that we use the same
approach everywhere.
auto-merge was automatically disabled September 17, 2023 18:45

Head branch was pushed to by a user without write access

@natikgadzhi natikgadzhi force-pushed the codegeneration/children-token-choices branch from be28272 to 4b76a66 Compare September 17, 2023 18:45
@natikgadzhi
Copy link
Contributor Author

natikgadzhi commented Sep 17, 2023 via email

@ahoppen
Copy link
Member

ahoppen commented Sep 18, 2023

@swift-ci Please test

@kimdv
Copy link
Contributor

kimdv commented Sep 18, 2023

@swift-ci please test windows

@ahoppen ahoppen merged commit c937cb7 into swiftlang:main Sep 18, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

For children of syntax nodes that are TokenSyntax, list the kinds of tokens it can contain
3 participants