Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add syntax highlight to datafusion-cli #8918

Merged
merged 5 commits into from
Jan 22, 2024
Merged

Conversation

trungda
Copy link
Contributor

@trungda trungda commented Jan 19, 2024

Which issue does this PR close?

#8701

Rationale for this change

More user-friendly and readability when using datafusion-cli.

What changes are included in this PR?

Add SQL syntax highlighting capability for datafusion-cli: SyntaxHighlighter.

Are these changes tested?

  1. Added unit tests to validate the ANSI string generated;
  2. Manually tested.
    Screenshot from 2024-01-19 14-29-03

Are there any user-facing changes?

Users of datafusion-cli.

@trungda trungda changed the title use tokenizer Add syntax highlight to datafusion-cli Jan 19, 2024
@trungda
Copy link
Contributor Author

trungda commented Jan 19, 2024

The current implementation is not the best. If the input string is invalid, e.g., unterminated: select * from tab_1 where col_1 = ', the whole string will be _un_highlighted. I think a better approach is to make change to sqlparser crate to expose a token iterator just so we can keep highlighting as we scan through the original string, and only stop when we encounter a failure. Something along this line: https://github.com/sqlparser-rs/sqlparser-rs/blob/main/src/tokenizer.rs#L555-L558.

I am happy to go down that path too. Let me know :)

On that note, can we add datafusion-cli to members attribute in [workspace] just so it will be recognized by the rust-analyzer plugin in IDEs?

@trungda trungda marked this pull request as ready for review January 19, 2024 22:41
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really love this one, thanks @trungda
I checked it in the codespace, it has some bugs mostly on when you type a key word and then press left arrow.

@comphead
Copy link
Contributor

image

@comphead
Copy link
Contributor

image

@trungda
Copy link
Contributor Author

trungda commented Jan 20, 2024

image

Thanks for the feedback @comphead . What is your terminal setup? I am not able to repro yet. The current approach is to take the whole input string and tokenize the input as a whole, the arrow keys shouldn't matter.

Screenshot from 2024-01-20 13-38-32

@Weijun-H
Copy link
Member

Awesome feature! Thanks, @trungda 👍 . I am wondering if could make the default color lighter because the current version is hard to track. Maybe you could consider the DuckDB setting.
image

@alamb
Copy link
Contributor

alamb commented Jan 21, 2024

I am happy to go down that path too. Let me know :)

That would be great -- thank you @trungda (this is somewhat related); apache/datafusion-sqlparser-rs#1094

On that note, can we add datafusion-cli to members attribute in [workspace] just so it will be recognized by the rust-analyzer plugin in IDEs?

The reason it is not part of the workspace is that as datafusion-cli we want to release a binary and thus have the versions locked (via a checked in Cargo.lock) However, for the datafusion crates we release them as a library where having Cargo.lock is not the recommended.

This is not clear I will add it to the README

BTW what I do is to "open" the datafusion/datafusion-cli project as its own top level project in my IDE (rather than open datafusion)

Update: #8938

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @trungda -- I tried this out and it was great.

Once CI passes I think we could merge it.

In terms of changing the default colors / improving the highlighting, it would be great to improve this PR but also I think we can iterate in the future as well

Another feature that would be neat would be a CLI flag to enable/disable the highlighting so people who find the color scheme not to their liking could turn it off

datafusion-cli/src/highlighter.rs Show resolved Hide resolved
datafusion-cli/src/highlighter.rs Show resolved Hide resolved
@trungda
Copy link
Contributor Author

trungda commented Jan 22, 2024

Thanks @alamb ! I've updated the license + changed the color to be lighter as suggested.

That would be great -- thank you @trungda (this is somewhat related); apache/datafusion-sqlparser-rs#1094

This is great! I'll look into that. It'd be easier if somehow we know the position in the original string where the tokernization fails just so we can just copy from that position to the highlighted string.

Another feature that would be neat would be a CLI flag to enable/disable the highlighting so people who find the color scheme not to their liking could turn it off

Sounds great! Let me look into that after this PR.

Copy link
Member

@Weijun-H Weijun-H left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @trungda

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks @trungda We can fix highlighting in following PRs

@alamb alamb merged commit f39841b into apache:main Jan 22, 2024
22 checks passed
@trungda trungda deleted the cli-syntax-highlight branch January 22, 2024 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants