Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix skipping Jupyter cells with unknown %% magic #4462

Merged
merged 3 commits into from
Sep 20, 2024

Conversation

AleksMat
Copy link
Contributor

@AleksMat AleksMat commented Sep 18, 2024

Description

This CL fixes an inconsistency of when to skip formatting of Jupyter cells with unknown magic methods.

Black correctly skips formatting in this case:

%%unknown_custom_magic
<code in another language that black shouldn't format>

However if the cell started with some empty lines black still tried to format it and failed:



%%unknown_custom_magic
<code in another language that black shouldn't format>

Also if the cell started with a comment black tried to format it:

# Some comment
%%unknown_custom_magic
<code in another language that black shouldn't format>

(This example isn't possible in normal Jupyter notebooks but it works in Colab.)

This PR ensures that black will skip formatting cells with a custom magic function which is preceded by any number of empty lines and lines with comments.

Additional notes:

  • I moved validate_cell function from __init__.py to handle_ipynb_magics.py module because it belongs there. (It also makes my maintenance of Pyink fork a bit easier if there are less things in __init__.py module.) But if there is a reason to keep it in __init__.py I can move it back.

  • Function _get_code_start could also be implemented like this:

    def _get_code_start(src: str) -> str:
        match = re.search(r"^[\s\n]*([^#\s\n].*)$", src, flags=re.M)
        return match.group(1) if match else ""

    It might be a bit more efficient but it is also less readable. Let me know if you would prefer this implementation.

Checklist - did you ...

  • Add an entry in CHANGES.md if necessary?
  • Add / update tests if necessary?
  • Add new / update outdated documentation?

start of the line and returns it. If such line doesn't exist, it returns an
empty string.
"""
for match in re.finditer(".+", src):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for match in re.finditer(".+", src):
for match in src.splitlines():

Any reason not to use this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be a slight performance difference because re.finditer will stop splitting string once the first line with code is found while src.splitlines() will always split the entire string by lines. But I haven't benchmarked anything and this probably won't affect much the performance of the entire formatting process.

I'm also ok with switching to src.splitlines() if that is the preferred way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I guess this is fine.

@JelleZijlstra JelleZijlstra merged commit 8d9d18c into psf:main Sep 20, 2024
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants