Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in 3.12.5 causing non-deterministic failures in Black #123821

Closed
hauntsaninja opened this issue Sep 7, 2024 · 8 comments
Closed

Regression in 3.12.5 causing non-deterministic failures in Black #123821

hauntsaninja opened this issue Sep 7, 2024 · 8 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker type-bug An unexpected behavior, bug, or error

Comments

@hauntsaninja
Copy link
Contributor

hauntsaninja commented Sep 7, 2024

Bug report

Bug description:

#122063 causes Black to fail non-deterministically.

To reproduce:

printf 'import argparse
import string


S1 = f"{argparse.OPTIONAL[:9]=}"
S2 = (string.ascii_uppercase + string.ascii_lowercase)
' > test.py

python3.12 -m pip install --no-cache-dir --upgrade black --target tmpenv
for i in $(seq 1 10); do PYTHONPATH=tmpenv ./python.exe -m black --check --diff test.py; done

You now non-deterministically get:

error: cannot format test.py: INTERNAL ERROR: Black produced code that is not equivalent to the source.  Please report a bug on https://github.com/psf/black/issues.  This diff might be helpful: /var/folders/0y/.../T/....log

Note I think you need a non-debug build and probably a Mac

cc @pablogsal

CPython versions tested on:

3.12

@hauntsaninja hauntsaninja added type-bug An unexpected behavior, bug, or error release-blocker labels Sep 7, 2024
@hauntsaninja
Copy link
Contributor Author

Also cc @JelleZijlstra

@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Sep 7, 2024

Repro that doesn't depend on Black:

import ast
import sys

PROG = """import argparse
import string


S1 = f"{argparse.OPTIONAL[:9]=}"
S2 = (string.ascii_uppercase + string.ascii_lowercase)
"""

outputs = set()
i = 0
while len(outputs) < 2:
    output = ast.dump(ast.parse(PROG), indent=4)
    outputs.add(output)
    i += 1
    print(f"{i} iterations...")

outputs = list(outputs)
print(len(outputs), "different outputs")

print(outputs[0])
print("\n\n\n")
print(outputs[1])
print("\n\n\n")

import difflib
sys.stdout.writelines(
    difflib.unified_diff(
        outputs[0].splitlines(keepends=True),
        outputs[1].splitlines(keepends=True),
        fromfile="output1", tofile="output2"
    )
)

On Mac I can reproduce the error^ on ff3bc82 (v3.12.5) and a9daa4f, but on Linux repro is much rarer (I think I did get one, but maybe I've confused myself)

@picnixz picnixz added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Sep 7, 2024
@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Sep 7, 2024

Okay, was getting some confusing bisect results, but I think this is fixed on latest 3.12 by #123265

hauntsaninja added a commit to hauntsaninja/black that referenced this issue Sep 7, 2024
Fixes psf#4446
See python/cpython#123821

It's possible this is too strict? We could instead do this anytime the
AST safety check fails, but feels weird to have that happen
non-deterministically
@hauntsaninja hauntsaninja closed this as not planned Won't fix, can't repro, duplicate, stale Sep 7, 2024
hauntsaninja added a commit to psf/black that referenced this issue Sep 7, 2024
Fixes #4446
See python/cpython#123821

It's possible this is too strict? We could instead do this anytime the
AST safety check fails, but feels weird to have that happen
non-deterministically
@pablogsal
Copy link
Member

Okay, was getting some confusing bisect results, but I think this is fixed on latest 3.12 by #123265

I'm a bit confused. That should not be able to affect black at all. It's just initializing the tokeniser structure consistently with zeros

@JelleZijlstra
Copy link
Member

The bug affected Black's AST safety check, which parses the code before and after formatting to see if it's the same. So if the uninitialized memory in the tokenizer got used somehow, it's plausible that it could affect Black. I'd also expect other code that does ast.parse to occasionally behave erratically.

@pablogsal
Copy link
Member

Ahhhhh, I misunderstood that this PR was the regression instead that this PR was the fix!

@pablogsal
Copy link
Member

Man time to setup a valgrind Buildbot 👀

@pablogsal
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker type-bug An unexpected behavior, bug, or error
Projects
Development

No branches or pull requests

4 participants