Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python Launcher for Windows (py.exe) breaks on non-python shebang line #94399

Closed
CAD97 opened this issue Jun 29, 2022 · 27 comments
Closed

Python Launcher for Windows (py.exe) breaks on non-python shebang line #94399

CAD97 opened this issue Jun 29, 2022 · 27 comments
Assignees
Labels
OS-windows type-bug An unexpected behavior, bug, or error

Comments

@CAD97
Copy link

CAD97 commented Jun 29, 2022

Bug report

Given a non-python shebang line, e.g. #!/usr/bin/env bash, py.exe falls over with

❯ py ./x.py
Unable to create process using '/usr/bin/env bash ./x.py'

py.exe should not be trying to interpret non-python shebang lines on Windows. If py has been invoked (especially if manually invoked rather than implicitly by ftype association), the intent is to interpret the file as a python script.

I originally hit this with rust-lang/rust's x.py script, which is currently using /usr/bin/env bash to attempt to launch Python 3 across all OSes, whether they provide a python executable or just python3. (MSYS makes this more fun, as it does directly interpret and use the shebang line, rather than calling the ftype registered handler.)

Your environment

❯ winget list python
Name            Id                                     Version     Available   Source
-------------------------------------------------------------------------------------
Python 3        Python.Python.3                        3.10.4150.0 3.10.5150.0 winget
Python Launcher {691AAAA1-FE86-4973-8DA2-6AA2B3327562} 3.10.7751.0

Microsoft Windows
Version 21H2 (OS Build 22000.739)

@CAD97 CAD97 added the type-bug An unexpected behavior, bug, or error label Jun 29, 2022
@eryksun
Copy link
Contributor

eryksun commented Jun 29, 2022

The launcher implements the virtual shebang "#!/usr/bin/env <name>" to search the directories in the PATH environment variable for the given name. The search is implemented by WinAPI SearchPathW(). If the name has no extension, SearchPathW() is called for each of the PATHEXT extensions in succession until the file is found. For example, the launcher will search for "bash.COM", "bash.EXE", and so on.

Chances are that a developer has "%SystemRoot%\System32\bash.exe" from WSL. However, the launcher is distributed as a 32-bit application, in which case accessing "%SystemRoot%\System32" gets redirected to the 32-bit system directory "%SystemRoot%\SysWOW64", in which "bash.exe" likely doesn't exist.

If the system happens to have bash from an MSYS2 or Cygwin installation in PATH, then that version of the shell will be found.

@CAD97
Copy link
Author

CAD97 commented Jun 29, 2022

And if no bash is available (as is the case on my machine1), we end up with this error that at the very least reads as if it's trying to run some application named /usr/bin/env. If the shebang is resolved, calling the resolved program is at least reasonable. If the shebang fails to resolve, though, it seems reasonable to (print a warning and) interpret the file with the default python, as running py is a good hint that the user believes the file to be a python script.

Footnotes

  1. Time and time again I'm the Windows dev who pops in to say that unixisms can't be assumed to work on Windows.

@eryksun
Copy link
Contributor

eryksun commented Jun 30, 2022

There is no difference between a py.exe process that's executed manually versus a file association, so any behavior change for the env virtual command would have to apply to all cases.

For the new implementation of the launcher (3.11+), I think it would be okay to use the default Python when env is passed a non-Python command, if that's not already the case. (I haven't had time to review most of the new launcher code in "PC/launcher2.c".) The new behavior could also be corrected to search for "python" commands that include a version specifier, i.e. "python<version><.exe>" (e.g. "python3.11.exe"). SearchPathW() doesn't reliably append the optional lpExtension value (e.g. it will append it to ".\python3.11" but not to "python3.11"), so the ".exe" extension has to be added manually if it's missing. Fall back on a registered command if SearchPathW() fails.

For the old launcher, it at least shouldn't try to pass the env command to CreateProcessW(). The launcher should fail if the search fails, and the error message should be clear. The user may not know whether or how the launcher supports the env virtual command. The documentation doesn't specify the behavior except for env python.

@CAD97
Copy link
Author

CAD97 commented Jun 30, 2022

For the new implementation of the launcher (3.11+), I think it would be okay to use the default Python when env is passed a non-Python command, if that's not already the case. (I haven't had time to review most of the new launcher code in "PC/launcher2.c".)

I glanced at the implementation; launcher2.c falls back to the no-shebang behavior if the shebang is not a python shebang or a registered py.ini command. So the new launcher already fixes this issue / doesn't implement this unexpected behavior 🎉

cpython/PC/launcher2.c

Lines 900 to 921 in 68fb032

} else if (_findCommand(search, command, commandLength)) {
search->executableArgs = &command[commandLength];
search->executableArgsLength = shebangLength - commandLength;
debug(L"# Treating shebang command '%.*s' as %s\n",
commandLength, command, search->executablePath);
} else if (_shebangStartsWith(command, commandLength, L"python", NULL)) {
search->tag = &command[6];
search->tagLength = commandLength - 6;
search->oldStyleTag = true;
search->executableArgs = &command[commandLength];
search->executableArgsLength = shebangLength - commandLength;
if (search->tag && search->tagLength) {
debug(L"# Treating shebang command '%.*s' as 'py -%.*s'\n",
commandLength, command, search->tagLength, search->tag);
} else {
debug(L"# Treating shebang command '%.*s' as 'py'\n",
commandLength, command);
}
} else {
debug(L"# Found shebang command but could not execute it: %.*s\n",
commandLength, command);
}

The user may not know whether or how the launcher supports the env virtual command. The documentation doesn't specify the behavior except for env python.

Which is actually part of my argument that py should assume the script is a python script, and not try to implement env. I know that I at least as a Windows user, where ftype is determined solely by extension/metadata and not by shebang, would expect running ./x.py or especially py ./x.py to always use the python interpreter.

Even on a unixy platform where a shebang is respected for ./x.py, a user can still run e.g. python3 ./x.py, which will interpret as python and not try to dispatch to the shebang-specified interpreter. Any *.py file which cannot be interpreted as Python is going to cause problems one way or another.

The new behavior could also be corrected to search for "python" commands that include a version specifier

launcher2.c already does implement this behavior. A shebang of (vcommand) python{version} will be sliced directly up into calling py -{version}.

SearchPathW() doesn't reliably append the optional lpExtension value

It's documented as "the extension is added only if the specified file name does not end with an extension," so yeah, if the base name contains a ., the caller has to recognize that it's not an extension and append the proper extension manually. (And I'm surprised that .\python3.11 functions for anything at all, given that is not a file name, and in this API you pass the path separately.)

For the old launcher, it at least shouldn't try to pass the env command to CreateProcessW(). The launcher should fail if the search fails, and the error message should be clear.

I just checked the code, and I think this line is the offending one that puts the full shebang line into the command buffer to be run. Just removing that should make maybe_handle_shebang fall through and return (signalling that no shebang is present). Alternatively (or additionally), an additional outparam to parse_shebang is needed for when cp == NULL signals that command lookup failed.

@eryksun
Copy link
Contributor

eryksun commented Jun 30, 2022

Which is actually part of my argument that py should assume the script is a python script, and not try to implement env.

It's a bad idea to change the basic behavior of the old launcher, which has been in place for a long time. But it seems reasonable to clean it up to not call CreateProcessW() if the env search fails and to provide a clearer error message in that case.

Even on a unixy platform where a shebang is respected for ./x.py, a user can still run e.g. python3 ./x.py, which will interpret as python and not try to dispatch to the shebang-specified interpreter.

The launcher wasn't implemented to take a private parameter when it's invoked for a file association, as opposed to explicitly invoked on the command line, so there's no way to reliably distinguish these cases for shebang handling. That said, if you specify a version, the shebang is ignored, e.g. run py -3 .\x.py.

It's documented as "the extension is added only if the specified file name does not end with an extension," so yeah, if the base name contains a ., the caller has to recognize that it's not an extension and append the proper extension manually. (And I'm surprised that .\python3.11 functions for anything at all, given that is not a file name, and in this API you pass the path separately.)

SearchPathW() can be called with qualified paths. This occurs all the time, e.g. as called by the internal implementation of CreateProcessW(). In a search context in Windows, partially qualified paths usually aren't resolved solely (if at all) against the current working directory. For example, "spam\python" will be resolved against each directory in the lpPath search path until a match is found. SearchPathW() will try to append lpExtension only if the name has no extension. The exception, however, is a filename that explicitly refers to the current working directory by beginning with a "." or ".." component, such as ".\python3.11" or "..\python3.11". These cases are resolved only against the current working directory. This is the inconsistent case, because SearchPathW() will try appending lpExtension regardless of whether lpFileName already has an extension (e.g. when searching for ".\python3.11", it can find ".\python3.11.exe" if ".exe" is the optional extension). We have to avoid this inconsistency, and we need to support base filenames that Windows sees as having an extension, e.g. "python3.11". Thus the ".exe" extension must be added manually instead of relying on lpExtension.

@CAD97
Copy link
Author

CAD97 commented Jun 30, 2022

The user may not know whether or how the launcher supports the env virtual command. The documentation doesn't specify the behavior except for env python.

Not to hurt my own case, but

The /usr/bin/env form of shebang line has one further special property. Before looking for installed Python interpreters, this form will search the executable PATH for a Python executable. This corresponds to the behaviour of the Unix env program, which performs a PATH search.

This could be read as documenting that #! /usr/bin/env x searches PATH for the (assumed to be) python executable x. I'd argue that it doesn't, and only the four listed forms are documented as supported, but in the face of the existing implementation, that documentation can be argued to refer to that behavior.

@eryksun
Copy link
Contributor

eryksun commented Jun 30, 2022

This could be read as documenting that #! /usr/bin/env x searches PATH for the (assumed to be) python executable x. I'd argue that it doesn't, and only the four listed forms are documented as supported, but in the face of the existing implementation, that documentation can be argued to refer to that behavior.

I don't think it was the intention to document support for anything but a "python" executable, with some supported extension from PATHEXT (".com", ".exe", etc), per the listed case "/usr/bin/env python". However, the old launcher doesn't limit itself to the documented behavior. It also has strange behavior in refusing to search PATH for versioned "python" names such as "python3", while happily searching for "bash".

@CAD97
Copy link
Author

CAD97 commented Jul 3, 2022

While I should have gathered this from when I skimmed the code: the path lookup behavior isn't limited to /usr/bin/env, but happens if you just write a non-python name. This means that #! py will happily result in an infinite loop of recursively creating new py processes 😄

@zooba
Copy link
Member

zooba commented Jul 29, 2022

If I'm reading this right, I accidentally fixed everything in the new launcher and we don't need this issue anymore? (That's my favourite kind of fix.)

This means that #! py will happily result in an infinite loop of recursively creating new py processes

Yep 😄 Might be worth catching that specific case, but py -3.10 should be fine, so it's not entirely trivial. Spelling it *nix style is better, of course, if only because it'll work on more platforms.

@CAD97
Copy link
Author

CAD97 commented Jul 29, 2022

If I'm reading this right, I accidentally fixed everything in the new launcher and we don't need this issue anymore? (That's my favourite kind of fix.)

Yes. As the new py.exe is currently implemented, it doesn't implement the undocumented behavior of the old py.exe of interpreting shebangs that aren't for python{version}, so means that it will just call #!/usr/bin/env bash with the python interpreter.

(This is imho the correct behavior. Calling py means "launch python, matching the file-requested version", not "launch via shebang, unless there isn't a shebang, in which case use python.")

@eryksun
Copy link
Contributor

eryksun commented Jul 29, 2022

If I'm reading this right, I accidentally fixed everything in the new launcher and we don't need this issue anymore?

The new launcher doesn't implement the search semantics of "/usr/bin/env python". The behavior is documented as follows:

The /usr/bin/env form of shebang line has one further special property. Before looking for installed Python interpreters, this form will search the executable PATH for a Python executable. This corresponds to the behaviour of the Unix env program, which performs a PATH search.

In the old launcher, the virtual "env" command is implemented to work like a PATH search in a CLI shell, by calling SearchPathW() for all PATHEXT extensions. (CreateProcessW() itself calls SearchPathW(), but only with ".EXE" as the optional extension, not all PATHEXT extensions.) The old launcher doesn't search PATH for versioned "python" commands such as "python3" or "python3.10", though I always thought it should, with care taken to add the ".exe" extension for names such as "python3.10" (i.e. ".10" is not an extension). The old launcher's "env" virtual command also searches for any command that doesn't start with "python", which is undocumented behavior that the new launcher doesn't have to implement. I say it's undocumented because the documentation refers to the " /usr/bin/env form of shebang line", which is only listed as "/usr/bin/env python".

@zooba
Copy link
Member

zooba commented Aug 1, 2022

Yeah, from a security POV I'm happier not letting the launcher launch any application just because someone double-clicked on a .py file (though of course that file may well launch it, and likely more reliably, but easier to detect).

At the same time I don't want to exclude non-CPython runtimes from being detected, and only allowing (effectively) python.exe (or PATHEXT) to be searched could do that.

I think on balance, I'm coming down on the "run anything" path, and we recommend against installing or using the launcher in secure contexts. Which I normally recommend against anyway, just to remove a few potential hijacks, so the advice won't change.

To be clear about the semantics:

  • when the shebang line matches /usr/bin/env (\S+)(.+)
  • search each directory on PATH for the first capture, following SearchPath semantics
  • if found, pretend it's a python.exe-like CLI and run it
  • if not, continue as we do today (which means failing if the first capture isn't python.*

@eryksun
Copy link
Contributor

eryksun commented Aug 1, 2022

The old launcher doesn't try ShellExecuteExW(), so using PATHEXT isn't important. Anyway, the old find_on_path() ignores the proper search priority of PATH directories. The correct search has to check for each extension before advancing to the next directory, which can't be implemented with SearchPathW(). I think the launcher should use a single SearchPathW() call, as CreateProcessW() does. If the given name has no extension or ends in a version specifier (e.g. "python3.10-32"), manually add ".exe" instead of using the lpExtension parameter of SearchPathW(). But don't bother to validate the version or binary type of the file if it has a version specifier; assume it's correct. Some Python distributions do include versioned executable names, e.g. ActivePython does.

@zooba
Copy link
Member

zooba commented Aug 2, 2022

@eryksun Would appreciate your review of my PR.

Interestingly, SearchPath also checks the application directory like CreateProcess (which makes sense, though surprised me a little), which needed the env var so that tests wouldn't break in the build directory. Forcing safe search paths seems like an easy decision, but it would be nice to have all the options that LoadLibraryEx has...

@eryksun
Copy link
Contributor

eryksun commented Aug 3, 2022

Interestingly, SearchPath also checks the application directory like CreateProcess

If lpPath is NULL, SearchPathW() uses RtlGetSearchPath(). OTOH, CreateProcessW() uses RtlGetExePath(). By default they're the same.

SearchPathW() default:

>>> import ctypes
>>> ntdll = ctypes.WinDLL('ntdll')
>>> kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)
>>> sp = ctypes.c_wchar_p()
>>> exe_sp = ctypes.c_wchar_p()
>>> os.environ.get('NoDefaultCurrentDirectoryInExePath') is None
True
>>> os.environ['PATH'] = 'PATH'

>>> ntdll.RtlGetSearchPath(ctypes.byref(sp))
0
>>> print(*sp.value.split(';'), sep='\n')
C:\Program Files\Python311
.
C:\Windows\SYSTEM32
C:\Windows\system
C:\Windows
PATH

In the above, "C:\Program Files\Python311" is the application directory. A command that includes one or more path separators (e.g. "dir\command") is resolved against each directory in the search path, not just against the current working directory, and the application directory has precedence.

CreateProcessW() default:

>>> ntdll.RtlGetExePath('command', ctypes.byref(exe_sp))
0
>>> print(*exe_sp.value.split(';'), sep='\n')
C:\Program Files\Python311
.
C:\Windows\SYSTEM32
C:\Windows\system
C:\Windows
PATH  

SearchPathW() safe mode:

>>> kernel32.SetSearchPathMode(1)
1
>>> ntdll.RtlGetSearchPath(ctypes.byref(sp))
0
>>> print(*sp.value.split(';'), sep='\n')
C:\Program Files\Python311
C:\Windows\SYSTEM32
C:\Windows\system
C:\Windows
.
PATH

An application can also pass an explicit lpPath value that further limits the search, e.g. to just the directories in the "PATH" environment variable.

CreateProcessW() safe mode (no implicit current directory):

>>> os.environ['NoDefaultCurrentDirectoryInExePath'] = '1'
>>> ntdll.RtlGetExePath('command', ctypes.byref(exe_sp))
0
>>> print(*exe_sp.value.split(';'), sep='\n')
C:\Program Files\Python311
C:\Windows\SYSTEM32
C:\Windows\system
C:\Windows
PATH

The current directory must remain in a CreateProcessW() search if the command contains one or more backslashes:

>>> os.environ['NoDefaultCurrentDirectoryInExePath'] = '1'
>>> ntdll.RtlGetExePath('dir\\command', ctypes.byref(exe_sp))
0
>>> print(*exe_sp.value.split(';'), sep='\n')
C:\Program Files\Python311
.
C:\Windows\SYSTEM32
C:\Windows\system
C:\Windows
PATH

@zooba
Copy link
Member

zooba commented Aug 3, 2022

Okay, so I could just load up PATH and pass it in directly to avoid the extra searches. I'll have to think about whether that's preferable to the default behaviour or not, since it would only affect cases that we don't construct ourselves. But it might be helpful if e.g. you copy py.exe into a venv Scripts directory to have it prefer the venv even if it's not activated.

@zooba
Copy link
Member

zooba commented Aug 3, 2022

Decided to go with the PATH-only search, which means I had to revise the tests. Also discovered a few tweaks were needed to properly handle converting lines like /usr/bin/python3.12_d.exe (optional _d and .exe) into py -3.12 equivalent. (We'll lose the debug option, but at least will still find something. Running py_d.exe will get you a debug runtime.)

miss-islington pushed a commit to miss-islington/cpython that referenced this issue Aug 3, 2022
…'/usr/bin/env' shebang lines (pythonGH-95582)

(cherry picked from commit 67840ed)

Co-authored-by: Steve Dower <steve.dower@python.org>
zooba added a commit that referenced this issue Aug 3, 2022
@CAD97
Copy link
Author

CAD97 commented Aug 3, 2022

I still personally think py shouldn't be in the business of interpreting non-python shebangs, but I'll defer to y'all here. Especially since pulling a python from PATH (and especially venv) is reasonable, and there's no good way to say "is this executable a python interpreter."

As a compromise: would it be reasonable to print a warning if launching an interpreter that is not python{}? (E.g. #!/usr/bin/env bash would print a warning and then call bash on the file, assuming e.g. WSL bash is available.)

miss-islington added a commit that referenced this issue Aug 3, 2022
…bin/env' shebang lines (GH-95582)

(cherry picked from commit 67840ed)

Co-authored-by: Steve Dower <steve.dower@python.org>
@zooba
Copy link
Member

zooba commented Aug 4, 2022

As a compromise: would it be reasonable to print a warning if launching an interpreter that is not python{}?

Also not pypy, and not jython, and not ipy, and not ... etc, etc. There's a decent enough range of legitimate interpreters that aren't named python that I wouldn't feel comfortable relegating them to some kind of second-class "you get warned if you run this" status.

If someone creates a .py file and gives it a shebang to call bash, that person is lying about what their file does :-) But they could just as easily use /python and then immediately call bash anyway, so it doesn't really provide any additional security to filter the shebang line.

@zooba zooba closed this as completed Aug 4, 2022
@CAD97
Copy link
Author

CAD97 commented Aug 4, 2022

For completeness sake, the reason rust-lang/rust's x.py currently uses a bash shebang line is that python doesn't work anymore on many distros due to the python2/python3 issue.

x.py would like to run as "python3 if available, otherwise python2," and only requiring either to be present. Since python is often not available (or a shim saying to use python2 or python3 or to apt get python-is-pythonN if you're lucky), the remaining somewhat portable way to get the behavior is a bash polyglot prefix which calls the first available option of py -3, python3, python, python2. Unfortunately, on windows with py, this has surprising results, as

  • if a cygwin bash is available, it works;
  • if a WSL bash is available, it "works" but now you're running under WSL instead of native; and
  • if bash is not available, the old launcher gives a horrible error, or the new launcher (I think?) falls back to just trying the default python.

x.py is likely changing again to have a python3 shebang line and providing separate x.sh and x.ps1 scripts to do the python interpreter sniffing, and imho this is the better solution than a bash polyglot prefix. (As such, no further action is required; this issue can be considered resolved, both in terms of the new launcher having a better failure case and rust x.py no longer being bit by surprise WSL.) But it still stands that a somewhat reasonable path of getting #! python-style "whichever python is available" leads to a bash shebang and hitting this behavior in py which can be unexpected.

It'd be nice to call out in the docs that a env shebang will call whatever is looked up in PATH even if it isn't a python to hopefully make the landing a little smoother if some other project hits this in the future.

@zooba
Copy link
Member

zooba commented Aug 4, 2022

It'd be nice to call out in the docs that a env shebang will call whatever is looked up in PATH even if it isn't a python to hopefully make the landing a little smoother if some other project hits this in the future.

That would be a nice little tweak (edit as a new issue). Though it should be noted that it definitely assumes it's a Python interpreter, and will launch and pass arguments to it as if it is a Python interpreter. It just doesn't have any way to verify whether it's actually a Python runtime or not (at least until it fails when it doesn't know how to launch a .py file).

@ChrisDenton
Copy link

Hm, could there be a simple way for Python interpreters to register with the py launcher on install? This wouldn't be a security feature but would act as a sanity check.

@zooba
Copy link
Member

zooba commented Aug 8, 2022

Yes, it's PEP 514, and it already works. I can't see us extending venv to use it though, as those are usually meant to be transient, and will just be deleted by users (which won't clean up the registry).

@CAD97
Copy link
Author

CAD97 commented Aug 13, 2022

That would be a nice little tweak (edit as a new issue).

Followup filed as gh-95946 with initial attempted wording at gh-95947.

@Khold6458
Copy link

Sorry if this isn't the best place to post. It appears I ran into some behavior that was changed and is closely related to the discussion here.

#! C:\full\path\to\python used to work to run a script with a specific python executable. It appears this is nowhere in the documentation, but there are various internet posts that include advice to use this behavior to specify a python.exe within a venv, for example #! C:\venv\Scripts\python.

I think there is mention of this exact use case in this issue but I don't quite understand it:

it might be helpful if e.g. you copy py.exe into a venv Scripts directory to have it prefer the venv even if it's not activated.

Should this behavior be restored? In either case, what are some alternative (perhaps better or more correct?) ways to achieve this?

@zooba
Copy link
Member

zooba commented Oct 25, 2022

A new issue would be best, yeah.

I'm surprised that I missed this, to be honest, it's something that I've definitely used before. I don't think there's any harm in adding it back though.

@eryksun
Copy link
Contributor

eryksun commented Oct 25, 2022

#! C:\full\path\to\python used to work to run a script with a specific python executable. It appears this is nowhere in the documentation, but there are various internet posts that include advice to use this behavior to specify a python.exe within a venv, for example #! C:\venv\Scripts\python.

A shebang with a fully-qualified executable should be supported by the launcher. You should be able to associate a script with a particular virtual environment. However, that's a separate issue from the discussion about supporting the "/usr/bin/env" virtual command. We need a new issue for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OS-windows type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants