Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow custom location for .venv #371

Closed
PhilipVinc opened this issue Jul 17, 2023 · 25 comments
Closed

Allow custom location for .venv #371

PhilipVinc opened this issue Jul 17, 2023 · 25 comments

Comments

@PhilipVinc
Copy link

I wish there was a way to specify a folder where rye should keep all virtual environments, and provide a simple command to allow activating that environment.

For example, this could be an env variable or configuration flag such as

RYE_VIRTUALENV_DEPOT="~/Documents/rye-envs

and they could be stored in a directory names as NAME-HHH where NAME is the name of the project and HHH is some hash of the path, for example.
(I am aware that this means that changing the path of the project would make rye create a new venv... but I don't have great alternatives to mind).

Why?

I keep all of my clones of repositories where I work on in a dropbox folder that is synced between my laptop and my workstation. That way I don't need to remind myself about committing my work every time I switch computer, and I have my stashes with me at all times.

However, my workstation and laptop have different architectures (Mac and linux) so I cannot keep the virtual environments in the same folder.

@cnpryer
Copy link
Contributor

cnpryer commented Jul 17, 2023

I'm curious of this. What would be needed to convince you that local .venv is better? (I'm not saying it is)

@PhilipVinc
Copy link
Author

PhilipVinc commented Jul 18, 2023

Maybe let me say something more about my setup.
I use direnv + pyenv to setup my projects.
In short, I have an .envrc file in my project directories that specifies the version of python I want to use in this folder, and the name of the virtual environment (which I have to specify by hand). This normally all goes to a .venv folder in the current workspace, but you can override a global env variable and set it up such that all such environments are located in a specific folder. I set it up such that it is located in ~/Documents/python_envs/{env-name}/{env-version}.

cat .envrc
layout pyenv-venv 3.11.2 netket

This allows me to cd into the workspace and direnv will activate the correct environment (and create it if needed). I then simply install dependencies with poetry, pip or another package manager.
That's where I would like to use a more advanced package manager such as rye to make this process smoother.

My necessity is keeping in sync my workspace among different computers with different architectures with minimal friction.
If there is a better approach than what I am doing that marries with rye, I would be happy to know it, but I feel like there is no point in discussing whether I should connect to a machine remotely or whatnot.

By the way, about storing 'binary' blobs and installed packages not in the project folder but somewhere else, this is not terrible for isolation. This is exactly what Julia's package manager does, which is designed such all package data is kept in a global folder (~/.julia/depot) and only the Project and Manifest files (in python language, only the pyproject and lock file) are kept in the working directory.

@mitsuhiko
Copy link
Collaborator

One thing I was considering was adding a way to have .venv be a symlink to managed location. I am not sure if getting rid of .venv is a good idea in general as it means that you need to consider different setups again which makes documentation and everything around it harder. But I do understand the general desire.

One way to square this would be to have a flag in the rye config that turns on out-of-tree virtualenvs where only a symlink is placed and the actual virtualenv is placed elsewhere.

@PhilipVinc
Copy link
Author

A symlink would not be compatible with my personal setup, because I sync those folders with dropbox (which will kill the symlink). In general, I would prefer to keep the virtual environment outside of folders I sync on the internet but of course that's just my personal point of view.

Out of curiosity, what would be the added complication of having a global depot for the virtual environments, beside having to add and maintain a sort of rye activate function that activates the local virtual environment ?

@mitsuhiko
Copy link
Collaborator

mitsuhiko commented Jul 19, 2023

because I sync those folders with dropbox (which will kill the symlink)

Does Dropbox kill symlinks? I thought they are retained.

Out of curiosity, what would be the added complication of having a global depot for the virtual environments, beside having to add and maintain a sort of rye activate function that activates the local virtual environment?

If rye goes down the path of too many different incompatible setups it means every user, script interacting with a rye managed environment needs to start learning about all of those peculiarities. I'm not a huge fan of needing virtualenvs at all but the goal is to as much as possible have a consistent setup. I do realize though that it might be entirely impossible to avoid.

@Zander-1024
Copy link
Contributor

Why don't you try to get dropbox to ignore this folder?

@PhilipVinc
Copy link
Author

I can't, because I have one such folder in every python project I work on (~several tens) and would have to do this on every device.

@doolio
Copy link

doolio commented Aug 29, 2023

I use direnv + pyenv to setup my projects.

cat .envrc
layout pyenv-venv 3.11.2 netket

Shouldn't it just be layout pyenv 3.11.2? Why pyenv-venv? In any case, rye removes the need of pyenv. FYI, as an alternative to pyenv there is asdf or rtx which use pyenv under the hood for python but can be used for other languages/tools.

Alternatives to rye that allows both centrally located environments and environments in the project root that you did not mention above are hatch and pdm. Unlike rye though they don't install different python versions for you so you would still need something like pyenv. Unlike Poetry and Hatch, but like rye PDM is not limited to a specific build backend; users have the freedom to choose any PEP621 compliant build backend they prefer.

+1 for rye to allow centrally located environments (perhaps as the default) and environments in the project root.

@mitsuhiko
Copy link
Collaborator

I'm closing this. I have no desire to add support for other locations into the tool in an attempt to standardize what I believe the correct behavior is (the one true location).

@simonw
Copy link

simonw commented Feb 4, 2024

Just dropped by to request this same feature, for the same reason: I keep hundreds of my Python projects in a Dropbox folder (as insurance against someone stealing my laptop before I've run git commit on them) and I use pipenv to manage virtual environments purely because I want those environments to live outside of Dropbox, so I don't end up backing up hundreds of environments to Dropbox unnecessarily.

@mitsuhiko
Copy link
Collaborator

@simonw I wonder if it would make more sense if rye had an option where it automatically sets the necessary flags on .venv to prevent synching. Eg: com.dropbox.ignored in case of dropbox. Then you can leave those files where they are, but .venv won't end up replicated to dropbox.

@dedeswim
Copy link

I would also appreciate this feature. For my set-up, I have limited storage in my $HOME directory, where we usually store code, as it is on a mounted disk not managed by me, and I usually store virtual environments on the local SSD of the machine. I also tried to move .venv on the local SSD and then soft-link the real location to .venv in the project directory, but I get

error: failed to canonicalize path `/home/.../.venv/bin/python`
  Caused by: No such file or directory (os error 2)
error: could not write production lockfile for project

I would also be happy to not have this feature fully supported, and to manually symlink .venv every time I sync a project for the first time, but this doesn't seem to work at the moment.

@mitsuhiko
Copy link
Collaborator

@dedeswim how do you do this with node_modules?

@dedeswim
Copy link

Thanks for the reply.

I'm not a node user, so it's the first time I've faced this issue.

What I do with e.g., Conda, is to set CONDA_HOME to some place on the local SSD, but of course this is a bit different from what happens with rye.

@mitsuhiko
Copy link
Collaborator

I wonder how much of that type of usage is coming from the fact that people historically did it. Imagine you were never given that opportunity in the first place, would you still set up your computer this way? A lot of the performance benefits of modern tools require the virtualenv and the cache to be co-located on the same disk. Which in this case means that ~/.rye and where your code lives should be on the same mount.

While we today do not leverage this much in rye, uv already leverages this to some degree. So while there might be some flag in the future to auto relocate some backing in formation of a virtualenv to a different mount, it will always come with disadvantages.

@dedeswim
Copy link

Unfortunately I wasn't the one setting up the computer (as it is half-managed by the university I work at), but, if it were my choice, I agree that I would set things up so that everything would be on the same disk.

The advantage of the non-local disk where I usually store code is that it is synchronised across machines and is redundantly backed up. This makes it easy to jump between machines (in case resources on one are busy to run the experiments I need), and at the same time I don't need to worry about losing my code (in case I forget to commit). Instead, on the local disk I can store "disposable" files, like a virtual environment.

Anyways, thanks for your time thinking about this, I will think of another way to use rye even without this feature because I honestly really like it!

@tomerha
Copy link

tomerha commented May 5, 2024

In case it matters, I unfortunately find rye unusable for some of my setups where I have the project workdir mounted on a remote server over sshfs and then the .venv realpath moves between local and remote executions :(

It would be really nice to support this feature.

@mitsuhiko
Copy link
Collaborator

@tomerha how do you deal with this in node?

@tomerha
Copy link

tomerha commented May 9, 2024

@tomerha how do you deal with this in node?

I don't use node, just Python 😇

With Python, for now I use hatch which works for me since the venv is external

@mitsuhiko
Copy link
Collaborator

I understand that it works, but supporting it has a lot of downsides requiring explicit configuration to discover the venv. It means that every tool needs to start learning where that venv is.

Since the same issue must exist in node they either must have found a solution there, the issue is not pressing enough or it’s unresolved and there must be some discussion around it.

@PhilipVinc
Copy link
Author

What do you mean by every tool needs to start learning where that venv is.?

Isn't it enough for rye to know where the venv is, and the rye's python shims are enough to get this sorted out? Or you mean that VIRTUAL_ENV variable must also be correctly set..?

@mitsuhiko
Copy link
Collaborator

It's not enough if just rye knows where the venv is, because the goal is that you stuff "just works" across toolings. If we establish .venv as the canonical location of a virtualenv, then any tooling can just magically automatically work for as long as the venv is in a synched state. This means editors can just rely on being able to auto discover it.

Today a lot of the bad DX from python comes from tools accidentally falling back to global python or requiring manual configuration for which virtualenv to use resulting in a lot of extra cost to be paid everywhere.

@tomerha
Copy link

tomerha commented May 9, 2024

I'm not that familiar with npm or node, but from what I understood the similar issue is with node_modules?

I didn't try myself but it looks like with node it can be manually worked around using bind mounts (see npm/npm#7120 (comment)).

However this solution doesn't work with rye since when the venv is created it tries to create a symlink to the Python interpreter and this fails since symlinks on sshfs don't behave as expected (e.g with -o follow_symlinks they're translated to regular file mirrors from the server side).

@mitsuhiko
Copy link
Collaborator

Rye should not manipulate owners so in theory that step is not necessary to begin with. What however might be an issue is that the same trick will run into rye’s behavior to re-create the venv. That might require further changes.

Maybe the right way forward here would be to make an issue for precisely the situation we need to solve for and to see what solutions we can come up with.

@phromo
Copy link

phromo commented May 20, 2024

@tomerha how do you deal with this in node?

FYI in node there is an alternative to npm, pnpm and it solves this by creating symlinks inside myproject/node_modules to ~/.pnpm/

Personally I think creating a .venv symlink is good enough. There are plenty of sync scenarios besides dropbox. I use the unison file synchronizer and it has the follow option that controls whether links are followed or copied as-is. That said, unison also has ignore directives, so I can just ignore .venv.

For me the main rationale for using symlinks would be to be able to nuke all forgotten venvs, to free up disk space, in a central place once in a while... and also I like that I can du -sh ./myproject and see the "true size" and not the including-venv-size.

zanieb added a commit to astral-sh/uv that referenced this issue Sep 3, 2024
…ONMENT` (#6834)

Allows configuration of the (currently hard-coded) path to the virtual
environment in projects using the `UV_PROJECT_ENVIRONMENT` environment
variable.

If empty, we'll ignore it. If a relative path, it will be resolved
relative to the workspace root. If an absolute path, we'll use that.

This feature targets use in Docker images and CI. The variable is
intended to be set once in an isolated system and used for all uv
operations.

We do not expose a CLI option or configuration file setting — we may
pursue those later but I see them as lower priority. I think a
system-level environment variable addresses the most pressing use-cases
here.

This doesn't special-case the system environment. Which means that you
can use this to write to the system Python environment. I would
generally strongly recommend against doing so. The insightful comment
from @edmorley at
#5229 (comment)
provides some context on why. More generally, `uv sync` will remove
packages from the environment by default. This means that if the system
environment contains any packages relevant to the operation of the
system (that are not dependencies of your project), `uv sync` will break
it. I'd only use this in Docker or CI, if anywhere. Virtual environments
have lots of benefits, and it's only [one line to "activate"
them](https://docs.astral.sh/uv/guides/integration/docker/#using-the-environment).

If you are considering using this feature to use Docker bind mounts for
developing in containers, I would highly recommend reading our [Docker
container development
documentation](https://docs.astral.sh/uv/guides/integration/docker/#developing-in-a-container)
first. If the solutions there do not work for you, please open an issue
describing your use-case and why.

We do not read `VIRTUAL_ENV` and do not have plans to at this time.
Reading `VIRTUAL_ENV` is high-risk, because users can easily leave an
environment active and use the uv project interface today. Reading
`VIRTUAL_ENV` would be a breaking change. Additionally, uv is
intentionally moving away from the concept of "active environments" and
I don't think syncing to an "active" environment is the right behavior
while managing projects. I plan to add a warning if `VIRTUAL_ENV` is
set, to avoid confusion in this area (see
#6864).

This does not directly enable centrally managed virtual environments. If
you set `UV_PROJECT_ENVIRONMENT` to an absolute path and use it across
multiple projects, they will clobber each other's environments. However,
you could use this with something like `direnv` to achieve "centrally
managed" environments. I intend to build a prototype of this eventually.
See #1495 for more details on this use-case.

Lots of discussion about this feature in:

- astral-sh/rye#371
- astral-sh/rye#1222
- astral-sh/rye#1211
- #5229
- #6669
- #6612

Follow-ups:

- #6835 
- #6864
- Document this in the project concept documentation (can probably
re-use some of this post)

Closes #6669
Closes #5229
Closes #6612
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants