Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Also emit source information for paths #8988

Closed
wants to merge 2 commits into from

Conversation

jonhoo
Copy link
Contributor

@jonhoo jonhoo commented Dec 17, 2020

Previously, the location for path-based (i.e., local) packages was
redacted from metadata output. This was done way back when so that
Cargo.lock would not include path sources:
#7483 (comment)

This PR makes this redaction optional so that cargo metadata and
friends will output the source location, which is useful for tools
like rustfmt which need to know the paths to local packages (see #7483
and rust-lang/rustfmt#4599).

Interestingly enough, no tests fail even though
disable_path_serialization is never currently called. I don't know if
that's because there's no test that cover where redaction is needed, or
because redaction is simply no longer needed (presumably because
lockfile generation now uses custom logic). If it is the latter, this PR
can be simplified to not support opting into redaction.

Fixes #7483

Previously, the location for path-based (i.e., local) packages was
redacted from metadata output. This was done way back when so that
`Cargo.lock` would not include path sources:
rust-lang#7483 (comment)

This PR makes this redaction optional so that `cargo metadata` and
friends _will_ output the source location, which is useful for tools
like `rustfmt` which need to know the paths to local packages (see rust-lang#7483
and rust-lang/rustfmt#4599).

Interestingly enough, no tests fail even though
`disable_path_serialization` is never currently called. I don't know if
that's because there's no test that cover where redaction is needed, or
because redaction is simply no longer needed (presumably because
lockfile generation now uses custom logic). If it is the latter, this PR
can be simplified to not support opting into redaction.

Fixes rust-lang#7483
@rust-highfive
Copy link

r? @alexcrichton

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 17, 2020
@jonhoo
Copy link
Contributor Author

jonhoo commented Dec 17, 2020

The failing test is due to disable_path_serialization being unused. The tests otherwise pass. I'll push with an #[allow(dead_code)] attribute so we can tests passing for now.

@ehuss
Copy link
Contributor

ehuss commented Dec 17, 2020

I don't think this is something that can be changed, because tools will be depending on that value being null.

Also, for rustfmt's use case, it needs the path to the package. The SourceId is intended to be opaque, and that shouldn't be used for determining the path. I was roughly thinking of adding a new field for that path information. EDIT (Or maybe manifest_path is enough, I don't remember.)

As for fixing rustfmt's specific use case, there's also a problem where --no-deps doesn't include path dependencies if it is not an explicit workspace. I seem to recall working on that, but I can't find it.

@jonhoo
Copy link
Contributor Author

jonhoo commented Dec 17, 2020

Hmm, interesting. Wouldn't adding a field be just as breaking? In what cases would consumers specifically expect the field to be null for paths specifically?

The problem with using manifest_path is, as @calebcartwright outlines in #7483 (comment), that manifest_path doesn't exist on dependency nodes, which is where rustfmt needs it I think. And presumably adding it there would also be a breaking change. I agree with you though that source should probably be considered opaque, so not sure what the best path forward here is.

I'd be happy to write up a PR that includes path dependencies even if --no-deps is given, but that also seems like it might break some downstream consumers? I suppose we could go to a --format-version=2?

@calebcartwright
Copy link
Member

Also, for rustfmt's use case, it needs the path to the package

@ehuss - we need to know where on the file system to locate the path-based dependencies, which are indeed typically local implicit workspace members, but can also be some random other package on the file system (I've no idea what that use case would be but we technically support that today)

@ehuss
Copy link
Contributor

ehuss commented Dec 17, 2020

Wouldn't adding a field be just as breaking?

New fields are allowed, as they are usually ignored by things like serde. I thought that was documented, but I can only see it mentioned in the rustc docs.

In what cases would consumers specifically expect the field to be null for paths specifically?

Whenever it wants to collect all local packages (like this).

The problem with using manifest_path is

Yea, it's a bit of a thorny issue. I was working on rust-lang/rustfmt#4247, but never really finished. The idea is that cargo fmt should use --no-deps, and Cargo should provide the path dependencies in the packages array. There is some risk that could confuse existing tools. That's really tough to evaluate how risky. That might be why I didn't finish, I recall this being a little tricky.

@jonhoo
Copy link
Contributor Author

jonhoo commented Dec 17, 2020

@ehuss Hmm, yeah, that's tricky. If new fields are indeed ignored and we're willing to add one, how about just adding a manifest-path each dependency node then? I'm more inclined to go that way than include local path dependencies in packages since the latter seems like it is more likely to have an adverse impact on other tooling. I have some spare cycles and would be happy to push forward on a PR here if you think that's a reasonable path.

@ehuss
Copy link
Contributor

ehuss commented Dec 17, 2020

how about just adding a manifest-path each dependency node then?

I don't think that would address the cargo fmt use case, as the intent is to switch to --no-deps (for performance), and with --no-deps the resolve is null.

@jonhoo
Copy link
Contributor Author

jonhoo commented Dec 17, 2020

Hmm, wouldn't it be sufficient for rustfmt to look at packages[].dependencies[] if those contained a manifest_path if they are local paths? Otherwise I'm guessing that I'm missing something.

@calebcartwright
Copy link
Member

wouldn't it be sufficient for rustfmt to look at packages[].dependencies[] if those contained a manifest_path if they are local paths

if the dependency nodes (crucially with --no-deps) contain something we can use to find the local manifest path then yes, that'd solve our need

@alexcrichton
Copy link
Member

I fear I may be missing some context on this, as I don't understand why we'd need more paths available than we already have with each package's manifest path listed. Can you elaborate a bit more where the current output falls short?

(e.g. if this is a rustfmt thing, can all packages be iterated over with --no-deps and their Cargo.toml is found via manifest_path?)

@calebcartwright
Copy link
Member

(e.g. if this is a rustfmt thing, can all packages be iterated over with --no-deps and their Cargo.toml is found via manifest_path?)

We already do this today for the packages, but #7483 is about our need to be able to locate the manifest file for dependencies of those packages. Implicit workspace members are not included in the packages collection, and with --no-deps the available information on the dependency nodes is stripped, leaving rustfmt with no way to correctly and consistently locate the manifests for those implicit workspace members.

That forces us to eschew the --no-deps flags which is causing various other downsides and challenges that are unnecessary for the purposes of formatting code

@alexcrichton
Copy link
Member

Ah ok, that makes sense. I don't think that "just add the information to dependencies" is what you want to fix this, though, because you presumably want all transtive path dependencies of a crate, not just the first-level path dependencies outside of the workspace, right?

I don't think that Cargo has any equivalent to that right now, you almost want cargo metadata --just-the-path-dependencies which doesn't currently exist.

@calebcartwright
Copy link
Member

because you presumably want all transtive path dependencies of a crate, not just the first-level path dependencies outside of the workspace, right?

That's correct and currently we do that transitive walking within cargo fmt once we get the paths for the direct deps from the metadata

you almost want cargo metadata --just-the-path-dependencies which doesn't currently exist.

That would be fantastic and allow us to simplify things a bit on the rustfmt side, but even being able to get the paths for the direct dependencies without having to require network connectivity, index/performance hit, etc. would be a huge win

@alexcrichton
Copy link
Member

Ok mostly just wanted to confirm. This means that "just add info to dependencies" is not going to work for rustfmt's use case as-is I think, since it will require more invasive changes in Cargo. Unfortunately though even implementing something like --just-the-path-dependencies in Cargo would be difficult to do. We don't actually know whether path dependencies in [patch] will be used unless we run the resolver, which naturally relies on everything being online.

Would it be possible for rustfmt to run cargo metadata --no-deps multiple times? That means if we include more information about dependencies rustfmt could follow the dependency links outside of the workspace itself. This would not handle path dependencies in [patch] correctly, however.

@calebcartwright
Copy link
Member

Would it be possible for rustfmt to run cargo metadata --no-deps multiple times?

We already do this today (albeit without --no-deps for the discussed reasons) to walk any and all transitive local dependencies, we just have to run cargo metadata in the corresponding directory for every discovered local dep in that tree

This means that "just add info to dependencies" is not going to work for rustfmt's use case as-is
That means if we include more information about dependencies rustfmt could follow the dependency links outside of the workspace itself.

I'm not following the distinction between these two, could you elaborate?

At the end of the day, the most minimal thing that would solve our need on the rustfmt side is the ability to determine the manifest path for implicit workspace members (even just direct local deps on packages would suffice) from cargo without having to the required network connectivity, downloading dependencies, etc. that happens today.

We're happy to work with whatever makes sense on the cargo side (understand there's some complexity here), but if there was some field foo on the dependency nodes in the cargo metadata --no-deps output that provided the path to the manifest for local dependencies, that would indeed solve our problem.

@jonhoo
Copy link
Contributor Author

jonhoo commented Dec 17, 2020

Just for comparison, I took a stab at a PR for surfacing manifest paths just for direct local dependencies in #8994.

@alexcrichton
Copy link
Member

Oh interesting, @calebcartwright how come you invoke cargo metadata once-per-package? Without specifying --no-deps I thought cargo metadata would basically give you the same information every time you invoke it. (in that it should give you full information about all the dependencies).

I'm not following the distinction between these two, could you elaborate?

At the end of the day, the most minimal thing that would solve our need on the rustfmt side is the ability to determine the manifest path for implicit workspace members (even just direct local deps on packages would suffice) from cargo without having to the required network connectivity, downloading dependencies, etc. that happens today.

What I'm imagining is that rustfmt would always invoke cargo metadata with the --no-deps flag so no network stuff happens. In that scenario when you first run it you'll initially learn of all workspace members and their dependency paths. I believe with #8994 you could then also learn about the path dependencies those workspace members have which are not themselves workspace members. That would allow you to iteratively, for all those non-workspace members, execute cargo metadata --no-deps to discover the full local dependency graph.

Would that work for rustfmt? If so #8994 definitely sounds like a good PR to me! Although it also seems good in isolation :)

@calebcartwright
Copy link
Member

What I'm imagining is that rustfmt would always invoke cargo metadata with the --no-deps flag so no network stuff happens. In that scenario when you first run it you'll initially learn of all workspace members and their dependency paths. I believe with #8994 you could then also learn about the path dependencies those workspace members have which are not themselves workspace members. That would allow you to iteratively, for all those non-workspace members, execute cargo metadata --no-deps to discover the full local dependency graph

Yes, that's precisely what we'd like to be able to, and is essentially what we do today just sans --no-deps

Oh interesting, @calebcartwright how come you invoke cargo metadata once-per-package? Without specifying --no-deps I thought cargo metadata would basically give you the same information every time you invoke it

Does the metadata output contain all the target and manifest path information for all dependencies for all packages? The core data points cargo fmt is grabbing to pass to rustfmt are the entry point files (obtained from the targets) and the edition.

We first run cargo metadata in the cwd, and then if there's local deps at some foo/bar.. path (or ../../../whatever) for any of the (non dependency) packages then we run the metadata commands in those respective local dep directories, recursively as needed for all local deps.

I don't recall that initial metadata output containing all the information we need for that entire tree of local deps but could certainly be mistaken.

@alexcrichton
Copy link
Member

I thought that the metadata information contained all that, but I could also be remembering wrong!

In any case it sounds like we should perhaps close this in favor of #8994?

@jonhoo jonhoo closed this Dec 18, 2020
@jonhoo
Copy link
Contributor Author

jonhoo commented Dec 18, 2020

@alexcrichton I went ahead and closed this -- #8994 seems like a much simpler way to get at this then, seeing as it still solves the need that rustfmt has 🎉

bors added a commit that referenced this pull request Jan 6, 2021
metadata: Supply local path for path dependencies

This is potentially a simpler way to address #7483 than #8988.

/cc rust-lang/rustfmt#4247

Fixes #7483.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Include path/url of local dependencies in cargo metadata --no-deps output
5 participants