Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: try making cargo tree's Graph from a UnitGraph #11379

Closed
wants to merge 34 commits into from

Conversation

alsuren
Copy link

@alsuren alsuren commented Nov 15, 2022

What does this PR try to resolve?

This is an attempt at implementing step 1 from #9599 (comment) - a function which converts from a UnitGraph to a cargo::ops::tree::graph::Graph

This would make it easier to find edge cases where cargo tree disagrees with cargo build.

How should we test and review this PR?

TODO (this PR is currently WIP)

Demonstrate how you test this change and guide reviewers through your PR.
With a smooth review process, a pull request usually gets reviewed quicker.

If you don't know how to write and run your tests, please read the guide:
https://doc.crates.io/contrib/tests

Additional information

  • minimal test
    • think a bit harder about how/whether to expose this code (currently hidden behind a CARGO_TREE_FROM_UNIT_GRAPH environment variable for simplicity)
  • actually implement Graph::from_bcx() (to create a Graph from a UnitGraph)
  • run everything in the current test suite using Graph::from_bcx() rather than graph::build() and fix any discrepancies
    • think about how cargo::core::compiler::unit_dependencies::State::deps() should handle filtering for --target=all (it seems like tree has a copy-pasta version of it. It might be possible to reconcile the two, but it feels a bit meaningless)
  • get feedback on whether the team likes this approach (if not then that's fine - I need to build something like this for cargo quickbuild anyway, so it won't be wasted effort).

Failing tests to chip away at:

  • features2::minimal_download --target=all
  • features2::tree_all --target=all
  • features::same_name
  • features_namespaced::tree - new code fails to print feat1, but cargo build --features=a actually builds bar with features ["feat1", "feat2"] (if I'm reading the trace output correctly), so I think the old code and test are also wrong, but in a different way. hrm. Nevermind. I changed how I was linking my features up and the test now passes. Maybe I will revisit this later to make sure I was misreading the cargo build trace output.
  • features_namespaced::tree_no_implicit
  • tree::cyclic_features
  • tree::dep_kinds
  • tree::depth_limit
  • tree::dev_dep_cycle_with_feature
  • tree::dev_dep_cycle_with_feature_nested
  • tree::dev_dep_feature
  • tree::duplicates
  • tree::features
  • tree::filters_target (HACK: hacked part of it to separate host and target versions of foo, and disabled the --target=all bits for now)
  • tree::format
  • tree::host_dep_feature
    • we're adding a second bar as a normal dep (with optdep in there)
    • build-dependencies also has two versions of bar. Not sure why (the cause will probably be similar for both).
  • tree::invert_with_build_dep
  • tree::proc_macro_features - This is duplicating somedep (once as "host" and once as "target" maybe? I need to add some debugging to make sure). One of them also has the optdep dependency. I think this might be a sign that I've successfully fixed cargo tree doesn't handle transitive duplications properly #9599 ;-)
  • tree::prune
  • tree::virtual_workspace
  • tree::workspace_features_are_local
  • tree_graph_features::dep_feature_various - this is a big mess. I will dig into this last.
  • tree_graph_features::features_enables_inactive_target - --target=all
  • tree_graph_features::graph_features_ws_interdependent - This is completely missing the feat2 dependency (dependencies.b = {path="../b", features=["feat2"]} so this is pretty glaring and should be easy to spot/debug. Done
  • tree_graph_features::slash_feature_name
  • weak_dep_features::tree - panic in graph.dep_name_map[&package_index] in add_cli_features(). I'm probably failing to uphold an invariant somewhere.
  • weak_dep_features::weak_namespaced

@rustbot
Copy link
Collaborator

rustbot commented Nov 15, 2022

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @weihanglo (or someone else) soon.

Please see the contribution instructions for more information.

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 15, 2022
@alsuren
Copy link
Author

alsuren commented Nov 15, 2022

Progress report for today:

my simple_from_unit_graph test now passes (so top-level build-deps and dev-deps are now supported)

If I apply this patch:

diff --git a/crates/cargo-test-support/src/lib.rs b/crates/cargo-test-support/src/lib.rs
index 4ce0162bf..513d4f4f8 100644
--- a/crates/cargo-test-support/src/lib.rs
+++ b/crates/cargo-test-support/src/lib.rs
@@ -414,6 +414,7 @@ impl Project {
         let mut execs = self.process(&cargo);
         if let Some(ref mut p) = execs.process_builder {
             p.env("CARGO", cargo);
+            p.env("CARGO_TREE_FROM_UNIT_GRAPH", "1");
             p.arg_line(cmd);
         }
         execs

then it causes failures from the 25 tests

[edit: copied to the top comment to give a better indication of progress]

Something to work on tomorrow 😁 .

@alsuren
Copy link
Author

alsuren commented Nov 20, 2022

Progress report for the weekend:

Apart from the tests that pass --target=all, I only have about 3 failing tests. I can probably chip away at these, but I'm not sure what to do about --target=all.

Questions:
a) Do you think that it's worth trying to teach cargo::core::compiler::unit_dependencies about --target=all or is it possible to deprecate it?
b) cargo tree -p $package seems to mean something slightly different from cargo build -p $package (cargo tree seems to use the whole workspace for calculating the dep tree, and then print a filtered tree by using $package as the root). Do I have this right, or do I just have bugs in my code/understanding? I'm tempted to make a cargo build --stop-after-package $package that does the same thing as cargo tree -p $package. Does this seem like a reasonable idea?
c) at the moment, I'm only adding complexity (because the existing code is useful as an oracle), but in theory this approach would let us remove a bunch of complexity by removing the existing Graph-building machinery. Is this a thing we want to do? Should I have a stab at it, and see what my diffstat looks like with the old code removed?
d) is this PR fleshed-out enough to determine whether we even want to go down this route? Is there anything else that would make it easier to make that decision?

I doubt that this PR will be in a reasonable state to merge before I get swallowed up by my new job, but it will hopefully serve as a reference of what is possible and what is easier/harder if you go down this route. I will therefore close this PR on Tuesday, whatever state it's in. I will leave the branch around in case someone wants to use it as inspiration.

@weihanglo
Copy link
Member

Haven't really looked deeply into your change, but I do appreciate what you've done. The amount of code added so far are less than I expected. It is indeed a great exploration. Thank you!

a) Do you think that it's worth trying to teach cargo::core::compiler::unit_dependencies about --target=all or is it possible to deprecate it?

I don't think so, at least for the latter. They have different meanings. We already have ForceAllTargets to disable target platform filters. Maybe we can look into that without too many churns on the other parts of code?

b) cargo tree -p $package seems to mean something slightly different from cargo build -p $package (cargo tree seems to use the whole workspace for calculating the dep tree, and then print a filtered tree by using $package as the root). Do I have this right, or do I just have bugs in my code/understanding? I'm tempted to make a cargo build --stop-after-package $package that does the same thing as cargo tree -p $package. Does this seem like a reasonable idea?

Package filtering process happens here, so ws_resolve should resolve what is passed incargo tree -p. Did you find any abnormal behaviour? For --stop-after-package, I am not sure what it is used for, though usually Cargo team takes it serious when adding a new flag.

c) at the moment, I'm only adding complexity (because the existing code is useful as an oracle), but in theory this approach would let us remove a bunch of complexity by removing the existing Graph-building machinery. Is this a thing we want to do? Should I have a stab at it, and see what my diffstat looks like with the old code removed?

Go ahead, or put it into another module, so we can have a clear view of the diff :)

d) is this PR fleshed-out enough to determine whether we even want to go down this route? Is there anything else that would make it easier to make that decision?

Even the final result reduces the complexity, I'll still ask about the extensibility, such as handling an issue like #10593 (comment). I think ehuss has some concerns on sharing code, so having ehuss taking a look is better once it is in a relatively good shape.

@alsuren
Copy link
Author

alsuren commented Nov 21, 2022

Haven't really looked deeply into your change, but I do appreciate what you've done. The amount of code added so far are less than I expected. It is indeed a great exploration. Thank you!

a) Do you think that it's worth trying to teach cargo::core::compiler::unit_dependencies about --target=all or is it possible to deprecate it?

I don't think so, at least for the latter. They have different meanings. We already have ForceAllTargets to disable target platform filters. Maybe we can look into that without too many churns on the other parts of code?

Ah. I think I might have misunderstood that part of the code. For some reason, I thought ForceAllTargets was about the --all-targets flag (equivalent to specifying --lib --bins --tests --benches --examples). Thanks. I'll have a go at using that machinery.

b) cargo tree -p $package seems to mean something slightly different from cargo build -p $package (cargo tree seems to use the whole workspace for calculating the dep tree, and then print a filtered tree by using $package as the root). Do I have this right, or do I just have bugs in my code/understanding? I'm tempted to make a cargo build --stop-after-package $package that does the same thing as cargo tree -p $package. Does this seem like a reasonable idea?

Package filtering process happens here, so ws_resolve should resolve what is passed incargo tree -p. Did you find any abnormal behaviour? For --stop-after-package, I am not sure what it is used for, though usually Cargo team takes it serious when adding a new flag.

I think this may be due to bugs in my understanding again. I was under the impression that it was possible to ask cargo tree to "show me how cargo would build this workspace, but only show me things below $package" in the tree.

  • dig into the the tests that I chalked up to this discrepancy and see if I can resolve them.

c) at the moment, I'm only adding complexity (because the existing code is useful as an oracle), but in theory this approach would let us remove a bunch of complexity by removing the existing Graph-building machinery. Is this a thing we want to do? Should I have a stab at it, and see what my diffstat looks like with the old code removed?

Go ahead, or put it into another module, so we can have a clear view of the diff :)

I will do the "another module" thing first, and then do "deleting the old code" as a separate PR against my own fork if I get time.

d) is this PR fleshed-out enough to determine whether we even want to go down this route? Is there anything else that would make it easier to make that decision?

Even the final result reduces the complexity, I'll still ask about the extensibility, such as handling an issue like #10593 (comment). I think ehuss has some concerns on sharing code, so having ehuss taking a look is better once it is in a relatively good shape.

I suppose this is a philosophical thing: is cargo tree a teaching/exploring tool (in which case, clarity and expressiveness might be more important) or is it a debugging tool (in which case correctness might be more important)?

Fundamentally, cargo tree does not work on the Unit level, and there will always be discrepancies between it and cargo build. This is something that I didn't appreciate when I started this project. It will certainly help when I next have time to work on cargo quickbuild.

Comment on lines +153 to +154
// FIXME: we're creating this here, but build_ctx() is also creating one.
// Can we refactor things so that they are shared?
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// FIXME: we're creating this here, but build_ctx() is also creating one.
// Can we refactor things so that they are shared?
// FIXME: we're creating this here, but create_bcx() is also creating one.
// Can we refactor create_bcx() into multiple functions so that they can be shared?

@alsuren
Copy link
Author

alsuren commented Nov 22, 2022

I've reached the end of my timebox for this.

I briefly tried propagating ForceAllTargets, but I ran out of time before I got any --target=all tests passing. I also didn't get time to do any cleaning up, so it's still a mess. Sorry.

I still think that this is a promising idea.

Advantages:

  • The Graph building code in graph::from_bcx() does not need to be recursive anymore. This means that it is easier to follow, and doesn't have to worry about infinite loops.
    • It does a pass to add all of the Package nodes, then a pass to add all of the links (which may add Feature nodes if feature edges are enabled)
    • The pass to add cli features at the end is currently still the old recursive add_internal_features() code, but it might be possible to flatten that out too (not sure).
  • We now match the behaviour of UnitGraph - build-dependencies are ignored if there is no build.rs unit.

As an aside, I wonder if we could use a similar approach to support displaying artifact dependencies and build-scripts. You would want a -e units or something, and a Node::Unit variant, a bit like how Node::Feature works (I have not thought about how it would interact with -e features). Going via UnitGraph for that feature would make things a lot easier, so if you are already going via UnitGraph for everything else, you might have an advantage.

Risks:

  • I didn't have time to look into the --target=all tests, and a few other things, so there are 6 failing tests. Any one of them might be difficult enough to solve that it kills this idea dead.
  • we still rely on calling out to the resolver to work out why each of the UnitDeps are there, because they don't contain enough information on their own. This could be considered to be a bad thing, but also may be considered to be a good thing, if we need extension points.
  • I'm still using a few of the old builder functions, like add_internal_features(), and they might be correcting for sloppiness in how I'm building the tree. It might be valuable to rewrite them in terms of UnitGraph to make sure we're not missing anything.
  • Please don't let my prototype-quality code convince you that it's a bad idea: it is my first attempt at hacking on cargo internals, so it's a bit of a mess.

Thanks for all of your support in this exploration. Hopefully it contains some useful ideas. Even if it doesn't, I have learned a lot.

Thanks again.

David.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cargo tree doesn't handle transitive duplications properly
3 participants