Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eRFC: Cargo build system integration #2136

Merged
merged 12 commits into from
Feb 1, 2018
Merged

Conversation

aturon
Copy link
Member

@aturon aturon commented Sep 1, 2017

This experimental RFC lays out a high-level plan for improving Cargo's ability to integrate with other build systems and environments. As an experimental RFC, it opens the door to landing unstable features in Cargo to try out ideas, but not to stabilizing those features, which will require follow-up RFCs. It proposes a variety of features which, in total, permit a wide spectrum of integration cases -- from customizing a single aspect of Cargo to letting an external build system run almost the entire show.

Rendered

@aturon aturon added the T-cargo Relevant to the Cargo team, which will review and decide on the RFC. label Sep 1, 2017
@aturon aturon self-assigned this Sep 1, 2017
@aturon
Copy link
Member Author

aturon commented Sep 1, 2017

cc @rust-lang/cargo @jsgf @Sid0 @acmcarther @luser @rillian

@aturon aturon changed the title RFC: Cargo build system integration eRFC: Cargo build system integration Sep 1, 2017
handling for native dependencies, and so on. Addressing these concerns well
means adding new points of extensibility or control to Cargo.

- **Homogenous build systems** like [Bazel], where there is a single prevailing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

homogeneous

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@chris-morgan chris-morgan Sep 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That citation indicates that homogeneous is still more popular (“and around a third of citations for the word now use the form homogenous”). Homogenous was an error that has only gained any legitimacy because so many people make it! (As I like to say: language is a popularity contest.)

added point of extensibility should ease build system integration for another
round of customers.

- **For the homoegenous build system case**, we will immediately pursue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

homogeneous

into a larger build system. This finer division is left as a question for
experimentation.

## Specifics for the homogenous build system case
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

homogeneous etc...


Reliably building native dependencies in a cross-platform way
is... challenging. Today, Rust offers some help with this through crates like
[`gcc`] and `[pkgconfig]`, which provide building blocks for writing build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the one below should be [pkgconfig] so that the link behaves :)

Copy link

@sunshowers sunshowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this! Here's some really early feedback after a quick read.

mirror thereof), but organizations can choose whether to manage their own crates
through a custom registry (more on that below) or some other means.

### Using crates managed by a crate registry

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By homogeneous build systems I assume you mean Bazel, Buck etc. Is this particular case something that an organization has expressed interest in? I'm not really sure how this is going to work with the build hermeticity/reproducibility constraints that Buck has.

For context, at Facebook we use the same system to manage dependencies from crates.io as we do to manage external C, C++ or Python dependencies. So our current workflow fits the "unmanaged crates" case somewhat better.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somewhat the story for bazel as well.

It's very noticeable as well when working other typically managed dependency languages that these build systems really want to you to vendor your dependencies. Cargo like mvn or ivy for the jvm tries to push you in the opposite direction with remote but managed dependencies.

I don't really think one is better than the other but they do conflict in assumptions and philosophy.

for guidance during the planning stage.

When *developing* a crate, it should be possible to invoke Cargo commands as
usual. We do this via a plugin. When invoking, for example, `cargo build`, the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to invoke the native build system directly instead of using cargo build?


I'm not sure if this is too low-level for this RFC, but it might be worth talking a bit about where build artifacts go. With Buck, build artifacts always go into buck-out in the root of the monorepo. This is advantageous for a few reasons:

  • it avoids polluting random subdirectories
  • it saves on Watchman file monitoring resources

Would tools like the RLS know to read build artifacts from the buck-out directory instead of from target/rls?

example, altering the way they build the native dependency, or the version they
use -- there's no clear heads-up that something may need to be adjusted within
the external build system. It might be possible, however, to use
version-specific whitelisting to side-step this issue.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is my biggest concern here. I don't think changes in the way native deps are built currently cause a semver bump. This change would seemingly propel build.rs into an API, so should any substantive changes here cause a semver bump?

If build.rs builds the native dependency with a set of features today (./configure --with-something), will it be the job of the native build system to ensure that its own copy of that dependency has those features turned on?

One important concern is: how do you depend on code from other languages, which
is being managed by the external build system? That's a narrow version of a more
general question around *native dependencies*, which will be addressed
separately in a later section.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not sure the later section really addresses how homogeneous build systems will tend to use native dependencies. That section seems to be restricted to how packages on crates.io specify native dependencies. That's an important part of the story, of course. But ideally, authors of Rust crates within Facebook's monorepo would not get Cargo involved in their native dependencies at all. They would use standard Buck rules to specify their native dependencies, and Buck would build those deps and provide libraries that Rust could link against.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... That's very similar to what I've been doing for OS projects to link the kernel with external assembly code, except the other way around. I have found it easier to compile my rust code into a .a, compile my other stuff into .os and pass them all to ld.

The annoying part of this approach is that if you want to use any other rust libraries (e.g. liballoc or libcollections) you also need to obtain them and pass them to either rustc or ld... It gets to be a bit messy if your not careful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this section to try to make things more clear. The setup is really very simple: Cargo/rustc don't need to know much at all about these native deps, other than their location. The build system manages the rest.

| Build lowering | A build plan: a series of steps that must be run in sequence, including rustc and binary invocations | Build scripts, plugins |
| Build execution | Compiled artifacts | Caching |

The first stage, dependency resolution, is the most complex; it's where our

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance you could avoid the use of first-person pronouns here? "We" and "our" are a little confusing.

"it's where Cargo's model of semver comes into play" etc seem clearer.


- **For the homoegenous build system case**, we will immediately pursue
extensibility points that will enable the external build system to perform
many of the tasks that Cargo does today--but while still meeting our

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What sort of timeline are you looking at for "immediately pursu[ing] extensibility points"?


The first two steps -- dependency resolution and build configuration -- need to
operate on an entire dependency graph at once. Build lowering, by contrast, can
be performed for any crate in isolation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be turned into a column of the table for clarity. "Scope" -> "Entire dependency graph"/"Individual crate"

kinds of use-cases (or "customers") involved here:

- **Mixed build systems**, where building already involves a variety of
language- or proeject-specific build systems. For this use case, the desire is

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo here. Proeject should be project


### Using "unmanaged" crates

In some cases, an organization may want to employ a their own strategy for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a their

@aturon
Copy link
Member Author

aturon commented Sep 2, 2017

Oy, also cc @joshtriplett of course!

@aturon
Copy link
Member Author

aturon commented Sep 2, 2017

@Sid0 I've just pushed significant clarifications for the homogeneous case and for native dependencies. Please let me know if the result aligns with your understanding.

@jpakkane
Copy link

jpakkane commented Sep 2, 2017

A while ago I wrote a bit on dependencies and related things in general.

@joshtriplett
Copy link
Member

As written, this looks very good to me. In particular, I like that you've provided a detailed look at the problem, and the collected discussions and information we've obtained, while holding off on providing concrete solutions.

There are a couple of aspects I'd appreciate some revisions on, in being both a little less specific about solutions and a little more specific about problems.

I'm not quite comfortable yet with us emphasizing "build plans" or "Cargo plugins" as possible solutions. I do think it's important that we emphasize the idea of letting Cargo provide "builds as a library" that can be woven together as part of a larger build system, but I'd really like to see that explained in a way that takes the "midlayer mistake" into account, and especially, I think describing it as "Cargo plugins" implies "fit your plugin into Cargo" rather than "use Cargo's steps as a library and override some of them". Would you consider generalizing this a little bit, and discussing it a bit more, focusing on the problem and the properties of a correct solution, rather than any specific solution? (Please feel free to take some text from the last paragraph of my pre-RFC if that helps.)

This paragraph from the eRFC seems very promising to me:

Note that the four-step division above is a useful conceptual model, but is ultimately too coarse-grained; we will likely need to divide up Cargo's functionality into numerous small pieces that can be re-used when integrating into a larger build system. This finer division is left as a question for experimentation.

I'd just like to see other parts of the eRFC match that more in spirit.

On the flip side, there are also certain critical properties we'd like a Cargo-based build system to provide, and there's an important aspect here that Cargo (and the crates.io ecosystem) exists to provide policy, not just mechanism. I think it's worth emphasizing, in the eRFC, that Cargo is intentionally a somewhat opinionated build system, precisely because it's solving a very difficult problem that's looks easier than it is, and hard to get right in all the details. And part of the goal here is to provide the flexibility to integrate Cargo with other build systems, distributions, and external dependencies, while continuing to provide policy as appropriate, not abandoning that and just serving up mechanism alone.

Do those points seem like reasonable additions, to you?

@vorner
Copy link

vorner commented Sep 2, 2017

I wonder if using rust completely without cargo is even considered as possibility. Here, the RFC doesn't seem to think so.

I guess cargo does a lot of useful work, but maybe part of that work is somewhat „artificial“. If I build a C application, I build just my sources, the build system to compile each .c file into .o file is reasonably straightforward. If have a dependency, I don't build it, but expect it to already live in my system. This approach doesn't seem to be possible right now (eg. a lot of code I write depends on serde… it would make sense to have already compiled serde somewhere in the system, as a dynamic library or something and reuse).

Anyway, that's just kind of brainstorming and I'm not sure this is completely related to the point of the RFC, so feel free to ignore. In general, I'm in favour of this RFC ‒ it at least states the problems and some ideas, even though I'm not 100% sure what the next actionable steps would be after accepting it.

@withoutboats
Copy link
Contributor

@vorner this RFC is specifically a cargo team RFC to make it easier to integrate cargo into larger build systems. Invoking rustc without cargo is out of scope for this RFC.


### Using crates managed by the build system

Many organization want to employ a their own strategy for maintaining and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "employ a their own strategy"

@matklad
Copy link
Member

matklad commented Sep 16, 2017

@aturon for other two concerns, which boil down to "it's not obvious that people would want to use Cargo for internal crates at all". I think we should explicitly mention "imposing less control" for internal crates in homogeneous build system (that is, remove Cargo from equation completely), and explain why this route won't be able to fulfill our goals.

@aturon
Copy link
Member Author

aturon commented Sep 19, 2017

@matklad I've pushed an update that, I think, should address your concerns. In particular, it commits the Cargo team to working with the Dev Tools team to fully enumerate the needs of tools, and to explore the lightest weight way of providing that information (which may just be something that a build tool generates, rather than something that Cargo needs to mediate).

@matklad
Copy link
Member

matklad commented Sep 20, 2017

Thanks @aturon !

@rfcbot resolved cargo_for_internal_crates

@matklad
Copy link
Member

matklad commented Sep 20, 2017

@rfcbot resolved bazel_workflows

@matklad
Copy link
Member

matklad commented Sep 20, 2017

@rfcbot reviewd

@matklad
Copy link
Member

matklad commented Sep 20, 2017

@rfcbot reviewed

:)

@rfcbot
Copy link
Collaborator

rfcbot commented Sep 20, 2017

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Sep 20, 2017
@joshtriplett
Copy link
Member

joshtriplett commented Sep 22, 2017

@aturon

rfcbot doesn't know that @joshtriplett has joined the Cargo team, but we'll block on his review as well :-)

Already discussed in the Cargo call, but:

👍

@sunshowers
Copy link

sunshowers commented Sep 22, 2017

I reread this with all your changes, and I have nothing substantive to add. One change I really like is the possibility to just use an interchange format. I think that has the potential to, if not eliminate plugins completely, at least substantially simplify them.

Thanks!

@aturon
Copy link
Member Author

aturon commented Sep 26, 2017

I've started up a thread to try to get a firmer grasp on the needs of Rust tools. Please join in!

@rfcbot
Copy link
Collaborator

rfcbot commented Sep 30, 2017

The final comment period is now complete.

@aturon
Copy link
Member Author

aturon commented Feb 1, 2018

Ooops, this somehow didn't get merged before. Doing so now!

I didn't open a tracking issue on the Rust repo, since this is for Cargo. I'm not sure yet how we'll want to organize/track work there, but I'll try to post back on this thread once there's something to look at.

@aturon aturon merged commit 96dce82 into rust-lang:master Feb 1, 2018
@jjpe
Copy link

jjpe commented Feb 7, 2018

Could someone update the rendered link in the OP to a working URL? Unfortunately it's broken ATM...

@carols10cents
Copy link
Member

Updated the rendered link to link to rust-lang/rfcs instead of aturon/rfcs. I wish this was automated!

@matklad
Copy link
Member

matklad commented Apr 9, 2018

cc rust-lang/cargo#5332

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-cargo Relevant to the Cargo team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.