Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement instrumentation RFC #3535

Merged
23 commits merged into from
Jul 7, 2020
Merged

Conversation

nojb
Copy link
Collaborator

@nojb nojb commented Jun 7, 2020

This PR implements the instrumentation RFC #3526.

It mostly builds upon what was done in #3403. The main difference is that instrumentation backends are referred to by their runtime library. Individual commits are discussed at the end.

The fact that we refer to instrumentation backends by the name of their runtime library complexifies the implementation considerably, as knowing the precise list of ppx preprocessors (the pps field) must be delayed until libraries can be resolved.

Thinking about this point, and also from the point of view of having simple semantics, I wonder if we shouldn't drop the emphasis on "instrumentation" as a concept, and just think of the feature as a way to turn off ppx rewriters easily on a per-context or per-workspace basis. This is all the implementation does after all.

If we go this way, we could simplify the syntax quite a bit, dropping the instrumentation.backend field (any ppx would be eligible), and merge the instrumentation field with pps field, as in

(executable
 (name foo)
 (preprocess (pps a b (instrumentation c) d)))

(we would use a different name than instrumentation). It would also clears up a bit the semantics, since right now it is not clear how the list of instrumentations are merged with the list of "ordinary" ppx preprocessors.

Additionally, we would no longer refer to the ppx by the name of their runtime library but by their own name, which would simplify the implementation as resolution of libraries is no longer needed. Maybe a bit uglier from a CLI perspective, but semantically clearer in my opinion and simpler.

Detailed description of the commits

There is some back and forth in the code which I would clean up before merging, but I left it as is for a first review as it illustrates some of the issues discussed above.

The main issue, as mentioned above, is that one must wait until resolution of libraries is possible in order to know the precise list of ppx preprocessors.

  • Commit 1, 2 and 3 are routine: adding the instrumentation.backend field to library, adding the instrument_with field to the workspace file, and CLI command-line logic.

  • Commit 4 is the main part of the PR. Currently, the contents of the pps field are stored as a (Loc.t * Lib_name.t) list obtained at parsing time (the Lib_name.t is the name of the ppx preprocessor). However, we can no longer do this in this PR, because we do not know the name of the ppx preprocessor at parsing time, only that of the associated runtime library. So we must keep an intermediate ("unresolved") data structure that will eventually "resolve" to this list of ppx rewriters.

    In this commit, I introduce a new type

    type t = Ordinary of Loc.t * Lib_name.t | Instrumentation of Loc.t * Lib_name.t
    

    where the Ordinary case corresponds to a ppx rewriter as written by the user, and Instrumentation corresponds to one coming from (instrumentation ...) field (the payload is the library name that declares the instrumentation.backend field). This new type needs to be propagated everywhere until the point where the Instrumentation case can be resolved to a "ordinary" ppx name.

  • Commits 4 and 5 extend the above to libraries. There is a dependency issue because one must keep the "unresolved" Dune_file.Preprocessor_map.t in the Lib_info.t created at parsing time so that it can be "resolved" inside Lib. The problem is that Dune_file already depends on Lib_info, so this is not possible. To work around this, I moved the whole Preprocess and Preprocess_map modules of Dune_file to their own module Preprocess.

Just to insist on this point: you can see how the implementation would be much simpler if we just referred to ppx's by their names instead of going through their runtime libraries.

What is left to do

@nojb nojb requested a review from aalekseyev as a code owner June 7, 2020 18:30
src/dune/lib.ml Outdated
let pps =
let resolve =
match db with
| None -> assert false (* FIXME ??? *)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The db argument is None if allow_overlaps is true further below. Not sure what this is supposed to be for, so I didn't handle this case yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's just an artifact of using Lib_name.t for comparison.

@ghost
Copy link

ghost commented Jun 9, 2020

To work around this, I moved the whole Preprocess and Preprocess_map modules of Dune_file to their own module Preprocess.

Ok, that seems like a good change anyway, dune_file.ml is getting big...

bin/common.ml Outdated Show resolved Hide resolved
src/dune/preprocess.ml Outdated Show resolved Hide resolved
@ghost
Copy link

ghost commented Jun 9, 2020

Regarding (using bisect_ppx ...), as mentioned on slack let's drop support entirely, the new feature is more general and cover the bisect use case. Is there anything special to do for versioning BTW?

@ghost
Copy link

ghost commented Jun 9, 2020

Answering myself: it's the Syntax.since stuff

@ghost
Copy link

ghost commented Jun 12, 2020

@nojb I'm adding [WIP] in the title, as I understand it's still in progress

@ghost ghost changed the title Implement instrumentation RFC [WIP] Implement instrumentation RFC Jun 12, 2020
@nojb
Copy link
Collaborator Author

nojb commented Jun 12, 2020

@nojb I'm adding [WIP] in the title, as I understand it's still in progress

Yes, thanks. I will give the final touches shortly (I was waiting for confirmation of the approach first, which we did during the last meeting).

@nojb nojb force-pushed the instrumentation_framework branch 4 times, most recently from 4f8d3fb to 1df02eb Compare June 13, 2020 12:12
@nojb nojb requested a review from emillon as a code owner June 13, 2020 12:12
@nojb nojb force-pushed the instrumentation_framework branch from 1df02eb to 75faa82 Compare June 13, 2020 12:14
@nojb
Copy link
Collaborator Author

nojb commented Jun 13, 2020

@nojb I'm adding [WIP] in the title, as I understand it's still in progress

Yes, thanks. I will give the final touches shortly (I was waiting for confirmation of the approach first, which we did during the last meeting).

I have finished all the remaining points (including doc and tests), and the PR is ready for review (I removed [WIP] from the title). It can be read commit by commit. I disabled the bisect_ppx tests because bisect_ppx needs to declare an instrumentation backend in order to be able to use the new infrastructure.

@aantron If you have the time it would be great if you could give it a test drive as well with bisect_ppx.

@nojb nojb changed the title [WIP] Implement instrumentation RFC Implement instrumentation RFC Jun 13, 2020
@nojb nojb force-pushed the instrumentation_framework branch from 75faa82 to 309f9ce Compare June 13, 2020 13:04
@ghost
Copy link

ghost commented Jun 23, 2020

@rgrinberg could you do the code review?

@nojb, just one suggestion regarding the documentation: it seems to me that defining an instrumentation backend is a more "expert" feature. i.e. most users will only be faced with using instrumentation and will never define their own backend. So personally I would put the documentation about defining a backend at the end and start with the more general usage. WDYT?

@nojb
Copy link
Collaborator Author

nojb commented Jun 23, 2020

@nojb, just one suggestion regarding the documentation: it seems to me that defining an instrumentation backend is a more "expert" feature. i.e. most users will only be faced with using instrumentation and will never define their own backend. So personally I would put the documentation about defining a backend at the end and start with the more general usage. WDYT?

Sounds good, I'll amend.

By the way, the CI failure is that as a side-effect of this patch, (pps) gets converted to no-preprocessing. I will see if I can move the check for (pps) forward in order to preserve current behaviour.

@ghost
Copy link

ghost commented Jun 24, 2020

Ack, I was wondering if we should just accept this change of semantic, but not accepting an empty (pps) seems better.

@nojb
Copy link
Collaborator Author

nojb commented Jun 24, 2020

Ack, I was wondering if we should just accept this change of semantic, but not accepting an empty (pps) seems better.

Checking for a syntactically empty (pps) is easy when parsing dune files, but the current check is a bit better than that because if I understand correctly it filters out the flags in the field. I haven't had time to look at it in detail though.

@ghost
Copy link

ghost commented Jun 24, 2020

IIRC, we should be able to do this check at parsing time. i.e. the flags and pp names are split textually.

@nojb nojb force-pushed the instrumentation_framework branch from ea2d216 to 5229892 Compare June 25, 2020 03:56
@nojb
Copy link
Collaborator Author

nojb commented Jun 25, 2020

IIRC, we should be able to do this check at parsing time. i.e. the flags and pp names are split textually.

OK, I pushed a commit rearranging the doc and another checking for an empty (pps) at parsing time. Note that currently, this check is lazier, as it is performed only when actually building the library/executable in question.

src/dune/workspace.mli Outdated Show resolved Hide resolved
src/dune/main.mli Outdated Show resolved Hide resolved
@nojb nojb force-pushed the instrumentation_framework branch from 9b3cbcc to 550f445 Compare June 25, 2020 07:01
@@ -565,7 +566,8 @@ let all_packages t =
- A memoized function for finding packages by names (see [find]).

- A [Memo.Lazy.t] storing the set of all packages (see [root_packages]). *)
let create ~stdlib_dir ~paths ~version ~lib_config =
let create ~paths ~version ~lib_config =
let stdlib_dir = lib_config.Lib_config.stdlib_dir in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version seems to be present in Lib_config.t. Perhaps we should unify these.

nojb and others added 15 commits July 7, 2020 07:55
Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
It's already passed there as a string

Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
… lib deps

Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
@nojb nojb force-pushed the instrumentation_framework branch from c9cbca9 to 028f70e Compare July 7, 2020 06:10
@nojb nojb closed this Jul 7, 2020
@nojb nojb reopened this Jul 7, 2020
@nojb
Copy link
Collaborator Author

nojb commented Jul 7, 2020

I rebased the PR. I also did some small tests with bisect_ppx, everything seems OK. One thing that came up is that bisect_ppx the runtime library is actually called bisect_ppx.runtime and the ppx rewriter is called bisect_ppx, so it made more sense to attach the instrumentation backend to the ppx rewriter (referring to itself) rather than to the runtime library so that users can write --instrument-with bisect_ppx instead of --instrument-with bisect_ppx.runtime.

Travis seems not to be working. Other than that, feel free to merge if you think it is OK.

@ghost
Copy link

ghost commented Jul 7, 2020

Indeed, for the attachment I suppose it's whatever makes the most sense. In general, I expect that authors will attach it to the library named after the package.

I've clicked on "Update branch", let see what happens with Travis

@ghost
Copy link

ghost commented Jul 7, 2020

All the checks are passing now, thanks @nojb for this work!

BTW, once a new version of Dune and landmarks and/or bisect_ppx are released, a blog post to announce the new feature would be great :)

@ghost ghost merged commit 4f0fa1b into ocaml:master Jul 7, 2020
@nojb
Copy link
Collaborator Author

nojb commented Jul 7, 2020

Thanks! OK for the blog post, I am a little short on time right now, but I will try to co-opt @mlasson :)

@nojb
Copy link
Collaborator Author

nojb commented Aug 14, 2020

@rgrinberg Sorry, but it looks like I forgot to update the CHANGES entry for this one (it still contains the mention of bisect_ppx of #3403).

@rgrinberg
Copy link
Member

rgrinberg commented Aug 14, 2020 via email

@nojb
Copy link
Collaborator Author

nojb commented Aug 14, 2020

That’s not a problem. Just send a PR now :) Rudi.

#3707

@nojb nojb deleted the instrumentation_framework branch August 14, 2020 06:16
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants