Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guess diagnostics #1941

Closed
wants to merge 8 commits into from
Closed

Guess diagnostics #1941

wants to merge 8 commits into from

Conversation

oli-obk
Copy link
Contributor

@oli-obk oli-obk commented Mar 3, 2017

rustfix should be the semantic companion to rustfmt (which is purely syntactical) and automatically change code to be more idiomatic.

To do this, we need to distinguish between automatically applicable code replacements (suggestions) and heuristics (guesses)

Rendered

cc @nrc, @Manishearth, @mcarton, @llogiq, @petrochenkov

some previous discussion at rust-lang/rust#39458

oli-obk and others added 7 commits March 2, 2017 15:14
graphviz dot code:

```dot
digraph G {
     A [ label = "Does the message contain\ncode snippets derived\nfrom real code?"]
     A -> B [ label = "Yes"]
     A -> C [ label = "No"]
     B [ label = "Are there multiple\nsnippets that may apply?"]
     B -> "Guess" [ label = "Yes" ]
     B -> D [ label = "No" ]
     D [ label = "Are there situations\nwhere the snippet is plain wrong?"]
     D -> E [ label = "Yes" ]
     D -> "Suggestion" [ label = "No" ]
     E [ label = "Can you detect those\nsituations programmatically?"]
     E -> G [ label = "Yes" ]
     E -> "Guess" [ label = "No" ]
     G [ label = "Suggestion where correct,\nGuess or no code hint otherwise"]


     C [ label = "Is the message reasonably short?"]
     C -> "Note" [ label = "Yes" ]
     C -> "Help" [ label = "No" ]
}
```
@Manishearth
Copy link
Member

cc @killercup

arbitrary limits on the number of displayed items is removed. This limit
is instead enforced by the command line diagnostic backend.

## Json diagnostics backend API changes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Json/JSON


## Json diagnostics backend API changes

The json object gets a new field `guesses` which is an array of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/json/JSON

= help: `std::collections::btree_map::Iter`
= help: `std::collections::btree_set::Iter`
= help: `std::collections::hash_map::Iter`
= help: and 8 other candidates
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's up with all these lines starting with "help:" anyway? Probably not part of this RFC but this does not look helpful to me. Starting the first line with a (highlighted) "help" should suffice.


The following chart should clarify the decision process for subdiagnostics

![chart](https://cloud.githubusercontent.com/assets/332036/23554529/be228b76-0025-11e7-99e7-627d60f5a328.png)
Copy link
Member

@killercup killercup Mar 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! For sake of keeping this self contained you should either reproduce the state machine in prose, or include the chart as ASCII, though. I spent a few minutes to do the latter with asciiflow.com, feel free to copy this:

                             +--------------------------+
                             | Does the message contain |
                    +--------+ snippets derived         +--------------+
                    |        | from real code?          |              |
                Yes |        +--------------------------+              | No
                    |                                                  |
        +-----------v--------------+                          +--------v----------+
    +---+ Are there multiple       +---+                      | Is the message    |
    |   | snippets that may apply? |   |                      | reasonably short? |
    |   +--------------------------+   |                      +---+-----------+---+
Yes |                                  | No                       |           |
    |                                  |                      Yes |           | No
    |              +-------------------v---------+                |           |
    |              | Are there situations where  |            +---v--+     +--v---+
    |              | the snippet is plain wrong? |            | Note |     | Help |
    |              +---+-----------------------+-+            +------+     +------+
    |                  |                       |
    |              Yes |                       | No
    |                  |                       |
    |    +-------------v----------------+    +-v----------+
    |    | Can you detect those         |    | Suggestion |
    |    | situations programmatically? |    +------------+
    |    +------+-----------------+-----+
    |           |                 |
    |           | No          Yes |
    |           |                 |
 +--v----+      |        +--------v------------------------+
 | Guess <------+        | Suggestion where correct,       |
 +-------+               | Guess or no code hint otherwise |
                         +---------------------------------+

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dot code is in the commit message: aed4d38

I'll always push the dot code together with changes in the image.

You can use an online tool like http://sandbox.kidstrythisathome.com/erdos/ to generate new images and edit the graph.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nice, I hadn't seen that. Maybe put it in an addendum section?

`Guess` objects. Every `Guess` object contains an array of span +
snippet pairs stating which part of the code should be replaced
with what replacment. There is no need for all guesses to replace
the same piece of code or even require the same number of replacements.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the JSON structure ever defined in an RFC? All I could find was #1855 (currently open) which has some links. If not, it would be great to define (or at least note) the schema of the JSON objects here, either in Rust syntax or in JSON Schema syntax.

The current iteration of rustfix uses the compiler's JSON output. The Language Server spec may contain its own definition of how to represent these annotations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better for rustfix to directly use the Diagnostics API like RLS wants to do

@nrc nrc added T-compiler Relevant to the compiler team, which will review and decide on the RFC. T-dev-tools Relevant to the development tools team, which will review and decide on the RFC. labels Mar 6, 2017
@nrc
Copy link
Member

nrc commented Mar 6, 2017

Quoting myself (also in the RFC, thanks!):

My work flow for 'automatic application' is that the user opts in to doing this
in an IDE or rustfix tool, possibly with some further input.
I don't think there is a need or expectation that after such an application
the resulting code is guaranteed to be logically correct or even to compile.
So, I think a suggestion with placeholders would be fine as would a suggestion with multiple choices.
In other words, I'd rather make suggestions more flexible than add another category of error.

To clarify slightly - I don't think that the compiler can ever be 100% confident that a change is correct. Therefore, you always need user input to make changes. Given that, it doesn't seem useful to me to be able to express how confident the compiler is in making changes, since it won't affect the UI actually presented to the user.

To expand on why I think there is a cost to adding another category of error suggestion:

  • we already have 3 categories (suggestion, note, help) for this kind of operation. I think this is complex enough
  • I fear that this will end up being exposed to users one way or another, eventually, and then they'll have to deal with this complexity too
  • We (compiler developers) already fail to use the existing categories correctly, I don't think we will do any better with more.

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 6, 2017

I can't argue point 1 as it is entirely subjective and I've been writing Lint's for too long to be able to judge this even remotely objectively.

Point 2: I don't see how this can be exposed to users, since it's only possible to be part of an existing error or warning message. Could you elaborate on this?

Edit: I'd have no problem with changing the suggested back end api to only have a book flag to state the confidence level, and then adding a lint that forbids using that flag in the command line diagnostic output
Point 3: adding the guess category actually will simplify this, since we can then enforce a no-code-snippet-in-help-or-note rule (by using a lint that checks this invariant). Since we now have guarantees about the code replacements, I can write a craterbomb extension to check the result of applying suggestions and guesses to crates. If suggestions turn up to fail the compilation or tests of the crate, we add such a test case to rustc. If a guess turns out to be unambiguous and its application turns up no compilation or test failures, we change it to be a suggestion.

Making the lint categories automatically checkable will make them correctly used instead of having to rely on reviews.

Also I trust rustfmt not to break my code, if it does it's a bug as I understand it. Why can't we treat suggestions the same?

@nrc
Copy link
Member

nrc commented Mar 6, 2017

Point 2: I don't see how this can be exposed to users, since it's only possible to be part of an existing error or warning message. Could you elaborate on this?

We currently expose "help" vs "note" to users, there is nothing to stop us exposing "guess" at some point (this is basically a 'slippery slope' argument). More realistically, an IDE or tool exposes this to users, rather than rustc doing so.

since we can then enforce a no-code-snippet-in-help-or-note rule

a lint cannot tell the difference between "make this variable a reference" and "make x into &x"

If suggestions turn up to fail the compilation or tests of the crate

This is not good enough, a change could compile, but still be incorrect. For example (and I realise this would be a guess, not a suggestion or whatever, but the point stands for all suggestions and this is an easy example): error is "x does not need to be mutable" suggestion is to turn let mut x = ... into let x = ..., however, that fix is incorrect even though it compiles, because what the programmer should do to fix this is insert x += 1; on the next line.

Making the lint categories automatically checkable will make them correctly used instead of having to rely on reviews.

This is impossible, how can a lint know if you should be confident enough to make a change without a user opting in (Or opting in in bulk, etc.) vs a mere guess at correctness?

Also I trust rustfmt not to break my code, if it does it's a bug as I understand it. Why can't we treat suggestions the same?

Rustfmt does not change code, it only changes whitespace (well, pretty much, there are some other simple transforms, all provably semantics-preserving). The code before and after rustfmt'ing must always be semantically equivalent (and trivially so). The changes that the compiler and Clippy suggest are not trivially equivalent, by a long shot.

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 6, 2017

a lint cannot tell the difference between "make this variable a reference" and "make x into &x"

A lint can tell whether there's a dataflow from a call to span_snippet to the text for a span_help/span_note call.

This is not good enough, a change could compile, but still be incorrect. For example (and I realise this would be a guess, not a suggestion or whatever, but the point stands for all suggestions and this is an easy example): error is "x does not need to be mutable" suggestion is to turn let mut x = ... into let x = ..., however, that fix is incorrect even though it compiles, because what the programmer should do to fix this is insert x += 1; on the next line.

This is exactly why I opened this RFC. In clippy we have many suggestions which we consider suboptimal, due to the situation you described. But we also have suggestions which are so very uncontroversial, that we want a way to mark them as automatically applicable.

This is impossible, how can a lint know if you should be confident enough to make a change without a user opting in (Or opting in in bulk, etc.) vs a mere guess at correctness?

The confidence would come from testing the lints through cargobomb. So developers would choose categories through best-effort (or just use guesses and let cargobomb detect that the guess should be a suggestion).

The code before and after rustfmt'ing must always be semantically equivalent (and trivially so). The changes that the compiler and Clippy suggest are not trivially equivalent, by a long shot.

While clippy has quite a few nontrivial changes, I think the list below shows that clippy has a lot of trivial changes which are often not more complex than rustfmt's if (x) to if x conversion, but most of them cannot be done by a purely syntactical analysis, due to type information being required for the decision of linting.

Note: Please don't argue the idiomaticality of lint suggestions in this PR, just whether the transformations are trivial (or incorrect)

The lint list of trivial clippy transformations:

  • assign_op_pattern ( x = x + 1 should be x += 1)
  • assign_ops (x += 1 should be x = x + 1)
  • bool_comparison (x == true should be x)
  • char_lit_as_u8 ('a' as u8 should be b'a')
  • clone_on_copy (x.clone() should be *x if x: Copy)
  • cmp_nan (x == f32::NAN should be x.is_nan())
  • cmp_null (x == ptr::null() should be x.is_null())
  • collapsible_if (if x { } else { if y {} } should be if x {} else if y {}
  • deref_addrof (*&x should be x)
  • explicit_into_iter_loop (for i in v.into_iter() should be for i in v)
  • explicit_iter_loop (for i in v.iter() should be for i in &v)
  • for_kv_map (for (k, _) in &map should be for k in map.keys() if map is in a whitelist of maps that support this
  • get_unwrap (v.get(n).unwrap() should be v[n] if v is in a whitelist of indexeables)
  • if_let_redundant_pattern_matching (if let Ok(_) = res should be if res.is_ok())
  • if_let_some_result (if let Some(x) = res.ok() should be if let Ok(x) = res)
  • if_not_else (if !cond { a } else { b } should be if cond { b } else { a }
  • iter_cloned_collect (slice.cloned().collect() should be slice.to_vec())
  • iter_nth (x.iter().nth(i) should be x.get(i) if x is in a whitelist of indexeables)
  • iter_skip_next (x.skip(n).next() should be x.nth(n))
  • len_zero (x.len() == 0 should be x.is_empty())
  • manual_swap (let t = b; b = a; a = t; should be swap(&mut a, &mut b)
  • map_clone (iter.map(|e| e.clone()) should be iter.cloned())
  • needless_bool (if c { true } else { false } should be c)
  • needless_borrow (let x: &i32 = &&5 should be let x: &i32 = &5)
  • needless_return (fn foo (x: usize) -> usize { return x; } should be fn foo(x: usize) -> usize(x))
  • neg_multiply (x * -1 should be -x)
  • ok_expect (res.ok().expect("foo") should be res.expect("foo") if res's error type impls Debug)
  • option_map_unwrap_or (x.map(y).unwrap_or(z) should be x.map_or(z, y))
  • option_map_unwrap_or_else (see option_map_unwrap_or)
  • ptr_arg (fn foo(&Vec<u32>) should be fn foo(&[u32]))
  • range_zip_with_len ((0..).zip(x.iter()) should be x.iter().enumerate())
  • redundant_closure (|x| foo(x) should be foo if foo is a function)
  • redundant_closure_call ((|| 42)() should be 42)
  • redundant_pattern (y @ _ should just be y)
  • search_is_some (iter.find(f).is_some() should be iter.any(f))
  • short_circuit_statement (f() && g(); should be if f() { g(); }
  • should_assert_eq (assert!(x == y) should be assert_eq!(x, y) if both impl Debug)
  • single_char_pattern (s.split("x") should be s.split('x'))
  • single_match(_else) (match x { Some(val) => f(val), _ => { stuff() } } should be if let Some(val) = x { f(val) } else { stuff() }
  • string_lit_as_bytes ("foo".as_bytes() should be b"foo")
  • toplevel_ref_arg (let ref x = foo should be let x = &foo)
  • transmute_ptr_to_ref (transmute<*const T, &T>(p) should be &*p)
  • unnecessary_cast (2i32 as i32 should be 2i32)
  • unnecessary_mut_passed (foo(&mut v) should be foo(&v) if foo's argument is a &T)
  • unseparated_literal_suffix (123832i32 should be 123832_i32)
  • while_let_loop (handwritten for loop should be for loop)
  • zero_ptr (0 as *const _ should be ptr_null())
  • match_bool (match cond { true => { a }, false => { b } } should be if cond { a } else { b })
  • match_ref_pats (match x { &A => {} } should be match *x { A => {}})

The lint list of nontrivial clippy transformations:

  • chars_next_cmp (name.chars().next() == Some('_') should be name.starts_with('_')
  • explicit_counter_loop (for i in 0..v.len() { v[i].something() } should be for el in &mut v { el.something() }
  • filter_map (_.filter(_).map(_) should be _.filter_map(_)
  • filter_next (_.filter(_).next() should be _.find(_)

The lint list of possibly wrong clippy transformations:

  • clone_double_ref (foo.clone() should be (*foo).clone() if foo: &&T
  • cmp_owned (x.to_owned() == y should be x == y)
  • let_and_return ({ let x = ..; x } should just be x)
  • map_entry
  • match_same_arms (match x { A => {}, B => {} } should be match x { A | B => {} })
  • misrefactored_assign_op (a += a + b should be a += b changes semantics)
  • needless_pass_by_value (fn foo(v: Vec<i32>) -> usize { v.len() } should be fn foo(v: &[i32]) -> usize { v.len() }, changes API)
  • nonminimal_bool (simplifies complex bool expressions, but ignores short circuiting)
  • or_fun_call (foo.or(bar(x)) should be foo.or_else(|| bar(x))

The lint list of lints that detect bugs and have a suggestion

  • for_loop_over_option (for v in opt should be if let Some(v) = opt) most likely a bug, so suggestion is still a bug
  • for_loop_over_result (see for_loop_over_option)
  • identity_op (x + 0 should just be x)
  • almost_swapped
  • approx_constant
  • min_max (min(0, max(100, x)) should be max(0, min(100, x)))
  • needless_update (Point { x: 1, y: 0, ..zero_point } should not have ..zero_point if Point has only x and y as fields)
  • zero_prefixed_literal (0123 should be 123)

@Manishearth
Copy link
Member

More realistically, an IDE or tool exposes this to users, rather than rustc doing so.

An easy way to avoid this is to have an extra boolean or enum argument on span_suggestion and/or make it an extra boolean flag in the json. This was in fact my original proposal, where they're all suggestions, but with a bool on the json.

@nrc
Copy link
Member

nrc commented Mar 6, 2017

A lint can tell whether there's a dataflow from a call to span_snippet to the text for a span_help/span_note call.

Sorry, could you explain a bit more how this helps please?

The confidence would come from testing the lints through cargobomb. So developers would choose categories through best-effort (or just use guesses and let cargobomb detect that the guess should be a suggestion).

This relies on the fact that the project has enough test coverage to detect errors, which seems optimistic to rely on. This also seems like a big project to implement.

Note: Please don't argue the idiomaticality of lint suggestions in this PR, just whether the transformations are trivial (or incorrect)

So I won't argue idiomaticality of individual lints, but suffice it to say there are more than a few on the 'trivial' list that I don't agree with in all situations and would be uncomfortable to blanket apply to my code. I think about half of those lints are suitable for automatic application (with others being almost certainly correct, but perhaps justified by the context, or IMO not that trivial). In any case, I think that there is enough room for doubt in automatic application that whether these things should be automatic or opt-in (i.e., suggestions or guesses) should be debated, and that is what leads to grey areas and messy boundaries and ultimately more complexity.

I will say, that when considering the Clippy lints in detail, I see the motivation for this much more clearly than when considering mainly the compiler's suggestions (which, under my current thinking should basically all be guesses). I suppose that a Clippy-specific solution is not possible because of its tight integration with the compiler? Perhaps a confidence flag on errors that only Clippy uses?

@nrc
Copy link
Member

nrc commented Mar 6, 2017

An easy way to avoid this is to have an extra boolean or enum argument on span_suggestion and/or make it an extra boolean flag in the json.

I don't think this guarantees avoiding the situation, but I agree it does make it less likely

@Manishearth
Copy link
Member

What about lints like unused imports in the compiler?

Yes, that often means that someone has forgotten something, but I think a difference between rustfmt and rustfix is that you generally just run-and-forget rustfmt whereas rustfix is meant to be reviewed a bit more carefully. Not too much more, but a bit more.

I'm fine with a confidence flag on errors only used by clippy, though the json needs to support it too.

@nrc
Copy link
Member

nrc commented Mar 6, 2017

What about lints like unused imports in the compiler? ...whereas rustfix is meant to be reviewed a bit more carefully

ISTM that being confident enough that a user should not opt-in to a change, but not so confident that they have to review the diff afterwards is a bad place to be, UX-wise. Unused imports seems like exactly the kind of lint where you should opt-in because it can be solved two different ways.

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 7, 2017

A lint can tell whether there's a dataflow from a call to span_snippet to the text for a span_help/span_note call.

Sorry, could you explain a bit more how this helps please?

My idea is that we don't have any code snippets whatsoever in span_help or span_note. The following piece of code, uses the result from a call to snippet somewhere in the arguments of span_help.

let x = snippet(e.span);
sess.span_help(e.span, &format!("look at this {}, it's the reason", x));

By tracing the dataflow in the MIR of this piece of code, we can automatically detect this kind of code and forbid it inside rustc. We already have some lints in clippy than enforce our own invariants. These lints make little sense outside clippy. We also have lints helping with the API of serde. Such specialized lints can help us enforce the correct usage of code hints vs help/note.

The confidence would come from testing the lints through cargobomb. So developers would choose categories through best-effort (or just use guesses and let cargobomb detect that the guess should be a suggestion).

This relies on the fact that the project has enough test coverage to detect errors, which seems optimistic to rely on.

True, but I believe it's ok for some bugs to slip through.

This also seems like a big project to implement.

Not really, all the heavy duty work has been done. I'll just insert an intermediate call to (a future version of) rustfix (which knows about suggestions and guesses) and recompile with the same compiler instead of a different one like cargobomb does now.

Perhaps a confidence flag on errors that only Clippy uses?
An easy way to avoid this is to have an extra boolean or enum argument on span_suggestion and/or make it an extra boolean flag in the json.

I don't see how this is different from my proposal. If you don't want suggestions in rustc, but only guesses, it would be easy enough to simply not expose span_suggestion anymore, but implement it in clippy. Rustc would then only have span_guess{es}.

In fact, adding another field to span_suggestion, be it a bool or an enum, will make them harder to use, not easier.

In case you are talking about the backend API, I'd be happy to change it to just be a boolean flag instead of an enum, but I'd still want to make that explicit CodeHint change instead of squashing code hints into the regular diagnostics. If it were just a bool flag, the command line diagnostics would simply ignore it, while json would do something useful with it. Would this reduce your fear of accidental exposure of the suggestion<->guess difference to users?

ISTM that being confident enough that a user should not opt-in to a change, but not so confident that they have to review the diff afterwards is a bad place to be, UX-wise.

I agree with you there. And I have a solution that I'm absolutely curious about what others think of it:

Only automatically apply suggestions for lints which are deny.

My reasoning is that if you opt in to have a lint break your build if it triggers, then you also opt in for it to automatically fix your code. Rustfix would treat all warn lints' suggestion as if they were guesses. This also means, that if a deny lint produces a guess, it's still just a guess and your code doesn't get autofixed. see below, this behaviour would be way too confusing.

@Ixrec
Copy link
Contributor

Ixrec commented Mar 7, 2017

I agree with you there. And I have a solution that I'm absolutely curious about what others think of it:

Only automatically apply suggestions for lints which are deny.

To me using deny normally feels like "compiler, don't let me get away with being lazy about this", i.e. it's for things I need to do, like deny(missing_docs). Using it to inform automatic code changes sounds like "compiler, please fix this for me whenever it comes up here (but only in this specific part of my code for some reason)" which just seems weird to me. I can't think of an example of an automatic code transformation rustc/clippy could apply which I'd want only in some parts of my code, much less exactly the parts where I'd deny() the corresponding warning.

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 7, 2017

To me using deny normally feels like "compiler, don't let me get away with being lazy about this", i.e. it's for things I need to do, like deny(missing_docs). Using it to inform automatic code changes sounds like "compiler, please fix this for me whenever it comes up here (but only in this specific part of my code for some reason)" which just seems weird to me.

True... Ignore what I said then...

Then the only nice solution left is a rustfix.toml (was actually my first idea to solve this, but somehow I got that deny business into my mind), which requires opt-in for anything even remotely controversial.

I can't think of an example of an automatic code transformation rustc/clippy could apply which I'd want only in some parts of my code, much less exactly the parts where I'd deny() the corresponding warning.

The idea is that you'd deny them for your crate. It could be taken even further to only do the automatic fixing if the lint was denied for the entire crate.

An example would be char_lit_as_u8. You'd never deny that for just one function or statement. You either do it for the entire crate, or not.

@burdges burdges mentioned this pull request Mar 7, 2017
@solson
Copy link
Member

solson commented Mar 8, 2017

On some classes of errors, Clang provides so-called "Fix-It" hints (search for the term here) that tell you exactly what needs to be changed to fix the error. And not only do they tell you, they're also understood by IDE tools like vim's YouCompleteMe plugin. In vim I can hit a keybind over an erroneous line of C++ with a fix-it error and have it immediately, automatically fixed.

I'd like it if rustc itself had a notion of fix-it errors. They would be especially easy to represent in the JSON error output for tools, and we can look at Clang for ideas on how to render them for humans.

Then, to achieve the aims of this RFC, we could have clippy optionally emit fix-its in the exact same format, so IDEs could apply them the same way they apply rustc fix-its.

A rustfix tool could simply be a command-line tool that runs rustc/clippy, collects the fix-it errors, and applies them all (perhaps based on some rules or interactively questioning the user).

@gnzlbg
Copy link
Contributor

gnzlbg commented Mar 8, 2017

On some classes of errors, Clang provides so-called "Fix-It" hints (search for the term here) that tell you exactly what needs to be changed to fix the error.

Note also that clang-tidy can automatically apply (most of) this fix-its for you, and that the tool states that doing so will never break your code (it states that it won't apply fixes that result in compilation errors).

I have been using clang-tidy for over a year regularly and they should rename the clang-tidy -fix option to clang-tidy -yolo. Beyond toy examples specifically crafted to make clang-tidy fix-its work, I have yet to see a real code-base that is not completely ruined destroyed obliterated annihilated by clang-tidy. Applying fix-its automatically with the tool, even the most trivial ones, has not yet produced output that compiles, not even once. And I have to say that I use it daily, so the problems that it discovers are minor incremental issues. It still manages to produce builds with 1000s or 10000s of compilation errors, which I always consider an amazing feat.

And I love clang-tidy, is like clippy but for C++, it is really good, but fixing C++ code automatically seems very hard.

IMO a rustfix tool with a --fix option that automatically apply the changes should apply each change one-by-one, recompile, run tests, and if nothing breaks, keep the change, and move to the next one, otherwise, ignore that particular change. This is slow, but I can do something else in the mean time, or have this as a commit hook and be guaranteed that it will always work. Unless we can do a crater run and no crates in crates.io are broken by the tool, applying changes automatically without quality-control should be performed by a --yolo option instead.

EDIT: Fun fact: I also remember one case in which clang-tidy never terminates. It has two lints, one that tells you that an argument is going out of scope, and that if you pass it to a function you should consume it using std::move, and another one that tells you that doing a std::move on Copy types is misleading, because it just copies the memory, so you should avoid it to make the cost explicit. The one that adds std::move did not know about the corner case of Copy types, so it would add std::move on e.g. std::array<int, N> and the other lint would remove it, in a loop, fun times :D (this is fixed now though, but it shows that combinations of lints might interact in interesting ways, and as the number of lints explodes, its hard to keep all these interactions in mind while running new lints, and also to test all possible combinations against each other) .

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 8, 2017

fixing C++ code automatically seems very hard.

It is. Rust is much easier IMO.

We could eliminate this entire rfc by simply adding a whitelist to rustfix. All Lint's in the list get automatically applied. This moves the burden to rustfix instead of the lint producer. The whitelist could be modified by a rustfix.toml. @killercup opinions on such a whitelist?

@killercup
Copy link
Member

We could eliminate this entire rfc by simply adding a whitelist to rustfix. All Lint's in the list get automatically applied.

Assuming you mean the "rustfix" CLI tool, sure. But "rustfix" as the general idea of automatically fixing Rust code? That would mean every editor plugin that uses rustc's suggestion output would need to have such a list. A list, I might add, that is highly specific to rustc and clippy versions.

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 8, 2017

Such a list will be necessary anyway, as there are obvious preferences as to what should be automatically fixed and what not. Similar to what should be linted being decided by allow and warn attributes.

So how about a fix attribute for suggestions and guesses, where suggestions are fix by default and guesses are not. Then you can opt into new fixes like opting into new lints

@notriddle
Copy link
Contributor

cmp_nan (x == f32::NAN should be x.is_nan())

This transformation is not semantics preserving. The semantics-preserving version is:

cmp_nan (x == f32::NAN should be false)

@Manishearth
Copy link
Member

It's "preserving the semantics you actually meant in the first place", partialeq with NaN is completely useless.

@nrc
Copy link
Member

nrc commented Mar 8, 2017

Could Clippy talk directly to Rustfix? At this point it seems like this facility is purely for Clippy to communicate to rustfix, so it seems sub-optimal to have to alter the compiler.

@Manishearth
Copy link
Member

I'm really unconvinced that the compiler will never make use of this.

In general I feel like builtin APIs are the way to go for something you eventually want IDEs to use. Also, most of clippy will live in rust-lang/rust soon anyway, so that distinction would be blurred anyway.


I think it's suboptimal to basically duplicate an API (and have to figure out a way to thread args down to a sideloaded plugin). The compiler will be talking to rustfix too -- regardless of whether or not the compiler will be making certain-suggestions (as opposed to "guesses"). Rustfix already has an interactive mode where it takes all compiler suggestions and asks you individually if you want to apply them. So regardless of the situation on guesses, the compiler will be talking to rustfix via this API. Clippy uses the same API for diagnostics already, and thus talks to rustfix the same way. This is already the case, regardless of future situations of a "guess vs certain" API.

The only difference is that clippy wants to pass an extra bit of information when communicating to rustfix (a bit of information which I feel the compiler would find useful too, but I digress). I think it's overkill to open a separate side channel to transmit this single bit of information.

@eulerdisk
Copy link

I'm playing around with the rust compiler these days. I'm trying to improve E0046, printing out copy-pastable ready-to-use trait method implementations (with unimplemented! body).
The compiler is the right place to generate such suggestion, because the compiler can infer the generic types from the impl giving correct method signatures.
That would be a "implement missing methods" quickfix action, which is one of the most useful feature that an IDE can have.
[The "hard" part is done, but it crashes when the trait method has generic params. My knowledge of the rust internals is limited, so maybe I should seek some mentoring 😭 ]
[It infers correct method signatures for generic trait params though. ]

I was thinking of a method to show such type of suggestions and then I see this RFC, just in time.
I'm currently working on methods, but the same logic apply for associated types and consts. In that case some kind of placeholder mechanism is necessary though.

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 9, 2017

@eulerdisk for now you can just use span_suggestion and use _ or ... as a placeholder. We'll figure out what to do with the suggestion once this RFC gets resolved.

@nrc
Copy link
Member

nrc commented Mar 9, 2017

Also, most of clippy will live in rust-lang/rust soon anyway

I don't think we're planning on doing this. AFAIK, the plan is that rustup will support pulling in Clippy to give it the privs it needs to work with the compiler even on stable. It will still live in its own repo and interact with the compiler via plugin mechanisms.

The current effort is to move things out of the Rust repo (e.g., liblog, rustdoc), not move more things in.

Anyway, I think you misunderstood my (admittedly over-brief) question - I'm not suggesting a side-channel as such (although I guess that might work), I'm wondering if there is a more general/scalable solution than adding specific complexity to a compiler-internal API to support the needs of a plugin and an external tool.

I would prefer, for example, a generic way for plugins to add information to JSON error messages (and perhaps human-readable error messages too), which in Clippy's case could be a confidence field on suggestions.

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 9, 2017

We could simply add a hashmap to the back end api. But I don't see the benefit. Clippy is so closely tied to the compiler, and any tool using the diagnostic api will be, too, that you either need to start creating a stable api or do it as it's done now and modify the api to fit the use cases.

In the case of this PR, the single bool change to the back end api is so negligible, I fail to see how adding it can make anyone's life harder. We can even drop the frontend api suggested and implement it in clippy. The only change to rustc would be in the json back end and in an "unrelated" change allowing multiple suggestions per diagnostic.

@eddyb
Copy link
Member

eddyb commented Mar 9, 2017

@nrc FWIW what I understood is we'd ship clippy by building it against the compiler at release time.

@Manishearth
Copy link
Member

I don't think we're planning on doing this. AFAIK, the plan is that rustup will support pulling in Clippy to give it the privs it needs to work with the compiler even on stable. It will still live in its own repo and interact with the compiler via plugin mechanisms.

That was the original plan, and I personally am okay with it. However, acrichto wants there to be per-push blocking CI, without a complicated sync model (a very sensible ask IMO), which means that libclippy_lints will have to live in rust-lang/rust (even if cargo-clippy lives out of tree)

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 11, 2017

I am not sure how to proceed on this RFC.

I could add a generic way to add data to diagnostics, but would prefer to do that once (if ever) there are more than one uses of such an API. In my book an underused API is worse than a adding a specific just-for-a-single-use-case field to an API that will never be stable to best of my knowledge.

@gnzlbg
Copy link
Contributor

gnzlbg commented Mar 11, 2017 via email

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 24, 2017

There has been no activity on this RFC for two weeks.

The current state is that there are four ways to continue:

  1. Don't do any of this
  2. Go through the full changes as the RFC stands now (adding an enum that differentiates between a single suggestion and one or many guesses)
  3. Add a single bool flag denoting that the suggestion has full confidence in being correct.
  4. Add a generic mechanism to diagnostics for providing custom user defined diagnostics

My opinions and comments to these four ways

  1. 😢
  2. I created the RFC, so I obviously support going through with it
  3. I'd be fine reducing the RFC to a single bool flag which can trivially be ignored in the regular diagnostics
  4. Total overkill in my opinion, especially since I don't see any use case other this RFC. I'd be willing to implement this if other use cases crop up, but until then I'd prefer to do 2. or 3.

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 24, 2017

Or what's the worst that could happen?

I think the issue is that once we start adding tool specific stuff to internal APIs, it becomes "easier" to add more and more in the future and then the API becomes big and messy.

@pnkfelix
Copy link
Member

nominating: I suspect this is best addressed by the tools team (not the compiler one), but in any case, there's been no activity (beyond that of the RFC author) for over two weeks, so it would be good to get a shepherd assigned (assuming that's what the tools team does...)

@oli-obk
Copy link
Contributor Author

oli-obk commented Apr 25, 2017

This has been dormant for a month. Can someone ping the tools team as suggested?

@Manishearth
Copy link
Member

cc @rust-lang/tools

@aturon
Copy link
Member

aturon commented Apr 29, 2017

@oli-obk I think the problem is that @nrc is on parental leave :-)

FWIW, the high-level idea here sounds great to me, and I agree with @Manishearth that this facility seems highly likely to be eventually used by the compiler, not just Clippy. Adding the feature will likely cause us to start finding opportunities.

I also agree with @oli-obk that jumping straight to a generic plugin system seems like premature abstraction.

What's the stabilization path here, and how difficult is the initial migration? Could we land this on a more or less experimental basis?

@oli-obk
Copy link
Contributor Author

oli-obk commented Apr 29, 2017

Since plugins are unstable, there's no stabilization path. We can even leave it out of the json for now and require rustfix to hook into the diagnostics api directly.

I can create a PR with one of the impls. I don't want to go around @nrc 's back and move this along with one of my fancier designs. I suggest we add the bool flag, since that has the most minimal impact, and can be converted into a more general system later easily

@Manishearth
Copy link
Member

I think we have to worry about the stabilization of the JSON format. But that can be appended to with minimal issues I guess.

@nrc
Copy link
Member

nrc commented May 4, 2017

The tools team discussed this in our meeting today. The general sentiment was that the motivation here should be addressed in as specific and minimally disruptive way as possible.

We would like to propose trying the bool flag and seeing how that works. In order to avoid the flag riding the trains to stability, by default, the compiler should not include the flag in the JSON output. A compiler plugin or tool (such as Clippy) should be able to opt-in to the flag existing either at compile time or run time, whichever is more convenient to implement. When generating suggestions (or other diagnostic) it should be optional whether the flag is provided or not.

Thus, we see the following situation for testing things out: Clippy turns the flag on and Clippy lints set the confidence levels. Rustfix can read this flag to do automatic corrections. Under normal usage, the compiler does not issue the flag. Compiler errors mostly do not set the flag, but can incrementally do so where useful. We'll evaluate how useful this is along the usual path to stability. If this sounds OK to the stakeholders in this thread, then we don't think it needs a new RFC and can just be implemented.

Assuming the above is OK, then this RFC should enter FCP with intent to close (@rfcbot FCP close).

@oli-obk
Copy link
Contributor Author

oli-obk commented May 4, 2017

I'm all on board with this, but have no idea what you mean by opt-in. It can't be a cfg since that would require rebuilding rustc. Would it be allright if we didn't show the flag in the diagnostics api at all, but instead just added it to the structures in the background? Clippy could then experiment with the flag without inconveniencing every span_suggestion call in rustc

@oli-obk oli-obk closed this May 4, 2017
@nrc
Copy link
Member

nrc commented May 5, 2017

but have no idea what you mean by opt-in

I meant that there should be some bool somewhere like use_confidence_flag which Clippy sets to true and rustc leaves false otherwise, and that the confidence flag only appears in the diagnostics if use_confidence_flag is true.

Would it be allright if we didn't show the flag in the diagnostics api at all, but instead just added it to the structures in the background?

This seems fine too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-compiler Relevant to the compiler team, which will review and decide on the RFC. T-dev-tools Relevant to the development tools team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.