Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove ConstValue::Slice #105653

Closed
wants to merge 3 commits into from
Closed

Remove ConstValue::Slice #105653

wants to merge 3 commits into from

Conversation

oli-obk
Copy link
Contributor

@oli-obk oli-obk commented Dec 13, 2022

r? @RalfJung

fixes #105536

This only removes Slice, we can do ZST and Scalar individually to judge their performance effect

@rustbot
Copy link
Collaborator

rustbot commented Dec 13, 2022

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @RalfJung (or someone else) soon.

Please see the contribution instructions for more information.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 13, 2022
@rustbot
Copy link
Collaborator

rustbot commented Dec 13, 2022

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

Some changes occurred in src/tools/clippy

cc @rust-lang/clippy

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 13, 2022

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Dec 13, 2022

⌛ Trying commit d9b98ef0f3dd6dc20cf835a154c5bf3bee839767 with merge 45be547054cbb06ab494b34fe16bb68a559385b2...

Comment on lines 178 to 179
Immediate::ScalarPair(a, b) => ConstValue::ByRef {
alloc: ecx.tcx.intern_const_alloc(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is silly, we start with an MPlace, then get the immediate, and then intern again?

We should just make sure that try_as_immediate is false for ScalarPair, then this match arm here is unreachable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we... don't always start with an MPlace. I tried this originally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait no there are more hacks here than I remember. Just saw this comment

        // It is guaranteed that any non-slice scalar pair is actually ByRef here.
        // When we come back from raw const eval, we are always by-ref. The only way our op here is
        // by-val is if we are in destructure_mir_constant, i.e., if this is (a field of) something that we
        // "tried to make immediate" before. We wouldn't do that for non-slice scalar pairs or
        // structs containing such.

But, still, the ScalarPair arm should then be unreachable, right?

How far away are we from being able to remove destructure_mir_constant entirely, porting all its users to valtrees?

Copy link
Member

@RalfJung RalfJung Dec 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this shows that we have an impedence mismatch if Immediate can contain ScalarPair but we still need to be able to convert arbitrary OpTy into ConstValue. But it's still a plus if we can do the ScalarPair/Slice handling once centrally in op_to_const rather than each codegen backend having to do this.

FWIW to fix #105536 it would probably be enough to change only try_as_immediate... then maybe the former intern_const_alloc arm here becomes unreachable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we... don't always start with an MPlace. I tried this originally.

Ah, I think ConstProp will also go through this and it can produce pairs. I hope the new ConstProp will use valtrees instead...

@RalfJung
Copy link
Member

Something is odd with ByRef const printing: the ByRef points to an allocation that contains a wide reference, which does not get printed. That allocation however contains a pointer to the actual string data, and that gets printed.

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 13, 2022

Something is odd with ByRef const printing: the ByRef points to an allocation that contains a wide reference, which does not get printed. That allocation however contains a pointer to the actual string data, and that gets printed.

yea, this is because the ByRef contains an &'tcx Allocation instead of an AllocId, causing that allocation to not get picked up

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 13, 2022

@rust-timer build 45be547054cbb06ab494b34fe16bb68a559385b2

@rust-timer

This comment has been minimized.

@RalfJung
Copy link
Member

Something is odd with ByRef const printing: the ByRef points to an allocation that contains a wide reference, which does not get printed. That allocation however contains a pointer to the actual string data, and that gets printed.

yea, this is because the ByRef contains an &'tcx Allocation instead of an AllocId, causing that allocation to not get picked up

That's a bug though (and doesn't seem to be fixed by the last commit) -- from the current rendering it is impossible to tell which constant in the source refers to which of the allocations at the bottom.

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 13, 2022

That's a bug though (and doesn't seem to be fixed by the last commit)

it's preexisting. I'm not fixing everything that's wrong about constants in this PR, just one "small" thing 😆

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 13, 2022

from the current rendering it is impossible to tell which constant in the source refers to which of the allocations at the bottom.

that's only relevant for when the source above actually mentions alloc ids. It's a bit too much info printed at the bottom now, I can filter that out if you want

@RalfJung
Copy link
Member

it's preexisting. I'm not fixing everything that's wrong about constants in this PR, just one "small" thing laughing

Hm, fair. Can you open an issue to track this?

@RalfJung
Copy link
Member

that's only relevant for when the source above actually mentions alloc ids. It's a bit too much info printed at the bottom now, I can filter that out if you want

Yeah, if we don't show the allocID there's little point in printing the corresponding allocation.

@rustbot rustbot added the T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) label Dec 13, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (45be547054cbb06ab494b34fe16bb68a559385b2): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.6% [0.2%, 1.5%] 21
Regressions ❌
(secondary)
7.1% [0.3%, 22.8%] 19
Improvements ✅
(primary)
-0.4% [-0.4%, -0.4%] 1
Improvements ✅
(secondary)
-0.7% [-0.7%, -0.7%] 1
All ❌✅ (primary) 0.6% [-0.4%, 1.5%] 22

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.8% [0.6%, 4.1%] 5
Regressions ❌
(secondary)
8.0% [1.1%, 19.7%] 30
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.0% [-2.5%, -1.5%] 2
All ❌✅ (primary) 1.8% [0.6%, 4.1%] 5

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.8% [1.2%, 2.8%] 5
Regressions ❌
(secondary)
16.6% [1.7%, 37.1%] 16
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.8% [1.2%, 2.8%] 5

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 22, 2023
@oli-obk
Copy link
Contributor Author

oli-obk commented May 22, 2023

paired with #111768 the LLVM regressions go away entirely.

There is still a major regression in the coercions benchmark that I don't think is resolvable. creating a single Allocation is simply more expensive than creating a value of ConstValue::Slice.

We could explore using some SmallVec or similar for all the small Allocations that are created everywhere.

@RalfJung
Copy link
Member

These are small allocations for string literals in the source, I assume? Can those generate a valtree instead of an Allocation?

@oli-obk
Copy link
Contributor Author

oli-obk commented May 23, 2023

These are small allocations for string literals in the source, I assume? Can those generate a valtree instead of an Allocation?

yea, that's the plan, getting rid of ConstValue in const patterns entirely and only having ValTrees there. I think I wanted to remove ConstValue::Slice first, as I thought it was annoying to move to valtrees otherwise, but I can't remember why I thought that. I'll do the valtree stuff first.

bors added a commit to rust-lang-ci/rust that referenced this pull request May 30, 2023
Optimize scalar and scalar pair representations loaded from ByRef in llvm

in rust-lang#105653 I noticed that we were generating suboptimal LLVM IR if we had a `ConstValue::ByRef` that could be represented by a `ScalarPair`. Before rust-lang#105653 this is probably rare, but after it, every slice will go down this suboptimal code path that requires LLVM to untangle a bunch of indirections and translate static allocations that are only used once to read a scalar pair from.
RalfJung pushed a commit to RalfJung/miri that referenced this pull request May 31, 2023
Optimize scalar and scalar pair representations loaded from ByRef in llvm

in rust-lang/rust#105653 I noticed that we were generating suboptimal LLVM IR if we had a `ConstValue::ByRef` that could be represented by a `ScalarPair`. Before rust-lang/rust#105653 this is probably rare, but after it, every slice will go down this suboptimal code path that requires LLVM to untangle a bunch of indirections and translate static allocations that are only used once to read a scalar pair from.
antoyo pushed a commit to rust-lang/rustc_codegen_gcc that referenced this pull request Jun 3, 2023
Optimize scalar and scalar pair representations loaded from ByRef in llvm

in rust-lang/rust#105653 I noticed that we were generating suboptimal LLVM IR if we had a `ConstValue::ByRef` that could be represented by a `ScalarPair`. Before rust-lang/rust#105653 this is probably rare, but after it, every slice will go down this suboptimal code path that requires LLVM to untangle a bunch of indirections and translate static allocations that are only used once to read a scalar pair from.
@oli-obk
Copy link
Contributor Author

oli-obk commented Jun 20, 2023

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 20, 2023
@bors
Copy link
Contributor

bors commented Jun 20, 2023

⌛ Trying commit 246d190 with merge d870714f42cfdb6491316e5c358010b2268d6e93...

@bors
Copy link
Contributor

bors commented Jun 20, 2023

☀️ Try build successful - checks-actions
Build commit: d870714f42cfdb6491316e5c358010b2268d6e93 (d870714f42cfdb6491316e5c358010b2268d6e93)

1 similar comment
@bors
Copy link
Contributor

bors commented Jun 20, 2023

☀️ Try build successful - checks-actions
Build commit: d870714f42cfdb6491316e5c358010b2268d6e93 (d870714f42cfdb6491316e5c358010b2268d6e93)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (d870714f42cfdb6491316e5c358010b2268d6e93): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.4% [0.2%, 0.9%] 17
Regressions ❌
(secondary)
13.3% [0.2%, 27.6%] 19
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.4% [0.2%, 0.9%] 17

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.3% [0.5%, 2.7%] 3
Regressions ❌
(secondary)
8.7% [2.2%, 22.2%] 18
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.2% [-4.4%, -2.5%] 4
All ❌✅ (primary) 1.3% [0.5%, 2.7%] 3

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
27.4% [4.7%, 46.3%] 15
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.3% [0.0%, 1.2%] 74
Regressions ❌
(secondary)
0.2% [0.0%, 0.4%] 15
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.3% [0.0%, 1.2%] 74

Bootstrap: 658.605s -> 656.253s (-0.36%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 20, 2023
@oli-obk
Copy link
Contributor Author

oli-obk commented Jun 21, 2023

The coercions regression is almost entirely due to the allocation and deallocation of allocations for slice constants:

51,379,440  ???:<rustc_const_eval::interpret::memory::AllocRef<rustc_middle::mir::interpret::AllocId, ()>>::read_scalar
44,257,269  ???:rustc_const_eval::interpret::intern::intern_const_alloc_recursive::<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>
33,685,073  ???:<rustc_const_eval::interpret::memory::AllocRefMut<rustc_middle::mir::interpret::AllocId, ()>>::write_scalar
24,221,336  ???:free
19,857,105  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::read_immediate_raw
19,683,306  ???:rustc_const_eval::const_eval::slice_for_alloc
18,808,545  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::write_immediate_no_validate
16,252,930  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::allocate
15,352,449  ???:malloc
15,109,539  ???:<rustc_middle::ty::context::TyCtxt>::mk_const_alloc
14,941,453  ???:<rustc_const_eval::interpret::intern::InternVisitor<rustc_const_eval::const_eval::machine::CompileTimeInterpreter> as rustc_const_eval::interpret::visitor::ValueVisitor<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::visit_value
13,762,350  ???:rustc_const_eval::const_eval::eval_queries::op_to_const
12,668,915  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::statement
12,121,936  ???:_rjem_je_tcache_bin_flush_small
11,646,331  ???:<rustc_middle::ty::context::CtxtInterners>::intern_ty
11,490,183  ???:<hashbrown::raw::RawTable<(rustc_middle::ty::context::InternedInSet<rustc_middle::mir::interpret::allocation::Allocation>, ())>>::reserve_rehash::<hashbrown::map::make_hasher<rustc_middle::ty::context::InternedInSet<rustc_middle::mir::interpret::allocation::Allocation>, (), core::hash::BuildHasherDefault<rustc_hash::FxHasher>>::{closure
10,616,661  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::ref_to_mplace

Maybe removing this for mir constants isn't actually the right way forward. We could start mirroring LLVM and have a full on ConstValue::ScalarPair?

@RalfJung
Copy link
Member

Hm, I'm trying to understand/remember what this is actually all about. What is ConstValue::Slice used for? I doubt "consts that evaluate to a slice" can be very relevant; also those already have allocations created for them anyway (for the return place) so that can't be the perf hit I assume.

So is this about slice literals that appear in the source, and having an efficient representation for those? That should be valtrees though? coercions has tons of slice literals so I guess this is the right trace. But why is there a ton of mk_const_alloc calls in that benchmark?

@RalfJung RalfJung added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Sep 5, 2023
@oli-obk
Copy link
Contributor Author

oli-obk commented Sep 14, 2023

closing in favor of #115764 preserving the AllocId

@oli-obk oli-obk closed this Sep 18, 2023
@apiraino apiraino removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Oct 12, 2023
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this pull request Apr 20, 2024
Optimize scalar and scalar pair representations loaded from ByRef in llvm

in rust-lang/rust#105653 I noticed that we were generating suboptimal LLVM IR if we had a `ConstValue::ByRef` that could be represented by a `ScalarPair`. Before rust-lang/rust#105653 this is probably rare, but after it, every slice will go down this suboptimal code path that requires LLVM to untangle a bunch of indirections and translate static allocations that are only used once to read a scalar pair from.
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this pull request Apr 27, 2024
Optimize scalar and scalar pair representations loaded from ByRef in llvm

in rust-lang/rust#105653 I noticed that we were generating suboptimal LLVM IR if we had a `ConstValue::ByRef` that could be represented by a `ScalarPair`. Before rust-lang/rust#105653 this is probably rare, but after it, every slice will go down this suboptimal code path that requires LLVM to untangle a bunch of indirections and translate static allocations that are only used once to read a scalar pair from.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

slice::from_raw_parts returns a different address in const context for u8
8 participants