[1.; 10] generates worse code than [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.] #56333

jrmuizel · 2018-11-29T03:06:04Z

Here's an example

pub struct L {
    a: [f64; 10],
}

pub struct Allocation<'a, T: 'a> {
    f: &'a mut T,
}

impl<'a, T> Allocation<'a, T> {
    pub fn init(self, value: T) {
        *self.f = value;
    }
}

#[inline(never)]
pub fn foo(a: Allocation<L>) {
    a.init(L {
        a: [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
    });
}

#[inline(never)]
pub fn bar(a: Allocation<L>) {
    a.init(L { a: [1.; 10] });
}

gives

.LCPI0_0:
  .quad 4607182418800017408
  .quad 4607182418800017408
example::foo:
  movaps xmm0, xmmword ptr [rip + .LCPI0_0]
  movups xmmword ptr [rdi], xmm0
  movups xmmword ptr [rdi + 16], xmm0
  movups xmmword ptr [rdi + 32], xmm0
  movups xmmword ptr [rdi + 48], xmm0
  movups xmmword ptr [rdi + 64], xmm0
  ret

.LCPI1_0:
  .quad 4607182418800017408
  .quad 4607182418800017408
example::bar:
  sub rsp, 88
  movaps xmm0, xmmword ptr [rip + .LCPI1_0]
  movaps xmmword ptr [rsp], xmm0
  movaps xmmword ptr [rsp + 16], xmm0
  movaps xmmword ptr [rsp + 32], xmm0
  movaps xmmword ptr [rsp + 48], xmm0
  movaps xmmword ptr [rsp + 64], xmm0
  movaps xmm0, xmmword ptr [rsp + 64]
  movups xmmword ptr [rdi + 64], xmm0
  movaps xmm0, xmmword ptr [rsp + 48]
  movups xmmword ptr [rdi + 48], xmm0
  movaps xmm0, xmmword ptr [rsp + 32]
  movups xmmword ptr [rdi + 32], xmm0
  movaps xmm0, xmmword ptr [rsp + 16]
  movups xmmword ptr [rdi + 16], xmm0
  movaps xmm0, xmmword ptr [rsp]
  movups xmmword ptr [rdi], xmm0
  add rsp, 88
  ret

which has an additional copy of the array.

nagisa · 2018-12-06T14:53:14Z

We generate an explicit loop for all in-line repeat initializers which is why the codegen is usually worse compared to plain literals.

jrmuizel · 2018-12-13T19:37:12Z

I filed an llvm bug about this: https://bugs.llvm.org/show_bug.cgi?id=40011

ebkalderon · 2019-10-13T21:32:04Z

Looking at the comment responding to that bug, it seems that it might be an LLVM pass ordering issue. Just curious, but is this something that we can fix or work around on our own fork of LLVM? Or is this a more general issue that will need to be resolved upstream?

nagisa · 2019-10-13T21:44:25Z

We prefer to have as few differences in our fork from upstream as possible, and what differences we have, must pull their weight to warrant backporting them every time we bump LLVM.

steveklabnik · 2020-08-26T19:27:32Z

Triage: no change

jrmuizel · 2021-03-13T20:00:06Z

This appears to be fixed by #81451

jrmuizel mentioned this issue Nov 29, 2018

Avoid picture primitive copies via VecHelper servo/webrender#3362

Merged

oli-obk added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-codegen Area: Code generation labels Nov 29, 2018

nikic added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Dec 13, 2018

nikic mentioned this issue Dec 16, 2018

Unnecessary memcpy when using array initialization shorthand #56882

Closed

sbechet mentioned this issue Apr 22, 2020

not enough memory? or How to create Resource object directly in heap w/o using Stack abelykh0/stm32f103-vga-rs#1

Closed

jrmuizel closed this as completed Mar 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[1.; 10] generates worse code than [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.] #56333

[1.; 10] generates worse code than [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.] #56333

jrmuizel commented Nov 29, 2018

nagisa commented Dec 6, 2018

jrmuizel commented Dec 13, 2018

ebkalderon commented Oct 13, 2019

nagisa commented Oct 13, 2019 •

edited

Loading

steveklabnik commented Aug 26, 2020

jrmuizel commented Mar 13, 2021

[1.; 10] generates worse code than [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.] #56333

[1.; 10] generates worse code than [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.] #56333

Comments

jrmuizel commented Nov 29, 2018

nagisa commented Dec 6, 2018

jrmuizel commented Dec 13, 2018

ebkalderon commented Oct 13, 2019

nagisa commented Oct 13, 2019 • edited Loading

steveklabnik commented Aug 26, 2020

jrmuizel commented Mar 13, 2021

nagisa commented Oct 13, 2019 •

edited

Loading