Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid picture primitive copies via VecHelper #3362

Merged
merged 1 commit into from
Nov 28, 2018

Conversation

kvark
Copy link
Member

@kvark kvark commented Nov 27, 2018

This is a successor of #3360 that avoids the borrow checker dance via RAII
Addresses part of #3358


This change is Reviewable

// Matches the definition of SK_ScalarNearlyZero in Skia.
const NEARLY_ZERO: f32 = 1.0 / 4096.0;

/// A typesafe helper that separates new value construction from
/// vector growing, which allows LLVM to elide the value copy.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"and ideally construct the element in place"


impl<'a, T> Allocation<'a, T> {
#[inline(always)]
pub fn init(self, value: T) -> usize {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add:
// writing is safe because alloc() ensured enough capacity and Allocation holds a mutable borrow to prevent anyone else from breaking this invariant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this actually avoids the memcpy / memmove? At least for self arguments a bit back we couldn't work around it like this, see rust-lang/rust#42763.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed in playground. In fn foo() changing the push(xxx) to alloc().init(xxx) removes the memcpy.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed rust-lang/rust#56333 for a problem contributing to the first example not working well in the playground.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like SmallVec also causes this to not work. I filed rust-lang/rust#56356 about that.

@@ -338,7 +338,7 @@ impl FrameBuilder {
let mut profile_counters = FrameProfileCounters::new();
profile_counters
.total_primitives
.set(self.prim_store.prim_count);
.set(self.prim_store.prim_count());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice if this only ran if profiling was on.

@kvark kvark changed the title Avoid picture primitive copies via VecHelper [WIP] Avoid picture primitive copies via VecHelper Nov 28, 2018
@kvark
Copy link
Member Author

kvark commented Nov 28, 2018

I'm not able to use @jrmuizel 's tool that find memcopies, since IR validation fails on webrender crate. However, I tried to reproduce this in playground, and in both cases (vec.push(x) and vec.alloc().init(x)) the compiler created the data on the stack first and then issued a single memcpy, so it doesn't appear to be fixing the copy.

Is my playground experiment incorrect? Otherwise, what assumptions are wrong?

@bors-servo
Copy link
Contributor

☔ The latest upstream changes (presumably #3359) made this pull request unmergeable. Please resolve the merge conflicts.

@jrmuizel
Copy link
Collaborator

jrmuizel commented Nov 28, 2018 via email

@kvark
Copy link
Member Author

kvark commented Nov 28, 2018

@jrmuizel indeed, I confirmed it with the playground experiment.
Therefore, we should proceed with this PR.

@kvark kvark changed the title [WIP] Avoid picture primitive copies via VecHelper Avoid picture primitive copies via VecHelper Nov 28, 2018
@kvark
Copy link
Member Author

kvark commented Nov 28, 2018

@bors-servo r=jrmuizel

@bors-servo
Copy link
Contributor

📌 Commit d680e32 has been approved by jrmuizel

@bors-servo
Copy link
Contributor

⌛ Testing commit d680e32 with merge be65949...

bors-servo pushed a commit that referenced this pull request Nov 28, 2018
Avoid picture primitive copies via VecHelper

This is a successor of #3360 that avoids the borrow checker dance via RAII
Addresses part of #3358

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/3362)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

☀️ Test successful - status-appveyor, status-taskcluster
Approved by: jrmuizel
Pushing be65949 to master...

@bors-servo bors-servo merged commit d680e32 into servo:master Nov 28, 2018
@kvark kvark deleted the picture-copies2 branch November 28, 2018 17:31
@jrmuizel
Copy link
Collaborator

It looks like this may not have worked:

memcpy of 1592 in _ZN9webrender22display_list_flattener20DisplayListFlattener20pop_stacking_context17hc7e8ce6ec929e9eaE @
  write<webrender::picture::PicturePrimitive> /rustc/6f93e93af6f823948cc13d2938957757c6486d88/src/libcore/ptr.rs:739
  init<webrender::picture::PicturePrimitive> webrender/src/util.rs:33
  pop_stacking_context webrender/src/display_list_flattener.rs:1220
memcpy of 1592 in _ZN9webrender22display_list_flattener20DisplayListFlattener20pop_stacking_context17hc7e8ce6ec929e9eaE @
  write<webrender::picture::PicturePrimitive> /rustc/6f93e93af6f823948cc13d2938957757c6486d88/src/libcore/ptr.rs:739
  init<webrender::picture::PicturePrimitive> webrender/src/util.rs:33
  pop_stacking_context webrender/src/display_list_flattener.rs:1187
memcpy of 1592 in _ZN9webrender22display_list_flattener20DisplayListFlattener21push_stacking_context17h500d71065754d89aE @
  write<webrender::picture::PicturePrimitive> /rustc/6f93e93af6f823948cc13d2938957757c6486d88/src/libcore/ptr.rs:739
  init<webrender::picture::PicturePrimitive> webrender/src/util.rs:33
  cut_flat_item_sequence webrender/src/display_list_flattener.rs:2207
  push_stacking_context webrender/src/display_list_flattener.rs:994
memcpy of 1592 in _ZN9webrender22display_list_flattener20DisplayListFlattener20pop_stacking_context17hc7e8ce6ec929e9eaE @
  write<webrender::picture::PicturePrimitive> /rustc/6f93e93af6f823948cc13d2938957757c6486d88/src/libcore/ptr.rs:739
  init<webrender::picture::PicturePrimitive> webrender/src/util.rs:33
  pop_stacking_context webrender/src/display_list_flattener.rs:1158
memcpy of 1592 in _ZN9webrender22display_list_flattener20DisplayListFlattener12flatten_item17hfd2b2318f8c75547E @
  write<webrender::picture::PicturePrimitive> /rustc/6f93e93af6f823948cc13d2938957757c6486d88/src/libcore/ptr.rs:739
  init<webrender::picture::PicturePrimitive> webrender/src/util.rs:33
  pop_all_shadows webrender/src/display_list_flattener.rs:1564
  flatten_item webrender/src/display_list_flattener.rs:789
memcpy of 1592 in _ZN9webrender22display_list_flattener20DisplayListFlattener20pop_stacking_context17hc7e8ce6ec929e9eaE @
  write<webrender::picture::PicturePrimitive> /rustc/6f93e93af6f823948cc13d2938957757c6486d88/src/libcore/ptr.rs:739
  init<webrender::picture::PicturePrimitive> webrender/src/util.rs:33
  pop_stacking_context webrender/src/display_list_flattener.rs:1117

I wonder if llvm gets sad the enum variant is too big.

@jrmuizel
Copy link
Collaborator

It looks like SmallVec is contributing to the sadness:

#![crate_type = "lib"]

extern crate smallvec;

use smallvec::SmallVec;

#[derive(Default)]
pub struct L {
    a: SmallVec<[f64; 16]>,
    b: SmallVec<[f64; 16]>,
    c: SmallVec<[f64; 16]>
}

pub struct Allocation<T> {
    f: *mut T,
}
use std::ptr;

impl<T> Allocation<T> {
    pub fn init(self, value: T) {
        unsafe { ptr::write(self.f, value) };
    }
}

#[inline(never)]
pub fn bar(a: Allocation<L>) {
    a.init(L{a: SmallVec::new(), b: SmallVec::new(), c: SmallVec::new()});
}

compiles to

playground::bar:
	pushq	%rbx
	subq	$720, %rsp
	movq	%rdi, %rbx
	xorps	%xmm0, %xmm0
	movaps	%xmm0, 288(%rsp)
	movaps	%xmm0, (%rsp)
	movaps	%xmm0, 144(%rsp)
	leaq	432(%rsp), %rdi
	movq	%rsp, %rsi
	movl	$144, %edx
	callq	memcpy@PLT
	leaq	576(%rsp), %rdi
	leaq	144(%rsp), %rsi
	movl	$144, %edx
	callq	memcpy@PLT
	leaq	288(%rsp), %rsi
	movl	$432, %edx
	movq	%rbx, %rdi
	callq	memcpy@PLT
	addq	$720, %rsp
	popq	%rbx
	retq

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Nov 29, 2018
…4c9170a70ef7 (WR PR #3362). r=kats

servo/webrender#3362

Differential Revision: https://phabricator.services.mozilla.com/D13355

--HG--
extra : moz-landing-system : lando
jankeromnes pushed a commit to jankeromnes/gecko that referenced this pull request Nov 29, 2018
bors-servo pushed a commit that referenced this pull request Nov 30, 2018
Reduce data copies during internation

Excessive copying in `fn intern()` is something we noticed yesterday with @jrmuizel .
This PR attempts to improve them in two ways (each in a separate commit):
  1. reduce the `Update` enum size by moving the data out
  2. adopt entry-like API to accommodate the common pattern of `if index == v.len() { v.push(xxx) } else { v[index] = xxx; }` without panics between element construction and actual assignment. It builds upon the `VecHelper` work of #3362.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/3366)
<!-- Reviewable:end -->
@jrmuizel
Copy link
Collaborator

jrmuizel commented Feb 2, 2019

The smallvec example now generates better code with Nightly:

playground::bar: # @playground::bar
# %bb.0:
	subq	$440, %rsp              # imm = 0x1B8
	xorps	%xmm0, %xmm0
	movaps	%xmm0, (%rsp)
	movaps	%xmm0, 144(%rsp)
	movaps	%xmm0, 288(%rsp)
	movq	%rsp, %rsi
	movl	$432, %edx              # imm = 0x1B0
	callq	*memcpy@GOTPCREL(%rip)
	addq	$440, %rsp              # imm = 0x1B8
	retq

However it looks like there's still an extra memcpy

@jrmuizel
Copy link
Collaborator

jrmuizel commented Feb 2, 2019

I filed a follow up at rust-lang/rust#58082

gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this pull request Oct 3, 2019
…4c9170a70ef7 (WR PR #3362). r=kats

servo/webrender#3362

Differential Revision: https://phabricator.services.mozilla.com/D13355

UltraBlame original commit: 54d56b9a77fb9d41eff3ce1662ba5fff538365d2
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this pull request Oct 3, 2019
…4c9170a70ef7 (WR PR #3362). r=kats

servo/webrender#3362

Differential Revision: https://phabricator.services.mozilla.com/D13355

UltraBlame original commit: 54d56b9a77fb9d41eff3ce1662ba5fff538365d2
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this pull request Oct 3, 2019
…4c9170a70ef7 (WR PR #3362). r=kats

servo/webrender#3362

Differential Revision: https://phabricator.services.mozilla.com/D13355

UltraBlame original commit: 54d56b9a77fb9d41eff3ce1662ba5fff538365d2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants