Open
Description
define <vscale x 2 x i64> @rhadds_v2i64(<vscale x 2 x i64> %s0, <vscale x 2 x i64> %s1) {
; SVE-LABEL: rhadds_v2i64:
; SVE: // %bb.0: // %entry
; SVE-NEXT: eor z2.d, z0.d, z1.d
; SVE-NEXT: orr z0.d, z0.d, z1.d
; SVE-NEXT: asr z1.d, z2.d, #1
; SVE-NEXT: sub z0.d, z0.d, z1.d
; SVE-NEXT: ret
;
; SVE2-LABEL: rhadds_v2i64:
; SVE2: // %bb.0: // %entry
; SVE2-NEXT: ptrue p0.d
; SVE2-NEXT: srhadd z0.d, p0/m, z0.d, z1.d
; SVE2-NEXT: ret
%s0s = sext <vscale x 2 x i64> %s0 to <vscale x 2 x i128>
%s1s = sext <vscale x 2 x i64> %s1 to <vscale x 2 x i128>
%add = add <vscale x 2 x i128> %s0s, splat (i128 1)
%add2 = add <vscale x 2 x i128> %add, %s1s
%s = ashr <vscale x 2 x i128> %add2, splat (i128 1)
%result = trunc <vscale x 2 x i128> %s to <vscale x 2 x i64>
ret <vscale x 2 x i64> %result
}
folded to:
define <vscale x 2 x i64> @rhadds_v2i64(<vscale x 2 x i64> %s0, <vscale x 2 x i64> %s1) {
%s0s = sext <vscale x 2 x i64> %s0 to <vscale x 2 x i128>
%s1s = sext <vscale x 2 x i64> %s1 to <vscale x 2 x i128>
%not = xor <vscale x 2 x i128> %s0s, splat (i128 -1)
%sub = sub <vscale x 2 x i128> %s1s, %not
%s = ashr <vscale x 2 x i128> %sub, splat (i128 1)
%result = trunc <vscale x 2 x i128> %s to <vscale x 2 x i64>
ret <vscale x 2 x i64> %result
}
Targets such as aarch64 will fold add (add x, y), 1
to sub y, (xor x, -1)
, preventing the combineShiftToAVG AVGCEIL matching code from identifying the new pattern. This is a particular problem for the SVE/SVE2 tests as the above code will no longer compile.
Noticed while investigating regression for DAG topological sorting - which will fold the add to sub y, (xor x, -1)
earlier than usual.