empty_permute decomposition #2698

apbose · 2024-03-19T21:02:14Z

This is an extension to support aten::empty_like.

gs-olive · 2024-04-11T19:28:02Z

tests/py/dynamo/lowering/test_decompositions.py

+        fx_graph = torch.fx.symbolic_trace(emptyLike())
+        unexpected_ops_seen, expected_ops_unseen = lower_graph_testing(
+            fx_graph,
+            inputs,
+            expected_ops=expected_ops,
+            unexpected_ops=unexpected_ops,
+            min_block_size=1,
+        )


Could you show a printout of what the original and final graphs look like in this case? I want to verify that there is not a circular issue where empty_permuted generates empty_like, and vice versa

With the empty_permute decomposition the graph is this
Pre-AOT Autograd graph:=============

graph(): %l_x_ : torch.Tensor [num_users=1] = placeholder[target=L_x_] %add : [num_users=2] = call_function[target=torch.ops.aten.add](args = (%l_x_, %l_x_), kwargs = {}) %empty_like_default : [num_users=1] = call_function[target=torch.ops.aten.empty_like.default](args = (%add,), kwargs = {}) %add_1 : [num_users=1] = call_function[target=operator.add](args = (%empty_like_default, %add), kwargs = {}) return (add_1,)

Post-AOT Autograd graph:=======

graph(): %arg0_1 : [num_users=1] = placeholder[target=arg0_1] %clone : [num_users=1] = call_function[target=torch.ops.aten.clone.default](args = (%arg0_1,), kwargs = {}) %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%clone, %clone), kwargs = {}) %empty : [num_users=1] = call_function[target=torch.ops.aten.empty.memory_format](args = ([3, 2],), kwargs = {dtype: torch.float32, layout: torch.strided, device: cuda:0, pin_memory: False}) %permute : [num_users=1] = call_function[target=torch.ops.aten.permute.default](args = (%empty, [0, 1]), kwargs = {}) %add_1 : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%permute, %add), kwargs = {}) return (add_1,)

Graph after constant folding:

graph(): %arg0_1 : [num_users=1] = placeholder[target=arg0_1] %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%arg0_1, %arg0_1), kwargs = {}) %_frozen_param0 : [num_users=1] = get_attr[target=_frozen_param0] %add_1 : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%_frozen_param0, %add), kwargs = {}) return (add_1,)

Post-lowering passes Autograd graph:=======

graph(): %arg0_1 : [num_users=1] = placeholder[target=arg0_1] %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%arg0_1, %arg0_1), kwargs = {}) %_frozen_param0 : [num_users=1] = get_attr[target=_frozen_param0] %add_1 : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%_frozen_param0, %add), kwargs = {}) return (add_1,)

Without the decomposition, the graph is
Pre-AOT Autograd graph:=============

graph(): %l_x_ : torch.Tensor [num_users=1] = placeholder[target=L_x_] %add : [num_users=2] = call_function[target=torch.ops.aten.add](args = (%l_x_, %l_x_), kwargs = {}) %empty_like_default : [num_users=1] = call_function[target=torch.ops.aten.empty_like.default](args = (%add,), kwargs = {}) %add_1 : [num_users=1] = call_function[target=operator.add](args = (%empty_like_default, %add), kwargs = {}) return (add_1,)

Post-AOT Autograd graph:=======

graph(): %arg0_1 : [num_users=1] = placeholder[target=arg0_1] %clone : [num_users=1] = call_function[target=torch.ops.aten.clone.default](args = (%arg0_1,), kwargs = {}) %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%clone, %clone), kwargs = {}) %empty_permuted : [num_users=1] = call_function[target=torch.ops.aten.empty_permuted.default](args = ([3, 2], [0, 1]), kwargs = {dt ype: torch.float32, layout: torch.strided, device: cuda:0, pin_memory: False}) %add_1 : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%empty_permuted, %add), kwargs = {}) return (add_1,)

Graph after constant folding:

graph(): %arg0_1 : [num_users=1] = placeholder[target=arg0_1] %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%arg0_1, %arg0_1), kwargs = {}) %_frozen_param0 : [num_users=1] = get_attr[target=_frozen_param0] %add_1 : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%_frozen_param0, %add), kwargs = {}) return (add_1,)

Post-lowering passes Autograd graph:=======

graph(): %arg0_1 : [num_users=1] = placeholder[target=arg0_1] %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%arg0_1, %arg0_1), kwargs = {}) %_frozen_param0 : [num_users=1] = get_attr[target=_frozen_param0] %add_1 : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%_frozen_param0, %add), kwargs = {}) return (add_1,)

So empty_like decomposes into empty_permute which decomposes into empty.memory_format. The above test does not give error, even though empty.memory_format is not supported since constant folding removes the op.

I am working on empty.memory_format in PR #2745

In the above example, the Pre-AOT graph shows:

%empty_like_default : [num_users=1] = call_function[target=torch.ops.aten.empty_like.default](args = (%add,), kwargs = {})

Since there is only one argument in args, what is empty_permute = args[1] defined as in the decomposition for that case?

In the above case with the AOT decomposition, the above operation decomposes to

%empty_permuted : [num_users=1] = call_function[target=torch.ops.aten.empty_permuted.default](args = ([3, 2], [0, 1]), kwargs = {dtype: torch.float32, layout: torch.strided, device: cuda:0, pin_memory: False})

The args[1] in this case is[0,1] since it keeps the shapes in the original form.
Not sure how it gets the [0,1] exact, but I assume it must be the internal AOT lowering heuristics?

gs-olive

Overall looks good to me - added one clarifying question

gs-olive · 2024-04-12T22:10:57Z

tests/py/dynamo/lowering/test_decompositions.py

+        fx_graph = torch.fx.symbolic_trace(emptyLike())
+        unexpected_ops_seen, expected_ops_unseen = lower_graph_testing(
+            fx_graph,
+            inputs,
+            expected_ops=expected_ops,
+            unexpected_ops=unexpected_ops,
+            min_block_size=1,
+        )


In the above example, the Pre-AOT graph shows:

%empty_like_default : [num_users=1] = call_function[target=torch.ops.aten.empty_like.default](args = (%add,), kwargs = {})

Since there is only one argument in args, what is empty_permute = args[1] defined as in the decomposition for that case?

facebook-github-bot added the cla signed label Mar 19, 2024

github-actions bot added component: tests Issues re: Tests component: lowering Issues re: The lowering / preprocessing passes component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels Mar 19, 2024

github-actions bot requested a review from gs-olive March 19, 2024 21:02

apbose force-pushed the empty_permuted_decomposition branch from dcfe61d to 6abe7ce Compare April 5, 2024 00:15

empty_permute decomposition

6abe7ce

gs-olive reviewed Apr 11, 2024

View reviewed changes

gs-olive approved these changes Apr 12, 2024

View reviewed changes

apbose merged commit 0b29987 into main Apr 17, 2024
16 of 21 checks passed

peri044 pushed a commit that referenced this pull request Apr 19, 2024

empty_permute decomposition (#2698)

dee74c4

laikhtewari pushed a commit that referenced this pull request May 24, 2024

empty_permute decomposition (#2698)

6865779

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

empty_permute decomposition #2698

empty_permute decomposition #2698

apbose commented Mar 19, 2024

gs-olive Apr 11, 2024

apbose Apr 12, 2024 •

edited

Loading

gs-olive Apr 12, 2024

apbose Apr 16, 2024 •

edited

Loading

gs-olive left a comment

gs-olive Apr 12, 2024

empty_permute decomposition #2698

empty_permute decomposition #2698

Conversation

apbose commented Mar 19, 2024

gs-olive Apr 11, 2024

Choose a reason for hiding this comment

apbose Apr 12, 2024 • edited Loading

Choose a reason for hiding this comment

gs-olive Apr 12, 2024

Choose a reason for hiding this comment

apbose Apr 16, 2024 • edited Loading

Choose a reason for hiding this comment

gs-olive left a comment

Choose a reason for hiding this comment

gs-olive Apr 12, 2024

Choose a reason for hiding this comment

apbose Apr 12, 2024 •

edited

Loading

apbose Apr 16, 2024 •

edited

Loading