Make type checking true read-only visitor #4829

asl · 2024-07-23T21:46:09Z

TypeChecking / TypeInference is the pass that is executed all the time during the compilation:

After each program tranformation to update type map
As nested pass on subtrees during MethodInstance evaluation
In some other cases, e.g. during static asserts evaluations

I counted on some downstream program that top-level (on P4Program node) TypeInference is called > 70 times. The nested invocations are much more numerous. Only in few cases (~2 out of these 70) is does some transformations, e.g. insert casts.

The real issue is that TypeInference is a Transform: so it does cloning of the IR even in so-called "read-only mode" and nothing really changes.

This PR restructures the code in such a way, that base functionality could be called from both Inspector and Transform. The only downside is that for Transform case we might have double-cloning, but this seems to cause negligible impact.

As a result, the functionality is split into:

ReadOnlyTypeInference which is an Inspector. It is used for TypeCheck, inside MethodInstance::resolve and read-write TypeInference learner
TypeInference that is a Transform. For debugging purposes I left the readOnly interface here as the cost to support it is essentially free.

Overall we are saving:

no clone for each node during type checking and use of TypeInference in read-only matter
as a result, much less malloc traffic
saves on internal Transform structures and much less complicated logic

Fixes #4815

asl · 2024-07-23T21:48:30Z

Removing of overheads seem to yield 5-10% improvements in runtime (I have not measured memory pressure, but it is clear it would be less):

Command	Mean [s]	Min [s]	Max [s]	Relative
`test/gtestp4c-main --gtest_filter=P4CParserUnroll.switch_20160512`	4.860 ± 0.115	4.653	5.064	1.07 ± 0.04
`test/gtestp4c --gtest_filter=P4CParserUnroll.switch_20160512`	4.534 ± 0.108	4.291	4.667	1.00

And similar times for downstream projects:

Command	Mean [s]	Min [s]	Max [s]	Relative
`main`	58.350 ± 1.484	57.078	61.983	1.07 ± 0.03
`this PR`	54.343 ± 0.723	53.538	56.008	1.00

asl · 2024-07-23T21:48:59Z

Certainly the better solution would be making Transform do lazy cloning, but it is a terribly invasive change effecting everything :)

asl · 2024-07-23T22:46:00Z

I have not yet checked why, but it seems we used to emit errors twice. E.g. testdata/p4_16_errors_outputs/issue3727.p4-stderr contains:

issue3727.p4(7): [--Werror=type-error] error: f2(1) is not invoking an action
    actions = {f2(1);}
               ^^^^
issue3727.p4(7): [--Werror=type-error] error: f2(1) is not invoking an action
    actions = {f2(1);}
               ^^^^
[--Werror=type-error] error: Error while analyzing t

now we produce:

p4c/testdata/p4_16_errors/issue3727.p4(7): [--Werror=type-error] error: f2(1) is not invoking an action
    actions = {f2(1);}
               ^^^^

asl · 2024-07-24T00:48:30Z

I have not yet checked why, but it seems we used to emit errors twice. E.g. testdata/p4_16_errors_outputs/issue3727.p4-stderr contains:

Ok, looks like there was a subtle bug within error reporter that is now simply not being triggered. The example code in question is:

        if (getContext()->node->is<IR::ActionListElement>()) {
            typeError("%1% is not invoking an action", expression);
            return expression;
        }

Before expression was IR::MethodCallExpression * and now it is const IR::MethodCallExpression *. Eventually typeError call ends in diagnose within error_reporter.h that has the following set of overloads:

    template <class T, typename = std::enable_if_t<Util::has_SourceInfo_v<T>>, typename... Args>
    void diagnose(DiagnosticAction action, const int errorCode, const char *format,
                  const char *suffix, const T *node, Args &&...args) {
       ...
    }

    template <class T, typename = std::enable_if_t<Util::has_SourceInfo_v<T>>, typename... Args>
    void diagnose(DiagnosticAction action, const int errorCode, const char *format,
                  const char *suffix, const T &node, Args &&...args) {
     ...
    }

    template <typename... Args>
    void diagnose(DiagnosticAction action, const char *diagnosticName, const char *format,
                  const char *suffix, Args &&...args) {
     ...
    }

When we pass non-const Node* pointer to these set of overloads, the third one is selected (as among first two the second one is selected essentially accepting a Node* by const reference, but then having it SFINA-disabled as Node* cannot have that has_SourceInfo() check succeeded with T = Node*), therefore all checks for duplicate emission that is based on SourceInfo are bypassed.

So, in this PR we started to pass const IR::MethodCallExpression * and therefore the different overload was selected, as a result, no duplicate diagnostics was emitted.

Frankly speaking, I do not understand why we're having that accept-by-reference overload at all, as only nodes are expected to have source info and all nodes are heap-allocated objects. I think we'd just remove it.

fruffy · 2024-07-24T13:30:21Z

Certainly the better solution would be making Transform do lazy cloning, but it is a terribly invasive change effecting everything :)

Why not duplicate Transform, modify the duplicated version, then use it only for TypeInference? If that works we can also use the modified version in other places.

A further complication is that, by design, many IR objects have const pointers which requires a copy.

asl · 2024-07-24T15:33:04Z

Why not duplicate Transform, modify the duplicated version, then use it only for TypeInference? If that works we can also use the modified version in other places.

Sorry, I do not follow. what do you mean as "duplicate"? Are you suggesting making a copy of 180k SLOC of implementation?

fruffy · 2024-07-24T16:51:31Z

Why not duplicate Transform, modify the duplicated version, then use it only for TypeInference? If that works we can also use the modified version in other places.

Sorry, I do not follow. what do you mean as "duplicate"? Are you suggesting making a copy of 180k SLOC of implementation?

On the note of

The real issue is that TypeInference is a Transform: so it does cloning of the IR even in so-called "read-only mode" and nothing really changes.

Can you implement another version of Transform that does lazy cloning which is then used by TypeInference?

Do you also need to change the IR implementation or many parts of TypeInference?

grg · 2024-07-24T16:55:46Z

Can you implement another version of Transform that does lazy cloning which is then used by TypeInference?

LazyTransform! 🙂

asl · 2024-07-24T18:23:29Z

Do you also need to change the IR implementation or many parts of TypeInference?

The IR is the same. Few things of TypeInference were changed not to modify things in-place, but rather return them as a result of preorder / postorder. See the commits in this PR for each of these cases, these changes were made prior to refactoring.

Can you implement another version of Transform that does lazy cloning which is then used by TypeInference?

Well, this would require some rethought on how IR is created / modified. Note that for lazy cloning we'd need to:

"Inform" transform that change was made
Propagate changes up to the context (might be tricky, especially in postorder)

These changes drastically the overall visitor concept, currently lots of things expect that everything is "fresh clone" and "there is also original node somewhere". LazyTransform would change this semantics heavily. I have some thoughts how this might be implemented, but it is much larger and complicated change rather than factoring out the generic functionality into a common base class and inherit the implementation depends on the intent.

grg · 2024-07-24T17:16:39Z

frontends/p4/typeChecking/readOnlyTypeInference.cpp

+DEFINE_PREORDER(IR::EntriesList)
+DEFINE_PREORDER(IR::Type_SerEnum)
+
+#undef DEFINE_PREORDER


Duplicated on line 162

grg · 2024-07-24T19:15:24Z

frontends/p4/typeChecking/typeCheckDecl.cpp

+    /*
+      The type of the member is initially set in the Type_SerEnum preorder visitor.
+      Here we check additional constraints and we may correct the member.
+      if (done()) return member;
+    */


This seems inconsistent with other comment styles in the file -- should it be double-slash?

Yes, it should be. However, this was the original comment. I just moved the function to a different file (see https://github.com/p4lang/p4c/pull/4829/files#diff-316382055471bab8faf77716e614a5508e70e44c56e52096f08ff5ebe6ee7e9dL355). Also note that the original code had if (done()) return member shortcut commented out. I have not dig into the logic why it was so required.

grg · 2024-07-24T19:45:40Z

The only downside is that for Transform case we might have double-cloning, but this seems to cause negligible impact.

Verifying that I correctly understand the reason for this:

To support the inspect read-only (Inspector) pass, the pointers that are passed to the TypeInferenceBase class are const IR::<Type> * instead of IR::<Type> *. In order to modify the object being pointed at, we need to clone it first.
Now when running the non-read-only (Transform) pass, Transform::apply_visitor clones each node before visiting, and TypeInferenceBase then creates a second clone.

To avoid the second copy, would either of these be a viable solution:

Have both const and non-const pointer versions of all TypeInferenceBase preorder/postorder functions that modify the object being passed in. Then ReadOnlyTypeInference can invoke the const pointer version and TypeInference can invoke the non-const pointer version. My big concern with this approach is that it would introduce lots of code duplication. (If it was possible to identify early if cloning is required, the const version could do an up-front clone and invoke the non-const version of each function.)
Add a boolean read_only field to TypeInferenceBase, and a "clone-or-unconst" function that clones the node when the read_only flag is true, and does a const_cast on the node if read-only is false to remove the constness. I avoid const_cast, but maybe this is one place where it is appropriate?

grg · 2024-07-24T19:54:28Z

I should add that since you said that the double-clone seems to have negligible impact, it's probably not worth attempting either of the suggestions in my previous comment.

But can extend the TypeInferenceBase comment in typeChecker.h to explain that a double-clone is occurring in case someone is looking at this later on please?

asl · 2024-07-24T20:19:21Z

I should add that since you said that the double-clone seems to have negligible impact, it's probably not worth attempting either of the suggestions in my previous comment.

The reason why I said the impact is negligible is the following:

Normal Transform also does double clone. First clone is done before preorder / postorder (so they receive pointer to non-const Node), second clone is done if preorder / postorder returns something that is is not the node that they received in
It turns out that when casts are inserted (non-read-only mode) only very few places (I count 5 or 6 node kinds) that modified thing in-place. All other cases created new nodes and returned them from preorder / postorder resulting in double-cloning. Essentially, I made it so no method changed things in-place. And these cases are rare (see first commits in this PR for this)
Few places that did manual visiting of their children in a specific order now do explicit clone when necessary.

And yes, the additional double-cloning only happens in very few TypeInference invocation where it inserts casts (~1 or 2 from 70+ total).

Hope this makes the things a bit more clear :)

frontends/p4/typeChecking/typeCheckDecl.cpp

ChrisDodd · 2024-07-30T06:39:42Z

Certainly the better solution would be making Transform do lazy cloning, but it is a terribly invasive change effecting everything :)

I've been trying to think of a way of doing this. One possibility would be to change the preorder/postorder methods to take and return a "smart" pointer type that holds both a const pointer to the original node and non-cost pointer to the clone that would initially be nullptr. The first modification of the node would clone it. Template magic would be involved to make this all work cleanly.

For incremental improvement, the way to do this sort of thing is to start by writing a new Visitor subclass replacement for Transform (maybe called LazyTransform?) and then experiment with rewriting existing Transform subclasses to use it. A big complication is how it interacts with Visitor::Context -- we need the smart pointers in there too, but perhaps that could be managed by careful updating.

ChrisDodd · 2024-07-30T06:47:12Z

Frankly speaking, I do not understand why we're having that accept-by-reference overload at all, as only nodes are expected to have source info and all nodes are heap-allocated objects. I think we'd just remove it.

It's for dealing with inline fields -- where a IR::Node subclass appears directly as a field within another IR::Node subclass, rather than as a pointer. While we could require that using such a field in an error message requires an explicit & to get a pointer to it, that would be error prone.

asl · 2024-07-30T08:21:52Z

For incremental improvement, the way to do this sort of thing is to start by writing a new Visitor subclass replacement for Transform (maybe called LazyTransform?) and then experiment with rewriting existing Transform subclasses to use it. A big complication is how it interacts with Visitor::Context -- we need the smart pointers in there too, but perhaps that could be managed by careful updating.

Another tricky part is various side maps: they may capture original const-node (e.g. in preorder), but in the case of any children change, the will need to update to cloned object....

ChrisDodd · 2024-09-03T23:49:47Z

frontends/p4/typeChecking/typeCheckDecl.cpp

-                decl = cloned;
-            }
-        }
+        if (decl->initializer != nullptr) visit(decl->initializer);


There's no need to check for nullptr here, as calling visit on a nullptr is a noop.

Yeah... but this was the original code :)

ChrisDodd · 2024-09-04T03:26:58Z

For incremental improvement, the way to do this sort of thing is to start by writing a new Visitor subclass replacement for Transform (maybe called LazyTransform?) and then experiment with rewriting existing Transform subclasses to use it. A big complication is how it interacts with Visitor::Context -- we need the smart pointers in there too, but perhaps that could be managed by careful updating.

Another tricky part is various side maps: they may capture original const-node (e.g. in preorder), but in the case of any children change, the will need to update to cloned object....

That's already tricky -- any Modifier?Transform that runs currently will invalidate any data structure outside the IR DAG that refers to IR nodes. Perhaps we need a way of registering such a data structure with the visitor infrastructure so that it can be automatically updated to reflect any clones that occur, but I'm not sure how that would best work. Some sort of callback to be called whenever a node is cloned (or replaced? Transform may completely replace a node with a newly created node that might not even be the same class)

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

…pechecker as a Visitor with reduced overhead Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

asl added the core Topics concerning the core segments of the compiler (frontend, midend, parser) label Jul 23, 2024

asl requested review from vlstill and ChrisDodd July 23, 2024 21:46

asl added the run-sanitizer Use this tag to run a Clang+Sanitzers CI run. label Jul 24, 2024

asl mentioned this pull request Jul 24, 2024

Ensure correct overload for diagnose() is called even in Transform context #4830

Merged

fruffy requested a review from grg July 24, 2024 13:27

grg reviewed Jul 24, 2024

View reviewed changes

grg approved these changes Jul 25, 2024

View reviewed changes

asl force-pushed the type-inference-visitor branch 2 times, most recently from 74172b3 to 1e3daeb Compare July 29, 2024 16:32

ChrisDodd reviewed Jul 30, 2024

View reviewed changes

frontends/p4/typeChecking/typeCheckDecl.cpp Outdated Show resolved Hide resolved

asl force-pushed the type-inference-visitor branch from 1e3daeb to 64b4b8b Compare September 2, 2024 01:52

asl added the run-validation Use this tag to trigger a Validation CI run. label Sep 2, 2024

asl force-pushed the type-inference-visitor branch from f362104 to c27ea8f Compare September 2, 2024 19:03

ChrisDodd approved these changes Sep 4, 2024

View reviewed changes

asl added 18 commits September 3, 2024 21:14

Ensure we create new ReturnStatement node should the initializer change.

80a1b35

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Recreate ForIn node if collection changes during type checking

8aacb81

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Reduce nesting level

892c47a

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Move few methods to proper place

ccab76d

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Properly re-create Declaration_Variable during type checking

cfc11b9

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Use better implementations

6f8abdc

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Recreate Mux node on change

fb9f720

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Recreate SerEnumMember node on change

8478ea7

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Recreate ConstructorCallExpression on change

c49fe15

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Make type checking of declaration instance to return new node

cd35540

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Do not change SwitchStatement in-place

2233e59

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Factor out type inference code into a common base class. Implement ty…

2c1176a

…pechecker as a Visitor with reduced overhead Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Switch to ReadOnlyTypeInference in couple of places

d7c6a33

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Unbreak unity builds

c886b83

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Pacify -Werror bots

e362109

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Fix new linting errors

27c26b9

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Simplify clone() dance

733689d

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

Simplify error formatting

ea6a303

Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>

asl force-pushed the type-inference-visitor branch from e983e97 to ea6a303 Compare September 4, 2024 04:14

asl added this pull request to the merge queue Sep 4, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 4, 2024

asl added this pull request to the merge queue Sep 4, 2024

Merged via the queue into p4lang:main with commit 83fa5f3 Sep 4, 2024
18 checks passed

asl deleted the type-inference-visitor branch September 4, 2024 07:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make type checking true read-only visitor #4829

Make type checking true read-only visitor #4829

asl commented Jul 23, 2024

asl commented Jul 23, 2024

asl commented Jul 23, 2024 •

edited

Loading

asl commented Jul 23, 2024

asl commented Jul 24, 2024

fruffy commented Jul 24, 2024 •

edited

Loading

asl commented Jul 24, 2024

fruffy commented Jul 24, 2024

grg commented Jul 24, 2024

asl commented Jul 24, 2024 •

edited

Loading

grg Jul 24, 2024

grg Jul 24, 2024

asl Jul 24, 2024 •

edited

Loading

grg commented Jul 24, 2024

grg commented Jul 24, 2024

asl commented Jul 24, 2024 •

edited

Loading

ChrisDodd commented Jul 30, 2024

ChrisDodd commented Jul 30, 2024

asl commented Jul 30, 2024

ChrisDodd Sep 3, 2024

asl Sep 4, 2024

ChrisDodd commented Sep 4, 2024

Make type checking true read-only visitor #4829

Make type checking true read-only visitor #4829

Conversation

asl commented Jul 23, 2024

asl commented Jul 23, 2024

asl commented Jul 23, 2024 • edited Loading

asl commented Jul 23, 2024

asl commented Jul 24, 2024

fruffy commented Jul 24, 2024 • edited Loading

asl commented Jul 24, 2024

fruffy commented Jul 24, 2024

grg commented Jul 24, 2024

asl commented Jul 24, 2024 • edited Loading

grg Jul 24, 2024

Choose a reason for hiding this comment

grg Jul 24, 2024

Choose a reason for hiding this comment

asl Jul 24, 2024 • edited Loading

Choose a reason for hiding this comment

grg commented Jul 24, 2024

grg commented Jul 24, 2024

asl commented Jul 24, 2024 • edited Loading

ChrisDodd commented Jul 30, 2024

ChrisDodd commented Jul 30, 2024

asl commented Jul 30, 2024

ChrisDodd Sep 3, 2024

Choose a reason for hiding this comment

asl Sep 4, 2024

Choose a reason for hiding this comment

ChrisDodd commented Sep 4, 2024

asl commented Jul 23, 2024 •

edited

Loading

fruffy commented Jul 24, 2024 •

edited

Loading

asl commented Jul 24, 2024 •

edited

Loading

asl Jul 24, 2024 •

edited

Loading

asl commented Jul 24, 2024 •

edited

Loading