-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[SLP] Fix cost estimation of external uses with wrong VF #148185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
gbossu
wants to merge
1
commit into
llvm:main
Choose a base branch
from
gbossu:gbossu.slp.fix.external.use.cost
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+42
−2
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the things I do find surprising is that the scalar load of
%a7
is kept, instead of being extracted from the last lane ofTMP1
. I guess it is fine, but costing and codegen are inconsistent. Changing that behaviour is out of scope for this PR.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cost of the original scalar is less than the extractelement, so prefer to stick with the original scalar instead of generating extractelement instruction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to extract this test into a separate file and make it generating the remarks with the cost to see how it affects the cost
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cost is actually the same, because the Lane idx is essentially ignored by TTI implementations of
getVectorInstrCost()
. However, querying the cost for a lane that is out-of-bounds is still wrong.In my local experiments with REVEC, I ended up querying the extraction cost of a sub-vector with an index that is out of bounds, and that did trigger an assert in
getShuffleCost()
. I just cannot find a test that reproduces the issue without bringing in more changes.Do you still want me to separate the test in a different file and show the costs (even though they would not change)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is true. In many cases, an extract from Index 0 is cheaper than an extract from other lanes, so getVectorInstrCost checks for the lane index.
I think you can see the difference if the original vector factor occupies more than a single register. In this case, some type legalization (spanning across registers) may add to the cost.
Would be good to try to adjust the test to either cause a crash, or show cost changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I wasn't very precise, I meant
getVectorInstrCost
implementations don't generally check if the lane is in bounds, but they do indeed check for specific values like0
or-1
.As mentioned in the description, the added assert does trigger crashes if
BundleWidth
isn't correctly computed. I thought this would be enough to show that the parameters given togetVectorInstrCost
are incorrect.