Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AbstractArrayAssignmentRestrictions: implement PHPCSUtils, bug fixes, modern PHP and more #2266

Merged
merged 17 commits into from
Jul 4, 2023

Conversation

jrfnl
Copy link
Member

@jrfnl jrfnl commented Jun 25, 2023

Review notes

This PR is the result of a comprehensive review of the AbstractArrayAssignmentRestrictionsSniff class.

While this PR makes significant improvements, there is still more wrong with this abstract:

  • The text quotes are stripped off the value before it gets passed to the callback, meaning the callback can no longer determine what type of value was received, which could lead to false positives/negatives.
    Example: https://3v4l.org/YH5Rp (and yes, this means that even though numeric literals with underscores should be supported, we cannot do so reliably).
  • A compound key, like [ $prefix . 'posts_per_page' ], will be flagged as if the key were 'posts_per_page' as the key does not get captured in its entirety.
  • Along the same lines, the fact that the callback method gets a string value for $key and $value is wrong as it doesn't allow the sniffs to walk the tokens for a proper analysis of the key/value.

To fix all of that, would IMO require changing the method signature for the callback() method.
Adjusting the method signature is non-trivial as it would instantly cause fatal errors for all sniffs using the abstract due to the signature mismatch: https://3v4l.org/0tf0c

Alternatively, adding a second callback method option which would receive stack pointers could be considered.

In both cases, that would be quite a big breaking change and even though we are at 3.0.0, I have not seen any issues reported asking for a change along these lines, so I don't think making that breaking change is justified.

If anything, I think we may need to introduce a replacement for this abstract at some point (which passes the tokens to the callback) and deprecate this one. Then dev-users can migrate to the replacement at their own pace.

That is outside the scope of this PR however. This PR attempts to make what's there better in the full knowledge that this is an abstract which shouldn't exist in its current form.

Commit details

AbstractArrayAssignmentRestrictions: add some extra tests

... to raise code coverage and cover some situations previously untested.

Tests have been added to the WordPress.WP.PostsPerPage sniff, which contains a @covers tag for the abstract.

AbstractArrayAssignmentRestrictions: improve documentation

Most notably: the value for 'message' as was appears to have led to sniff devs not using the key, which was never the intention.

AbstractArrayAssignmentRestrictions: minor readability fix

AbstractArrayAssignmentRestrictions: implement PHPCSUtils

AbstractArrayAssignmentRestrictions: bug fix - improve comment tolerance [1]

Non-code style sniffs should ignore most whitespace and comment tokens.

The AbstractArrayAssignmentRestrictions did not do so correctly, which could lead to false negatives.

This commit fixes the handling of comments between the "key" and the assignment operator by the sniff.

Includes tests via the WordPress.WP.PostsPerPage sniff.

Related to #763

AbstractArrayAssignmentRestrictions: bug fix - improve comment tolerance [2]

Non-code style sniffs should ignore most whitespace and comment tokens.

The AbstractArrayAssignmentRestrictions did not do so correctly, which could lead to false negatives.

This commit fixes the handling of comments between the assignment operator and the assigned value.

Includes tests via the WordPress.WP.PostsPerPage sniff.

Related to #763

AbstractArrayAssignmentRestrictions: minor efficiency tweak [1]

Get rid of two unnecessary in_array() function calls.

AbstractArrayAssignmentRestrictions: minor efficiency tweak [2]

The position of the T_EQUAL token has already been determined before, so let's use the predetermined value for efficiency.

Includes renaming the local variable to be slightly more descriptive.

AbstractArrayAssignmentRestrictions: minor efficiency tweak [3]

No need to start looping the inner array if the $key is not one of the target keys in the current group.

AbstractArrayAssignmentRestrictions: precision (bug) fix

Check if the array "key" token found is a text string token as well as checking that the content is non-numeric.

This is also an efficiency fix as code like $query_args[] = 300; would previously be analyzed with key [, which is clearly nonsense.

Along the same lines, a numeric string key would previously also be analyzed, due to the quotes around the number, while PHP auto-casts numeric string keys to integers (see: https://3v4l.org/uVFsO ), which means they are not the target of this sniff and therefore should be disregarded.
This also meant that the $key passed to the callback() method could previously potentially be an integer (as the same type casts happens when we store the numeric string key to the $inst array), while it is documented as string only.

Note: this didn't lead to false positives as nobody would set the "key" to search for to [, but it was still wrong.

Includes tests via the WordPress.WP.PostsPerPage sniff.

AbstractArrayAssignmentRestrictions: bug fix - simplify the $inst(ances) array

As things were, the sniff would store the encountered "instances" in a multi-dimensional array looking like this:

array(
    (string) 'key' => array(
        [0] => array(
            [0] => (string) 'value',
            [1] => (int) line number,
        ),
        ...
    )
)

Now, the sniffs listens to four different tokens:

  • T_DOUBLE_ARROW
  • T_CLOSE_SQUARE_BRACKET
  • T_CONSTANT_ENCAPSED_STRING
  • T_DOUBLE_QUOTED_STRING

For the first two, there could only ever be one entry in the lowest level of the array as it looks at each array item individually due to the use of the T_DOUBLE_ARROW token for array declarations and the T_CLOSE_SQUARE_BRACKET token being used for assignments to an existing array.

For the last two, the found string is analyzed to see if it is a query string and parsed if it is. In that case, there could be multiple entries in the array.

However.... when PHP parses a query string, there will never be multiple entries for the same key: https://3v4l.org/n3Iv9

So, I cannot fanthom why having multiple entries for the same key would be useful for this sniff. Only the last found entry should be taken into account.
There were also no tests (anywhere) covering a query string containing a duplicate key.

So, taking the above into account, this commit simplifies the entries are stored in the $inst array to this:

array(
    (string) 'key' => array(
        [0] => (string) 'value',
        [1] => (int) line number,
    )
)

This array format change will now prevent incorrect messages about a "high" value which is subsequently overwritten by a "low" value anyway.

Includes tests via the WordPress.WP.PostsPerPage sniff.

AbstractArrayAssignmentRestrictions: make the contents of $inst more comprehensible

Even though the array is only used locally, let's use keys for the entries to make the code more readable/comprehensible.

AbstractArrayAssignmentRestrictions: report the error for array assignments on the key pointer

For array assignments, the error would previously be reported on a ] or => token, while those tokens in and of themselves are not where the problem lies.

As the sniff reports based on the key (with an optional value check in the callback() method), I think it is more appropriate and more correct for the sniff to report on the stack pointer to the token identified as the key.

To achieve this, a new array entry is added to the $inst array so the loop throwing the error has access to that information.

For query strings, the error will continue to be thrown on the text string token containing the query string.

The difference this commit makes can be seen when using the code report offered by PHPCS, like so --report=code.

Includes tests via the WordPress.WP.PostsPerPage sniff.

AbstractArrayAssignmentRestrictions: allow for PHP 7.4+ null coalesce equals

While other assignment operators all lead to calculations, which means we cannot reliably determine the actual value, the null coalesce assignment operator provides a default value, which means we can analyze the value as if it were the value set.

Includes tests via the WordPress.WP.PostsPerPage sniff.

AbstractArrayAssignmentRestrictions: add test with PHP 8.0 match

Includes tests via the WordPress.WP.PostsPerPage sniff.

AbstractArrayAssignmentRestrictions: bug fix - improve value capturing

This commit improves the way the AbstractArrayAssignmentRestrictionsSniff captures the value of an array key assignment before it gets passed on to the callback for further examination.

As things were previously, the sniff would go wrong in two cases:

  1. Without a trailing comma after an array item, the long/short array close bracket would be included in the value, while it shouldn't be.
    This was reported in issue AbstractArrayAssignmentRestrictionsSniff doesn't correctly capture array values with no trailing comma #2211.
  2. The sniff would not always capture a complete value when the value contained a function call, sub-array, closure etc; basically anything which can contain a comma or semi-colon, but where that comma/semi-colon is in an "inner" construct and not the "outer" construct which is being captured.
    This was reported in issue False positive "High pagination limit" when posts_per_page is set to a function. #2027

The fix included in this commit fixes the first issue completely and a partial fix for the second issue.

Includes tests via the WordPress.WP.PostsPerPage sniff.

Fixes #2211

Note: a follow-up PR which builds onto this commit will address #2027.


🆕 AbstractArrayAssignmentRestrictionsSniff: don't strip quotes off value

As the value capturing is now more likely to (correctly) contain multiple tokens, stripping the quotes of a value is plain incorrect as wit would lead to a value like this:

$var['key'] = 'my' . 10 . 'something';

... to be passed on as my' . 10 . 'something - i.e. without the surrounding quotes, but with the non-matching quotes inside the value.

I'm proposing to remove the call to TextStrings::stripQuotes() to allow the callback() to receive a more accurate and actionable value.

This will be a potentially breaking change for some extenders, so will need a clear entry in the dev upgrade guide. (It doesn't impact the WPCS native sniffs extending the abstract at this time)

Copy link
Member

@GaryJones GaryJones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question, one small typo fix, and a suggestion to update 1c969ab commit to include the # before the issue references so they get automatically tagged to this commit/PR.

WordPress/AbstractArrayAssignmentRestrictionsSniff.php Outdated Show resolved Hide resolved
WordPress/Tests/WP/PostsPerPageUnitTest.inc Outdated Show resolved Hide resolved
jrfnl added 16 commits June 26, 2023 17:30
... to raise code coverage and cover some situations previously untested.

Tests have been added to the `WordPress.WP.PostsPerPage` sniff, which contains a `@covers` tag for the abstract.
Most notably: the value for `'message'` as was appears to have led to sniff devs not using the key, which was never the intention.
…nce [1]

Non-code style sniffs should ignore most whitespace and comment tokens.

The `AbstractArrayAssignmentRestrictions` did not do so correctly, which could lead to false negatives.

This commit fixes the handling of comments between the "key" and the assignment operator by the sniff.

Includes tests via the `WordPress.WP.PostsPerPage` sniff.
…nce [2]

Non-code style sniffs should ignore most whitespace and comment tokens.

The `AbstractArrayAssignmentRestrictions` did not do so correctly, which could lead to false negatives.

This commit fixes the handling of comments between the assignment operator and the assigned value.

Includes tests via the `WordPress.WP.PostsPerPage` sniff.
Get rid of two unnecessary `in_array()` function calls.
The position of the `T_EQUAL` token has already been determined before, so let's use the predetermined value for efficiency.

Includes renaming the local variable to be slightly more descriptive.
No need to start looping the inner array if the `$key` is not one of the target keys in the current group.
Check if the array "key" token found is a text string token as well as checking that the content is non-numeric.

This is also an efficiency fix as code like `$query_args[] = 300;` would previously be analyzed with key `[`, which is clearly nonsense.

Along the same lines, a numeric string key would previously also be analyzed, due to the quotes around the number, while PHP auto-casts numeric string keys to integers (see: https://3v4l.org/uVFsO ), which means they are not the target of this sniff and therefore should be disregarded.
This also meant that the `$key` passed to the `callback()` method could previously potentially be an integer (as the same type casts happens when we store the numeric string key to the `$inst` array), while it is documented as `string` only.

Note: this didn't lead to false positives as nobody would set the "key" to search for to `[`, but it was still wrong.

Includes tests via the `WordPress.WP.PostsPerPage` sniff.
…nces) array

As things were, the sniff would store the encountered "instances" in a multi-dimensional array looking like this:
```
array(
    (string) 'key' => array(
        [0] => array(
            [0] => (string) 'value',
            [1] => (int) line number,
        ),
        ...
    )
)
```

Now, the sniffs listens to four different tokens:
* `T_DOUBLE_ARROW`
* `T_CLOSE_SQUARE_BRACKET`
* `T_CONSTANT_ENCAPSED_STRING`
* `T_DOUBLE_QUOTED_STRING`

For the first two, there could only ever be one entry in the lowest level of the array as it looks at each array item individually due to the use of the `T_DOUBLE_ARROW` token for array declarations and the `T_CLOSE_SQUARE_BRACKET` token being used for assignments to an existing array.

For the last two, the found string is analyzed to see if it is a query string and parsed if it is. In that case, there could be multiple entries in the array.

However.... when PHP parses a query string, there will never be multiple entries for the same key: https://3v4l.org/n3Iv9

So, I cannot fanthom why having multiple entries for the same key would be useful for this sniff. Only the last found entry should be taken into account.
There were also no tests (anywhere) covering a query string containing a duplicate key.

So, taking the above into account, this commit simplifies the entries are stored in the `$inst` array to this:
```
array(
    (string) 'key' => array(
        [0] => (string) 'value',
        [1] => (int) line number,
    )
)
```

This array format change will now prevent incorrect messages about a "high" value which is subsequently overwritten by a "low" value anyway.

Includes tests via the `WordPress.WP.PostsPerPage` sniff.
…e comprehensible

Even though the array is only used locally, let's use keys for the entries to make the code more readable/comprehensible.
…nments on the key pointer

For array assignments, the error would previously be reported on a `]` or `=>` token, while those tokens in and of themselves are not where the problem lies.

As the sniff reports based on the key (with an optional value check in the `callback()` method), I think it is more appropriate and more correct for the sniff to report on the stack pointer to the token identified as the key.

To achieve this, a new array entry is added to the `$inst` array so the loop throwing the error has access to that information.

For query strings, the error will continue to be thrown on the text string token containing the query string.

The difference this commit makes can be seen when using the `code` report offered by PHPCS, like so `--report=code`.

Includes tests via the `WordPress.WP.PostsPerPage` sniff.
… equals

While other assignment operators all lead to calculations, which means we cannot reliably determine the actual value, the null coalesce assignment operator provides a default value, which means we can analyze the value as if it were the value set.

Includes tests via the `WordPress.WP.PostsPerPage` sniff.
Includes tests via the `WordPress.WP.PostsPerPage` sniff.
This commit improves the way the `AbstractArrayAssignmentRestrictionsSniff` captures the value of an array key assignment before it gets passed on to the callback for further examination.

As things were previously, the sniff would go wrong in two cases:
1. Without a trailing comma after an array item, the long/short array close bracket would be included in the value, while it shouldn't be.
    This was reported in issue 2211.
2. The sniff would not always capture a complete value when the value contained a function call, sub-array, closure etc; basically anything which can contain a comma or semi-colon, but where that comma/semi-colon is in an "inner" construct and not the "outer" construct which is being captured.
    This was reported in issue 2027

The fix included in this commit fixes the first issue completely and a partial fix for the second issue.

Includes tests via the `WordPress.WP.PostsPerPage` sniff.

Fixes 2211
@jrfnl jrfnl force-pushed the feature/abstractarrayassignmentrestrictions-review branch from 1c969ab to 02b4581 Compare June 26, 2023 15:33
@jrfnl
Copy link
Member Author

jrfnl commented Jun 26, 2023

a suggestion to update 1c969ab commit to include the # before the issue references so they get automatically tagged to this commit/PR.

I don't do that on purpose. Quite often in my workflow, I rebase and (force-)push branches numerous times over their lifetime. If I include the # in the commit message, each of those pushes results in a new "entry" on the ticket which can be very noisy.

Instead I add the # in the PR description once something is PRed, which ensures the ticket and the PR are linked anyway, but then without the commit noise.

As the value capturing is now more likely to (correctly) contain multiple tokens, stripping the quotes of a value is plain incorrect as wit would lead to a value like this:
```php
$var['key'] = 'my' . 10 . 'something';
```
... to be passed on as `my' . 10 . 'something` - i.e. without the surrounding quotes, but with the non-matching quotes inside the value.

I'm proposing to remove the call to `TextStrings::stripQuotes()` to allow the `callback()` to receive a more accurate and actionable value.

This will be a potentially breaking change for _some_ extenders, so will need a clear entry in the dev upgrade guide. (It doesn't impact the WPCS native sniffs extending the abstract at this time)
@jrfnl
Copy link
Member Author

jrfnl commented Jun 27, 2023

❗ I've added one additional commit to this PR, please check it carefully as it contains a breaking change, but one which IMO must be made.

@jrfnl
Copy link
Member Author

jrfnl commented Jul 4, 2023

@GaryJones Do you still want to have a look at that new commit or can this be merged ?

@GaryJones GaryJones merged commit 76c0b13 into develop Jul 4, 2023
35 checks passed
@GaryJones GaryJones deleted the feature/abstractarrayassignmentrestrictions-review branch July 4, 2023 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AbstractArrayAssignmentRestrictionsSniff doesn't correctly capture array values with no trailing comma
3 participants