-
Notifications
You must be signed in to change notification settings - Fork 821
Block delimiter: add new Scanner class. #44158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
The `Block_Delimiter` class introduced `next_delimiter()` and `scan_delimiters()`, which made it possible to parse the block structure in a document in a memory-efficient way. Unfortunately, fundamental choices for the interface, namely returning a new class instance on every block delimiter and relying on a generator function, limited the CPU performance fronteir of that class as a replacement for `parse_blocks()`. This new class introduces `Block_Scanner`, more directly-modeled after the HTML API and informed by refactors incorporating `Block_Delimiter`. This class mutates itself and requires a new instance before scanning. The tradeoff is that it’s much faster running while maintaining the same near-zero memory overhead. A new class is introduced due to the scale of change in the interface and in order to provide seamless refactoring of code already relying on `scan_delimiters()`.
Are you an Automattician? Please test your changes on all WordPress.com environments to help mitigate accidental explosions.
Interested in more tips and information?
|
Thank you for your PR! When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:
This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖 Follow this PR Review Process:
If you have questions about anything, reach out in #jetpack-developers for guidance! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a new high-performance, mutable Block_Scanner
class as a replacement for the legacy Block_Delimiter
, updates tests to cover it, and adds stubs, documentation, and changelog entries to support it.
- Add
WP_HTML_Span
stub and include it in test bootstrap - Implement
Block_Scanner
and extensive PHPUnit tests - Update README and changelog to describe and document the new scanner
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
tests/stubs/class-wp-html-span.php | Add stub for WP_HTML_Span |
tests/php/bootstrap.php | Load the new stub in test bootstrap |
tests/php/Block_Scanner_Test.php | New PHPUnit tests covering all scanner behaviors |
src/class-block-scanner.php | Implementation of Block_Scanner |
changelog/add-block-scanner-delimiter | Changelog entry for the added scanner |
README.md | Document new Block_Scanner and legacy Block_Delimiter |
Comments suppressed due to low confidence (1)
projects/packages/block-delimiter/src/class-block-scanner.php:235
- The docblock for
next_delimiter()
describes support for a$freeform_blocks
parameter, but the implementation currently ignores it. Please update the documentation to note that freeform scanning is not yet implemented or implement the parameter behavior to match the docs.
public function next_delimiter( string $freeform_blocks = 'skip' ): bool { // phpcs:ignore VariableAnalysis.CodeAnalysis.VariableAnalysis.UnusedVariable
} | ||
|
||
$json_span = substr( $this->source_text, $this->json_at, $this->json_length ); | ||
$parsed = json_decode( $json_span, null, 512, JSON_OBJECT_AS_ARRAY | JSON_INVALID_UTF8_SUBSTITUTE ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing null
as the second argument to json_decode()
relies on default behavior. Explicitly pass true
(for associative arrays) to make the intent clearer and avoid ambiguity.
$parsed = json_decode( $json_span, null, 512, JSON_OBJECT_AS_ARRAY | JSON_INVALID_UTF8_SUBSTITUTE ); | |
$parsed = json_decode( $json_span, true, 512, JSON_OBJECT_AS_ARRAY | JSON_INVALID_UTF8_SUBSTITUTE ); |
Copilot uses AI. Check for mistakes.
* not surrounded by block delimiters. Defaults to `skip`. | ||
* @return bool Whether a block delimiter was matched. | ||
*/ | ||
public function next_delimiter( string $freeform_blocks = 'skip' ): bool { // phpcs:ignore VariableAnalysis.CodeAnalysis.VariableAnalysis.UnusedVariable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The next_delimiter()
method is very large and handles multiple responsibilities. Consider extracting parts of its logic (e.g., scanning for comment boundaries, parsing JSON spans) into private helper methods to improve readability and maintainability.
Copilot uses AI. Check for mistakes.
|
||
The Block Delimiter package provides an efficient, streaming parser for working with WordPress block structure without the memory overhead of `parse_blocks()`. It's designed for scenarios where you need to inspect, find, or modify specific blocks without parsing the entire block tree. | ||
The Block Delimiter package provides efficient, streaming parsers for working with WordPress block structure without the memory overhead of `parse_blocks()`. It's designed for scenarios where you need to inspect, find, or modify specific blocks without parsing the entire block tree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Minor grammar: change “parsers” to singular “parser” since the sentence describes the package itself.
The Block Delimiter package provides efficient, streaming parsers for working with WordPress block structure without the memory overhead of `parse_blocks()`. It's designed for scenarios where you need to inspect, find, or modify specific blocks without parsing the entire block tree. | |
The Block Delimiter package provides an efficient, streaming parser for working with WordPress block structure without the memory overhead of `parse_blocks()`. It's designed for scenarios where you need to inspect, find, or modify specific blocks without parsing the entire block tree. |
Copilot uses AI. Check for mistakes.
Code Coverage Summary1 file is newly checked for coverage.
|
Proposed changes:
This adds a new class to the package. See matching PRs:
Other information:
Jetpack product discussion
Does this pull request change what data or activity we track or use?
Testing instructions: