-
-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow developer to control FpmHandler timeout #770
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Oct 9, 2020
Closing this as it's a bad solution anyway. |
mnapoli
added a commit
that referenced
this pull request
Jan 4, 2022
When Lambda times out with the PHP-FPM layer, the logs written by the PHP script are never flushed to stderr by PHP-FPM. That means they never reach CloudWatch, which makes timeouts really hard to debug. With this change, Bref waits for the FPM response until 1 second before the actual Lambda timeout (via a connection timeout on the FastCGI connection). If Bref reaches that point, it will ask PHP-FPM to gracefully restart the PHP-FPM worker, which: - flushes the logs (logs end up in CloudWatch, which is great) - restarts a clean FPM worker, without doing a full FPM restart (which may take longer) Follow up of #770, #772, #895 May address some of #862 Note: this does not change anything for the Function layer (only affects FPM). Also this does not show a full stack track of the place in the code where the timeout happens (#895 did). Still it's an improvement over the current status.
Do you remember why it was a bad solution? I follow the same path and I managed to get something working: #1133 |
mnapoli
added a commit
that referenced
this pull request
Jan 4, 2022
When Lambda times out with the PHP-FPM layer, the logs written by the PHP script are never flushed to stderr by PHP-FPM. That means they never reach CloudWatch, which makes timeouts really hard to debug. With this change, Bref waits for the FPM response until 1 second before the actual Lambda timeout (via a connection timeout on the FastCGI connection). If Bref reaches that point, it will ask PHP-FPM to gracefully restart the PHP-FPM worker, which: - flushes the logs (logs end up in CloudWatch, which is great) - restarts a clean FPM worker, without doing a full FPM restart (which may take longer) Follow up of #770, #772, #895 May address some of #862 Note: this does not change anything for the Function layer (only affects FPM). Also this does not show a full stack track of the place in the code where the timeout happens (#895 did). Still it's an improvement over the current status.
mnapoli
added a commit
that referenced
this pull request
Jan 4, 2022
When Lambda times out with the PHP-FPM layer, the logs written by the PHP script are never flushed to stderr by PHP-FPM. That means they never reach CloudWatch, which makes timeouts really hard to debug. With this change, Bref waits for the FPM response until 1 second before the actual Lambda timeout (via a connection timeout on the FastCGI connection). If Bref reaches that point, it will ask PHP-FPM to gracefully restart the PHP-FPM worker, which: - flushes the logs (logs end up in CloudWatch, which is great) - restarts a clean FPM worker, without doing a full FPM restart (which may take longer) Follow up of #770, #772, #895 May address some of #862 Note: this does not change anything for the Function layer (only affects FPM). Also this does not show a full stack track of the place in the code where the timeout happens (#895 did). Still it's an improvement over the current status.
The setter was bad because it required users to write their own |
mnapoli
added a commit
that referenced
this pull request
Feb 14, 2023
When Lambda times out with the PHP-FPM layer, the logs written by the PHP script are never flushed to stderr by PHP-FPM. That means they never reach CloudWatch, which makes timeouts really hard to debug. With this change, Bref waits for the FPM response until 1 second before the actual Lambda timeout (via a connection timeout on the FastCGI connection). If Bref reaches that point, it will ask PHP-FPM to gracefully restart the PHP-FPM worker, which: - flushes the logs (logs end up in CloudWatch, which is great) - restarts a clean FPM worker, without doing a full FPM restart (which may take longer) Follow up of #770, #772, #895 May address some of #862 Note: this does not change anything for the Function layer (only affects FPM). Also this does not show a full stack track of the place in the code where the timeout happens (#895 did). Still it's an improvement over the current status.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
Bref currently don't support APIs that take longer than 30 seconds, even if running behing Application Load Balancer, which don't have the same restriction as API Gateway.
Diagnostics
Lambda Timeout: Specified on the Lambda Resource (SAM/Serverless), it is currently limited by AWS up to 15 minutes. I have set this to 180 seconds to support APIs that take up to 3 minutes.
PHP Max Execution Time: ini configuration that comes by default set to 27s1. We are able to increase this value by providing our own ini configuration2.
Unix Read Write Timeout: The focus of this Pull Request. Currently, The Unix Socket has a timeout of 30 seconds3. Any request that reaches the 30 seconds mark gets the following exception:
Solutions
In order to support requests that takes longer than 30 seconds we have a few options.
Environment Variable. BREF can expose one environment variable that controls the Unix timeout. Ideally, if AWS Lambda had a standard environment variable containing the time specified in the Lambda Timeout (item 1), that would be perfect for us to intercept and use. Unfortunately, AWS does not offer that and although it could solve the issue, I don't know if it's fair to ask Bref to expose yet another environment variable for us. As a user, I'm not against this option, but I sympathize that it's not ideal for Bref maintenance.
Allow users to set the Timeout via the setter. This pull request adds that ability. It is, however, a setter method on an
@internal
class, which doesn't make much sense because Bref internally don't need this setter. The fact that this class is internal makes it a strong argument to decline this change.Users can write their own FpmHandler. This option requires users to copy/paste the entire FpmHandler class while still being subject to the same issues as the previous option.
Usage
Where users would be able to call the
setReadWriteTimeout
method? We would have to mix 2 layers: FPM and Function. The FPM layer brings all the necessary files to get FPM to work. The Function layer allows us to return our own handler4.The final result looks like this:
In the Template, we want the
function
layer to come in the end as it will overwrite files brought by the previous layer5Our Handler would point to a functional handler:
The Handler File could instantiate and modify the Fpm object:
Caveat / Gotcha
1- The Function layer will also overwrite the
php.ini
from the FPM Layer. We're still able to patch that up inside our projects, but it's something users would need to keep in mind.2- Changes to
Function/bootstrap.php
could theoretically negatively affect this setup, but I say theoretically because we're still respecting the contract from Function: we're returning a valid Handler.3- Changes to
Fpm/bootstrap
would not take effect on people doing a setup like this. This one is less theoretical as any improvements made to Fpm Bref would not be seen by people doing this.4- The Lambda handler is no longer the
index.php
and the custom Handler has to instantiateFpmHandler
specifying where theindex.php
is located.5- FpmHandler is internal.
Conclusion
While writing all of this, I'm starting to think that this is not a good idea for Bref. I will create the PR anyway as it could be useful information for future users even if it gets declined. I think this PR could be a good place to get the discussion rolling on:
1- Should Bref support APIs that take longer than 30 seconds behind ALB?
2- Are there any alternatives that better fit Bref philosophy?
3- Are there ways to mitigate the caveats / gotchas?
Reference