Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control FPM timeout to give time to flush out logs #1106

Closed
wants to merge 1 commit into from
Closed

Conversation

deleugpn
Copy link
Member

One of the most common issues we face is with Lambda timeout that doesn't signal anything (logs or etc) when running Lambda inside a VPC without a NAT. This proposal makes it so that FPM timeout faster than Lambda so that there's enough time to at least flush out logs and help developers debug why their lambda is timing out.

One thing to consider is that this solution is FPM-only and doesn't improve the situation on the other layers.

Copy link
Member

@t-richard t-richard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for starting this 👍

Even if it's in draft, here are some comments 😊
Also, I know @Nyholm did something similar that was only working with the function layer. Maybe combining both could help us solve this once and for all 🤞

$phpFpm = new FpmHandler($handlerFile);

$phpFpm->setTimeout((int) $timeout);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be called only when a timeout env Var is provided?

Because the variable is set by the serverless framework plugin, it may not be set when deploying with sam/cdk/cloudformation

defineBrefFpmTimeout(name) {
const timeout = this.serverless.service.functions[name].timeout ??
this.serverless.service.provider?.environment?.timeout ??
29 // We assume API Gateway Limit.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the default be 6 seconds (to be confirmed) , the default timeout for functions in the serverless framework?

If a timeout is not explicitly defined, sls would deploy it with a timeout of 6 seconds and fpm would timeout at 28 seconds. This would defeat the purpose of this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a tricky thing. If we go with 6 seconds here, anybody deploying without Serverless will suddenly see a huge timeout drop in their API Gateway. I know that if they're using API gateway they cannot have it bigger than 30 seconds. For ALB, the use case is much more limited and much more likely to be used by people aware of how AWS works.
Choosing a large amount because of API Gateway will cover the most common case and anybody having issues with 6 second timeout can always increase it to 30 seconds.

@deleugpn
Copy link
Member Author

deleugpn commented Dec 5, 2021

What I don't like about this change:

  • New Environment Variable added which increases configuration and maintenance surface
  • Only work on FPM
  • The Serverless Plugin is defining a BREF_FPM_TIMEOUT on ALL functions, not just functions using the FPM Layer.

@mnapoli
Copy link
Member

mnapoli commented Jan 8, 2022

🤦 somehow I mixed #770 and this PR.

So yeah #1133 builds on the solution explored here but without the environment variable/setter, all thanks an internal method of the FastCGI client library that isn't actually internal 🎉 : https://github.com/brefphp/bref/pull/1133/files#diff-9218ae2e7645f6996fa4502fbc2df967394f63138e768cf614c73875aa607047R121-R123

That allows us to set the timeout at runtime, based on the "context" object (instead of passing the timeout via an environment variable).

@mnapoli mnapoli added the bug label Jan 8, 2022
@mnapoli
Copy link
Member

mnapoli commented Jan 8, 2022

Thanks @deleugpn for exploring this lead!

@mnapoli mnapoli closed this Jan 8, 2022
@mnapoli mnapoli deleted the timeout branch January 8, 2022 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants