-
Notifications
You must be signed in to change notification settings - Fork 767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Periodically seeing fatal Timeout error messages #3740
Comments
Looks a bit like a memory leak and the error is just the final thing before process dies and not the actual root cause. |
I see that the versions provided in the thread are also relatively old ( |
I agree that this looks a like a memory leak, but I don't know if we're leaking or if we're just in the stack when the crash happens. Does disabling opentelemetry prevent the leak/process crash? A minimal reproduction would be extremely helpful. |
Adding some extra info if it's helpful. We're seeing the same issue intermittently throughout the day with the following package versions:
I haven't had time to setup a minimum repro yet, but if I get some time, I'll try to get one together. We did not see these errors before setting up OpenTelemetry. They started the same day we turned it on. Some example stack traces:
|
@gblock0 a minimum repro is always welcome. 🙂 Are you having the same problem as OP (with it actually running into the heap limit)? Otherwise, it may be better to open another issue to keep things separate. 🤔 If it is not running out of memory, the problem could be that sometimes the server/collector is unavailable for export or that it would timeout occasionally (due to network issues, for instance). 🤔 If that is the case, and it's a network issue, and you'd like just to get rid of the logs, then there are two options:
Looking closer at the error messages, it looks like you're using an exporter that is not hosted in this repository but over at https://github.com/GoogleCloudPlatform/opentelemetry-operations-js |
Thanks for the quick reply @pichlermarc! I don't think we are running out of memory, but I'll double check. Sorry for making things confusing. I'll dig into the diag logger some and take a look the GCP exporter. Thanks for the pointers! |
@laurencefass are you still seeing the same problem with a more recent version? We've fixed a memory leak that sounds similar to this a while ago: #4115 |
@laurencefass closing this issue as we can't reproduce it on the current release. If you find that this is still happening with the most recent versions, please open a new issue and link this one. |
What happened?
Steps to Reproduce
I am running my otel enabled app and I am seeing fatal timeout errors during periods of no activity. See error details below.
Expected Result
no fatal error
Actual Result
This happens after a long but undefined period of inactivity.
2023-04-18 12:11:55 {"stack":"Error: Timeout\n at Timeout._onTimeout (/app/node_modules/@opentelemetry/sdk-trace-base/src/export/BatchSpanProcessorBase.ts:153:16)\n at listOnTimeout (internal/timers.js:557:17)\n at processTimers (internal/timers.js:500:7)","message":"Timeout","name":"Error"}
2023-04-18 12:12:49
2023-04-18 12:12:49 <--- Last few GCs --->
2023-04-18 12:12:49
2023-04-18 12:12:49 [5388:0xffff86fac380] 8046853 ms: Scavenge (reduce) 2038.5 (2044.7) -> 2038.3 (2046.7) MB, 5.8 / 0.0 ms (average mu = 0.128, current mu = 0.113) allocation failure
2023-04-18 12:12:49 [5388:0xffff86fac380] 8046863 ms: Scavenge (reduce) 2039.4 (2050.7) -> 2039.2 (2051.2) MB, 7.2 / 0.0 ms (average mu = 0.128, current mu = 0.113) allocation failure
2023-04-18 12:12:49 [5388:0xffff86fac380] 8046872 ms: Scavenge (reduce) 2040.2 (2045.2) -> 2040.0 (2048.2) MB, 7.3 / 0.0 ms (average mu = 0.128, current mu = 0.113) allocation failure
2023-04-18 12:12:49
2023-04-18 12:12:49
2023-04-18 12:12:49 <--- JS stacktrace --->
2023-04-18 12:12:49
2023-04-18 12:12:49 FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
Additional Details
OpenTelemetry Setup Code
package.json
Relevant log output
The text was updated successfully, but these errors were encountered: