Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return an error when script is not found rather than crash with SIGSEGV #49276

Closed

Conversation

brianathere
Copy link

Related to #43205

This change handles the case of the script not being found with an error and a message, rather than a SIGSEGV. Makes it easier to debug when this happens.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or

(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or

(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.

(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/loaders
  • @nodejs/modules
  • @nodejs/vm

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels Aug 21, 2023
@aduh95
Copy link
Contributor

aduh95 commented Aug 22, 2023

Can you add tests please?

@bnoordhuis
Copy link
Member

The fix doesn't look Obviously Correct(TM) to me. I'd summarize it as "do something different when the key is missing from the map" but why is the key missing in the first place?

@brianathere
Copy link
Author

Yes, I'll work up a test.

As to correctness. Yes, this doesn't fix why the item is missing from the map. When I researched that problem I found there are multiple reasons for why that may be, and see other tickets open for over. ayear while people have been working on those.

What lead me to this is to make it more ergonomic to discover why NodeJS is crashing when you encoutner this. In my case I'd imported some code into my project, and when using it in a certain way, it would SIGSEGV with zero context. As it was a large code base, I had no idea where to start looking for the issue. With this change, I received a nice clean backtrace in javascript at the call site where this happened, and it was very clear why it happened. I was then able to make a simple fix on the javascript side. (I'll try to use this as the basis of a test, the code was calling async import in a timer that was running after the module was unloaded, I believe).

I believe this fix will help reduce the number of place people experience SIGSEV with no context, and help find valid bugs or workaround in the javascript code they right.

@bnoordhuis
Copy link
Member

I sympathize but applying bandaids without understanding the underlying cause is not something we do.

@brianathere
Copy link
Author

Related tickets that would be easier to debug with this change:

#44211
#45449
#44219
#43205
#40058

@brianathere
Copy link
Author

Finally, while true that I don't see an easy path to correct the error, I believe it's better to report the error to the caller than to end the process.

As to this being a bandaid, I think that's a bit rude. In other sections of this same module, when an invariant has been violated, instead of crashing, the code returns an error to the caller. I'm following the same pattern, and I found it very helpful as the developer using this code to have the information. I see other users reporting similar issues, and believe with this information they would be able to find a workaround rather than file a ticket about a SIGSEGV. That's not a bandaid, it's a tool.

@arcanis
Copy link
Contributor

arcanis commented Aug 22, 2023

I sympathize but applying bandaids without understanding the underlying cause is not something we do.

It's not a bandaid if it still crashes though (it's not a workaround, just a clearer error) - and it avoids causing a SIGSEGV, which I imagine may have security implications.

@devsnek
Copy link
Member

devsnek commented Aug 22, 2023

didn't @joyeecheung have a proper fix for this?

@brianathere
Copy link
Author

I saw some work to fix a root cause of the script now being present, not sure if that was the only possible cause. I believe this change is safe and makes a nice step forward and is compatible with other work that closes cases where the script is missing.

@joyeecheung
Copy link
Member

joyeecheung commented Aug 23, 2023

#48510 should fix the issues related to module/script lifetime (it's pending a Node.js update in the V8 CI for the necessary V8 patch to pass the CI). Though I agree having a promise rejection is somewhat better than just crashing at SIGSEV. Personally though I would prefer doing

    auto it = env->id_to_script_map.find(id);
    CHECK_NE(it, env->id_to_script_map.end());

Similar to what we do for functions below, instead of rejecting a promise. That should also give you some useful information in the logs (usually, the source location of the assertion, or a message that helps you find the location of the assertion). An assertion certainly makes more sense than SIGSEV at a invalid pointer. A promise rejection less so, because I don't think you are going to see useful stack trace with a promise rejected from the C++ land and the rejection can be mishandled in the user land. If we do want to use a promise, I think this needs a test, you can borrow test/es-module/test-dynamic-import-script-lifetime.js in #48510

@brianathere
Copy link
Author

brianathere commented Aug 23, 2023

My personal experience, the promise rejection made it immediately clear which JavaScript code triggered the problem allowing me to quickly work around the issue. An assertion wouldn't have been helpful.

Excellent test, i will reuse it.

@joyeecheung
Copy link
Member

joyeecheung commented Aug 23, 2023

the promise rejection made it immediately clear which JavaScript code triggered the problem

Do you actually get a meaningful stack trace from this rejection though? Normally the stack trace isn't useful (or there isn't a stack at all, for the C++ land rejections), see: #49045

@brianathere
Copy link
Author

Yes, super useful trace to a closure that was attempting an async import from a timer. Exact smoking gun. Of course, other issues may not be as easy to catch.

@joyeecheung
Copy link
Member

I see, I guess in this case the promise is passed back to V8 so V8 would thread the stacks together. So a promise is probably more helpful then - though I am not sure how this helps users find a fix, what they need to do as a workaround is holding a reference to the script somewhere to ensure it outlive the potential import. Perhaps that should be noted in the error message, the downside though is that this is essentially trading crashing for leaking, so when the bug gets fixed, they would have an unnecessary leaky workaround in their code unless they remember to remove it..

@brianathere
Copy link
Author

In my case I had a simple change I could make to how the script was handled. Yes, a workaround, but easy.

Copy link
Member

@bnoordhuis bnoordhuis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed CHECK would be okay but this PR is not a proper solution, as explained above. Red X to prevent this from landing as-is.

@brianathere
Copy link
Author

The check is far less useful. See the rest of the thread of comments.

@brianathere
Copy link
Author

I'm going to drop this, then, since I don't see the point in writing a test to test an assertion asserts, and the maintainer is fine with a SIGSEGV. Leaving it open for now in case this helps others find a way to debug when Node crashes.

@bnoordhuis
Copy link
Member

The check is far less useful.

I think the point you're missing is that we maintainers (or at least: I do) strongly suspect that the missing map key is a bug in node, cause unknown.

Crashing in C++ land is appropriate because that hopefully points us to the root cause. Turning it into a runtime JS exception is not because a) users having to work around node bugs is never appropriate, and b) it doesn't help us get closer to the root cause.

I'm going to close this but thanks anyway for the PR.

@bnoordhuis bnoordhuis closed this Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants