Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_runner: cancel on termination #43549

Conversation

MoLow
Copy link
Member

@MoLow MoLow commented Jun 23, 2022

refs: #43505 (comment)
this PR marks all tests that did not end as canceled and reports when the process is terminated

CC @tniessen @benjamingr @nodejs/test_runner

@nodejs-github-bot nodejs-github-bot added needs-ci PRs that need a full CI run. test_runner Issues and PRs related to the test runner subsystem. labels Jun 23, 2022
@MoLow
Copy link
Member Author

MoLow commented Jun 23, 2022

Cc @cjihrig

@aduh95
Copy link
Contributor

aduh95 commented Jun 23, 2022

/cc @nodejs/test_runner

@Linkgoron Linkgoron added the request-ci Add this label to start a Jenkins CI on a PR. label Jun 26, 2022
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jun 26, 2022
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@MoLow MoLow force-pushed the test-runner-cancel-running-tests-on-terminate branch from c37ca32 to 9f56f6f Compare June 29, 2022 09:37
@benjamingr benjamingr added the request-ci Add this label to start a Jenkins CI on a PR. label Jun 29, 2022
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jun 29, 2022
@nodejs-github-bot
Copy link
Collaborator

@benjamingr benjamingr added the request-ci Add this label to start a Jenkins CI on a PR. label Jun 30, 2022
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jun 30, 2022
@nodejs-github-bot
Copy link
Collaborator

@MoLow MoLow force-pushed the test-runner-cancel-running-tests-on-terminate branch from ca3ad9a to db19e9a Compare July 3, 2022 07:02
@benjamingr benjamingr added request-ci Add this label to start a Jenkins CI on a PR. labels Jul 3, 2022
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jul 3, 2022
@nodejs-github-bot
Copy link
Collaborator

@MoLow MoLow force-pushed the test-runner-cancel-running-tests-on-terminate branch from 7bf3ba6 to e50bff8 Compare July 3, 2022 10:03
aduh95 pushed a commit to aduh95/node-core-test that referenced this pull request Jul 9, 2022
PR-URL: nodejs/node#43549
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Nitzan Uziely <linkgoron@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
aduh95 pushed a commit to aduh95/node-core-test that referenced this pull request Jul 9, 2022
PR-URL: nodejs/node#43549
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Nitzan Uziely <linkgoron@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
targos pushed a commit that referenced this pull request Jul 12, 2022
PR-URL: #43549
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Nitzan Uziely <linkgoron@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
@tniessen
Copy link
Member

Sorry, I didn't have time to review this. Looks like I cannot cancel synchronous tests using SIGINT (ctrl+c) anymore with this change 😕

@MoLow
Copy link
Member Author

MoLow commented Jul 14, 2022

Sorry, I didn't have time to review this. Looks like I cannot cancel synchronous tests using SIGINT (ctrl+c) anymore with this change 😕

@tniessen can you help me reproduce? I assume your test is not blocking the eventloop?

@tniessen
Copy link
Member

I assume your test is not blocking the eventloop?

It is, that's why I said "synchronous tests". But without a SIGINT handler, it should still be possible to ctrl+c the script.

@MoLow
Copy link
Member Author

MoLow commented Jul 14, 2022

@tniessen I am not sure there is an easy way to allow both handling when the process is blocked and reporting the test tree on termination.
any solution I can think of (workers, vm) has it penalties

@tniessen
Copy link
Member

I am not sure there is an easy way to allow both handling when the process is blocked and reporting the test tree on termination.

@MoLow But "reporting the test tree on termination" does not seem to work either, at least not when the event loop is blocked. If I put while(1) in a test, the test_runner ignores SIGINT and SIGTERM with this PR and the new timeout option does not help either, so I have to forcefully terminate the process using SIGKILL, in which case reporting does not work.

any solution I can think of (workers, vm) has it penalties

Yes, my concern here is very similar to some of my concerns in #43490, which also seem difficult to address within the same application thread.

@MoLow
Copy link
Member Author

MoLow commented Jul 15, 2022

@tniessen IL give it a few more days of think.
My main concern is ending up with invalid tap output. Your example suffers from that as well.

Like other discussions, I'd be glad if it was configurable, Wich raises some questions about what the default should be and what would be the API for global configuration.

@benjamingr
Copy link
Member

One thing to explore is adding the handler in C++ land, but even if we do that the core issue is that the test JS and the test framework are running in the same context on the same isolate so a while(1) would freeze it (and all the objects of the report live there too).

I'm not sure the infinite loop case is worth porting the test runner logic to C++ or spinning up a worker either.

@MoLow
Copy link
Member Author

MoLow commented Jul 17, 2022

My biggest concern here is that an ungraceful exit will result in invalid/incomplete TAP output,
I think the default should be graceful exit is on - and add the ability to turn it off.
@tniessen does that sound ok for you?

@tniessen
Copy link
Member

I think the default should be graceful exit is on - and add the ability to turn it off.
@tniessen does that sound ok for you?

I don't think I would call the current behavior graceful given that it makes ctrl+c / SIGINT / kill / SIGTERM unreliable.

As a user, I think a test runner should be simple and reliable, and it should only interfere with my script and workflow where that is absolutely necessary or a clear improvement, at least by default. Fancy features are nice if and only if they are reliable and don't introduce flakiness or disturb my workflow (which is also why I was against short default timeouts in #43490). But that's just my personal opinion and there are far more knowledgeable people involved.

@Linkgoron
Copy link
Member

IMO if you ctrl+c I don't think Node should offer a "graceful exit" solution if it's at the cost of actually killing the process in some cases.

@MoLow
Copy link
Member Author

MoLow commented Jul 17, 2022

I don't think I would call the current behavior graceful given that it makes ctrl+c / SIGINT / kill / SIGTERM unreliable.

the alternative is making the TAP output unreliable - which I think is much more important and more users expect to be valid

@tniessen
Copy link
Member

My biggest concern here is that an ungraceful exit will result in invalid/incomplete TAP output,

I don't think I would call the current behavior graceful given that it makes ctrl+c / SIGINT / kill / SIGTERM unreliable.

the alternative is making the TAP output unreliable - which I think is much more important and more users expect to be valid

@MoLow Could you elaborate on this? When would not catching SIGINT / SIGKILL make TAP output unreliable in a way that is relevant? I happily use tape, which does not interfere with signals at all, as far as I know.

@MoLow
Copy link
Member Author

MoLow commented Jul 18, 2022

@tniessen here is an example:

./node <<EOF
const test = require('node:test');

test('finite test', () => {});
test('infinite test', (t, done) =>  setTimeout(done, 1000000000));
EOF

when hitting ctr+c on this program - it produces a valid TAP output only when the SIGINT handler is attached

BTW, this tape example also results in a truncated TAP output:

./node <<EOF
const test = require('tape');

test('finite test', (t) => t.end());

test('infinite test', async (t) => {
    t.plan(1)
    return new Promise(r => setTimeout(r, 1000000000))
});
EOF

@ljharb
Copy link
Member

ljharb commented Jul 18, 2022

I don’t think anyone has ever filed an issue on tape about this. If you hit control-C, why would you have any expectation of valid output? Test suites (the user code) aren’t generally cleanly cancellable anyways.

@MoLow
Copy link
Member Author

MoLow commented Jul 18, 2022

I don’t think anyone has ever filed an issue on tape about this. If you hit control-C, why would you have any expectation of valid output? Test suites (the user code) aren’t generally cleanly cancellable anyways.

it doesnt have to be ctr+c, it can also be a ci system killing the process

@ljharb
Copy link
Member

ljharb commented Jul 18, 2022

Right - I’m saying that valid output is only expected when the process completes normally.

@MoLow
Copy link
Member Author

MoLow commented Jul 18, 2022

Discussed this with @Linkgoron - I will open a PR to make this opt-in

@tniessen
Copy link
Member

tniessen commented Jul 24, 2022

the test JS and the test framework are running in the same context on the same isolate so a while(1) would freeze it

I just learned (thanks to #43941 (comment)) that the CLI test runner starts tests as child processes. That should allow us to work around this restriction (albeit not if a test runs as node my-test.js). This could allow useful timeouts and proper handling of signals.

Edit: this appears to have been implemented at least in part in #43977.

targos pushed a commit that referenced this pull request Jul 31, 2022
PR-URL: #43549
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Nitzan Uziely <linkgoron@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
guangwong pushed a commit to noslate-project/node that referenced this pull request Oct 10, 2022
PR-URL: nodejs/node#43549
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Nitzan Uziely <linkgoron@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. needs-ci PRs that need a full CI run. test_runner Issues and PRs related to the test runner subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants