Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EventPipe] Block EventPipeProvider Deletion for ongoing callbacks #106040

Conversation

mdh1418
Copy link
Member

@mdh1418 mdh1418 commented Aug 6, 2024

Fixes #80666

This PR aims to align behavior between EventPipe's Unregister logic and ETW's Unregister logic by blocking EventPipe's DeleteProvider for in-flight callbacks, so that the gchandle will not be freed before the callback completes. (ETW has its own lock for ETW commands/callbacks).

Our initial attempt to add a corresponding EventPipe lock revealed to us that locks should not be taken around the callback (specifically performing the callback within a lock) because it breaks concurrent callbacks scenarios.

In this PR, we track the EventPipeProvider's callbacks that have been prepared but not yet invoked (i.e. in-flight callbacks), and leverage a signal set/wait to block the EventPipe Provider's deferred deletion.


Repro

Reproduced the crash by:

Console.WriteLine("MIHW Ready to create TestEventSource");
Console.ReadKey();
var testEventSource = new TestEventSource();
Console.WriteLine("MIHW Ready to dispose TestEventSource");
Console.ReadKey();
testEventSource.Dispose();
Console.WriteLine($"MIHW TestEventSource disposed.");

[EventSource(Name = "TestEventSource")]
public class TestEventSource : EventSource
...
  1. Launching the above sample app in windbg
  2. Connecting an EventPipe session via dotnet-trace collect --providers TestEventSource -p <pid of app from dotnet-trace ps>
  3. Setting a breakpoint in ep-provider.c provider_invoke_callback at callback invocation
  4. Closing the EventPipe session (enter || ctrl + c) to hit that breakpoint and freezing the thread
  5. Continuing the application logic by disposing the EventSource
  6. Unfreezing the previously frozen thread and continuing

Resulted in a NullReferenceException crash.

Testing

Performed the same steps as above with the changes in this PR, Dispose is blocked until the callback completes.

@jkotas
Copy link
Member

jkotas commented Aug 6, 2024

Can you re-enable the disabled test?

@jkotas
Copy link
Member

jkotas commented Aug 6, 2024

The change LGTM, but I am not deeply familiar with EventPipe code.

@dotnet dotnet deleted a comment from azure-pipelines bot Aug 6, 2024
@jkotas
Copy link
Member

jkotas commented Aug 6, 2024

/azp run runtime-coreclr outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few comments inline but mostly looks good!

src/native/eventpipe/ep-provider.c Outdated Show resolved Hide resolved
src/native/eventpipe/ep-provider.h Outdated Show resolved Hide resolved
src/native/eventpipe/ep.c Show resolved Hide resolved
@lateralusX lateralusX self-requested a review August 7, 2024 07:32
@mdh1418
Copy link
Member Author

mdh1418 commented Aug 7, 2024

/azp run runtime-coreclr outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Rename counter
Add more comments describing the blocking behavior
Add comments for potential deadlock scenario
@mdh1418 mdh1418 force-pushed the eventpipe_block_unregister_for_callbacks_counter_signal_impl branch from 7878c1d to d28ed8e Compare August 7, 2024 17:32
@mdh1418
Copy link
Member Author

mdh1418 commented Aug 7, 2024

/azp run runtime-coreclr outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mdh1418
Copy link
Member Author

mdh1418 commented Aug 7, 2024

/ba-g The failing tests uncaught by build analysis are #104905 and #103347, not sure why build analysis didn't recognize the match in 103347.

@mdh1418 mdh1418 merged commit 1d2e841 into dotnet:main Aug 7, 2024
199 of 210 checks passed
@mdh1418 mdh1418 deleted the eventpipe_block_unregister_for_callbacks_counter_signal_impl branch August 7, 2024 21:15
@github-actions github-actions bot locked and limited conversation to collaborators Sep 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tracing/eventpipe/eventsourceerror/eventsourceerror/eventsourceerror failure
5 participants