Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify diagnostics story for custom ALC #51069

Closed
davidfowl opened this issue Apr 11, 2021 · 13 comments
Closed

Simplify diagnostics story for custom ALC #51069

davidfowl opened this issue Apr 11, 2021 · 13 comments
Labels
area-AssemblyLoader-coreclr untriaged New issue has not been triaged by the area owner

Comments

@davidfowl
Copy link
Member

Description

When issues like these come up, it's a game of wack-a-mole to figure out what objects from a custom ALC are rooted by objects in the default ALC. Can we do something low tech to make this a bit easier? Maybe an SOS command.

Regression?

No

cc @vitek-karas @janvorli

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Apr 11, 2021
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost
Copy link

ghost commented Apr 11, 2021

Tagging subscribers to this area: @vitek-karas, @agocke, @CoffeeFlux, @VSadov
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

When issues like these come up, it's a game of wack-a-mole to figure out what objects from a custom ALC are rooted by objects in the default ALC. Can we do something low tech to make this a bit easier? Maybe an SOS command.

Regression?

No

cc @vitek-karas @janvorli

Author: davidfowl
Assignees: -
Labels:

area-AssemblyLoader-coreclr, area-System.Runtime, untriaged

Milestone: -

@jkotas
Copy link
Member

jkotas commented Apr 11, 2021

Can we do something low tech to make this a bit easier? Maybe an SOS command.

The sos command for this is gcroot. Here is an example for how it can be used to investigate leaking ALCs #31377 (comment)

Leaking ALCs are managed memory leaks, so most tutorials and tricks for debugging managed memory leaks are applicable too.

@davidfowl
Copy link
Member Author

Yea, I know but the repro I have is extremely unreliable, so maybe it's a bug. I think would be interesting to be able to identify which ALC an object came from using SOS.

@davidfowl
Copy link
Member Author

Oh that thread helped. So LoaderAllocator is the magic I should be looking for.

@davidfowl
Copy link
Member Author

Turns out event sources are really bad in this regard (probably all diagnostics really). It's not self contained enough to be unloadable.

@vitek-karas
Copy link
Member

Just curious - do you think it happens to all event sources, or just those somebody is listening to? In theory we should be able to "do nothing" if nobody's listening... but maybe not.

Also - this should be fixable - we just need to make the references to event sources weak references (and handle the case where it goes away).

@davidfowl
Copy link
Member Author

I hope we can solve it in general. EventSources basically register with etw or event pipe and end up getting finalized. Now it makes sense why I was seeing this repro be unreliable, at least when I was trying to narrow it down. I'll try to make a repro today

@janvorli
Copy link
Member

@davidfowl are you aware of the following doc I have written on using and debugging unloadability? https://docs.microsoft.com/en-us/dotnet/standard/assembly/unloadability#debug-unloading-issues

@davidfowl
Copy link
Member Author

Nope! That's super useful! Also I figured out this issue, it's pretty cool one 😄

@vitek-karas
Copy link
Member

@davidfowl Can you please create a separate issue for the EventSource unloadability? (Since you have the data/samples)

@davidfowl
Copy link
Member Author

Turns out EventSources weren't the problem. Well I should say, they get finalized eventually.

Here's the explanation dotnet/aspnetcore#31637 (comment)

TL;DR, async stack unwinding will root things (which makes sense). So extra care is needed when you execute async code in another load context. It's super easy to keep things alive.

@davidfowl
Copy link
Member Author

We should add this one to the doc. Basically the equivalent of ExecuteAndUnload but the async version

@ghost ghost locked as resolved and limited conversation to collaborators May 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-AssemblyLoader-coreclr untriaged New issue has not been triaged by the area owner
Projects
None yet
Development

No branches or pull requests

4 participants