Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/7.0] Unload MsQuic after checking for QUIC support to free resources (#75163) #75330

Closed
wants to merge 1 commit into from

Conversation

rzikm
Copy link
Member

@rzikm rzikm commented Sep 9, 2022

Manual backport of #75163
Fixes #74629

cc: @wfurt

Customer Impact

When using HttpClient, we also check whether the running platform supports QUIC (to enable HTTP/3). However, the way we are checking QUIC support causes many threads to be allocated in the native MsQuic library (2* number of logical cores). This causes unnecessary resource increase even if the process does not end up using HTTP/3 at all (HTTP/3 is opt-in). This would therefore cause regression in memory usage when upgrading to .NET 7 in such cases.

The change builds on fix in msquic -- therefore it includes also update of Docker images to version with the right msquic version (2.1.1).

Affected platforms include Windows 11, Windows Server 2022, many Linux platforms with msquic package installed.

Note: The same problem exists also in 6.0 and we had 1 customer with ~50-core machine who noticed that and were surprised to see so many extra threads they didn't know about / they didn't need. At minimum it will help avoid confusion in such cases.

Testing

Functional tests suite passes as part of the CI, resource consumption was checked manually.

Risk

Low, the fix consists of gracefully unloading MsQuic library from the process after checking QUIC support. The library is reloaded only when actually needed.

…et#75163)

* Revert "Revert "Unload MsQuic after checking for QUIC support to free resources. (dotnet#74749)" (dotnet#74984)"

This reverts commit 953f524.

* update helix images

* update helix images

* Improve diagnostics when opening MsQuic

Co-authored-by: Radek Zikmund <radekzikmund@microsoft.com>
@dotnet dotnet deleted a comment from dotnet-issue-labeler bot Sep 9, 2022
@ghost
Copy link

ghost commented Sep 9, 2022

Tagging subscribers to this area: @dotnet/area-system-reflection-metadata
See info in area-owners.md if you want to be subscribed.

Issue Details

This reverts commit 953f524.

  • update helix images

  • update helix images

  • Improve diagnostics when opening MsQuic

Co-authored-by: Radek Zikmund radekzikmund@microsoft.com

Fixes Issue

main PR

Description

Customer Impact

Regression

Testing

Risk

Package authoring signed off?

IMPORTANT: If this change touches code that ships in a NuGet package, please make certain that you have added any necessary package authoring and gotten it explicitly reviewed.

Author: rzikm
Assignees: rzikm
Labels:

area-System.Reflection.Metadata, new-api-needs-documentation

Milestone: -

@ghost
Copy link

ghost commented Sep 9, 2022

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

This reverts commit 953f524.

  • update helix images

  • update helix images

  • Improve diagnostics when opening MsQuic

Co-authored-by: Radek Zikmund radekzikmund@microsoft.com

Fixes Issue

main PR

Description

Customer Impact

Regression

Testing

Risk

Package authoring signed off?

IMPORTANT: If this change touches code that ships in a NuGet package, please make certain that you have added any necessary package authoring and gotten it explicitly reviewed.

Author: rzikm
Assignees: rzikm
Labels:

area-System.Net.Quic

Milestone: -

@rzikm rzikm requested a review from a team September 9, 2022 08:08
@karelz karelz added this to the 7.0.0 milestone Sep 9, 2022
@danmoseley
Copy link
Member

Approved -- significant customer impact, new feature, and hitting existing customers that don't use it. Would certainly service for it.

But, more eyes would be good at this point. @stephentoub would it be possible for you to review before we merge here?

}
finally
{
if (!IsQuicSupported)
// Unload the library, we will load it again when we actually use QUIC
NativeLibrary.Free(msQuicHandle);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought @jkotas had advised against actually unloading it. Did we decide it makes sense to anyway?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note that this is not unloading it completely. The library for msquic tracing support stays around: #74710 (comment) .

Copy link
Member Author

@rzikm rzikm Sep 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can keep the same benefit without unloading the library from the process, we just need to save the IntPtr handle to a static field. The threads should all get deallocated by MsQuicClose. The dangling libmsquic.lttng.so is something I have not foreseen when writing this change.

return new MsQuicApi(apiTable);
}

ThrowHelper.ThrowIfMsQuicError(openStatus);
Copy link
Member

@stephentoub stephentoub Sep 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw ThrowHelper.GetExceptionForMsQuicStatus(openStatus);

and then remove the next line?

@karelz karelz added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Sep 9, 2022
@karelz
Copy link
Member

karelz commented Sep 9, 2022

Yeah, I have suspicion that the mysterious memory corruption we are chasing may be caused by unloading msquic as @jkotas suggested above. Too high risk for not-so-large value. Let's hold off merging this PR. We will likely not take it into 7.0 at all.

@danmoseley
Copy link
Member

Closing per comments above.

@danmoseley danmoseley closed this Sep 9, 2022
@karelz
Copy link
Member

karelz commented Sep 12, 2022

FYI: The mysterious memory corruption was chased down to be unrelated problem in LTTNG triggered in our test environment - see #74795 for a test-only fix.

We have CR feedback addressed in #75441 -- calling MsQuicClose without unloading msquic DLL.

Given the churn in the area, I would like to see a little bit more bake time in main for the final changes. Once we have confidence it didn't break anything and is not triggering problems in LTTNG or elsewhere, I would be fine bringing it into 7.0. That will be already too late for GA, so we will leave it for servicing request if it confuses customers enough and if/when we get complains from customers.
@stephentoub @jkotas would you suggest something else?

@rzikm rzikm deleted the 75163-release-7.0 branch September 12, 2022 15:08
@stephentoub
Copy link
Member

stephentoub commented Sep 12, 2022

I would like to see this fixed for .NET 7; it is a regression from .NET 6 (I just tried with .NET 6 on Win11, and without setting the experimental HTTP/3 flag and just making a simple HttpClient request, msquic is never loaded and the extra threads aren't created) that pretty much every app is going to end up paying even if HTTP/3 is never used. I think it's fine to hold off for a few days or however long to ensure this actually addresses the problem without resulting in new instabilities, and it can be fixed post-RC2 for 7.0.0 or in servicing for 7.0.1.

@karelz
Copy link
Member

karelz commented Sep 12, 2022

Agreed in offline discussion. @rzikm feel free to reopen the PR and merge your PR-feedback. We will mark it NO-MERGE and decide how long we want to confirm it didn't cause any regression in main.

@stephentoub
Copy link
Member

That said, can we take another look at whether we can do more on the SocketsHttpHandler side to even avoid touching quic at all unless there's a demonstrated need? e.g. with the default version of a request being HTTP/1.1 and the default version policy of RequestVersionOrLower, couldn't we avoid initializing HTTP/3 support at all unless someone makes a request with either a different policy or explicitly requesting HTTP/3?

@jkotas
Copy link
Member

jkotas commented Sep 12, 2022

We will mark it NO-MERGE and decide how long we want to confirm it didn't cause any regression in main.

My preference would be to push the Quic fixes to release/7.0 sooner rather than later (e.g. after one day of bake time). Main is not getting a lot of scrutiny outside our CI system currently, so you are unlikely to get a lot more extra signal by waiting longer.

@rzikm
Copy link
Member Author

rzikm commented Sep 13, 2022

I was unable to reopen this PR, so I created new at #75521

@ghost ghost locked as resolved and limited conversation to collaborators Oct 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Net.Quic NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants