-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release/7.0] Unload MsQuic after checking for QUIC support to free resources (#75163) #75330
Conversation
…et#75163) * Revert "Revert "Unload MsQuic after checking for QUIC support to free resources. (dotnet#74749)" (dotnet#74984)" This reverts commit 953f524. * update helix images * update helix images * Improve diagnostics when opening MsQuic Co-authored-by: Radek Zikmund <radekzikmund@microsoft.com>
Tagging subscribers to this area: @dotnet/area-system-reflection-metadata Issue Details
This reverts commit 953f524.
Co-authored-by: Radek Zikmund radekzikmund@microsoft.com Fixes Issue main PR DescriptionCustomer ImpactRegressionTestingRiskPackage authoring signed off?IMPORTANT: If this change touches code that ships in a NuGet package, please make certain that you have added any necessary package authoring and gotten it explicitly reviewed.
|
Tagging subscribers to this area: @dotnet/ncl Issue Details
This reverts commit 953f524.
Co-authored-by: Radek Zikmund radekzikmund@microsoft.com Fixes Issue main PR DescriptionCustomer ImpactRegressionTestingRiskPackage authoring signed off?IMPORTANT: If this change touches code that ships in a NuGet package, please make certain that you have added any necessary package authoring and gotten it explicitly reviewed.
|
Approved -- significant customer impact, new feature, and hitting existing customers that don't use it. Would certainly service for it. But, more eyes would be good at this point. @stephentoub would it be possible for you to review before we merge here? |
} | ||
finally | ||
{ | ||
if (!IsQuicSupported) | ||
// Unload the library, we will load it again when we actually use QUIC | ||
NativeLibrary.Free(msQuicHandle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought @jkotas had advised against actually unloading it. Did we decide it makes sense to anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note that this is not unloading it completely. The library for msquic tracing support stays around: #74710 (comment) .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can keep the same benefit without unloading the library from the process, we just need to save the IntPtr handle to a static field. The threads should all get deallocated by MsQuicClose
. The dangling libmsquic.lttng.so is something I have not foreseen when writing this change.
return new MsQuicApi(apiTable); | ||
} | ||
|
||
ThrowHelper.ThrowIfMsQuicError(openStatus); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
throw ThrowHelper.GetExceptionForMsQuicStatus(openStatus);
and then remove the next line?
Yeah, I have suspicion that the mysterious memory corruption we are chasing may be caused by unloading msquic as @jkotas suggested above. Too high risk for not-so-large value. Let's hold off merging this PR. We will likely not take it into 7.0 at all. |
Closing per comments above. |
FYI: The mysterious memory corruption was chased down to be unrelated problem in LTTNG triggered in our test environment - see #74795 for a test-only fix. We have CR feedback addressed in #75441 -- calling Given the churn in the area, I would like to see a little bit more bake time in main for the final changes. Once we have confidence it didn't break anything and is not triggering problems in LTTNG or elsewhere, I would be fine bringing it into 7.0. That will be already too late for GA, so we will leave it for servicing request if it confuses customers enough and if/when we get complains from customers. |
I would like to see this fixed for .NET 7; it is a regression from .NET 6 (I just tried with .NET 6 on Win11, and without setting the experimental HTTP/3 flag and just making a simple HttpClient request, msquic is never loaded and the extra threads aren't created) that pretty much every app is going to end up paying even if HTTP/3 is never used. I think it's fine to hold off for a few days or however long to ensure this actually addresses the problem without resulting in new instabilities, and it can be fixed post-RC2 for 7.0.0 or in servicing for 7.0.1. |
Agreed in offline discussion. @rzikm feel free to reopen the PR and merge your PR-feedback. We will mark it NO-MERGE and decide how long we want to confirm it didn't cause any regression in main. |
That said, can we take another look at whether we can do more on the SocketsHttpHandler side to even avoid touching quic at all unless there's a demonstrated need? e.g. with the default version of a request being HTTP/1.1 and the default version policy of RequestVersionOrLower, couldn't we avoid initializing HTTP/3 support at all unless someone makes a request with either a different policy or explicitly requesting HTTP/3? |
My preference would be to push the Quic fixes to release/7.0 sooner rather than later (e.g. after one day of bake time). Main is not getting a lot of scrutiny outside our CI system currently, so you are unlikely to get a lot more extra signal by waiting longer. |
I was unable to reopen this PR, so I created new at #75521 |
Manual backport of #75163
Fixes #74629
cc: @wfurt
Customer Impact
When using
HttpClient
, we also check whether the running platform supports QUIC (to enable HTTP/3). However, the way we are checking QUIC support causes many threads to be allocated in the native MsQuic library (2* number of logical cores). This causes unnecessary resource increase even if the process does not end up using HTTP/3 at all (HTTP/3 is opt-in). This would therefore cause regression in memory usage when upgrading to .NET 7 in such cases.The change builds on fix in msquic -- therefore it includes also update of Docker images to version with the right msquic version (2.1.1).
Affected platforms include Windows 11, Windows Server 2022, many Linux platforms with msquic package installed.
Note: The same problem exists also in 6.0 and we had 1 customer with ~50-core machine who noticed that and were surprised to see so many extra threads they didn't know about / they didn't need. At minimum it will help avoid confusion in such cases.
Testing
Functional tests suite passes as part of the CI, resource consumption was checked manually.
Risk
Low, the fix consists of gracefully unloading MsQuic library from the process after checking QUIC support. The library is reloaded only when actually needed.