Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make non-file URL support optional in XmlUrlResolver #83495

Closed
Tracked by #93172
MichalStrehovsky opened this issue Mar 16, 2023 · 11 comments · Fixed by #84169
Closed
Tracked by #93172

Make non-file URL support optional in XmlUrlResolver #83495

MichalStrehovsky opened this issue Mar 16, 2023 · 11 comments · Fixed by #84169
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Xml blocking Marks issues that we want to fast track in order to unblock other important work linkable-framework Issues associated with delivering a linker friendly framework
Milestone

Comments

@MichalStrehovsky
Copy link
Member

MichalStrehovsky commented Mar 16, 2023

Make non-file URL support optional in System.Xml

Currently some root-level XML APIs take URI as string as an input (type is string but they contain Uri in the name, i.e. inputUri and similar). Typically those APIs are used with file paths but also support URLs. By default URI handling for those APIs is handled by XmlUrlResolver. We have changed XML to not use XmlUrlResolver in many places already (i.e. some XML constructs allow specyfing links, i.e. DTD, schema) - we have switched them to use null resolver making this forbidden. Root level APIs are the only place we're still using XmlUrlResolver by default because it was a trade-off at the moment - since user explicitly asks to load something we decided that behavior was acceptable.

Per original request (in the bottom of this proposal) in trimming scenarios that causes us to pull in networking dependency during trimming causing those relatively rarer scenarios (path is the most common scenario) to increase size by around 3MB.

Proposal is to have default resolver controllable with switch which allows or not URLs (pick different resolver).

Usage

For trimming scenarios we will not resolve URLs by default - only file system resolving will be supported. To opt-in set System.Xml.XmlResolver.IsNetworkingEnabledByDefault to true with RuntimeHostConfigurationOption in your csproj.

For non-trimming scenarios switch is opt-out and URL resolution needs to be turned off (in concern for backward compatibility).

AppContext.SetSwitch("System.Xml.XmlResolver.IsNetworkingEnabledByDefault", false);

API proposal

Add following switch (other switches start with Switch.):

System.Xml.XmlResolver.IsNetworkingEnabledByDefault

with value true by default for non-trimming scenarios and false for trimming.

Add public API with new resolver - while this is not strictly necessary to unblock reported scenario it complements it and allows to not access network (and good trimming) also in other scenarios which are very common (i.e. validating XML with schema with references in the same folder etc.).

namespace System.Xml;

public abstract partial class XmlResolver
{
// Existing:
// public static System.Xml.XmlResolver ThrowingResolver { get; }
public static System.Xml.XmlResolver FileSystemResolver { get; }
// also note: replacement for XmlUrlResolver which is separate public class
}

Original request (by @MichalStrehovsky)

We should consider a feature switch port of dotnet/corefx#31510.

Experimentally stubbing out XmlDownloadManager.GetNonFileStreamAsync I get these results with Native AOT for a simple app that just does XDocument.Load("");:

Baseline: 7,531,520 bytes
Stubbed: 3,795,456 bytes

We would have a use case for it in crossgen2 and ILC because both use XML, but neither require non-filestream support. This currently drags in the entire network stack for no reason.

@MichalStrehovsky MichalStrehovsky added area-System.Xml linkable-framework Issues associated with delivering a linker friendly framework labels Mar 16, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Mar 16, 2023
@ghost
Copy link

ghost commented Mar 16, 2023

Tagging subscribers to this area: @dotnet/area-system-xml
See info in area-owners.md if you want to be subscribed.

Issue Details

We should consider a feature switch port of dotnet/corefx#31510.

Experimentally stubbing out XmlDownloadManager.GetNonFileStreamAsync I get these results with Native AOT for a simple app that just does XDocument.Load("");:

Baseline: 7,531,520 bytes
Stubbed: 3,795,456 bytes

We would have a use case for it in crossgen2 and ILC because both use XML, but neither require non-filestream support. This currently drags in the entire network stack for no reason.

Author: MichalStrehovsky
Assignees: -
Labels:

area-System.Xml, linkable-framework

Milestone: -

@MichalStrehovsky MichalStrehovsky added linkable-framework Issues associated with delivering a linker friendly framework and removed linkable-framework Issues associated with delivering a linker friendly framework labels Mar 16, 2023
@ghost
Copy link

ghost commented Mar 16, 2023

Tagging subscribers to 'linkable-framework': @eerhardt, @vitek-karas, @LakshanF, @sbomer, @joperezr
See info in area-owners.md if you want to be subscribed.

Issue Details

We should consider a feature switch port of dotnet/corefx#31510.

Experimentally stubbing out XmlDownloadManager.GetNonFileStreamAsync I get these results with Native AOT for a simple app that just does XDocument.Load("");:

Baseline: 7,531,520 bytes
Stubbed: 3,795,456 bytes

We would have a use case for it in crossgen2 and ILC because both use XML, but neither require non-filestream support. This currently drags in the entire network stack for no reason.

Author: MichalStrehovsky
Assignees: -
Labels:

area-System.Xml, untriaged, linkable-framework

Milestone: -

@krwq krwq added this to the 8.0.0 milestone Mar 16, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Mar 16, 2023
@eerhardt
Copy link
Member

Alternatively, what if we added a new API that didn’t do web requests and then we didn’t need a feature switch?

‘XDocument.LoadFromFile’

@stephentoub
Copy link
Member

stephentoub commented Mar 16, 2023

Alternatively, what if we added a new API that didn’t do web requests and then we didn’t need a feature switch?

‘XDocument.LoadFromFile’

If that worked, I'd be in favor of it; the less we need to rely on feature switches, the better. But unfortunately I don't think it's that simple. I expect you'll find that any use of XmlReaders, regardless of how they're constructed, will end up rooting that GetNonFileStreamAsync method. Someone would need to prototype reworking the innards to remove those choke points.

@MichalStrehovsky
Copy link
Member Author

Yes, the web stack is included even if the Stream overload of the constructor is called.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Mar 31, 2023
@krwq
Copy link
Member

krwq commented Mar 31, 2023

I've created a draft PR with suggested approach: #84169

The approach is adding new resolver which resolves URIs with file scheme (accessible with XmlResolver.FileSystemResolver) and switch which changes so that we use it rather than XmlUrlResolver. In most situation we never do any resolving for XMLs but XmlUrlResolver is still used to resolve URIs for root level APIs, the switch changes so that we use file resolver instead and never access the network

@krwq krwq added api-ready-for-review API is ready for review, it is NOT ready for implementation blocking Marks issues that we want to fast track in order to unblock other important work labels Apr 3, 2023
@terrajobst
Copy link
Member

terrajobst commented Apr 4, 2023

Video

  • Let's do both:
    1. Let's investigate if we can refactor XML so that non-URL based use doesn't result in static dependencies of networking
    2. Let's add a big switch that allows to turn off all networking from XML
  • The default behavior
    • It's on by default for now
    • We should look into making it a breaking change to not allow networking by default, but we shouldn't tie it to trimming
    • This would be an extension of what we already did (i.e. not resolving DTDs)
  • Name:
    • We prefer IsNetworkingEnabledByDefault
    • We should name it like we'd name an API, which means we'd need a type were we'd put it
    • System.Xml.XmlResolver.IsNetworkingEnabledByDefault

@terrajobst terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Apr 4, 2023
@ghost ghost added in-pr There is an active PR which will close this issue when it is merged and removed in-pr There is an active PR which will close this issue when it is merged labels May 4, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jun 4, 2023
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jul 17, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 18, 2023
@MichalStrehovsky
Copy link
Member Author

Reopening to track SDK and docs work (feel free to close if this is tracked elsewhere and sorry for the noise in that case). This would be hard/impossible to use without that:

@eiriktsarpalis
Copy link
Member

Reopening to track SDK and docs work

It's unlikely this will be actioned upon unless follow-up issues are opened in the relevant repos. Closing this one and opening new issues there.

@MichalStrehovsky
Copy link
Member Author

Reopening to track SDK and docs work

It's unlikely this will be actioned upon unless follow-up issues are opened in the relevant repos. Closing this one and opening new issues there.

If this is to get the issue off of the team's 8.0 query, it accomplishes it. But note that adding docs+exposing this from the SDK side has been traditionally done by the feature team.

Filing issues in the docs/sdk repo is usually a pointless exercise because if you look at the repo history, there are many issues going years without triage/milestone. It is the issue graveyard and nobody is on point to fix these for you. I don't believe they are included in any product queries. So if the Xml team actually wants customers to benefit from this, I would suggest tracking it as 8.0 here.

@eiriktsarpalis
Copy link
Member

But note that adding docs+exposing this from the SDK side has been traditionally done by the feature team.

Filing issues on the docs repo is something that has worked for us in the past, even in cases where the changes were being made by the area owners.

Filing issues in the docs/sdk repo is usually a pointless exercise because if you look at the repo history, there are many issues going years without triage/milestone. It is the issue graveyard and nobody is on point to fix these for you.

Fair. I'll move that issue back to runtime but no promises we'll get this done in time for RC1.

@ghost ghost locked as resolved and limited conversation to collaborators Sep 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Xml blocking Marks issues that we want to fast track in order to unblock other important work linkable-framework Issues associated with delivering a linker friendly framework
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants