Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt System.CommandLine (v2) in more places #72082

Merged
merged 18 commits into from
Sep 23, 2022

Conversation

am11
Copy link
Member

@am11 am11 commented Jul 13, 2022

Continuation of #66929; switches ilc and crossgen2 to System.CommandLine.

In 4731fdb, CG2 was switched from System.CommandLine (v1) to an internal CommandLine utility. Since then, the performance and memory consumption in System.CommandLine has improved quite a lot (in the effort to make it AoT friendly). SDK and some tools in runtime repo are already using System.CommandLine v2 beta4.

I did some quick testing in bash by running System.Private.CoreLib's crossgen'ing common (captured from clr subset build):

$ cat cmd.sh
/runtime/dotnet.sh \
  /runtime/artifacts/bin/coreclr/Linux.x64.Debug/crossgen2/crossgen2.dll \
  -o:/runtime/artifacts/bin/coreclr/Linux.x64.Debug/System.Private.CoreLib.dll \
  -r:/runtime/artifacts/bin/coreclr/Linux.x64.Debug/IL/*.dll \
  --targetarch:x64 -O --verify-type-and-field-layout \
  /runtime/artifacts/bin/coreclr/Linux.x64.Debug/IL/System.Private.CoreLib.dll \
  --perfmap-format-version:1 --perfmap --perfmap-path:/runtime/artifacts/bin/coreclr/Linux.x64.Debug/

then used GNU's time command to get process-level stats (10 iterations):

$ command time -v sh -c 'for i in $(seq 10); do sh cmd.sh; done'

Results:

Before After
  User time (seconds): 1382.99
  System time (seconds): 34.68
  Percent of CPU this job got: 493%
  Elapsed (wall clock) time (h:mm:ss or m:ss): 4:47.05
  Average shared text size (kbytes): 0
  Average unshared data size (kbytes): 0
  Average stack size (kbytes): 0
  Average total size (kbytes): 0
  Maximum resident set size (kbytes): 1522000
  Average resident set size (kbytes): 0
  Major (requiring I/O) page faults: 30242
  Minor (reclaiming a frame) page faults: 4960009
  Voluntary context switches: 371025
  Involuntary context switches: 324012
  Swaps: 0
  File system inputs: 5528
  File system outputs: 372160
  Socket messages sent: 0
  Socket messages received: 0
  Signals delivered: 0
  Page size (bytes): 4096
  Exit status: 0
  User time (seconds): 1386.69
  System time (seconds): 34.94
  Percent of CPU this job got: 490%
  Elapsed (wall clock) time (h:mm:ss or m:ss): 4:49.87
  Average shared text size (kbytes): 0
  Average unshared data size (kbytes): 0
  Average stack size (kbytes): 0
  Average total size (kbytes): 0
  Maximum resident set size (kbytes): 1496544
  Average resident set size (kbytes): 0
  Major (requiring I/O) page faults: 30658
  Minor (reclaiming a frame) page faults: 5020581
  Voluntary context switches: 393065
  Involuntary context switches: 340019
  Swaps: 0
  File system inputs: 232
  File system outputs: 372160
  Socket messages sent: 0
  Socket messages received: 0
  Signals delivered: 0
  Page size (bytes): 4096
  Exit status: 0

speed is pretty much in noise range; memory and I/O has slightly improved as we are now accessing options lazily from ParseResult (in most cases).


The last tool on internal plan is dotnet-pgo. If this PR gets accepted, I will work on that and delete the internal utility in a follow-up PR.

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Jul 13, 2022
@MichalStrehovsky
Copy link
Member

Do you have stats for crossgenning a hello world? The main problem with S.CommandLine was startup time. CoreLib is a huge assembly and not very representative of what we're crossgenning normally. Most assemblies are a lot smaller than CoreLib. Last time we were looking at S.CommandLine, the startup overhead for compiling hello world was like 25%.

@am11
Copy link
Member Author

am11 commented Jul 13, 2022

@MichalStrehovsky, I was motivated by aot related improvements made in System.CommandLine. The speed overhead is becoming insignificant.

helloworld script:

$ /runtime/dotnet.sh \
  /runtime/artifacts/bin/coreclr/Linux.x64.Release/crossgen2/crossgen2.dll \
  -o:./helloworld.ni.dll \
  -r:/runtime/artifacts/bin/coreclr/Linux.x64.Release/IL/*.dll \
  --targetarch:x64 -O --verify-type-and-field-layout \
  /helloworld/bin/Release/net7.0/helloworld.dll \
  --perfmap-format-version:1 --perfmap --perfmap-path:/tmp/

Results from 100 iterations:

BeforeAfter
  User time (seconds): 42.80
  System time (seconds): 9.66
  Percent of CPU this job got: 100%
  Elapsed (wall clock) time (h:mm:ss or m:ss): 0:52.21
  Average shared text size (kbytes): 0
  Average unshared data size (kbytes): 0
  Average stack size (kbytes): 0
  Average total size (kbytes): 0
  Maximum resident set size (kbytes): 128484
  Average resident set size (kbytes): 0
  Major (requiring I/O) page faults: 203893
  Minor (reclaiming a frame) page faults: 2864972
  Voluntary context switches: 16773
  Involuntary context switches: 3897
  Swaps: 0
  File system inputs: 4440
  File system outputs: 1616
  Socket messages sent: 0
  Socket messages received: 0
  Signals delivered: 0
  Page size (bytes): 4096
  Exit status: 0
  User time (seconds): 43.60
  System time (seconds): 10.49
  Percent of CPU this job got: 100%
  Elapsed (wall clock) time (h:mm:ss or m:ss): 0:53.99
  Average shared text size (kbytes): 0
  Average unshared data size (kbytes): 0
  Average stack size (kbytes): 0
  Average total size (kbytes): 0
  Maximum resident set size (kbytes): 131844
  Average resident set size (kbytes): 0
  Major (requiring I/O) page faults: 210921
  Minor (reclaiming a frame) page faults: 2916726
  Voluntary context switches: 16441
  Involuntary context switches: 3525
  Swaps: 0
  File system inputs: 0
  File system outputs: 1600
  Socket messages sent: 0
  Socket messages received: 0
  Signals delivered: 0
  Page size (bytes): 4096
  Exit status: 0

@@ -0,0 +1,289 @@
// Licensed to the .NET Foundation under one or more agreements.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is derived from Common/CommandLine/CommandLineHelpers.cs. The main difference is that it doesn't depend on anything from Common/CommandLine/* and LINQ. The MakeReproPackage method in this file uses System.CommandLine's ParseResult instead of Internal.CommandLine's ArgumentSyntax.

In a follow-up PR, I will delete src/coreclr/tools/Common/CommandLine/* after switching dotnet-pgo (its last consumer) to System.CommandLine. ILVerify will also be switched to use this helper during that work (without the need for <DefineConstants>ILVerify, #if ILVERIFY).

@am11
Copy link
Member Author

am11 commented Jul 13, 2022

cc @jkotas, @MichalStrehovsky, PTAL. Diffs without whitespace change will make it a bit easier to review. I switched to use file level namespace in Program.cs to decrease the overall indentation,

ilc: --insructionset, crossgen2: --instruction-set

Should we align them?

@jkotas
Copy link
Member

jkotas commented Jul 13, 2022

I switched to use file level namespace in Program.cs to decrease the overall indentation,

That should be separate change. We want to be consistent across the whole repro - either agree to do it everywhere or keep it as is.

@@ -14,6 +14,7 @@ public enum TargetArchitecture
Unknown,
ARM,
ARM64,
ARMEL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this used for?

This would mean that all places the check for TargetArchitecture.ARM have to check for TargetArchitecture.ARMEL too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used for (easier) strongly typed binding. In Program.ctor, it overrides ARMEL -> ARM since the rest of the implementation expects it with a flag to indicate if ARMEL ABI is to be used.

Alternatively, I can use the old approach to capture the flag during parsing (it is already using custom step in call to Option<T> ctor for --targetarch).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be better - since we do not expect this value to be valid, except in command line parsing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW #43706 (comment) is the previous conversation on why we don't have TargetArchitecture.Armel.

@jkotas
Copy link
Member

jkotas commented Jul 13, 2022

ilc: --insructionset, crossgen2: --instruction-set

Yes, I think it makes sense to align these.

@am11
Copy link
Member Author

am11 commented Jul 13, 2022

We want to be consistent across the whole repro

Sure, I will revert it. git grep '^namespace.*;' src/coreclr/tools/aot on main shows that only ReadyToRunDiagnosticsConstants.cs is using file namespace.

Yes, I think it makes sense to align these.

I will update ILC to use --instruction-set, which is typically used in multi-word arg names.

@jkotas
Copy link
Member

jkotas commented Jul 13, 2022

@trylek @mangod9 Are you ok with taking this conversion for .NET 7 or would you prefer to wait until we fork for .NET 8?

@trylek
Copy link
Member

trylek commented Jul 13, 2022

No pushback from me, the perf results look reasonable, I believe we can take it for .NET 7.

@mangod9
Copy link
Member

mangod9 commented Jul 13, 2022

perhaps we should run the crossgen2 outerloop CI to ensure it doesn't regress anything?

@am11 am11 force-pushed the feature/deps/system.commandline_usage branch from a0112eb to f896f93 Compare July 13, 2022 22:43
@MichalStrehovsky
Copy link
Member

@MichalStrehovsky, I was motivated by aot related improvements made in System.CommandLine. The speed overhead is becoming insignificant.

So now it's about 3-4% slower E2E than with Internal.CommandLine. It's probably not blocking anymore. I would certainly not mind if it was faster though - Internal.CommandLine has similar ergonomics and feels like a fair benchmark.

For completeness, here's the size:

crossgen2 trimmed, compiled with R2R: 33,472,412 bytes -> 34,388,937 bytes (2.7% regression)
crossgen2 compiled with NativeAOT: 19,021,529 bytes -> 19,891,963 bytes (4.6% regression)

Not blocking either; could be better.

Cc @jonsequitur @adamsitnik

@jkotas
Copy link
Member

jkotas commented Jul 14, 2022

1MB worth of code just to parse the command line the same way as before feels like a lot.

@am11
Copy link
Member Author

am11 commented Jul 14, 2022

1MB worth of code just to parse the command line the same way as before feels like a lot.

Isn't it 0.91 MB (R2R) and 0.87 MB (AoT) of difference?

@jkotas
Copy link
Member

jkotas commented Jul 14, 2022

Yes, I was rounding it. (Also, the existing lightweight parser is some code too.)

@am11
Copy link
Member Author

am11 commented Jul 14, 2022

Ah, I am sorry. I was reading it on a smaller screen and read it as eleven instead of one 😅

I agree that we should continue to improve System.CommandLine (size, memory and speed), following the footsteps of @adamsitnik. 😎

@MichalStrehovsky
Copy link
Member

@MichalStrehovsky Do you know what would need to be fixed to get this 246 kB of satellite assemblies trimmed? crossgen2 is not localized so it doesn't make sense for it to be carrying this.

I don't know much about how satellite assemblies work - would it be sufficient to have a mode in illink that simply doesn't copy these over?

illink currently copies all of these over if a ResourceManager..ctor call is seen.

I've also filed dotnet/command-line-api#1800 in the command line repo to make the satellite assemblies smaller. 5% of the size of these assemblies is a pompous assembly description that doesn't need to be there.

@ViktorHofer
Copy link
Member

I don't know much about how satellite assemblies work - would it be sufficient to have a mode in illink that simply doesn't copy these over?

In case you want to avoid pulling in the extra satellite assemblies and just a default locale, you can set the following in the project file: <SatelliteResourceLanguages>en-US</SatelliteResourceLanguages>

https://docs.microsoft.com/en-us/dotnet/core/project-sdk/msbuild-props#satelliteresourcelanguages

@jkotas
Copy link
Member

jkotas commented Jul 19, 2022

In case you want to avoid pulling in the extra satellite assemblies and just a default locale, you can set the following in the project file: en-US

Nice! #72478

@AntonLapounov
Copy link
Member

Looks like that System.CommandLine exposed a bug in the runtime. Build OSX x64 release Runtime_Debug is crashing with:

 Emitting R2R PE file: /Users/runner/work/1/s/artifacts/obj/Microsoft.NETCore.App.Crossgen2/Release/net7.0/osx-x64/S.P.C.tmp
  
  Assert failure(PID 93502 [0x00016d3e], Thread: 279072 [0x44220]): SyncTableEntry::GetSyncTableEntry()[sbIndex].m_Object == obj
      File: /Users/runner/work/1/s/src/coreclr/vm/syncblk.cpp Line: 2083
      Image: /Users/runner/work/1/s/artifacts/bin/coreclr/OSX.x64.Release/crossgen2/osx-x64/publish/crossgen2

I just hit the same assertion failure running crossgen2 with GCStress=3 on Windows ARM64 (without this PR).

@trylek
Copy link
Member

trylek commented Aug 25, 2022

@AntonLapounov - is the failure deterministic? I suspect it's worth investigating, I'm just wondering how hard it is to repro.

@AntonLapounov
Copy link
Member

It took more than one day on my old ARM64 laptop to hit this. I have a dump if anyone is interested.

@MichalStrehovsky
Copy link
Member

One more thing I noticed is that if the root command has arguments (like in this case the list of input files), System.CommandLine will send all unmatched command line switches as the file names instead of erroring out. I don't know if there's a way not to do that.

So crossgen2.exe input1.dll some_path\*.dll --SomeSwitchNameWithATypo would treat --SomeSwitchNameWithATypo as a file name and we will crash with some nonsensical stack.

The Internal.CommandLine parser that we use right now will instead tell the user that --SomeSwitchNameWithATypo is an unknown switch.

@am11
Copy link
Member Author

am11 commented Aug 27, 2022

I think we shouldn't care about user experience of crossgen2.exe any more than dotnet.exe (that uses System.CommandLine and has much vaster public usage).

Of course if we can configure the behavior of the open ended args, that would be nice.

To wit:

$ ~/.dotnet8/dotnet --foo
The command could not be loaded, possibly because:
  * You intended to execute a .NET application:
      The application '--foo' does not exist.
  * You intended to execute a .NET SDK command:
      No .NET SDKs were found.

@am11
Copy link
Member Author

am11 commented Sep 23, 2022

Resolved merge conflicts and unified --parallelism defaults (ported https://github.com/dotnet/runtime/pull/74956/files#diff-58625e1414cc2e244a276695b0572cc258c7ac4697f8e4326614bd2745fc27aa to ilc).

@jkotas, tree is opened for .NET 8, should we get this in?

@jkotas jkotas added the blocked Issue/PR is blocked on something - see comments label Sep 23, 2022
@jkotas
Copy link
Member

jkotas commented Sep 23, 2022

There are ongoing discussions about the future of System.CommandLine. I would prefer for to wait for those discussions to come to conclusion.

@am11
Copy link
Member Author

am11 commented Sep 23, 2022

ILVerify, R2Rdump etc. are depending on the same SystemCommandLine package. Getting this in will make it easier to upgrade all dependents to System.CommandLine's future release, when it's time to update the package.

tbh, merge conflicts in this PR are not fun to resolve.

@jonsequitur
Copy link
Contributor

One more thing I noticed is that if the root command has arguments (like in this case the list of input files), System.CommandLine will send all unmatched command line switches as the file names instead of erroring out. I don't know if there's a way not to do that.

It should be able to handle this, @MichalStrehovsky. I couldn't reproduce the issue. Configuring more specific expected types and adding validators might help.

If the parser is throwing an exception, that's unambiguously a bug. Do you have a stack trace for this issue?

@jkotas
Copy link
Member

jkotas commented Sep 23, 2022

It should be able to handle this, @MichalStrehovsky. I couldn't reproduce the issue.

dotnet tool install dotnet-ilverify --global --version 7.0.0-rc.1.22426.10`
ilverify --referenci c:\Windows\Microsoft.NET\Framework\v4.0.30319\mscorlib.dll

It prints Error: No files matching --referenci. It would be better if it printed Error: Bad option --referenci.

@jkotas
Copy link
Member

jkotas commented Sep 23, 2022

@jonsequitur I have opened dotnet/command-line-api#1860 on this

@jkotas
Copy link
Member

jkotas commented Sep 23, 2022

Getting this in will make it easier to upgrade all dependents to System.CommandLine's future release, when it's time to update the package.

It is not necessary correct. It can be more work to update the package if there are breaking changes.

I agree with that there are number of tools depending on System.CommandLine, and having two more is not going to change the update problem much. So let's get this merged. There is a lot of runway in .NET 8 to get the issues sorted out.

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jkotas jkotas merged commit d0559e6 into dotnet:main Sep 23, 2022
jkotas added a commit that referenced this pull request Sep 26, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Oct 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-crossgen2-coreclr blocked Issue/PR is blocked on something - see comments community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants