Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve perf of ActivatorUtilities.CreateInstance() #91290

Merged
merged 10 commits into from
Sep 11, 2023

Conversation

steveharter
Copy link
Member

@steveharter steveharter commented Aug 29, 2023

Fixes #91186

We should consider porting to v8 since this addresses a ~3x perf regression (or more) in certain cases.

This gets back the v8 regressions in ActivatorUtilities.CreateInstance plus a lot more. The regressions were caused by the new v8 "keyed services" feature and are most pronounced when a service's constructor contains many other services.

The optimizations include caching of ConstructorInfo-related state, in order of perf gain:

  • The values from [FromKeyedServicesAttribute] attribute lookup (the v8 regression)
  • Caching of Type.GetConstructors() along with each ConstructorInfo.Parameters
  • The values from [ActivatorUtilitiesConstructorAttribute] attribute lookup.
Benchmarks from issue #91186: ~2.1-4.5x faster than 8.0 and ~1.3x-2.4x faster than 7.0 and with fewer allocs 7.0
Method Mean Error StdDev Median Min Max Gen0 Allocated
CreateA 147.9 ns 1.07 ns 0.95 ns 148.1 ns 146.0 ns 149.3 ns 0.0198 208 B
CreateB 249.1 ns 2.42 ns 2.15 ns 249.7 ns 245.1 ns 251.8 ns 0.0405 432 B
CreateC 566.5 ns 2.64 ns 2.21 ns 567.0 ns 560.4 ns 569.2 ns 0.0610 648 B

8.0

Method Mean Error StdDev Median Min Max Gen0 Allocated
CreateA 134.5 ns 1.13 ns 1.00 ns 134.5 ns 133.2 ns 136.2 ns 0.0196 208 B
CreateB 473.9 ns 6.06 ns 5.06 ns 471.3 ns 467.9 ns 481.4 ns 0.0386 456 B
CreateC 1,929.6 ns 28.26 ns 25.05 ns 1,927.4 ns 1,895.7 ns 1,978.6 ns 0.0484 600 B

This PR

Method Mean Error StdDev Median Min Max Gen0 Allocated
CreateA 63.68 ns 1.372 ns 1.580 ns 62.95 ns 61.99 ns 67.40 ns 0.0167 176 B
CreateB 146.39 ns 2.143 ns 1.789 ns 145.91 ns 143.86 ns 150.72 ns 0.0372 392 B
CreateC 424.74 ns 1.828 ns 1.621 ns 425.22 ns 420.82 ns 427.47 ns 0.0473 496 B
Existing benchmarks: CreateInstance_5 is ~3.5x faster than 8.0 and ~3.3 faster than 7.0 and with half the allocs

7.0

Method Mean Error StdDev Median Min Max Gen0 Allocated
GetService_0Injected 22.12 ns 0.483 ns 0.536 ns 21.82 ns 21.62 ns 23.13 ns 0.0023 24 B
GetService_1Injected 21.55 ns 0.464 ns 0.456 ns 21.38 ns 21.09 ns 22.61 ns 0.0023 24 B
GetService_2Injected 22.56 ns 0.328 ns 0.274 ns 22.45 ns 22.31 ns 23.22 ns 0.0030 32 B
GetService_3Injected 23.72 ns 0.115 ns 0.090 ns 23.70 ns 23.63 ns 23.96 ns 0.0038 40 B
GetService_4Injected 25.86 ns 0.194 ns 0.162 ns 25.86 ns 25.52 ns 26.19 ns 0.0045 48 B
GetService_5Injected 29.45 ns 0.330 ns 0.292 ns 29.37 ns 29.14 ns 30.00 ns 0.0053 56 B
CreateInstance_0 98.69 ns 0.826 ns 0.772 ns 98.54 ns 97.18 ns 99.68 ns 0.0076 80 B
CreateInstance_1 343.67 ns 3.327 ns 2.950 ns 342.32 ns 340.31 ns 349.16 ns 0.0252 264 B
CreateInstance_2 444.98 ns 2.626 ns 2.456 ns 445.19 ns 440.92 ns 448.47 ns 0.0342 360 B
CreateInstance_3 169.82 ns 1.253 ns 1.172 ns 170.01 ns 167.93 ns 172.14 ns 0.0154 168 B
CreateInstance_4 505.42 ns 5.853 ns 5.475 ns 505.84 ns 495.27 ns 515.61 ns 0.0385 408 B
CreateInstance_5 529.31 ns 4.613 ns 4.089 ns 529.38 ns 523.04 ns 538.25 ns 0.0467 496 B
Factory_1Injected_2Explicit 28.16 ns 0.167 ns 0.156 ns 28.18 ns 27.93 ns 28.38 ns 0.0038 40 B
Factory_1Injected_2Explicit_OutOfOrder 29.03 ns 0.152 ns 0.142 ns 29.04 ns 28.71 ns 29.21 ns 0.0037 40 B
Factory_3Explicit 10.24 ns 0.121 ns 0.101 ns 10.24 ns 10.09 ns 10.42 ns 0.0038 40 B
Factory_3Injected 65.30 ns 0.802 ns 0.751 ns 65.15 ns 64.29 ns 66.78 ns 0.0038 40 B

8.0

Method Mean Error StdDev Median Min Max Gen0 Allocated
GetService_0Injected 11.73 ns 0.281 ns 0.300 ns 11.59 ns 11.402 ns 12.38 ns 0.0022 24 B
GetService_1Injected 12.16 ns 0.278 ns 0.273 ns 12.08 ns 11.757 ns 12.67 ns 0.0023 24 B
GetService_2Injected 14.00 ns 0.287 ns 0.268 ns 13.88 ns 13.665 ns 14.60 ns 0.0030 32 B
GetService_3Injected 14.93 ns 0.112 ns 0.105 ns 14.91 ns 14.812 ns 15.11 ns 0.0038 40 B
GetService_4Injected 17.65 ns 0.132 ns 0.123 ns 17.67 ns 17.364 ns 17.85 ns 0.0046 48 B
GetService_5Injected 18.72 ns 0.131 ns 0.122 ns 18.70 ns 18.500 ns 18.90 ns 0.0053 56 B
CreateInstance_0 112.66 ns 0.547 ns 0.484 ns 112.62 ns 111.720 ns 113.60 ns 0.0074 80 B
CreateInstance_1 341.42 ns 1.247 ns 0.974 ns 341.34 ns 339.797 ns 343.13 ns 0.0247 264 B
CreateInstance_2 477.02 ns 2.629 ns 2.459 ns 476.49 ns 473.083 ns 481.34 ns 0.0327 360 B
CreateInstance_3 176.83 ns 1.213 ns 1.135 ns 176.74 ns 174.181 ns 178.78 ns 0.0158 168 B
CreateInstance_4 520.00 ns 1.763 ns 1.472 ns 519.93 ns 518.101 ns 523.24 ns 0.0372 408 B
CreateInstance_5 558.50 ns 4.572 ns 4.277 ns 557.23 ns 553.463 ns 566.70 ns 0.0401 432 B
Factory_1Injected_2Explicit 17.59 ns 0.128 ns 0.114 ns 17.60 ns 17.408 ns 17.78 ns 0.0038 40 B
Factory_1Injected_2Explicit_OutOfOrder 17.77 ns 0.405 ns 0.338 ns 17.74 ns 17.384 ns 18.62 ns 0.0038 40 B
Factory_3Explicit 10.09 ns 0.126 ns 0.112 ns 10.09 ns 9.910 ns 10.30 ns 0.0038 40 B
Factory_3Injected 34.62 ns 0.286 ns 0.239 ns 34.60 ns 34.202 ns 35.16 ns 0.0038 40 B

This PR

Method Mean Error StdDev Median Min Max Gen0 Allocated
GetService_0Injected 11.21 ns 0.265 ns 0.248 ns 11.09 ns 10.95 ns 11.74 ns 0.0023 24 B
GetService_1Injected 14.85 ns 0.098 ns 0.092 ns 14.86 ns 14.72 ns 15.05 ns 0.0023 24 B
GetService_2Injected 14.53 ns 0.125 ns 0.117 ns 14.54 ns 14.34 ns 14.72 ns 0.0030 32 B
GetService_3Injected 14.72 ns 0.126 ns 0.118 ns 14.69 ns 14.50 ns 14.97 ns 0.0038 40 B
GetService_4Injected 18.26 ns 0.157 ns 0.146 ns 18.25 ns 18.04 ns 18.49 ns 0.0045 48 B
GetService_5Injected 19.25 ns 0.101 ns 0.084 ns 19.23 ns 19.12 ns 19.42 ns 0.0053 56 B
CreateInstance_0 32.25 ns 0.199 ns 0.176 ns 32.25 ns 31.98 ns 32.63 ns 0.0046 48 B
CreateInstance_1 76.11 ns 0.719 ns 0.638 ns 75.94 ns 75.33 ns 77.72 ns 0.0113 120 B
CreateInstance_2 107.63 ns 1.046 ns 0.979 ns 107.85 ns 105.58 ns 109.16 ns 0.0158 168 B
CreateInstance_3 71.51 ns 1.264 ns 1.121 ns 71.11 ns 70.08 ns 73.63 ns 0.0084 88 B
CreateInstance_4 130.70 ns 1.313 ns 1.164 ns 130.44 ns 129.37 ns 133.38 ns 0.0186 200 B
CreateInstance_5 158.43 ns 1.394 ns 1.304 ns 158.12 ns 156.94 ns 160.70 ns 0.0203 216 B
Factory_1Injected_2Explicit 18.26 ns 0.134 ns 0.125 ns 18.23 ns 18.01 ns 18.48 ns 0.0038 40 B
Factory_1Injected_2Explicit_OutOfOrder 17.80 ns 0.170 ns 0.151 ns 17.77 ns 17.54 ns 18.09 ns 0.0038 40 B
Factory_3Explicit 10.23 ns 0.093 ns 0.087 ns 10.22 ns 10.09 ns 10.42 ns 0.0038 40 B
Factory_3Injected 34.31 ns 0.269 ns 0.225 ns 34.27 ns 33.93 ns 34.73 ns 0.0037 40 B

@ghost
Copy link

ghost commented Aug 29, 2023

Tagging subscribers to this area: @dotnet/area-extensions-dependencyinjection
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #91186

We should consider porting to v8 since this addresses a ~3x perf regression in certain cases.

This gets back the v8 regressions in ActivatorUtilities.CreateInstance plus a lot more. The regressions were caused by the new v8 "keyed services" feature and are most pronounced when a service's constructor contains many other services.

The optimizations include caching of ConstructorInfo-related state, in order of perf gain:

  • The values from [FromKeyedServicesAttribute] attribute lookup (the v8 regression)
  • Caching of Type.GetConstructors() along with each ConstructorInfo.Parameters
  • The values from [ActivatorUtilitiesConstructorAttribute] attribute lookup.
  • Use of the new ConstructorInvoker.
Benchmarks from issue #91186: ~2.1-4.5x faster than 8.0 and ~1.3x-2.4x faster than 7.0 and with fewer allocs 7.0
Method Mean Error StdDev Median Min Max Gen0 Allocated
CreateA 147.9 ns 1.07 ns 0.95 ns 148.1 ns 146.0 ns 149.3 ns 0.0198 208 B
CreateB 249.1 ns 2.42 ns 2.15 ns 249.7 ns 245.1 ns 251.8 ns 0.0405 432 B
CreateC 566.5 ns 2.64 ns 2.21 ns 567.0 ns 560.4 ns 569.2 ns 0.0610 648 B

8.0

Method Mean Error StdDev Median Min Max Gen0 Allocated
CreateA 134.5 ns 1.13 ns 1.00 ns 134.5 ns 133.2 ns 136.2 ns 0.0196 208 B
CreateB 473.9 ns 6.06 ns 5.06 ns 471.3 ns 467.9 ns 481.4 ns 0.0386 456 B
CreateC 1,929.6 ns 28.26 ns 25.05 ns 1,927.4 ns 1,895.7 ns 1,978.6 ns 0.0484 600 B

This PR

Method Mean Error StdDev Median Min Max Gen0 Allocated
CreateA 63.68 ns 1.372 ns 1.580 ns 62.95 ns 61.99 ns 67.40 ns 0.0167 176 B
CreateB 146.39 ns 2.143 ns 1.789 ns 145.91 ns 143.86 ns 150.72 ns 0.0372 392 B
CreateC 424.74 ns 1.828 ns 1.621 ns 425.22 ns 420.82 ns 427.47 ns 0.0473 496 B
Existing benchmarks: CreateInstance_5 is ~3.5x faster than 8.0 and ~3.3 faster than 7.0 and with half the allocs

7.0

Method Mean Error StdDev Median Min Max Gen0 Allocated
GetService_0Injected 22.12 ns 0.483 ns 0.536 ns 21.82 ns 21.62 ns 23.13 ns 0.0023 24 B
GetService_1Injected 21.55 ns 0.464 ns 0.456 ns 21.38 ns 21.09 ns 22.61 ns 0.0023 24 B
GetService_2Injected 22.56 ns 0.328 ns 0.274 ns 22.45 ns 22.31 ns 23.22 ns 0.0030 32 B
GetService_3Injected 23.72 ns 0.115 ns 0.090 ns 23.70 ns 23.63 ns 23.96 ns 0.0038 40 B
GetService_4Injected 25.86 ns 0.194 ns 0.162 ns 25.86 ns 25.52 ns 26.19 ns 0.0045 48 B
GetService_5Injected 29.45 ns 0.330 ns 0.292 ns 29.37 ns 29.14 ns 30.00 ns 0.0053 56 B
CreateInstance_0 98.69 ns 0.826 ns 0.772 ns 98.54 ns 97.18 ns 99.68 ns 0.0076 80 B
CreateInstance_1 343.67 ns 3.327 ns 2.950 ns 342.32 ns 340.31 ns 349.16 ns 0.0252 264 B
CreateInstance_2 444.98 ns 2.626 ns 2.456 ns 445.19 ns 440.92 ns 448.47 ns 0.0342 360 B
CreateInstance_3 169.82 ns 1.253 ns 1.172 ns 170.01 ns 167.93 ns 172.14 ns 0.0154 168 B
CreateInstance_4 505.42 ns 5.853 ns 5.475 ns 505.84 ns 495.27 ns 515.61 ns 0.0385 408 B
CreateInstance_5 529.31 ns 4.613 ns 4.089 ns 529.38 ns 523.04 ns 538.25 ns 0.0467 496 B
Factory_1Injected_2Explicit 28.16 ns 0.167 ns 0.156 ns 28.18 ns 27.93 ns 28.38 ns 0.0038 40 B
Factory_1Injected_2Explicit_OutOfOrder 29.03 ns 0.152 ns 0.142 ns 29.04 ns 28.71 ns 29.21 ns 0.0037 40 B
Factory_3Explicit 10.24 ns 0.121 ns 0.101 ns 10.24 ns 10.09 ns 10.42 ns 0.0038 40 B
Factory_3Injected 65.30 ns 0.802 ns 0.751 ns 65.15 ns 64.29 ns 66.78 ns 0.0038 40 B

8.0

Method Mean Error StdDev Median Min Max Gen0 Allocated
GetService_0Injected 11.73 ns 0.281 ns 0.300 ns 11.59 ns 11.402 ns 12.38 ns 0.0022 24 B
GetService_1Injected 12.16 ns 0.278 ns 0.273 ns 12.08 ns 11.757 ns 12.67 ns 0.0023 24 B
GetService_2Injected 14.00 ns 0.287 ns 0.268 ns 13.88 ns 13.665 ns 14.60 ns 0.0030 32 B
GetService_3Injected 14.93 ns 0.112 ns 0.105 ns 14.91 ns 14.812 ns 15.11 ns 0.0038 40 B
GetService_4Injected 17.65 ns 0.132 ns 0.123 ns 17.67 ns 17.364 ns 17.85 ns 0.0046 48 B
GetService_5Injected 18.72 ns 0.131 ns 0.122 ns 18.70 ns 18.500 ns 18.90 ns 0.0053 56 B
CreateInstance_0 112.66 ns 0.547 ns 0.484 ns 112.62 ns 111.720 ns 113.60 ns 0.0074 80 B
CreateInstance_1 341.42 ns 1.247 ns 0.974 ns 341.34 ns 339.797 ns 343.13 ns 0.0247 264 B
CreateInstance_2 477.02 ns 2.629 ns 2.459 ns 476.49 ns 473.083 ns 481.34 ns 0.0327 360 B
CreateInstance_3 176.83 ns 1.213 ns 1.135 ns 176.74 ns 174.181 ns 178.78 ns 0.0158 168 B
CreateInstance_4 520.00 ns 1.763 ns 1.472 ns 519.93 ns 518.101 ns 523.24 ns 0.0372 408 B
CreateInstance_5 558.50 ns 4.572 ns 4.277 ns 557.23 ns 553.463 ns 566.70 ns 0.0401 432 B
Factory_1Injected_2Explicit 17.59 ns 0.128 ns 0.114 ns 17.60 ns 17.408 ns 17.78 ns 0.0038 40 B
Factory_1Injected_2Explicit_OutOfOrder 17.77 ns 0.405 ns 0.338 ns 17.74 ns 17.384 ns 18.62 ns 0.0038 40 B
Factory_3Explicit 10.09 ns 0.126 ns 0.112 ns 10.09 ns 9.910 ns 10.30 ns 0.0038 40 B
Factory_3Injected 34.62 ns 0.286 ns 0.239 ns 34.60 ns 34.202 ns 35.16 ns 0.0038 40 B

This PR

Method Mean Error StdDev Median Min Max Gen0 Allocated
GetService_0Injected 11.21 ns 0.265 ns 0.248 ns 11.09 ns 10.95 ns 11.74 ns 0.0023 24 B
GetService_1Injected 14.85 ns 0.098 ns 0.092 ns 14.86 ns 14.72 ns 15.05 ns 0.0023 24 B
GetService_2Injected 14.53 ns 0.125 ns 0.117 ns 14.54 ns 14.34 ns 14.72 ns 0.0030 32 B
GetService_3Injected 14.72 ns 0.126 ns 0.118 ns 14.69 ns 14.50 ns 14.97 ns 0.0038 40 B
GetService_4Injected 18.26 ns 0.157 ns 0.146 ns 18.25 ns 18.04 ns 18.49 ns 0.0045 48 B
GetService_5Injected 19.25 ns 0.101 ns 0.084 ns 19.23 ns 19.12 ns 19.42 ns 0.0053 56 B
CreateInstance_0 32.25 ns 0.199 ns 0.176 ns 32.25 ns 31.98 ns 32.63 ns 0.0046 48 B
CreateInstance_1 76.11 ns 0.719 ns 0.638 ns 75.94 ns 75.33 ns 77.72 ns 0.0113 120 B
CreateInstance_2 107.63 ns 1.046 ns 0.979 ns 107.85 ns 105.58 ns 109.16 ns 0.0158 168 B
CreateInstance_3 71.51 ns 1.264 ns 1.121 ns 71.11 ns 70.08 ns 73.63 ns 0.0084 88 B
CreateInstance_4 130.70 ns 1.313 ns 1.164 ns 130.44 ns 129.37 ns 133.38 ns 0.0186 200 B
CreateInstance_5 158.43 ns 1.394 ns 1.304 ns 158.12 ns 156.94 ns 160.70 ns 0.0203 216 B
Factory_1Injected_2Explicit 18.26 ns 0.134 ns 0.125 ns 18.23 ns 18.01 ns 18.48 ns 0.0038 40 B
Factory_1Injected_2Explicit_OutOfOrder 17.80 ns 0.170 ns 0.151 ns 17.77 ns 17.54 ns 18.09 ns 0.0038 40 B
Factory_3Explicit 10.23 ns 0.093 ns 0.087 ns 10.22 ns 10.09 ns 10.42 ns 0.0038 40 B
Factory_3Injected 34.31 ns 0.269 ns 0.225 ns 34.27 ns 33.93 ns 34.73 ns 0.0037 40 B
Author: steveharter
Assignees: steveharter
Labels:

tenet-performance, area-Extensions-DependencyInjection

Milestone: -

…onstructorInvoker for possible risk of callstack usage
@steveharter
Copy link
Member Author

/azp run runtime-extra-platforms

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@buyaa-n buyaa-n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thank you!

@steveharter steveharter merged commit 00937b2 into dotnet:main Sep 11, 2023
103 of 105 checks passed
@steveharter steveharter deleted the DiPerf9 branch September 11, 2023 13:07
@steveharter
Copy link
Member Author

/backport to release/8.0

@github-actions
Copy link
Contributor

Started backporting to release/8.0: https://github.com/dotnet/runtime/actions/runs/6146835734

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NET 8.0 Preview 7 Performance Regression: ActivatorUtilities.CreateInstance
5 participants