Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

test_norm, test_batchnorm and test_layer_norm dramatically slow down #20092

Closed
barry-jin opened this issue Mar 25, 2021 · 3 comments
Closed

test_norm, test_batchnorm and test_layer_norm dramatically slow down #20092

barry-jin opened this issue Mar 25, 2021 · 3 comments

Comments

@barry-jin
Copy link
Contributor

barry-jin commented Mar 25, 2021

It looks like test_norm, test_layer_norm became the slowest tests after openmp submodule being removed in #19953 .
unix-cpu Python3: CPU pytest slowest 50 before this commit
https://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/master/2485/pipeline/282

[2021-03-01T06:10:05.320Z] ============================= slowest 50 durations =============================
[2021-03-01T06:10:05.320Z] 246.53s call     tests/python/unittest/test_operator.py::test_broadcast_binary_op

[2021-03-01T06:10:05.320Z] 185.37s call     tests/python/unittest/test_operator.py::test_order

[2021-03-01T06:10:05.320Z] 110.09s call     tests/python/unittest/test_operator.py::test_psroipooling

[2021-03-01T06:10:05.320Z] 97.39s call     tests/python/unittest/test_operator.py::test_layer_norm[float32-0.001-0.001-in_shape_l1-finite_grad_check_l1-0]

[2021-03-01T06:10:05.320Z] 95.54s call     tests/python/unittest/test_operator.py::test_layer_norm[float64-0.0001-0.0001-in_shape_l2-finite_grad_check_l2-0]

[2021-03-01T06:10:05.321Z] 90.15s call     tests/python/unittest/test_operator.py::test_layer_norm[float64-0.0001-0.0001-in_shape_l2-finite_grad_check_l2-1]

[2021-03-01T06:10:05.321Z] 88.62s call     tests/python/unittest/test_operator.py::test_layer_norm[float32-0.001-0.001-in_shape_l1-finite_grad_check_l1-1]

[2021-03-01T06:10:05.321Z] 71.50s call     tests/python/unittest/test_operator.py::test_convolution_dilated_impulse_response

[2021-03-01T06:10:05.321Z] 71.45s call     tests/python/unittest/test_operator.py::test_bilinear_resize_op

[2021-03-01T06:10:05.321Z] 43.48s call     tests/python/unittest/test_operator.py::test_convolution_independent_gradients

[2021-03-01T06:10:05.321Z] 42.75s call     tests/python/unittest/test_operator.py::test_stack

[2021-03-01T06:10:05.321Z] 24.36s call     tests/python/unittest/test_operator.py::test_multi_proposal_op

[2021-03-01T06:10:05.321Z] 24.18s call     tests/python/unittest/test_operator.py::test_laop_2

[2021-03-01T06:10:05.321Z] 21.11s call     tests/python/unittest/test_operator.py::test_layer_norm[float16-0.01-0.01-in_shape_l0-finite_grad_check_l0-1]

[2021-03-01T06:10:05.321Z] 20.08s call     tests/python/unittest/test_operator.py::test_layer_norm[float16-0.01-0.01-in_shape_l0-finite_grad_check_l0-0]

[2021-03-01T06:10:05.321Z] 18.79s call     tests/python/unittest/test_operator.py::test_reduce

[2021-03-01T06:10:05.321Z] 10.60s call     tests/python/unittest/test_operator.py::test_batchnorm[True-False-False-shape2-BatchNorm]

[2021-03-01T06:10:05.321Z] 10.39s call     tests/python/unittest/test_operator.py::test_batchnorm_training

[2021-03-01T06:10:05.321Z] 10.11s call     tests/python/unittest/test_operator.py::test_batchnorm[True-True-False-shape2-BatchNorm]

[2021-03-01T06:10:05.321Z] 9.05s call     tests/python/unittest/test_operator.py::test_l2_normalization

unix-cpu Python3: CPU pytest slowest 50 for this commit
https://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-19953/3/pipeline/282


[2021-03-01T11:36:42.961Z] ============================= slowest 50 durations =============================

[2021-03-01T11:36:42.961Z] 600.14s call     tests/python/unittest/test_operator.py::test_layer_norm[float64-0.0001-0.0001-in_shape_l2-finite_grad_check_l2-1]

[2021-03-01T11:36:42.961Z] 591.61s call     tests/python/unittest/test_operator.py::test_layer_norm[float32-0.001-0.001-in_shape_l1-finite_grad_check_l1-1]

[2021-03-01T11:36:42.961Z] 543.72s call     tests/python/unittest/test_operator.py::test_layer_norm[float16-0.01-0.01-in_shape_l0-finite_grad_check_l0-1]

[2021-03-01T11:36:42.961Z] 541.24s call     tests/python/unittest/test_operator.py::test_layer_norm[float32-0.001-0.001-in_shape_l1-finite_grad_check_l1-0]

[2021-03-01T11:36:42.961Z] 483.78s call     tests/python/unittest/test_operator.py::test_batchnorm[False-True-False-shape2-BatchNorm]

[2021-03-01T11:36:42.961Z] 438.33s call     tests/python/unittest/test_operator.py::test_batchnorm[False-False-False-shape2-BatchNorm]

[2021-03-01T11:36:42.961Z] 428.82s call     tests/python/unittest/test_operator.py::test_norm

[2021-03-01T11:36:42.961Z] 406.73s call     tests/python/unittest/test_operator.py::test_batchnorm[True-False-False-shape2-BatchNorm]

[2021-03-01T11:36:42.961Z] 363.46s call     tests/python/unittest/test_operator.py::test_layer_norm[float64-0.0001-0.0001-in_shape_l2-finite_grad_check_l2-0]

[2021-03-01T11:36:42.961Z] 340.72s call     tests/python/unittest/test_operator.py::test_reduce

[2021-03-01T11:36:42.961Z] 297.12s call     tests/python/unittest/test_operator.py::test_layer_norm[float16-0.01-0.01-in_shape_l0-finite_grad_check_l0-0]

[2021-03-01T11:36:42.961Z] 262.40s call     tests/python/unittest/test_operator.py::test_batchnorm[True-True-False-shape2-BatchNorm]

[2021-03-01T11:36:42.961Z] 200.51s call     tests/python/unittest/test_operator.py::test_laop_2

[2021-03-01T11:36:42.961Z] 177.58s call     tests/python/unittest/test_operator.py::test_broadcast_binary_op

[2021-03-01T11:36:42.961Z] 163.79s call     tests/python/unittest/test_operator.py::test_batchnorm[False-False-True-shape2-BatchNorm]

[2021-03-01T11:36:42.961Z] 154.08s call     tests/python/unittest/test_operator.py::test_batchnorm[True-False-False-shape2-SyncBatchNorm]

[2021-03-01T11:36:42.961Z] 150.28s call     tests/python/unittest/test_operator.py::test_batchnorm[False-True-True-shape2-BatchNorm]

Originally posted by @barry-jin in #20091 (comment)

@leezu
Copy link
Contributor

leezu commented Mar 25, 2021

May be fixed by #20093
Let's check the CI timing

@akarbown
Copy link
Contributor

@barry-jin - can we close this issue - it seems to be resolved by the #20367, isn't it?

@barry-jin
Copy link
Contributor Author

barry-jin commented Oct 15, 2021

Closes via #20367, Thanks @bgawrych for the fix!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants