Flaky test test_quantization_mkldnn.test_requantize_int32_to_int8 #11747

KellenSunderland · 2018-07-13T15:13:42Z

Example Failure
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/master/1175/pipeline

Output

FAIL: test_quantization_mkldnn.test_requantize_int32_to_int8

Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/nose/case.py", line 198, in runTest
self.test(*self.arg)
File "/usr/local/lib/python3.5/dist-packages/nose/util.py", line 620, in newfunc
return func(*arg, **kw)
File "/work/mxnet/tests/python/mkl/../unittest/common.py", line 157, in test_new
orig_test(*args, **kwargs)
File "/work/mxnet/tests/python/mkl/../quantization/test_quantization.py", line 127, in test_requantize_int32_to_int8
check_requantize((3, 4, 10, 10))
File "/work/mxnet/tests/python/mkl/../quantization/test_quantization.py", line 123, in check_requantize
assert_almost_equal(qdata_int8.asnumpy(), qdata_int8_np)
File "/work/mxnet/python/mxnet/test_utils.py", line 493, in assert_almost_equal
raise AssertionError(msg)
AssertionError:
Items are not equal:
Error 1562.500000 exceeds tolerance rtol=0.000010, atol=0.000000. Location of maximum error:(0, 3, 8, 0), a=-63.000000, b=-64.000000
a: array([[[[ -72, 106, 79, ..., 73, -43, 126],
[ -74, -118, -46, ..., -18, 44, -37],
[ 0, 13, -93, ..., -117, -123, -56],...
b: array([[[[ -72, 106, 79, ..., 73, -43, 126],
[ -74, -118, -46, ..., -18, 44, -37],
[ 0, 13, -93, ..., -117, -123, -56],...
-------------------- >> begin captured logging << --------------------
common: INFO: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=2054829694 to reproduce.
--------------------- >> end captured logging << ---------------------

apeforest · 2018-07-13T19:15:29Z

Thannks for filing this issue. We will investigate this Flaky test

xinyu-intel · 2018-07-16T01:16:42Z

Cannot reproduce locally. By the way, why just python3 fail?

KellenSunderland · 2018-07-16T06:38:26Z

@xinyu-intel: This is a rarely occurring non-deterministic failure, so the fact it happens in python3 as opposed to python2, (or any other language) is probably just a coincidence.

marcoabreu · 2018-07-30T15:15:53Z

Hi,

please run the following commands:

ci/build.py -p ubuntu_cpu -i
PYTHONPATH=/work/mxnet/python python3 tools/flakiness_checker.py -s 2127644814 test_quantization_mkldnn.test_requantize_int32_to_int8

The output will be:


INFO:root:Testing: /work/mxnet/tests/python/mkl/test_quantization_mkldnn.py:test_requantize_int32_to_int8
INFO:root:No test seed provided, using random seed
test_quantization_mkldnn.test_requantize_int32_to_int8 ... [INFO] 351 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=2127644814 to reproduce.
FAIL

======================================================================
FAIL: test_quantization_mkldnn.test_requantize_int32_to_int8
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/usr/local/lib/python3.5/dist-packages/nose/util.py", line 620, in newfunc
    return func(*arg, **kw)
  File "/work/mxnet/tests/python/mkl/../quantization/common.py", line 172, in test_new
    orig_test(*args, **kwargs)
  File "/work/mxnet/tests/python/mkl/../quantization/test_quantization.py", line 127, in test_requantize_int32_to_int8
    check_requantize((3, 4, 10, 10))
  File "/work/mxnet/tests/python/mkl/../quantization/test_quantization.py", line 123, in check_requantize
    assert_almost_equal(qdata_int8.asnumpy(), qdata_int8_np)
  File "/work/mxnet/python/mxnet/test_utils.py", line 493, in assert_almost_equal
    raise AssertionError(msg)
AssertionError:
Items are not equal:
Error 1562.500000 exceeds tolerance rtol=0.000010, atol=0.000000.  Location of maximum error:(1, 3, 1, 7), a=-63.000000, b=-64.000000
 a: array([[[[-119,   61,  -58, ...,    8,  123,   10],
         [  97,   79,   11, ...,   37, -106,  -13],
         [  82,   53, -125, ...,  104,   90,  112],...
 b: array([[[[-119,   61,  -58, ...,    8,  123,   10],
         [  97,   79,   11, ...,   37, -106,  -13],
         [  82,   53, -125, ...,  104,   90,  112],...
-------------------- >> begin captured logging << --------------------
common: INFO: 351 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=2127644814 to reproduce.
--------------------- >> end captured logging << ---------------------

This output is entirely deterministic given the seed. For reference, I'm running on a g3.8xlarge.
@pengzhao-intel @xinyu-intel

pengzhao-intel · 2018-08-03T14:50:32Z

sorry, I missed this one previously.
@xinyu-intel will keep looking the issue.

xinyu-intel · 2018-08-05T08:40:20Z

@KellenSunderland @marcoabreu @reminisce Hi, I think we should set the absolute error to one ULP. When we check whether the int8 data are equal, we should set the absolute error to 1.
In this case, we convert int32 data to float32 data and then convert to int8 data. In the last step (np.sign(data) * np.minimum(np.abs(data) * scale + 0.5, quantized_range)).astype('int8') plus 0.5 to realize rounded up will bring one ULP truncated error.
For example, MKLDNN float 63.4999 will not be round up to 64 after plus 0.5 but Numpy float 63.5000 will. The absolute error of float32 1e-4 will be expand to 1 after converted to int8.
I'll start a pr to fix it.

marcoabreu · 2018-08-05T18:53:03Z

Great, thanks a lot @xinyu-intel !

marcoabreu · 2018-08-10T11:46:18Z

#12040

ChaiBapchya · 2019-11-02T18:09:06Z

#16692 unrelated PR
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-16692/1/pipeline

======================================================================

FAIL: test_quantization.test_requantize_int32_to_int8

----------------------------------------------------------------------

Traceback (most recent call last):

  File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest

    self.test(*self.arg)

  File "/work/mxnet/tests/python/quantization/common.py", line 177, in test_new

    orig_test(*args, **kwargs)

  File "/work/mxnet/tests/python/quantization/test_quantization.py", line 186, in test_requantize_int32_to_int8

    check_requantize_with_symbol((3, 4, 10, 10))

  File "/work/mxnet/tests/python/quantization/test_quantization.py", line 181, in check_requantize_with_symbol

    assert_almost_equal(qdata_int8.asnumpy(), qdata_int8_np)

  File "/work/mxnet/python/mxnet/test_utils.py", line 627, in assert_almost_equal

    raise AssertionError(msg)

AssertionError: 

Items are not equal:

Error 1562.500000 exceeds tolerance rtol=1.000000e-05, atol=1.000000e-20 (mismatch 0.166667%).

Location of maximum error: (0, 2, 3, 4), a=-63.00000000, b=-64.00000000

 ACTUAL: array([[[[ -98,  -25,  -99, ...,  -93,   -8,   37],

         [  94,  -57,  -94, ...,  -84,   83,   60],

         [ -47,  112,   95, ...,  107, -112,    4],...

 DESIRED: array([[[[ -98,  -25,  -99, ...,  -93,   -8,   37],

         [  94,  -57,  -94, ...,  -84,   83,   60],

         [ -47,  112,   95, ...,  107, -112,    4],...

-------------------- >> begin captured stdout << ---------------------


*** Maximum errors for vector of size 1200:  rtol=1e-05, atol=1e-20


  1: Error 1562.500000  Location of error: (0, 2, 3, 4), a=-63.00000000, b=-64.00000000

  2: Error 1562.500000  Location of error: (2, 2, 0, 2), a=63.00000000, b=64.00000000

pengzhao-intel · 2019-11-03T00:39:27Z

@xinyu-intel please take a look for flaky test.

ChaiBapchya · 2020-04-10T04:47:10Z

Still flaky. For unrelated PR #17993
Log : http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-17993/7/pipeline/300

zixuanweeei · 2020-04-10T07:18:33Z

Hi, @ChaiBapchya. Perhaps #16709 was not backported to the 1.6.x branch. Would you mind confirming that? If so, I think we need to backport the patch to 1.6.x as well along with #17993.

ChaiBapchya · 2020-04-10T14:02:52Z

Backported that patch. Thanks for pointing it out. Weird how it didn't get selected in the previous cherry-picks.

marcoabreu mentioned this issue Jul 13, 2018

[MXNET-675] Disable flaky mkldnn test_requantize_int32_to_int8 #11748

Merged

5 tasks

marcoabreu added Test Flaky Disabled test labels Jul 13, 2018

xinyu-intel mentioned this issue Aug 5, 2018

flaky test test_quantization.test_quantize_float32_to_int8 #12028

Closed

xinyu-intel mentioned this issue Aug 6, 2018

Fix flaky tests for quantize and requantize #12040

Merged

5 tasks

marcoabreu closed this as completed Aug 10, 2018

xinyu-intel mentioned this issue Sep 2, 2019

Fix quantized_conv flaky test #16074

Merged

7 tasks

xinyu-intel mentioned this issue Nov 3, 2019

Fix requantize flaky test #16709

Merged

7 tasks

zixuanweeei mentioned this issue Apr 10, 2020

[CI][v1.6.x] Fix blocked pipelines [windows,unix-cpu,unix-gpu] on v1.6.x #17993

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky test test_quantization_mkldnn.test_requantize_int32_to_int8 #11747

Flaky test test_quantization_mkldnn.test_requantize_int32_to_int8 #11747

KellenSunderland commented Jul 13, 2018

apeforest commented Jul 13, 2018

xinyu-intel commented Jul 16, 2018

KellenSunderland commented Jul 16, 2018

marcoabreu commented Jul 30, 2018 •

edited

Loading

pengzhao-intel commented Aug 3, 2018

xinyu-intel commented Aug 5, 2018

marcoabreu commented Aug 5, 2018

marcoabreu commented Aug 10, 2018

ChaiBapchya commented Nov 2, 2019

pengzhao-intel commented Nov 3, 2019

ChaiBapchya commented Apr 10, 2020

zixuanweeei commented Apr 10, 2020

ChaiBapchya commented Apr 10, 2020

Flaky test test_quantization_mkldnn.test_requantize_int32_to_int8 #11747

Flaky test test_quantization_mkldnn.test_requantize_int32_to_int8 #11747

Comments

KellenSunderland commented Jul 13, 2018

Output

FAIL: test_quantization_mkldnn.test_requantize_int32_to_int8

apeforest commented Jul 13, 2018

xinyu-intel commented Jul 16, 2018

KellenSunderland commented Jul 16, 2018

marcoabreu commented Jul 30, 2018 • edited Loading

pengzhao-intel commented Aug 3, 2018

xinyu-intel commented Aug 5, 2018

marcoabreu commented Aug 5, 2018

marcoabreu commented Aug 10, 2018

ChaiBapchya commented Nov 2, 2019

pengzhao-intel commented Nov 3, 2019

ChaiBapchya commented Apr 10, 2020

zixuanweeei commented Apr 10, 2020

ChaiBapchya commented Apr 10, 2020

marcoabreu commented Jul 30, 2018 •

edited

Loading