[Numpy] Add qr backward for wide inputs with nrows < ncols #18757

D-Roberts · 2020-07-19T14:50:35Z

As titled. This is a resubmit of #18197 . In addition, tests were re-verified for robustness.

MXNET_TEST_COUNT=10000 pytest -v 
 ~/workspace/incubator mxnet/tests/python/unittest/test_numpy_op.py::test_np_linalg_qr

incubator-mxnet/tests/python/unittest/test_numpy_op.py::test_np_linalg_qr PASSED  [100%]

The obtained gradient has the same values for a given input as with TensorFlow. The implemented method is similar to the method implemented in tf .

Here are cross-checked examples:

import mxnet as mx
import mxnet.numpy as np
import numpy as _np
_np.random.seed(42)
data_np = _np.random.uniform(-1, 1, (3, 5)).astype(_np.float32)
data = np.array(data_np, dtype='float32')
data.attach_grad()
with mx.autograd.record():
    ret = np.linalg.qr(data)
mx.autograd.backward(ret)
print(data.grad)
[[ 0.7569422   0.5140486  -0.48962986 -0.48962957 -0.48962957]
 [-2.414882   -1.2380642  -1.6602778  -1.6602782  -1.6602782 ]
 [ 0.2763981  -0.5044659   0.06115001  0.06115055  0.06115055]]

import tensorflow as tf
import numpy as _np
_np.random.seed(42)
data_np = _np.random.uniform(-1, 1, (3, 5)).astype(_np.float32)
data = tf.convert_to_tensor(data_np)
with tf.GradientTape() as g:
    g.watch(data)
    ret = tf.linalg.qr(data)
print(g.gradient(ret, data))
tf.Tensor(
[[ 0.75694233  0.51404876 -0.48962957 -0.48962957 -0.48962957]
 [-2.414882   -1.2380638  -1.6602784  -1.6602781  -1.6602781 ]
 [ 0.276398   -0.50446594  0.0611502   0.06115052  0.06115052]], shape=(3, 5), dtype=float32)

At high level the methodology is: partition/split the input A into 2 matrices X and Y and split matrix R (from A=QR decomposition) into 2 matrices U and V. Then X = QU and get X_grad by applying the gradient derivation from the square input case (m=n) with adjusted Q_grad. Also get Y_grad separately. Then A_grad is the concatenation of X_grad and Y_grad.

Changes

qr backward wide input
tests

mxnet-bot · 2020-07-19T14:50:40Z

Hey @D-Roberts , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

To trigger all jobs: @mxnet-bot run ci [all]
To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [website, miscellaneous, centos-gpu, windows-gpu, unix-cpu, unix-gpu, windows-cpu, edge, sanity, clang, centos-cpu]

Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

D-Roberts · 2020-07-19T20:36:19Z

@mxnet-bot run ci [edge, clang]

mxnet-bot · 2020-07-19T20:36:26Z

Jenkins CI successfully triggered : [edge, clang]

D-Roberts · 2020-07-20T17:40:09Z

@szha Here is the replacement PR. What is necessary to be done about the codecov/project failure?

leezu · 2020-07-20T18:07:56Z

@D-Roberts it's a bug and you can ignore it #18421 (comment)

Could you elaborate on how the PR differs from the previous version? Do you mean the test was adapted? ("tests were re-verified for robustness") If not, do you know why the previous version failed the CI and your current version passes?

Thank you

D-Roberts · 2020-07-20T18:39:39Z

Hi @leezu
By 're-verify tests' I really meant that I ran the tests >3 times with MXNET_TEST_COUNT=10000 :) .

I also made the following changes to the tests (as compared to the previous PR):

I removed the expensive analytical and numerical Jacobian calculations that were included in the initial PR. These calculations were initially included to demonstrate that the analytical method that I was implementing was working correctly and passing the central differences numerical checks. At the time of the initial PR 3 months ago, MXNet was the only framework implementing this qr backward method for wide inputs. However, in the meanwhile, TensorFlow had this method implemented, so the method is already verified via central differences. Notice in the example given in the description that the results from MXNet implementation and Tf implementation are identical.
I included a larger matrix in the test input for a test case. I removed a couple of redundant test cases to reduce test run time. I also relaxed atol and rtol for dtype float32 to make sure there are no flaky test runs at any time due to numerical rounding errors.

The error that led to the "stale PR" failing CI was due to a calculation in the numerical Jacobian for central differences check. The code broke after updates to Numpy and MXNet Numpy when the source array is float32 and the the dtype used in view is float64.

"jacobian_num[row, :] = diff.asnumpy().ravel().view(dtype)
[2020-07-17T17:41:26.700Z] E           ValueError: When changing to a larger dtype, its size must be a divisor of the total size in bytes of the last axis of the array.

szha · 2020-07-20T21:19:06Z

I'm comfortable merging this. @leezu?

leezu · 2020-07-20T21:31:36Z

Thank you @D-Roberts!

Add qr backward for wide inputs ncols>nrows

96df3bc

D-Roberts changed the title ~~Add qr backward for wide inputs with nrows < ncols~~ [Numpy] Add qr backward for wide inputs with nrows < ncols Jul 20, 2020

leezu mentioned this pull request Jul 20, 2020

Codecov migration to marketplace app #18421

Closed

leezu merged commit 1aec483 into apache:master Jul 20, 2020

D-Roberts deleted the qr_back_wide_v2 branch December 21, 2020 19:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Numpy] Add qr backward for wide inputs with nrows < ncols #18757

[Numpy] Add qr backward for wide inputs with nrows < ncols #18757

D-Roberts commented Jul 19, 2020 •

edited

Loading

mxnet-bot commented Jul 19, 2020

D-Roberts commented Jul 19, 2020

mxnet-bot commented Jul 19, 2020

D-Roberts commented Jul 20, 2020

leezu commented Jul 20, 2020

D-Roberts commented Jul 20, 2020 •

edited

Loading

szha commented Jul 20, 2020

leezu commented Jul 20, 2020

[Numpy] Add qr backward for wide inputs with nrows < ncols #18757

[Numpy] Add qr backward for wide inputs with nrows < ncols #18757

Conversation

D-Roberts commented Jul 19, 2020 • edited Loading

Changes

mxnet-bot commented Jul 19, 2020

D-Roberts commented Jul 19, 2020

mxnet-bot commented Jul 19, 2020

D-Roberts commented Jul 20, 2020

leezu commented Jul 20, 2020

D-Roberts commented Jul 20, 2020 • edited Loading

szha commented Jul 20, 2020

leezu commented Jul 20, 2020

D-Roberts commented Jul 19, 2020 •

edited

Loading

D-Roberts commented Jul 20, 2020 •

edited

Loading