Disable -DUSE_TVM_OP on GPU builds #18204

leezu · 2020-04-30T07:40:04Z

Need to be fixed first before it can be tested on CI

mxnet-bot · 2020-04-30T07:40:06Z

Hey @leezu , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

To trigger all jobs: @mxnet-bot run ci [all]
To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [miscellaneous, sanity, clang, website, windows-cpu, centos-gpu, centos-cpu, unix-cpu, unix-gpu, edge, windows-gpu]

Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

marcoabreu · 2020-04-30T07:49:29Z

Is this issue caused by using the g4 instances or do we have and clue of why they saw suddenly broken?

leezu · 2020-04-30T18:24:52Z

I think this has been broken since it was introduced.
It's unclear why it triggers non-deterministic failure in some cases, but deterministic failures in other. I think that may be due to the error message being swallowed in some cases.

Also, CI hasn't switched to G4 yet

leezu · 2020-04-30T18:26:38Z

@mxnet-bot run ci [unix-gpu, windows-gpu, centos-gpu]

mxnet-bot · 2020-04-30T18:26:43Z

Jenkins CI successfully triggered : [windows-gpu, centos-gpu, unix-gpu]

marcoabreu · 2020-04-30T18:27:00Z

So how did CI pass if it was broken? 🤔

leezu · 2020-04-30T18:29:44Z

So how did CI pass if it was broken? thinking

Those features were not tested by CI. For some other tests that do rely on the broken feature, they are disabled, for example

https://github.com/apache/incubator-mxnet/blob/1496c91871b9d81d6a18785bdc8a1c3450bedbca/tests/python/unittest/test_deferred_compute.py#L327-L329

But the functionality in #18083 triggers the broken parts of TVMOP, and at this stage I think we should disable TVMOP on GPU feature instead of changing the way #18083 uses the mx.np API

jinboci · 2020-06-11T06:37:15Z

@leezu Hi, I am fixing the issue #17840. Would reverting changes in this PR reenable all the TVMOp tests?

leezu · 2020-06-13T01:10:26Z

Thank you @jinboci for fixing the issue! Reverting this PR will indeed re-enable the previous test setup, but there may be some conflicts during the revert due to subsequent changes to the CI. You can try it though.

Instead of reverting, my recommendation is to take a look at what tests this PR has removed and think about what the best strategy for testing the TVM_OP feature would be. For example, we can consider to test it only on the CentOS system tests or only on the Ubuntu system tests. But maybe this is not ideal. Do you have any recommendation about the best testing strategy?

Sorry for the late response.

jinboci · 2020-06-13T03:15:17Z

@leezu Thank you very much for your suggestions. I don't have recommendations about testing strategy yet; however, I will take a look at the test code and see if there are better ways for testing. Thanks!

Due to issues apache#17886 apache#17840

Disable -DUSE_TVM_OP on GPU builds

4418c5e

leezu requested review from aaronmarkham and marcoabreu as code owners April 30, 2020 07:40

leezu mentioned this pull request Apr 30, 2020

Changes to mxnet.metric #18083

Merged

7 tasks

marcoabreu approved these changes Apr 30, 2020

View reviewed changes

leezu merged commit 03fdfe0 into apache:master May 1, 2020

leezu deleted the disabletvmgpu branch May 1, 2020 00:13

leezu mentioned this pull request May 1, 2020

TVMOp doesn't work well with GPU builds #17840

Open

waytrue17 mentioned this pull request May 26, 2020

[v1.7.x] update jetson dockerfile to support CUDA 10.0 #18339

Merged

1 task

jinboci mentioned this pull request Jun 12, 2020

Restoring TVMOp tests #18542

Draft

7 tasks

AntiZpvoh pushed a commit to AntiZpvoh/incubator-mxnet that referenced this pull request Jul 6, 2020

Disable -DUSE_TVM_OP on GPU builds (apache#18204)

d5d762d

Due to issues apache#17886 apache#17840

ChaiBapchya mentioned this pull request Aug 16, 2020

[CI][1.x] Cherrypick: Upgrade unix gpu toolchain (#18186) #18785

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable -DUSE_TVM_OP on GPU builds #18204

Disable -DUSE_TVM_OP on GPU builds #18204

leezu commented Apr 30, 2020

mxnet-bot commented Apr 30, 2020

marcoabreu commented Apr 30, 2020

leezu commented Apr 30, 2020 •

edited

Loading

leezu commented Apr 30, 2020

mxnet-bot commented Apr 30, 2020

marcoabreu commented Apr 30, 2020

leezu commented Apr 30, 2020 •

edited

Loading

jinboci commented Jun 11, 2020

leezu commented Jun 13, 2020 •

edited

Loading

jinboci commented Jun 13, 2020

Disable -DUSE_TVM_OP on GPU builds #18204

Disable -DUSE_TVM_OP on GPU builds #18204

Conversation

leezu commented Apr 30, 2020

mxnet-bot commented Apr 30, 2020

marcoabreu commented Apr 30, 2020

leezu commented Apr 30, 2020 • edited Loading

leezu commented Apr 30, 2020

mxnet-bot commented Apr 30, 2020

marcoabreu commented Apr 30, 2020

leezu commented Apr 30, 2020 • edited Loading

jinboci commented Jun 11, 2020

leezu commented Jun 13, 2020 • edited Loading

jinboci commented Jun 13, 2020

leezu commented Apr 30, 2020 •

edited

Loading

leezu commented Apr 30, 2020 •

edited

Loading

leezu commented Jun 13, 2020 •

edited

Loading