-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New SEACAS tests failing in ATDM Trilinos builds starting on 7/19/2018 and 7/23/2018 #3183
Comments
@bartlettroscoe
The new capability in aprepro is the addition of exodus query capability, so the extra data may be related to linking aprepro now to a parallel hdf5 and netcdf library... I will take a look. |
@bartlettroscoe |
@gsjaardema, if you don't want to deal with the hassle of fixing those tests on 'mutrino', we can just disable them on 'mutrino'. I can provide detailed instructions on how to do that. |
@bartlettroscoe Yes, that would be good to do. |
@bartlettroscoe How do I disable the tests on |
Hopefully this documentation will allow any Trilinos developer to selectively disable tests for the ATDM Trilinos builds. Hopefully this documentation will allow a Trilinos developer to disable tests as part of trilinos#3183 but this will be used in many future issues as well.
@gsjaardema, if you have some time, can you please take a look at the documentation that explains how to do this that I wrote as part of the PR #3211? Specifically, can you read over the new section shown in this PR branch at: and then comment in that PR if you see any problems or if anything is not clear? Of you don't find anything wrong, can you please approve that PR so that we can merge it? Then hopefully it will be straightforward how to disable these specific tests for the builds on 'mutrino' (which you can do in a new PR). Note the sub-process Temporarily disable the failing code or test . This PR #3211 is just providing the technical details on how to disable the test. |
Hopefully this documentation will allow any Trilinos developer to selectively disable tests for the ATDM Trilinos builds. Hopefully this documentation will allow a Trilinos developer to disable tests as part of #3183 but this will be used in many future issues as well.
Thanks!
Kris
…Sent from my iPhone
On Jul 31, 2018, at 11:36 AM, Roscoe A. Bartlett <notifications@github.com<mailto:notifications@github.com>> wrote:
@bartlettroscoe<https://github.com/bartlettroscoe> How do I disable the tests on mutrino?
@gsjaardema<https://github.com/gsjaardema>, if you have some time, can you please take a look at the documentation that explains how to do this that I wrote as part of the PR #3211<#3211>? Specifically, can you read over the new section shown in this PR branch at:
* https://github.com/bartlettroscoe/Trilinos/blob/3183-disable-tests-info/cmake/std/atdm/README.md#disabling-failing-tests
and then comment in that PR if you see any problems or if anything is not clear? Of you don't find anything wrong, can you please approve that PR so that we can merge it?
Then hopefully it will be straightforward how to disable these specific tests for the builds on 'mutrino' (which you can do in a new PR).
Note the sub-process Temporarily disable the failing code or test <https://snl-wiki.sandia.gov/display/CoodinatedDevOpsATDM/Triaging+and+addressing+ATDM+Trilinos+Failures#TriagingandaddressingATDMTrilinosFailures-5.Makesuretheissueisaddressedinatimelyway:> .
This PR #3211<#3211> is just providing the technical details on how to disable the test.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#3183 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AMxA7GK7WsKjjmQB8ohmFMLjmApQLohIks5uMJWfgaJpZM4VeYtA>.
|
The executables are built in a parallel build, but they are really serial so when run there is some extra info that is output to stderr which ends up messing with the textual comparison with the expected "gold" output files. For now disable the tests until can figure out a better way of running them. This should address #3183.
PR #3213 was just merged that should disable these tests on the 'mutrino' builds. After we get confirmation that these tests are disabled tomorrow, then we can add the "Disabled Tests" labels and move one. |
The good news is that the 'mutrino' build SEACAS test results today shown here don't show the following tests as failing:
(because they have been disabled) The bad news is that it does show that the test When the test passes like shown here, it shows:
when it fails like shown here, it shows:
This looks like the same problem you mentioned above with these annoying STDERR output on 'mutrino'. The reason we did not notice this before was that these are randomly failing tests and the day that I looked at to create this issue did not happen to have these tests failing. Can we disable this test as well in the 'mutrino' builds? And can we refactor to a single |
Yes. That would be good to do. Should I do it or do you want to.
.. gteg
…On Thu, Aug 2, 2018 at 1:43 PM Roscoe A. Bartlett ***@***.***> wrote:
@gsjaardema <https://github.com/gsjaardema>,
The good news is that the 'mutrino' build SEACAS test results today shown
here
<https://testing-vm.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2018-08-02&filtercombine=and&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=status&compare2=62&value2=passed&field3=status&compare3=62&value3=notrun&field4=groupname&compare4=61&value4=ATDM&field5=testname&compare5=65&value5=SEACAS>
don't show the following tests as failing:
- SEACASAprepro_aprepro_array_test
- SEACASAprepro_aprepro_command_line_include_test
- SEACASAprepro_aprepro_command_line_vars_test
- SEACASAprepro_aprepro_unit_test
- SEACASAprepro_lib_aprepro_lib_array_test
- SEACASAprepro_lib_aprepro_lib_unit_test
- SEACASExodus_exodus_unit_tests_nc5_env
(because they have been disabled)
The bad news is that it does show that the test
SEACASAprepro_aprepro_test_dump_reread is failing in both of the builds.
The test SEACASAprepro_aprepro_test_dump_reread appeared in testing on
7/23/2018 (and therefore must have been a PR merged to 'develop' on
7/22/2018) and then started randomly failing as shown in this query
<https://testing-vm.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2018-08-02&filtercombine=and&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=groupname&compare2=61&value2=ATDM&field3=testname&compare3=61&value3=SEACASAprepro_aprepro_test_dump_reread&field4=site&compare4=61&value4=mutrino&field5=buildstarttime&compare5=83&value5=2018-07-15>
.
When the test passes like shown here
<https://testing-vm.sandia.gov/cdash/testDetails.php?test=51055647&build=3759381>,
it shows:
================================================================================
TEST_3
Running: "diff" "-w" "test-filter.dump" "test-reread.dump"
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
TEST_3: Return code = 0
TEST_3: Pass criteria = Zero return code [PASSED]
TEST_3: Result = PASSED
================================================================================
when it fails like shown here
<https://testing-vm.sandia.gov/cdash/testDetails.php?test=51563099&build=3790512>,
it shows:
================================================================================
TEST_3
Running: "diff" "-w" "test-filter.dump" "test-reread.dump"
--------------------------------------------------------------------------------
1,2c1,2
< Thu Aug 2 11:20:16 2018: [unset]:_pmi_alps_init:alps_get_placement_info returned with error -1
< Thu Aug 2 11:20:16 2018: [unset]:_pmi_init:_pmi_alps_init returned -1
---
> Thu Aug 2 11:20:17 2018: [unset]:_pmi_alps_init:alps_get_placement_info returned with error -1
> Thu Aug 2 11:20:17 2018: [unset]:_pmi_init:_pmi_alps_init returned -1
--------------------------------------------------------------------------------
TEST_3: Return code = 1
TEST_3: Pass criteria = Zero return code [FAILED]
TEST_3: Result = FAILED
================================================================================
This looks like the same problem you mentioned above
<#3183 (comment)>
with these annoying STDERR output on 'mutrino'. The reason we did not
notice this before was that these are randomly failing tests and the day
that I looked at to create this issue did not happen to have these tests
failing.
Can we disable this test as well in the 'mutrino' builds? And can we
refactor to a single *.cmake file all of the common test disabled as
described here
<https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#disable-a-test-for-several-or-all-builds-on-a-specific-platform>
?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3183 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA2xDqXGEdtYT0fPFqaxvuQDmHCRuOZNks5uM1ZKgaJpZM4VeYtA>
.
|
@gsjaardema, I can create a new PR to disable this last test after the PR #3051 gets merged. Otherwise, these PRs could conflict (depending on how smart git/github is). |
These tests were shown failing again is the build |
I can't explain what happened, but somehow the commit a133d86:
dropped the disables that @gsjaardema added in commit 5059d8a. This was a simple file renaming so I can't understand how it deleted these lines. Very scary. I will add the disables again. |
This also adds back the disables for several SEACAS tests that got removed when the file INTEL-RELEASE-OPENMP.cmake got renamed to the file INTEL-RELEASE-OPENMP-HSW.cmake (not clear how that happened).
FYI: PR #3251 should contain the necessary disables and removed duplication. @gsjaardema or @fryeguy52, can you please approve PR #3251 so that I can merge once PR testing is complete? |
FYI: As shown in this query, the test |
FYI: As shown in this query, the only ATDM group SEACAS tests failing since 8/8/2018 are the tests:
mostly on 'white' (except for one failure each on 'chama' and 'serrano' for some reason) and the failures on 'white' are addressed in #3288. In any case, the test failures called out in this Issue seem to be fixed. Therefore, I think we can close this issue. Closing as complete. |
…rilinos#3211) Hopefully this documentation will allow any Trilinos developer to selectively disable tests for the ATDM Trilinos builds. Hopefully this documentation will allow a Trilinos developer to disable tests as part of trilinos#3183 but this will be used in many future issues as well.
The executables are built in a parallel build, but they are really serial so when run there is some extra info that is output to stderr which ends up messing with the textual comparison with the expected "gold" output files. For now disable the tests until can figure out a better way of running them. This should address trilinos#3183.
…linos#3251) This also adds back the disables for several SEACAS tests that got removed when the file INTEL-RELEASE-OPENMP.cmake got renamed to the file INTEL-RELEASE-OPENMP-HSW.cmake (not clear how that happened).
New SEACAS tests failing in ATDM Trilinos builds starting on 7/19/2018 and 7/23/2018
CC: @trilinos/seacas, @gsjaardema (pushed breaking commits?), @kddevin (Trilinos Data Services Product Lead)
Next Action Status
PR #3213 merged on 8/1/2018 then later fixed in PR #3251 merged 8/8/2018 that disabled most of these tests in the 'mutrino' builds on 8/2/2018. No test failures since 8/8/2018 as of 8/29/2018.
Description
As shown in this query for the builds today, the tests:
are failing in the builds:
and the tests:
are failing in the builds:
As shown in this query showing failing SEACAS tests going back to 7/10/2018, the test
SEACASExodus_exodus_unit_tests_nc5_env
started failing on 7/19/2018 and the other tests started failing on 7/23/2018. There were several PRs merged the days before these dates by @gsjaardema so it is not clear which changes caused these new failures but it seems likely that one or more of the commits in these merged PRs triggered these new failures.Also, the test
SEACASAprepro_aprepro_test_dump_reread
added in one of these PRs appeared on 7/23/2018 and then started randomly failing as shown in this query. When the test passes like shown here, it shows:when it fails like shown here, it shows:
Steps to reproduce
These failures should be reproducable on the machines 'hansen' or 'shiller' and 'mutrino' using the instructions in:
For example, for the failures on 'hansen'/'shiler', the specific instructions are given at:
For example, after cloning Trilinos, the following commands should reproduce the test failures on 'hansen' or 'shiller' with:
The text was updated successfully, but these errors were encountered: