Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pmem2_mem_ext failure because of Valigirnd not supporting avx512f. #5640

Open
grom72 opened this issue May 15, 2023 · 3 comments
Open

pmem2_mem_ext failure because of Valigirnd not supporting avx512f. #5640

grom72 opened this issue May 15, 2023 · 3 comments
Assignees
Labels
CI: Disabled Temporarily disabled from testing libpmem2 libpmem- and libpmem2-related Priority: 4 low Type: Bug A previously unknown bug in PMDK Type: External Bug A bug not in PMDK
Milestone

Comments

@grom72
Copy link
Contributor

grom72 commented May 15, 2023

ISSUE: pmem2_mem_ext/TEST[1-4]: failed under valgrind (self-hosted, rhel, RUNTESTS.py --force-enable pmemcheck)

Environment Information

  • PMDK package version(s): 37292d4
  • OS(es) version(s): Rocky Linux 9.1 (Blue Onyx)
  • ndctl version(s): 71.1
  • kernel version(s): 5.14.0-162.6.1.el9_1.x86_64
  • compiler, libraries, packaging and other related tools version(s):

Please provide a reproduction of the bug:

pmem2_mem_ext/TEST1: failed under valgrind (self-hosted, rhel, RUNTESTS.py --force-enable pmemcheck)
with error message:

pmem2_mem_ext/TEST1: FAILED	(short/debug/pmemcheck/page/wc_workaround: on/variant: avx512f)
Pattern: memmove_mov_avx512f occurs 0 times. One expected. Type: C Flag id: 0
pmem2_mem_ext/TEST1: FAILED	(short/debug/pmemcheck/page/wc_workaround: off/variant: avx512f)
Pattern: memmove_mov_avx512f occurs 0 times. One expected. Type: C Flag id: 0
pmem2_mem_ext/TEST1: FAILED	(short/debug/pmemcheck/page/wc_workaround: default/variant: avx512f)
Pattern: memmove_mov_avx512f occurs 0 times. One expected. Type: C Flag id: 0

## How often bug is revealed: (always, often, rare):  always

<!-- describe special circumstances in section above -->

## Actual behavior:

pmem2_mem_ext/TEST4: SETUP (short/debug/pmemcheck/byte/wc_workaround: off/variant: avx512f)
Last 9 lines of /home/test-user/actions-runner/_work/pmdk/pmdk/src/test/pmem2_mem_ext/pmemcheck4.log below (whole file has 9 lines):
==1341046== pmemcheck-1.0, a simple persistent store checker
==1341046== Copyright (c) 2014-2020, Intel Corporation
==1341046== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==1341046== Command: /home/test-user/actions-runner/_work/pmdk/pmdk/src/test/pmem2_mem_ext/pmem2_mem_ext /mnt/pmem0/pmem2_mem_ext_4/testfile C 1024 0
==1341046== Parent PID: 1320522
==1341046==
==1341046==
==1341046== Number of stores not made persistent: 0
==1341046== ERROR SUMMARY: 0 errors
Last 30 lines of /home/test-user/actions-runner/_work/pmdk/pmdk/src/test/pmem2_mem_ext/pmem2_4.log below (whole file has 55 lines):
: <3> [init.c:597 pmem2_arch_init] using movnt SSE2
: <3> [map_posix.c:293 pmem2_map_new] cfg 0x4232f50 src 0x4232fa0 map_ptr 0x1ffefffa40
: <3> [source_posix.c:143 pmem2_source_alignment] type 2
: <4> [source_posix.c:173 pmem2_source_alignment] alignment 4096
: <3> [source_posix.c:92 pmem2_source_size] type 2
: <4> [source_posix.c:132 pmem2_source_size] file length 4194304
: <4> [map_posix.c:140 map_reserve] system choice 0x7e09000
: <4> [map_posix.c:149 map_reserve] hint 0x8000000
: <15> [map_posix.c:204 file_map] reserve 0x8000000 len 4194304 proto 3 flags 10 fd 8 offset 0 map_sync 0x1ffefff97f
: <4> [map_posix.c:228 file_map] mmap with MAP_SYNC succeeded
: <3> [map_posix.c:477 pmem2_map_new] mapped at 0x8000000
: <15> [auto_flush_linux.c:140 pmem2_auto_flush]
: <15> [auto_flush_linux.c:175 pmem2_auto_flush] Start traversing region: /sys/bus/nd/devices/region0
: <3> [auto_flush_linux.c:86 check_domain_in_region] region_path: /sys/bus/nd/devices/region0
: <3> [auto_flush_linux.c:30 check_cpu_cache] domain_path: /sys/bus/nd/devices/region0/persistence_domain
: <15> [auto_flush_linux.c:64 check_cpu_cache] detected persistent_domain: memory_controller
: <15> [auto_flush_linux.c:69 check_cpu_cache] cpu_cache not in persistent_domain: /sys/bus/nd/devices/region0/persistence_domain
: <3> [map_posix.c:522 pmem2_map_new] using libpmem2 default async mover
: <3> [mover.c:181 mover_new] map 0x42344f0, vdm 0x1ffefff958
: <6> [ravl.c:395 ravl_emplace]
: <3> [map.c:44 pmem2_map_get_size] map 0x42344f0
: <3> [map.c:32 pmem2_map_get_address] map 0x42344f0
: <15> [memcpy_t_sse2.c:218 memmove_mov_sse2_empty] dest 0x8000400 src 0x8000000 len 1024
: <15> [persist.c:125 pmem2_drain]
: <15> [init.c:26 memory_barrier]
: <3> [map_posix.c:580 pmem2_map_delete] map_ptr 0x1ffefffa40
: <6> [ravl.c:526 ravl_find]
: <6> [ravl.c:526 ravl_find]
: <6> [ravl.c:547 ravl_remove]
: <3> [libpmem2.c:44 libpmem2_fini]
Last 3 lines of /home/test-user/actions-runner/_work/pmdk/pmdk/src/test/pmem2_mem_ext/out4.log below (whole file has 3 lines):
pmem2_mem_ext/TEST4: START: pmem2_mem_ext default avx avx512f movdir64b
/home/test-user/actions-runner/_work/pmdk/pmdk/src/test/pmem2_mem_ext/pmem2_mem_ext /mnt/pmem0/pmem2_mem_ext_4/testfile C 1024 0
pmem2_mem_ext/TEST4: DONE
Last 0 lines of /home/test-user/actions-runner/_work/pmdk/pmdk/src/test/pmem2_mem_ext/err4.log below (whole file has 0 lines):
Last 3 lines of /home/test-user/actions-runner/_work/pmdk/pmdk/src/test/pmem2_mem_ext/trace4.log below (whole file has 3 lines):
{pmem2_mem_ext.c:86 main} pmem2_mem_ext/TEST4: START: pmem2_mem_ext default avx avx512f movdir64b
/home/test-user/actions-runner/_work/pmdk/pmdk/src/test/pmem2_mem_ext/pmem2_mem_ext /mnt/pmem0/pmem2_mem_ext_4/testfile C 1024 0
{pmem2_mem_ext.c:143 main} pmem2_mem_ext/TEST4: DONE
pmem2_mem_ext/TEST4: FAILED (short/debug/pmemcheck/byte/wc_workaround: off/variant: avx512f)
Pattern: memmove_mov_avx512f occurs 0 times. One expected. Type: C Flag id: 0
pmem2_mem_ext/TEST4: SETUP (short/debug/pmemcheck/byte/wc_workaround: default/variant: sse2)
pmem2_mem_ext/TEST4: PASS [06.076 s]
pmem2_mem_ext/TEST4: SETUP (short/debug/pmemcheck/byte/wc_workaround: default/variant: avx)
pmem2_mem_ext/TEST4: PASS [06.150 s]
pmem2_mem_ext/TEST4: SETUP (short/debug/pmemcheck/byte/wc_workaround: default/variant: avx512f)
Last 9 lines of /home/test-user/actions-runner/_work/pmdk/pmdk/src/test/pmem2_mem_ext/pmemcheck4.log below (whole file has 9 lines):
==1341106== pmemcheck-1.0, a simple persistent store checker
==1341106== Copyright (c) 2014-2020, Intel Corporation
==1341106== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==1341106== Command: /home/test-user/actions-runner/_work/pmdk/pmdk/src/test/pmem2_mem_ext/pmem2_mem_ext /mnt/pmem0/pmem2_mem_ext_4/testfile C 1024 0
==1341106== Parent PID: 1320522
==1341106==
==1341106==
==1341106== Number of stores not made persistent: 0
==1341106== ERROR SUMMARY: 0 errors
Last 30 lines of /home/test-user/actions-runner/_work/pmdk/pmdk/src/test/pmem2_mem_ext/pmem2_4.log below (whole file has 54 lines):
: <3> [init.c:597 pmem2_arch_init] using movnt SSE2
: <3> [map_posix.c:293 pmem2_map_new] cfg 0x4232f50 src 0x4232fa0 map_ptr 0x1ffefffa60
: <3> [source_posix.c:143 pmem2_source_alignment] type 2
: <4> [source_posix.c:173 pmem2_source_alignment] alignment 4096
: <3> [source_posix.c:92 pmem2_source_size] type 2
: <4> [source_posix.c:132 pmem2_source_size] file length 4194304
: <4> [map_posix.c:140 map_reserve] system choice 0x7e09000
: <4> [map_posix.c:149 map_reserve] hint 0x8000000
: <15> [map_posix.c:204 file_map] reserve 0x8000000 len 4194304 proto 3 flags 10 fd 8 offset 0 map_sync 0x1ffefff99f
: <4> [map_posix.c:228 file_map] mmap with MAP_SYNC succeeded
: <3> [map_posix.c:477 pmem2_map_new] mapped at 0x8000000
: <15> [auto_flush_linux.c:140 pmem2_auto_flush]
: <15> [auto_flush_linux.c:175 pmem2_auto_flush] Start traversing region: /sys/bus/nd/devices/region0
: <3> [auto_flush_linux.c:86 check_domain_in_region] region_path: /sys/bus/nd/devices/region0
: <3> [auto_flush_linux.c:30 check_cpu_cache] domain_path: /sys/bus/nd/devices/region0/persistence_domain
: <15> [auto_flush_linux.c:64 check_cpu_cache] detected persistent_domain: memory_controller
: <15> [auto_flush_linux.c:69 check_cpu_cache] cpu_cache not in persistent_domain: /sys/bus/nd/devices/region0/persistence_domain
: <3> [map_posix.c:522 pmem2_map_new] using libpmem2 default async mover
: <3> [mover.c:181 mover_new] map 0x42344f0, vdm 0x1ffefff978
: <6> [ravl.c:395 ravl_emplace]
: <3> [map.c:44 pmem2_map_get_size] map 0x42344f0
: <3> [map.c:32 pmem2_map_get_address] map 0x42344f0
: <15> [memcpy_t_sse2.c:218 memmove_mov_sse2_empty] dest 0x8000400 src 0x8000000 len 1024
: <15> [persist.c:125 pmem2_drain]
: <15> [init.c:26 memory_barrier]
: <3> [map_posix.c:580 pmem2_map_delete] map_ptr 0x1ffefffa60
: <6> [ravl.c:526 ravl_find]
: <6> [ravl.c:526 ravl_find]
: <6> [ravl.c:547 ravl_remove]
: <3> [libpmem2.c:44 libpmem2_fini]
Last 3 lines of /home/test-user/actions-runner/_work/pmdk/pmdk/src/test/pmem2_mem_ext/out4.log below (whole file has 3 lines):
pmem2_mem_ext/TEST4: START: pmem2_mem_ext default avx avx512f movdir64b
/home/test-user/actions-runner/_work/pmdk/pmdk/src/test/pmem2_mem_ext/pmem2_mem_ext /mnt/pmem0/pmem2_mem_ext_4/testfile C 1024 0
pmem2_mem_ext/TEST4: DONE
Last 0 lines of /home/test-user/actions-runner/_work/pmdk/pmdk/src/test/pmem2_mem_ext/err4.log below (whole file has 0 lines):
Last 3 lines of /home/test-user/actions-runner/_work/pmdk/pmdk/src/test/pmem2_mem_ext/trace4.log below (whole file has 3 lines):
{pmem2_mem_ext.c:86 main} pmem2_mem_ext/TEST4: START: pmem2_mem_ext default avx avx512f movdir64b
/home/test-user/actions-runner/_work/pmdk/pmdk/src/test/pmem2_mem_ext/pmem2_mem_ext /mnt/pmem0/pmem2_mem_ext_4/testfile C 1024 0
{pmem2_mem_ext.c:143 main} pmem2_mem_ext/TEST4: DONE
pmem2_mem_ext/TEST4: FAILED (short/debug/pmemcheck/byte/wc_workaround: default/variant: avx512f)

## Expected behavior:

<!-- fill this out -->

## Details

<!-- fill this out -->

## Additional information about Priority and Help Requested:

Are you willing to submit a pull request with a proposed change? (Yes, No)  <!-- check one if possible -->

Requested priority: (Showstopper, High, Medium, Low)                        <!-- check one if possible -->
@grom72 grom72 added the Type: Bug A previously unknown bug in PMDK label May 15, 2023
@grom72 grom72 added this to the 1.13 on GHA milestone May 15, 2023
@grom72 grom72 changed the title pmem2_mem_ext/TEST[1-4: failed pmemcheck pmem2_mem_ext/TEST[1-4]: failed pmemcheck May 15, 2023
@grom72 grom72 added QA: CI .github/ and utils/ related to automated testing libpmem2 libpmem- and libpmem2-related labels May 15, 2023
@grom72
Copy link
Contributor Author

grom72 commented May 15, 2023

See https://github.com/pmem/pmdk/actions/runs/4969298952/jobs/8892382347#step:6:4640
This test should skip as it does for MOVDIR64B

pmem2_mem_ext/TEST4: FAILED	(short/debug/pmemcheck/byte/wc_workaround: default/variant: avx512f)
Pattern: memmove_mov_avx512f occurs 0 times. One expected. Type: C Flag id: 0
pmem2_mem_ext/TEST5: SETUP	(short/debug/pmemcheck/byte/wc_workaround: on/variant: movdir64b)
pmem2_mem_ext/TEST5: SKIP: MOVDIR64B unavailable
pmem2_mem_ext/TEST5: SETUP	(short/debug/pmemcheck/byte/wc_workaround: off/variant: movdir64b)
pmem2_mem_ext/TEST5: SKIP: MOVDIR64B unavailable

@grom72
Copy link
Contributor Author

grom72 commented May 15, 2023

See #4715 ;)

@grom72 grom72 added Exposure: Medium Priority: 3 medium Type: External Bug A bug not in PMDK CI: Disabled Temporarily disabled from testing and removed Exposure: Medium QA: CI .github/ and utils/ related to automated testing labels May 15, 2023
@grom72 grom72 changed the title pmem2_mem_ext/TEST[1-4]: failed pmemcheck pmem2_mem_ext failure because of Valigirnd not supporting avx512f. May 18, 2023
grom72 added a commit to grom72/pmdk that referenced this issue May 18, 2023
Disable avc512x until Valgrind until Valigrind will provide support for avx512f instruction.
See: pmem#5640

Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@intel.com>
grom72 added a commit to grom72/pmdk that referenced this issue May 18, 2023
Disable avc512x until Valgrind until Valigrind will provide support for avx512f instruction.
See: pmem#5640

Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@intel.com>
grom72 added a commit to grom72/pmdk that referenced this issue May 18, 2023
Disable avx512f tests until Valgrind until
Valigrind will provide support for avx512f instruction.

See: pmem#5640

Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@intel.com>
grom72 added a commit to grom72/pmdk that referenced this issue May 18, 2023
Disable avx512f tests until Valgrind until
Valigrind will provide support for avx512f instruction.

See: pmem#5640

Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@intel.com>
@grom72
Copy link
Contributor Author

grom72 commented May 18, 2023

It looks like Valgrind does not support avx512f.
is_cpu_feature_present(0x7, EBX_IDX, bit_AVX512F); returns 0 under Valigring but 1 when the function is called directly on CPU.
In the case of the pmem2_mem_ext test setup, a program cpufd is used to determine if avx512f is available.
This program uses directly cpu.c source code without any Valgrind instrumentation and detects properly avx512f availability.
Inside test pmem2_mem_ext is called under Valgrind and libpmem2 library initialization does not detect avx512f support properly.

<libpmem2>: <1> [out.c:209 out_init] pid 1572171: program: /home/tgromadz/repos/pmdk/src/test/pmem2_mem_ext/pmem2_mem_ext
<libpmem2>: <1> [out.c:211 out_init] libpmem2 version 0.0
<libpmem2>: <1> [out.c:215 out_init] src version: 1.13.0+git36.g850a09941
<libpmem2>: <1> [out.c:223 out_init] compiled with support for Valgrind pmemcheck
<libpmem2>: <1> [out.c:228 out_init] compiled with support for Valgrind helgrind
<libpmem2>: <1> [out.c:233 out_init] compiled with support for Valgrind memcheck
<libpmem2>: <1> [out.c:238 out_init] compiled with support for Valgrind drd
<libpmem2>: <1> [out.c:243 out_init] compiled with support for shutdown state
<libpmem2>: <1> [out.c:248 out_init] compiled with libndctl 63+
<libpmem2>: <3> [libpmem2.c:29 libpmem2_init] 
<libpmem2>: <3> [init.c:560 pmem2_arch_init] 
<libpmem2>: <3> [init.c:472 pmem_cpuinfo_to_funcs] 
<libpmem2>: <4> [cpu.c:135 is_cpu_clflush_present] CLFLUSH supported
<libpmem2>: <3> [init.c:475 pmem_cpuinfo_to_funcs] clflush supported
<libpmem2>: <4> [cpu.c:147 is_cpu_clflushopt_present] CLFLUSHOPT not supported
<libpmem2>: <4> [cpu.c:159 is_cpu_clwb_present] CLWB not supported
<libpmem2>: <4> [cpu.c:123 is_cpu_genuine_intel] CPU vendor: GenuineIntel
<libpmem2>: <3> [init.c:517 pmem_cpuinfo_to_funcs] WC workaround forced to 1
<libpmem2>: <3> [init.c:527 pmem_cpuinfo_to_funcs] WC workaround = 1
<libpmem2>: <4> [cpu.c:171 is_cpu_avx_present] AVX supported
<libpmem2>: <3> [init.c:272 use_avx_memcpy_memset] avx supported
<libpmem2>: <3> [init.c:276 use_avx_memcpy_memset] PMEM_AVX set to 0
<libpmem2>: <4> [cpu.c:183 is_cpu_avx512f_present] AVX512f not supported
<libpmem2>: <4> [cpu.c:196 is_cpu_movdir64b_present] movdir64b not supported
<libpmem2>: <3> [init.c:588 pmem2_arch_init] using clflush
<libpmem2>: <3> [init.c:599 pmem2_arch_init] using movnt SSE2
<libpmem2>: <3> [map_posix.c:293 pmem2_map_new] cfg 0x8032d30 src 0x8032db0 map_ptr 0x1ffeffec50

Below is the fragment of the log from calling pmem2_mem_ext directly from the command line:

<libpmem2>: <1> [out.c:209 out_init] pid 1569697: program: /home/tgromadz/repos/pmdk/src/test/pmem2_mem_ext/pmem2_mem_ext
<libpmem2>: <1> [out.c:211 out_init] libpmem2 version 0.0
<libpmem2>: <1> [out.c:215 out_init] src version: 1.13.0+git36.g850a09941
<libpmem2>: <1> [out.c:223 out_init] compiled with support for Valgrind pmemcheck
<libpmem2>: <1> [out.c:228 out_init] compiled with support for Valgrind helgrind
<libpmem2>: <1> [out.c:233 out_init] compiled with support for Valgrind memcheck
<libpmem2>: <1> [out.c:238 out_init] compiled with support for Valgrind drd
<libpmem2>: <1> [out.c:243 out_init] compiled with support for shutdown state
<libpmem2>: <1> [out.c:248 out_init] compiled with libndctl 63+
<libpmem2>: <3> [libpmem2.c:29 libpmem2_init] 
<libpmem2>: <3> [init.c:560 pmem2_arch_init] 
<libpmem2>: <3> [init.c:472 pmem_cpuinfo_to_funcs] 
<libpmem2>: <4> [cpu.c:135 is_cpu_clflush_present] CLFLUSH supported
<libpmem2>: <3> [init.c:475 pmem_cpuinfo_to_funcs] clflush supported
<libpmem2>: <4> [cpu.c:147 is_cpu_clflushopt_present] CLFLUSHOPT supported
<libpmem2>: <3> [init.c:483 pmem_cpuinfo_to_funcs] clflushopt supported
<libpmem2>: <4> [cpu.c:159 is_cpu_clwb_present] CLWB supported
<libpmem2>: <3> [init.c:496 pmem_cpuinfo_to_funcs] clwb supported
<libpmem2>: <4> [cpu.c:123 is_cpu_genuine_intel] CPU vendor: GenuineIntel
<libpmem2>: <3> [init.c:527 pmem_cpuinfo_to_funcs] WC workaround = 1
<libpmem2>: <4> [cpu.c:171 is_cpu_avx_present] AVX supported
<libpmem2>: <3> [init.c:272 use_avx_memcpy_memset] avx supported
<libpmem2>: <3> [init.c:280 use_avx_memcpy_memset] PMEM_AVX enabled
<libpmem2>: <4> [cpu.c:183 is_cpu_avx512f_present] AVX512f supported
<libpmem2>: <3> [init.c:372 use_avx512f_memcpy_memset] avx512f supported
<libpmem2>: <3> [init.c:380 use_avx512f_memcpy_memset] PMEM_AVX512F enabled
<libpmem2>: <4> [cpu.c:196 is_cpu_movdir64b_present] movdir64b not supported
<libpmem2>: <3> [init.c:584 pmem2_arch_init] using clwb
<libpmem2>: <3> [init.c:595 pmem2_arch_init] using movnt AVX512F

For that reason, the test will be disabled until Valgrind will properly support avx512f.
See https://bugs.kde.org/show_bug.cgi?id=383010 as a reference.

@grom72 grom72 added Priority: 4 low and removed Priority: 3 medium Type: Bug A previously unknown bug in PMDK labels May 18, 2023
@grom72 grom72 removed this from the 1.13 on GHA milestone May 18, 2023
@grom72 grom72 self-assigned this May 18, 2023
@grom72 grom72 modified the milestones: 1.14, 1.15 May 18, 2023
@janekmi janekmi added the Type: Bug A previously unknown bug in PMDK label Sep 14, 2023
@janekmi janekmi modified the milestones: 2.0.1, 2.0.2 Nov 23, 2023
@janekmi janekmi modified the milestones: 2.1.0, 2.x Feb 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Disabled Temporarily disabled from testing libpmem2 libpmem- and libpmem2-related Priority: 4 low Type: Bug A previously unknown bug in PMDK Type: External Bug A bug not in PMDK
Projects
None yet
Development

No branches or pull requests

2 participants