Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMD Rocm windows does not work - hipErrorSharedObjectInitFailed #2119

Open
vasicvuk opened this issue Apr 16, 2024 · 1 comment
Open

AMD Rocm windows does not work - hipErrorSharedObjectInitFailed #2119

vasicvuk opened this issue Apr 16, 2024 · 1 comment

Comments

@vasicvuk
Copy link

vasicvuk commented Apr 16, 2024

Installed latest version of AMD drivers. Graphics card is: 7900 XTX

No vmfb found. Compiling and saving to D:\nodeai shark\euler_scale_model_input_1_512_512_rocm_fp16.vmfb
Configuring for device:rocm://0
could not execute `iree-run-module --dump_devices=rocm`
Did not find ROCm architecture from `--iree-rocm-target-chip` flag
 or from `iree-run-module --dump_devices=rocm` command.
Using gfx1100 as ROCm arch for compilation.
Saved vmfb in D:\nodeai shark\euler_scale_model_input_1_512_512_rocm_fp16.vmfb.
Loading module D:\nodeai shark\euler_scale_model_input_1_512_512_rocm_fp16.vmfb...
Traceback (most recent call last):
  File "C:\Users\xxx\AppData\Local\Temp\_MEI61762\gradio\queueing.py", line 489, in call_prediction
    output = await route_utils.call_process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\xxx\AppData\Local\Temp\_MEI61762\gradio\route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\xxx\AppData\Local\Temp\_MEI61762\gradio\blocks.py", line 1561, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\xxx\AppData\Local\Temp\_MEI61762\gradio\blocks.py", line 1191, in call_function
    prediction = await utils.async_iteration(iterator)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\xxx\AppData\Local\Temp\_MEI61762\gradio\utils.py", line 519, in async_iteration
    return await iterator.__anext__()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\xxx\AppData\Local\Temp\_MEI61762\gradio\utils.py", line 512, in __anext__
    return await anyio.to_thread.run_sync(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "anyio\to_thread.py", line 56, in run_sync
  File "anyio\_backends\_asyncio.py", line 2134, in run_sync_in_worker_thread
  File "anyio\_backends\_asyncio.py", line 851, in run
  File "C:\Users\xxx\AppData\Local\Temp\_MEI61762\gradio\utils.py", line 495, in run_sync_iterator_async
    return next(iterator)
           ^^^^^^^^^^^^^^
  File "C:\Users\xxx\AppData\Local\Temp\_MEI61762\gradio\utils.py", line 666, in gen_wrapper
    yield from f(*args, **kwargs)
  File "ui\txt2img_ui.py", line 194, in txt2img_inf
  File "apps\stable_diffusion\src\schedulers\sd_schedulers.py", line 141, in get_schedulers
  File "apps\stable_diffusion\src\schedulers\shark_eulerdiscrete.py", line 147, in compile
  File "apps\stable_diffusion\src\schedulers\shark_eulerdiscrete.py", line 123, in _import
  File "apps\stable_diffusion\src\utils\utils.py", line 187, in compile_through_fx
  File "apps\stable_diffusion\src\utils\utils.py", line 84, in _compile_module
  File "shark\shark_inference.py", line 232, in load_module
    params = load_flatbuffer(
             ^^^^^^^^^^^^^^^^
  File "shark\iree_utils\compile_utils.py", line 517, in load_flatbuffer
    vmfb, config, temp_file_to_unlink = load_vmfb_using_mmap(
                                        ^^^^^^^^^^^^^^^^^^^^^
  File "shark\iree_utils\compile_utils.py", line 448, in load_vmfb_using_mmap
    ctx.add_vm_module(mmaped_vmfb)
  File "iree\runtime\system_api.py", line 271, in add_vm_module
  File "iree\runtime\system_api.py", line 268, in add_vm_modules
RuntimeError: Error registering modules: C:\actions-runner\w\SRT\SRT\c\experimental\rocm\status_util.c:31: INTERNAL; rocm driver error 'hipErrorSharedObjectInitFailed' (303): shared object initialization failed; while invoking native function hal.executable.create; while calling import;
[ 1]   native hal.executable.create:0 -
[ 0] bytecode module@1:284 -
@rohit-mp
Copy link

rohit-mp commented Aug 4, 2024

Similar issue, but it seems to be trying to access something from D drive when my system doesn't even have one.

local-sync devices are available.
local-task devices are available.
vulkan devices are available.
metal devices are not available.
cuda devices are not available.
hip devices are available.
Clearing .mlir temporary files from a prior run. This may take some time...
Clearing .mlir temporary files took 0.0000 seconds.
gradio temporary image cache located at C:\Users\rohit\Downloads\sd\shark_tmp\gradio. You may change this by setting the GRADIO_TEMP_DIR environment variable.
No temporary images files to clear.
gradio temporary image cache located at C:\Users\rohit\Downloads\sd\shark_tmp\gradio. You may change this by setting the GRADIO_TEMP_DIR environment variable.
No temporary images files to clear.
diffusers\utils\outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
diffusers\utils\outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
Running on local URL:  http://0.0.0.0:8080


To create a public link, set `share=True` in `launch()`.
To create a public link, set `share=True` in `launch()`.


[LOG] Submitting Request...


[LOG] Initializing new pipeline...


[LOG] Pipeline initialized with pipe_id: stabilityai_stable-diffusion-2-1-base_1_64_512x512_fp16_gfx1100.


[LOG] Preparing pipeline...

Missing files: clip.vmfb, unet.vmfb, vae_decode.vmfb, clip.safetensors, unet.safetensors, vae_decode.safetensors

huggingface_hub\file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
Saved params to C:\Users\rohit\Downloads\sd\models\stabilityai_stable-diffusion-2-1-base_fp16\clip.safetensors

Compiling to rocm with flags: ['--iree-hal-target-backends=rocm', '--iree-rocm-target-chip=gfx1100', '--iree-opt-const-eval=false', '--iree-vm-bytecode-module-output-format=flatbuffer-binary', '--iree-global-opt-propagate-transposes=true', '--iree-opt-outer-dim-concat=true', '--iree-vm-target-truncate-unsupported-floats', '--iree-llvmgpu-enable-prefetch=true', '--iree-opt-data-tiling=false', '--iree-opt-aggressively-propagate-transposes=true', '--iree-flow-enable-aggressive-fusion', '--iree-global-opt-enable-fuse-horizontal-contractions=true', '--iree-codegen-gpu-native-math-precision=true', '--iree-codegen-llvmgpu-use-vector-distribution=true', '--iree-preprocessing-pass-pipeline=builtin.module(iree-preprocessing-transpose-convolution-pipeline, iree-global-opt-raise-special-ops, util.func(iree-preprocessing-pad-to-intrinsics))', '--iree-codegen-transform-dialect-library=C:\\Users\\rohit\\Downloads\\sd\\models\\stabilityai_stable-diffusion-2-1-base_1_64_512x512_fp16_gfx1100\\attention_and_matmul_spec_wmma.mlir']

Saved to C:\Users\rohit\Downloads\sd\models\stabilityai_stable-diffusion-2-1-base_1_64_512x512_fp16_gfx1100\clip.mlir

Saved to C:\Users\rohit\Downloads\sd\models\stabilityai_stable-diffusion-2-1-base_1_64_512x512_fp16_gfx1100\clip.vmfb

huggingface_hub\file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
Saved params to C:\Users\rohit\Downloads\sd\models\stabilityai_stable-diffusion-2-1-base_fp16\unet.safetensors

Compiling to rocm with flags: ['--iree-hal-target-backends=rocm', '--iree-rocm-target-chip=gfx1100', '--iree-opt-const-eval=false', '--iree-vm-bytecode-module-output-format=flatbuffer-binary', '--iree-global-opt-propagate-transposes=true', '--iree-opt-outer-dim-concat=true', '--iree-vm-target-truncate-unsupported-floats', '--iree-llvmgpu-enable-prefetch=true', '--iree-opt-data-tiling=false', '--iree-opt-aggressively-propagate-transposes=true', '--iree-flow-enable-aggressive-fusion', '--iree-global-opt-enable-fuse-horizontal-contractions=true', '--iree-codegen-gpu-native-math-precision=true', '--iree-codegen-llvmgpu-use-vector-distribution=true', '--iree-preprocessing-pass-pipeline=builtin.module(iree-preprocessing-transpose-convolution-pipeline, iree-global-opt-raise-special-ops, util.func(iree-preprocessing-pad-to-intrinsics))', '--iree-codegen-transform-dialect-library=C:\\Users\\rohit\\Downloads\\sd\\models\\stabilityai_stable-diffusion-2-1-base_1_64_512x512_fp16_gfx1100\\attention_and_matmul_spec_wmma.mlir']

Saved to C:\Users\rohit\Downloads\sd\models\stabilityai_stable-diffusion-2-1-base_1_64_512x512_fp16_gfx1100\unet.mlir

Saved to C:\Users\rohit\Downloads\sd\models\stabilityai_stable-diffusion-2-1-base_1_64_512x512_fp16_gfx1100\unet.vmfb

huggingface_hub\file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
Saved params to C:\Users\rohit\Downloads\sd\models\stabilityai_stable-diffusion-2-1-base_fp16\vae_decode.safetensors

Compiling to rocm with flags: ['--iree-hal-target-backends=rocm', '--iree-rocm-target-chip=gfx1100', '--iree-opt-const-eval=false', '--iree-vm-bytecode-module-output-format=flatbuffer-binary', '--iree-global-opt-propagate-transposes=true', '--iree-opt-outer-dim-concat=true', '--iree-vm-target-truncate-unsupported-floats', '--iree-llvmgpu-enable-prefetch=true', '--iree-opt-data-tiling=false', '--iree-opt-aggressively-propagate-transposes=true', '--iree-flow-enable-aggressive-fusion', '--iree-global-opt-enable-fuse-horizontal-contractions=true', '--iree-codegen-gpu-native-math-precision=true', '--iree-codegen-llvmgpu-use-vector-distribution=true', '--iree-preprocessing-pass-pipeline=builtin.module(iree-preprocessing-transpose-convolution-pipeline, iree-global-opt-raise-special-ops, util.func(iree-preprocessing-pad-to-intrinsics))', '--iree-codegen-transform-dialect-library=C:\\Users\\rohit\\Downloads\\sd\\models\\stabilityai_stable-diffusion-2-1-base_1_64_512x512_fp16_gfx1100\\attention_and_matmul_spec_wmma.mlir']

Saved to C:\Users\rohit\Downloads\sd\models\stabilityai_stable-diffusion-2-1-base_1_64_512x512_fp16_gfx1100\vae_decode.mlir

Saved to C:\Users\rohit\Downloads\sd\models\stabilityai_stable-diffusion-2-1-base_1_64_512x512_fp16_gfx1100\vae_decode.vmfb

All necessary files found.


[LOG] Loading pipeline to device rocm.

huggingface_hub\file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
Traceback (most recent call last):
  File "C:\Users\rohit\AppData\Local\Temp\_MEI66202\gradio\queueing.py", line 527, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\rohit\AppData\Local\Temp\_MEI66202\gradio\route_utils.py", line 270, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\rohit\AppData\Local\Temp\_MEI66202\gradio\blocks.py", line 1847, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\rohit\AppData\Local\Temp\_MEI66202\gradio\blocks.py", line 1445, in call_function
    prediction = await utils.async_iteration(iterator)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\rohit\AppData\Local\Temp\_MEI66202\gradio\utils.py", line 629, in async_iteration
    return await iterator.__anext__()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\rohit\AppData\Local\Temp\_MEI66202\gradio\utils.py", line 622, in __anext__
    return await anyio.to_thread.run_sync(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "anyio\to_thread.py", line 56, in run_sync
  File "anyio\_backends\_asyncio.py", line 2177, in run_sync_in_worker_thread
  File "anyio\_backends\_asyncio.py", line 859, in run
  File "C:\Users\rohit\AppData\Local\Temp\_MEI66202\gradio\utils.py", line 605, in run_sync_iterator_async
    return next(iterator)
           ^^^^^^^^^^^^^^
  File "C:\Users\rohit\AppData\Local\Temp\_MEI66202\gradio\utils.py", line 788, in gen_wrapper
    response = next(iterator)
               ^^^^^^^^^^^^^^
  File "apps\shark_studio\api\sd.py", line 315, in shark_sd_fn_dict_input
  File "apps\shark_studio\api\sd.py", line 438, in shark_sd_fn
  File "apps\shark_studio\api\sd.py", line 244, in prepare_pipe
  File "turbine_models\custom_models\sd_inference\sd_pipeline.py", line 372, in load_pipeline
  File "turbine_models\model_runner.py", line 66, in __init__
  File "iree\runtime\system_api.py", line 191, in __init__
    self._vm_context = _binding.VmContext(
                       ^^^^^^^^^^^^^^^^^^^
RuntimeError: Error creating vm context with modules: D:\a\SRT\SRT\c\experimental\rocm\status_util.c:31: INTERNAL; rocm driver error 'hipErrorSharedObjectInitFailed' (303): shared object initialization failed; mismatched target chip? missing/wrong bitcode directory?; while invoking native function hal.executable.create; while calling import;
[ 1]   native hal.executable.create:0 -
[ 0] bytecode compiled_clip.__init:3500 [
    <stdin>:408:10
      at <stdin>:376:12,```

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants