Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ParalleRunner hangs on Linux Server #4176

Open
Dekermanjian opened this issue Sep 18, 2024 · 3 comments
Open

ParalleRunner hangs on Linux Server #4176

Dekermanjian opened this issue Sep 18, 2024 · 3 comments
Labels
Community Issue/PR opened by the open-source community

Comments

@Dekermanjian
Copy link

Dekermanjian commented Sep 18, 2024

Description

I have a pipeline that I would like to run using the ParallelRunner. When I run this pipeline on my local windows machine it works just fine. However, when I try running the exact same pipeline on a Linux server (Rocky Linux) it will just hang at the loading datasets stage.

  • Kedro version used (pip show kedro or kedro -V): 0.19.8
  • Python version used (python -V): 3.11.8
  • Operating system and version: Rocky Linux version 8.10
@noklam
Copy link
Contributor

noklam commented Sep 18, 2024

Can you provide some more context, if possible to share a simplified version of repository that we can try to reproduce locally.

@merelcht merelcht added the Community Issue/PR opened by the open-source community label Sep 18, 2024
@Dekermanjian
Copy link
Author

Dekermanjian commented Sep 18, 2024

@noklam Yeah, of course. Let me try to put together something simple that will hang on the server and then I'll share the repo with you.

@Dekermanjian
Copy link
Author

Dekermanjian commented Sep 18, 2024

@noklam Okay, I figured out why it is not working. I just don't understand why it doesn't work on Linux but it does on Windows. Here is a simple example: https://github.com/Dekermanjian/test-parallel-runner

The reason it is not working on the linux server is because I am loading a parquet file in my settings.py file. When I load that file in the simple example the ParallelRunner will hang at the loading dataset stage. If you comment that line out (line 6) then it will work. You can generate the data by running the notebook I created.

Sorry let me add the command to run: kedro run --runner=ParallelRunner -p data_processing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community Issue/PR opened by the open-source community
Projects
Status: No status
Development

No branches or pull requests

3 participants