Skip to content

Guard against duplicate builder_kwargs/config_kwargs in load_dataset_… #7622

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Shohail-Ismail
Copy link

@Shohail-Ismail Shohail-Ismail commented Jun 17, 2025

…builder (#4910 )

What does this PR do?

Fixes edge case in load_dataset_builder by raising a TypeError if the same key exists in both builder_kwargs and config_kwargs.

Implementation details

  • Added a guard clause in load_dataset_builder to detect duplicate keys between builder_kwargs and config_kwargs

  • Wrote a unit test in tests/test_load_duplicate_keys.py to verify the exception is raised correctly

Fixes

Closes #4910

Reviewers

@zach-huggingface
@SunMarc

Would appreciate your review if you have time - thanks!

@Shohail-Ismail Shohail-Ismail force-pushed the fix-build-kwarg-conflict branch from e740409 to 61de0c5 Compare June 17, 2025 18:35
@Shohail-Ismail
Copy link
Author

Shohail-Ismail commented Jul 2, 2025

Hi folks, this PR fixes the duplicate-kwargs edge case and includes a unit test. Would love a review when you have a moment!

@zach-huggingface
@SunMarc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Identical keywords in build_kwargs and config_kwargs lead to TypeError in load_dataset_builder()
1 participant