Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModelInfo bug #2186

Closed
Narsil opened this issue Apr 2, 2024 · 18 comments · Fixed by #2190
Closed

ModelInfo bug #2186

Narsil opened this issue Apr 2, 2024 · 18 comments · Fixed by #2190
Labels
bug Something isn't working

Comments

@Narsil
Copy link
Contributor

Narsil commented Apr 2, 2024

Describe the bug

Cannot load model information on some repo.

Reproduction

from huggingface_hub import HfApi
api = HfApi()
api.model_info("CohereForAI/c4ai-command-r-v01")

Logs

TypeError: SafeTensorsInfo.__init__() got an unexpected keyword argument 'sharded'


### System info

```shell
- huggingface_hub version: 0.22.2
- Platform: Linux-5.15.0-1048-aws-x86_64-with-glibc2.31
- Python version: 3.11.6
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Token path ?: /data/token
- Has saved token ?: True
- Who am I ?: Narsil
- Configured git credential helpers:
- FastAI: N/A
- Tensorflow: 2.16.1
- Torch: 2.2.1
- Jinja2: 3.1.2
- Graphviz: N/A
- keras: 3.1.1
- Pydot: N/A
- Pillow: 10.2.0
- hf_transfer: 0.1.5
- gradio: 4.16.0
- tensorboard: N/A
- numpy: 1.26.4
- pydantic: 2.6.4
- aiohttp: 3.8.5
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: /data/hub
- HF_ASSETS_CACHE: /data/assets
- HF_TOKEN_PATH: /data/token
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10
@Narsil Narsil added the bug Something isn't working label Apr 2, 2024
@kresimirfijacko
Copy link

kresimirfijacko commented Apr 2, 2024

it happened to me as well while working (no changes in dependencies/code etc.)
i guess it is something related to hf api?

@Wauplin
Copy link
Contributor

Wauplin commented Apr 2, 2024

Thanks for reporting @Narsil @kresimirfijacko! Will check if this is a breaking change server side and open a PR to fix it client side anyway.

@kresimirfijacko
Copy link

Thanks for reporting @Narsil @kresimirfijacko! Will check if this is a breaking change server side and open a PR to fix it client side anyway.

yeah it's probably something related server side
i experienced it in vllm which uses huggingface-hub under the hood

python -u -m vllm.entrypoints.openai.api_server \ --model Qwen/Qwen1.5-72B-Chat-GPTQ-Int4 \ ///

all of a sudden stopped working
when i changed --model to absolute path on disk, it worked ok

@kubs0ne
Copy link

kubs0ne commented Apr 2, 2024

Hey I have the same issue, this started happpening today around 3PM. I cannot use vllm server nor the huggingface-cli in order to download the model. Everything returns the same error:
TypeError: SafeTensorsInfo.__init__() got an unexpected keyword argument 'sharded'

@Supermax197
Copy link

work around,add sharded: None in the file hf_api.py,like this :
@DataClass
class SafeTensorsInfo(dict):
parameters: List[Dict[str, int]]
total: int
sharded: None
def post_init(self): # hack to make SafeTensorsInfo backward compatible
self.update(asdict(self))

@0-hero
Copy link

0-hero commented Apr 2, 2024

+1

@Whylickspittle
Copy link

work around,add sharded: None in the file hf_api.py,like this : @DataClass class SafeTensorsInfo(dict): parameters: List[Dict[str, int]] total: int sharded: None def post_init(self): # hack to make SafeTensorsInfo backward compatible self.update(asdict(self))
it works bro

@KevinNaidoo
Copy link

KevinNaidoo commented Apr 2, 2024

Just have the same issue. It was working earlier today.

docker run --runtime nvidia --gpus all \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HUGGING_FACE_HUB_TOKEN=<secret>" \
    -p 8000:8000 \
    --ipc=host \
    vllm/vllm-openai:latest \
    --model TheBloke/Mixtral-8x7B-Instruct-v0.1-AWQ --quantization awq --tensor-parallel-size 4

@jyotsnar
Copy link

jyotsnar commented Apr 2, 2024

Same issue here too. Started today.

@Supermax197
Copy link

i think the server side is fixed,so this work around is deprecated.

work around,add sharded: None in the file hf_api.py,like this : @DataClass class SafeTensorsInfo(dict): parameters: List[Dict[str, int]] total: int sharded: None def post_init(self): # hack to make SafeTensorsInfo backward compatible self.update(asdict(self))

@Wauplin
Copy link
Contributor

Wauplin commented Apr 2, 2024

Hey everyone, thanks for quickly reporting issues and suggesting a workaround. Failure is indeed due to a server-side change and we are discussing solutions to mitigate it. In the meantime, I opened #2190 to fix the issue client-side (which will make the class future-proof). To get an immediate fix, please install from this branch:

pip install git+https://github.com/huggingface/huggingface_hub@2186-fix-safetensors-info

EDIT: no need to install a new version of huggingface_hub. A server-side fix has been deployed, making the fix above optional. See #2186 (comment).

@binhnq94
Copy link

binhnq94 commented Apr 2, 2024

How can I re-upload my model after a 24h training process die because of this bug?
I still have model folder in local.

@martina-zxy
Copy link

+1

@momo-exaion
Copy link

momo-exaion commented Apr 2, 2024

Thank you @Wauplin, as a reminder you can use this syntax to include optional dependencies

pip install "huggingface_hub[cli,hf_transfer] @ git+https://github.com/huggingface/huggingface_hub@2186-fix-safetensors-info"

EDIT: no need to install a new version of huggingface_hub. A server-side fix has been deployed, making the fix above optional. See #2186 (comment).

@youkaichao
Copy link

i think the server side is fixed,so this work around is deprecated.

Is this fixed? I still get this error :(

@Wauplin
Copy link
Contributor

Wauplin commented Apr 2, 2024

A fix has been deployed a few minutes ago. This should be fixed for everyone without updating any dependencies. Sorry again for the inconvenience and thanks everyone for your reactivity on this 🤗

@binhnq94
Copy link

binhnq94 commented Apr 4, 2024

Hey everyone, thanks for quickly reporting issues and suggesting a workaround. Failure is indeed due to a server-side change and we are discussing solutions to mitigate it. In the meantime, I opened #2190 to fix the issue client-side (which will make the class future-proof). To get an immediate fix, please install from this branch:

pip install git+https://github.com/huggingface/huggingface_hub@2186-fix-safetensors-info

I got error: git checkout -q 2186-fix-safetensors-info did not run successfully.
If we fixed it, should I use which huggingface hub version?

@Wauplin
Copy link
Contributor

Wauplin commented Apr 4, 2024

I got error: git checkout -q 2186-fix-safetensors-info did not run successfully.
If we fixed it, should I use which huggingface hub version?

Yes sorry, PR has been merged and is now on main. But another fix has been deployed server-side meaning you don't even need to update your dependencies. Any huggingface_hub version from PyPI will work correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.