-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
504 Server error when running comet-score using multiple machines #162
Comments
Hmm this seems to be a problem downloading the model and on HF side. Have you tried it recently? |
it could be that HF Hub was down for a period |
@Smu-Tan have you solved your problem?? I'm getting the same error of downloading the model. |
@ricardorei Hi, I run the code
and get this exception:
After checking this file, I found the available_legacy_metrics in comet/models/download_utils.py does not have the corresponding key-value pair. Can you update this file or tell me the way to directly download it on the HF? the current version of unbabel-comet is 2.2.0 |
Hey! Hmm this is weird. available_legacy_metrics should just be called when the model is not found on Hugging face. What is your hugging face hub version? can you send me the pip freeze output? |
OK, the following is the pip freeze list: the hugging face hub version is huggingface-hub==0.16.4, I upgrade it to huggingface-hub-0.19.4 but still not work with the same error:) |
The problem was solved by manually downloading the model from huggingface repo. Thx. |
You have to acknowledge the model's license on the web. Then perform a cli login on your code before downloading it. |
I forgot this issue. Thanks for answering @mohataher. |
SOLVED - had the same issue |
🐛 Bug
Hi! A 504 server error is encountered when running multiple
comet-score
scripts. See below:Traceback (most recent call last): File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py, line 261, in hf_raise_for_status response.raise_for_status() File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/requests/models.py, line 1021, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 504 Server Error: Gateway Time-out for url: https://huggingface.co/api/models/Unbabel/wmt22-comet-da/revision/main
The above exception was the direct cause of the following exception: Traceback (most recent call last): File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/comet/models/__init__.py, line 46, in download_model model_path = snapshot_download( File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py, line 118, in _inner_fn return fn(*args, **kwargs) File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/huggingface_hub/_snapshot_download.py, line 186, in snapshot_download repo_info = api.repo_info(repo_id=repo_id, repo_type=repo_type, revision=revision, token=token) File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py, line 118, in _inner_fn return fn(*args, **kwargs) File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/huggingface_hub/hf_api.py, line 1868, in repo_info return method( File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py, line 118, in _inner_fn return fn(*args, **kwargs) File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/huggingface_hub/hf_api.py, line 1678, in model_info hf_raise_for_status(r) File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py, line 303, in hf_raise_for_status raise HfHubHTTPError(str(e), response=response) from e huggingface_hub.utils._errors.HfHubHTTPError: 504 Server Error: Gateway Time-out for url: https://huggingface.co/api/models/Unbabel/wmt22-comet-da/revision/main
During handling of the above exception, another exception occurred: Traceback (most recent call last): File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/comet/models/__init__.py, line 51, in download_model checkpoint_path = download_model_legacy(model, saving_directory) File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/comet/models/download_utils.py, line 224, in download_model_legacy raise Exception( Exception: Unbabel/wmt22-comet-da is not in the available_legacy_metrics or is a valid checkpoint folder.
During handling of the above exception, another exception occurred: Traceback (most recent call last): File /home/stan1/anaconda3/envs/prefix_mt/bin/comet-score, line 8, in <module> sys.exit(score_command()) File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/comet/cli/score.py, line 154, in score_command model_path = download_model(cfg.model, saving_directory=cfg.model_storage_path) File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/comet/models/__init__.py, line 53, in download_model raise KeyError(fModel {model} not supported by COMET.) KeyError: Model Unbabel/wmt22-comet-da not supported by COMET.
To Reproduce
Here's the reproduction code template, pls ignore the
task
andseed
setting.#!/bin/bash
RESULT_DIR=zero-shot
TASKS=(zs)
SEEDS=(1234)
SRCAR=('de' 'nl' 'sv' 'da' 'is')
TGTAR=('de' 'nl' 'sv' 'da' 'is')
for (( t=0; t<${#TASKS[@]}; t++ ))
do
for (( s=0; s<${#SEEDS[@]}; s++ ))
do
first_id=$((t*${#SEEDS[@]}+s))
for (( i=0; i<${#SRCAR[@]}; i++ ))
do
second_id=$((first_id*${#SRCAR[@]}+i))
for (( j=0; j<${#TGTAR[@]}; j++ ))
do
third_id=$((second_id*${#TGTAR[@]}+j))
if [ "$third_id" -eq "$SLURM_ARRAY_TASK_ID" ]
then
SRC=${SRCAR[i]}
TGT=${TGTAR[j]}
if [[ "$SRC" != "$TGT" ]]
then
echo "SRC-TGT: $SRC-$TGT"
SOURCE_SENT=${RESULT_DIR}/${SRC}-${TGT}/test-src.txt
HYPOTHESIS=${RESULT_DIR}/${SRC}-${TGT}/test-sys.txt
REFERENCE=${RESULT_DIR}/${SRC}-${TGT}/test-ref.txt
comet-score -s ${SOURCE_SENT} -t ${HYPOTHESIS} -r ${REFERENCE} --quiet --only_system > ${RESULT_DIR}/${SRC}-${TGT}/test_comet.txt
fi
fi
done
done
done
done
Environment
OS: Linux (slurm)
comet version: newest
The text was updated successfully, but these errors were encountered: