Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v0.1.3] Added Hugging Face Hub support. #26

Merged
merged 6 commits into from
Mar 2, 2024

Conversation

dbolya
Copy link
Contributor

@dbolya dbolya commented Jan 26, 2024

Added huggingface support using PytorchModelMixin, but only if huggingface-hub is installed. Also added a config object as a member variable of Hiera that keeps track of the args to the __init__ function so that it can be reconstructed later.

Sample usage:

import hiera
model = hiera.hiera_tiny_224(pretrained=True)

# Save locally
model.save_pretrained("hiera-tiny-224", config=model.config)

#Save to hub
model.push_to_hub("dbolya/hiera-tiny-224", config=model.config)

#Load from hub
from hiera import Hiera
model = Hiera.from_pretrained("dbolya/hiera-tiny-224")

If huggingface-hub is not installed, all of these functions will throw a runtime error.

Thanks @NielsRogge for guidance on this (#25).

@NielsRogge
Copy link

Looks great to me!

We're also improving the Mixin class in order to support sharding large checkpoints (as in the Transformers library), as well as having safetensors seralization by default (right now pickle is being used). cc @luccabb

@dbolya
Copy link
Contributor Author

dbolya commented Feb 7, 2024

@chayryali Here's a script to upload all the existing models:

org = "facebook"


import hiera

model_names = [x for x in hiera.__dict__ if x.startswith("hiera_") and x.endswith("224")]

print(f"Uploading models: {model_names}")

for model_name in model_names:
    model_func = getattr(hiera, model_name)
    
    print()

	# Only upload the default checkpoints for now
    for checkpoint in [ model_func.default ]:
        model = model_func(pretrained=True, checkpoint=checkpoint)

        hf_model_name = model_name.replace('_', '-')
        hf_chkpt_name = checkpoint.replace('_', '-')

        print(f"Pushing {hf_model_name} checkpoint {hf_chkpt_name}...")

        model.push_to_hub(f"{org}/{hf_model_name}", config=model.config)

Since the branch thing doesn't seem to work with named branches, I've decided to only upload the default (in1k / k400) finetuned models, as that's likely what's most useful to most people.

@dbolya
Copy link
Contributor Author

dbolya commented Feb 7, 2024

@chayryali Here's a version updated with .<checkpoint_name> in case we want to upload all checkpoints.

org = "facebook"


import hiera

model_names = [x for x in hiera.__dict__ if x.startswith("hiera_") and x.endswith("224")]

print(f"Uploading models: {model_names}")

for model_name in model_names:
    model_func = getattr(hiera, model_name)
    
    print()

    # Only upload the default checkpoints for now
    for checkpoint in model_func.checkpoints:
        model = model_func(pretrained=True, checkpoint=checkpoint)

        repo_name = f"{org}/{model_name}.{checkpoint}"
        print(f"Pushing {model_name} checkpoint {checkpoint} as {repo_name}...")

        model.push_to_hub(repo_name, config=model.config)

@NielsRogge
Copy link

NielsRogge commented Feb 10, 2024

Great! Would you like to push them to the Meta organization (assuming you're still at Meta ;) )? Otherwise I could do it.

Also note that we could make the Hiera models work directly with the Transformers library, by following this guide: https://huggingface.co/docs/transformers/custom_models#using-a-model-with-custom-code. This is a different approach to the mixin class: rather than integrating push_to_hub and from_pretrained within a custom package, this one allows to load your models within the Transformers library

@chayryali
Copy link
Contributor

Thanks @dbolya, @NielsRogge Pushed to hub!

@dbolya
Copy link
Contributor Author

dbolya commented Feb 29, 2024

Thanks @chayryali! Though it seems like they were uploaded as safetensors somehow. When I ran the command, it uploaded as a pytorch_model.bin. @NielsRogge, do you know why that's the case? The ModelMixin class doesn't seem to work with safetensors as far as I can tell. Or am I missing something? I currently get an error looking for pytorch_model.bin when trying to load them.

@NielsRogge
Copy link

Hi, awesome to see all models were uploaded! 🥳 🙌

Regarding the use of safetensors by default - I asked for updating the Mixin class as the Transformers library also does this given that it's a safer format. This was included in the new release: https://github.com/huggingface/huggingface_hub/releases/tag/v0.21.0.

Are you fine with the use of safetensors? Download metrics will work now for each of the model repositories on the hub.

We also now allow to define your config attributes as a Dataclass rather than a dictionary (both are supported now): https://huggingface.co/docs/huggingface_hub/main/en/guides/integrations#a-concrete-example-pytorch.

@NielsRogge
Copy link

Also, as a more general comment, I was wondering how we could make researchers at Meta in general aware of this Mixin class that we provide to easily push model checkpoints to the hub and reload them using from_pretrained, get download metrics, etc.

Cause there are many repositories on Github which all seem to be leveraging Torch hub, which does not allow for easy discovery, doesn't come with model cards/tags (e.g. you can't easily find a "image classification" model with models stored on torch hub) or download metrics, etc. Some examples:

https://github.com/facebookresearch/segment-anything
https://github.com/facebookresearch/dinov2
https://github.com/facebookresearch/ImageBind
https://github.com/facebookresearch/ijepa
https://github.com/facebookresearch/jepa
https://github.com/facebookresearch/nougat
https://github.com/facebookresearch/audioseal
https://github.com/facebookresearch/co-tracker

Do you have any suggestions regarding how we could make researchers at Meta more aware of this tooling?

I would like to improve the discoverability of models developed at Meta.

e.g. now people can type "hiera" in the search bar on hf.co, tag them with "image classification", etc., which helps in discovering these models, find documentation, etc.

Kind regards,

Niels

@Wauplin
Copy link

Wauplin commented Mar 1, 2024

Thanks @chayryali! Though it seems like they were uploaded as safetensors somehow. When I ran the command, it uploaded as a pytorch_model.bin. @NielsRogge, do you know why that's the case? The ModelMixin class doesn't seem to work with safetensors as far as I can tell. Or am I missing something? I currently get an error looking for pytorch_model.bin when trying to load them.

Hi @dbolya, as @NielsRogge mentioned safe serialization as .safetensors files has been shipped in the latest version of huggingface_hub. It looks like the weights have been uploaded to the Hub with this version. This means if you want to load models from the Hub, you simply have to update your huggingface_hub version (pip install -U huggingface_hub) and it should work correctly. For the record, the newest version is able to load both from pytorch_model.bin and model.safetensors so updating the lib will not break loading from previous weights :)

@dbolya
Copy link
Contributor Author

dbolya commented Mar 2, 2024

Thanks @NielsRogge, @Wauplin! Safetensors is perfectly fine (and actually preferred). I've updated the PR to include a version check for hub version >= 0.21.0, which solved the issue.

As for the discussion, speaking from experience everyone just does what other people are doing. Everyone had a torch hub integration, and it was easy enough to implement, so that's why we included it. Likely if a critical mass of projects start using the huggingface hub and if it's easy to integrate, I'm sure the same thing would happen.

I think huggingface/huggingface_hub#2079 is a good start for that, since one of the blockers for implementing huggingface hub vs. torch hub was how the config is handled and saved.

And with that, I think this version is ready. I'll merge now, but feel free to continue the discussion in this thread.

@dbolya dbolya merged commit 42d33a7 into facebookresearch:main Mar 2, 2024
1 check passed
@dbolya dbolya deleted the v0.1.3 branch March 2, 2024 04:15
@NielsRogge
Copy link

Great to see the integration!

Some further comments:

  • I see the README mentions that models can be saved as follows:
model.save_pretrained("hiera-base-224", config=model.config)

However, the config will now automatically be saved when using save_pretrained, there's no need to explicitly pass the config anymore, you can just do (cc @Wauplin):

model.save_pretrained("hiera-base-224")
  • It would be great to add model cards to the Hiera models on the hub (model cards are just READMEs, hence one could add a minimal model card explaining what the model is, the paper it links to, how to run it, etc.). I've opened a PR here for a demonstration.

Regarding making more researchers at Meta aware of this, I could start by opening pull requests on the repositories mentioned above, if know have any other ways to make more people aware of this, let me know! 🤗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants