[CVPR-2023 Workshop@NFVLR] Official PyTorch implementation of Learning CLIP Guided Visual-Text Fusion Transformer for Video-based Pedestrian Attribute Recognition
transformer
pedestrian-attribute-recognition
multi-modal-fusion
video-based-attribute-recognition
visual-text-fusion
-
Updated
Jun 11, 2024 - Python