Person reidentification (Re-ID) aims to match specific pedestrians across nonoverlapping camera views. Due to the dramatic disparities between different datasets, transferring a Re-ID model trained on the source domain to the target domain is challenging. Some outstanding unsupervised domain adaptation (UDA) Re-ID methods use clustering to generate pseudolabels, optimizing the model on the target domain, but the pseudolabels are inevitably noisy. To address the above issues, we propose a framework named mini-transformer with pooling (MTP) to facilitate the generation of superior quality pseudolabels by improving the model’s feature representation capability. First, we introduce an effective mini-transformer (MT) that can be placed directly behind the CNNs as a feature extractor to capture long-range dependency. Then, we design two delicate pooling methods named global hybrid pooling (GHP) and global subvalue pooling (GSVP) to suit mini-transformer’s tremendous capability without increasing computational complexity. Specifically, GHP can keep more global information and GSVP can keep more discriminative information. Finally, experiments on four mainstream UDA Re-ID tasks demonstrate that MTP achieves competitive mAP and rank-1 accuracy to the current state-of-the-art methods, suggesting that our technique is simple but effective. In addition to UDA Re-ID, our MTP can be extended to other supervised retrieval tasks. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Transformers
Performance modeling
Data modeling
Cameras
Head
Visual process modeling
Visualization