Please refer to get started with SimMIM to play with SimMIM pre-training. Merged SimMIM, which is a Masked Image Modeling based pre-training approach applicable to Swin and SwinV2 (and also applicable for ViT and ResNet). Please refer to Feature-Distillation for details, and the checkpoints (FD-EsViT-Swin-B, FD-DeiT-ViT-B, FD-DINO-ViT-B, FD-CLIP-ViT-B, FD-CLIP-ViT-L). Models and codes of Feature Distillation are released.Nvidia's FasterTransformer now supports Swin Transformer V2 inference, which have significant speed improvements on T4 and A100 GPUs.Mixture-of-Experts: See get_started for more instructions.įeature-Distillation: See Feature-Distillation. SSL: Masked Image Modeling: See get_started.md#simmim-support. SSL: Contrasitive Learning: See Transformer-SSL. Semi-Supervised Object Detection: See Soft Teacher. Video Action Recognition: See Video Swin Transformer. Semantic Segmentation: See Swin Transformer for Semantic Segmentation. Object Detection and Instance Segmentation: See Swin Transformer for Object Detection. Image Classification: Included in this repo. It currently includes code and models for the following tasks: This repo is the official implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" as well as the follow-ups.
0 Comments
Leave a Reply. |