Leveraging ViTImageProcessor and VitForImageClassification classes to extract logits from the model classifier and display the output as a label back to the user. This could be awesome utility if you have large image based corpus and need to label it.
Also, learning the representation in a self supervised fashion is working, and again it is pretty simple to minimize the loss between predicted teacher’s representation and student’s.
Check this out for more details: https://arxiv.org/pdf/2104.14294.pdf
Just a few lines of code and your are able to classify images.
Check this out: https://huggingface.co/facebook/dino-vitb8
Class label – jersey, T-shirt, tee shirt
Leveraging HuggingFace, AWS SageMaker, Flask and basic Python scripting, we build fast!