Transformers in computer vision: ViT architectures, tips, tricks and improvements