VinVL (Visual features in Vision-Language) is an object-attribute detection model that specializes in image encoding. If you’re unfamiliar with VL systems, they are driven by machine learning and provide a way to search images for a text query or search a text for matching image. These systems give natural language descriptions of the content within an image. VL systems typically combine image encoding and vision language fusion. Microsoft Research says VinVL is an image encoding model that works alongside existing VL fusion modules to produce accurate image/text matching results.
Breakthrough
For example, it topped leaderboards across a range of VL testing services, such as Microsoft’s own COCO Image Captioning, Novel Object Captioning, and Visual Question Answering (VQA). Furthermore, the new model is able to better human performance on the nocaps leaderboard by a large margin. “VinVL has demonstrated great potential in improving image encoding for VL understanding. Our newly developed image encoding model can benefit a wide range of VL tasks, as illustrated by examples in this paper. Despite the promising results we obtained, such as surpassing human performance on image captioning benchmarks, our model is by no means reaching the human-level intelligence of VL understanding. “Interesting directions of future works include: (1) further scale up the object-attribute detection pretraining by leveraging massive image classification/tagging data, and (2) extend the methods of cross-modal VL representation learning to building perception-grounded language models that can ground visual concepts in natural language and vice versa like humans do.” Microsoft says it will fold VinVL into Azure Cognitive Services. That means it will be available to customers of the platform that works across service such as LinkedIn and Office 365. Additionally, the project will be open source and available to all dev’s. Tip of the day: Did you know that as a Windows 10 admin you can restrict user accounts by disabling settings or the control panel? Our tutorial shows how to disable and enable them via Group Policy and the registry.