Volume - 4 | Issue - 4 | december 2022
DOI
10.36548/jiip.2022.4.009
Published
27 January, 2023
Caption generation has long been of interest to researchers in the field of artificial intelligence. The ability to train a system to properly represent an image or environment, has broad applications in robotic vision, management, and many other areas. The purpose of this study is to analyze multiple transmission learning strategies and create a unique system for improving caption accuracy. To increase object relevance, image feature vectors are constructed using multiple state-of-the-art models that are input into an encoder/decoder transformation network based on attentional mechanisms. The model is evaluated for comparing datasets such as MS-COCO with criteria such as Bilingual Evaluation Understudy.
KeywordsImage retrieval caption generator artificial intelligence generative adversarial network image analysis