Bhatt, Dvijesh, and Priyank Thakkar. “Improving Narrative Coherence in Dense Video Captioning through Transformer and Large Language Models”. Journal of Innovative Image Processing 7, no. 2 (June 3, 2025): 333–361. Accessed March 20, 2026. https://irojournals.com/iroiip/article/view/783.