Volume - 7 | Issue - 2 | june 2025
Published
27 May, 2025
This study presents a novel interpretability pipeline for image forgery localization by integrating GAN-generated adversarial forgeries with Grad-CAM visual explanations. The objective is to assess the capability of a deep learning classifier to not only detect but also spatially localize manipulated regions in digital images. A Deep Convolutional GAN is trained to generate realistic forged patches, which are synthetically embedded into clean images to simulate new forgery instances. These synthetic images are then analyzed using a proficient 1-based binary classifier. To elucidate the spatial focus of the model, Grad-CAM is employed to visualize class differences of interest. The analysis incorporates metrics such as attention scores, IoU, recall, F1 score, MSE, and SSIM, enabling comprehensive comparisons between heat maps and ground truth forged areas. Despite the high attention scores, the results indicate poor localization performance, with IoU and Pixel-Wise F1 scores at zero. These findings suggest that while the classifier can identify vulnerable areas, Grad-CAM lacks the accuracy necessary for precise manipulation indication. Layer-wise visualization analysis further reveals that the deep layers of the model capture high-level features but prioritize rapid localization over accuracy. This study provides evidence that GAN-generated examples can highlight significant interpretative boundaries. The findings emphasize a disconnect between visual saliency and actual spatial alignment, underscoring the necessity for more refined explanatory methods in image forensics. This framework offers a scalable testbed for future interpretability benchmarking in adversarial scenarios and contributes to the development of more explainable and robust AI models in high-stakes visual domains. The experimental results reveal a stark contrast between high Grad-CAM attention scores and low spatial IoU, indicating a disparity between focus and true localization. Although the classifier reliably detects forged images, its spatial interpretation lacks precision. These insights underscore the need for more granular explanatory tools to enhance forensic trustworthiness. This work establishes a precedent for adversarial interpretability evaluation using synthetic forgeries, with future research potentially focusing on embedding-aware Grad-CAM variants or localized training objectives.
KeywordsImage Forgery Detection GAN Grad-CAM Interpretability Deep Learning Attention Score Adversarial Forgeries Efficient Net Localization Explainable AI