Comparison of Deep Learning Approaches for Fake Image Classification
Keywords:
Fake image detection, generative artificial intelligence, deep learning, loss functions, attention mechanismsAbstract
Rapid advancements in generative artificial intelligence models have increasingly made it difficult for humans to distinguish fake images from real ones, giving rise to serious risks in terms of security, ethics, and information reliability. In this study, deep learning-based approaches for the automatic detection of AI-generated images are systematically investigated. Experiments are conducted on the CIFAKE dataset, a balanced, publicly available benchmark comprising 60,000 real and 60,000 synthetic images. EfficientNet-B0, EfficientNet-B3, and EfficientNet-B6 architectures with varying depths and capacities are evaluated using three different loss function variants: standard cross-entropy (CE), attention-enhanced cross-entropy (Attn+CE), and an attention-based composite loss function (Attn+Composite).
To analyze the effect of random initialization, all models are trained with five different random seeds. Performance is evaluated using Accuracy, Precision, Recall, F1-score, PR-AUC, and Expected Calibration Error (ECE) to assess prediction reliability. In addition, the statistical significance of performance differences between variants is examined using the McNemar test. The results demonstrate that the mid-depth EfficientNet-B3 architecture provides a more balanced trade-off between performance and stability. While the Attn+CE loss function yields meaningful improvements for specific architectures, the Attn+Composite variant does not consistently outperform simpler alternatives. Overall, this study presents a comprehensive evaluation that jointly considers architectural depth, loss function design, classification performance, and calibration reliability in fake image detection.