1 Department of Business Administration, International American University, Los Angeles, CA 90010, USA.
2 Department of Engineering/Industrial Management, Westcliff University, Irvine, CA 92614, USA.
3 Department of Mathematics and Natural Sciences, BRAC University, Dhaka, Bangladesh.
4 Department of Information Technology, Westcliff University, Irvine, CA 92614, USA.
5 Department of Management Information System, International American University, CA 90010, USA.
International Journal of Science and Research Archive, 2025, 15(01), 1860-1873
Article DOI: 10.30574/ijsra.2025.15.1.1166
Received on 13 March 2025; revised on 22 April 2025; accepted on 24 April 2025
Skin cancer is a major cause of death, making early detection essential. This study presents LEVit, an explainable and class-balanced deep learning framework designed for multiclass skin lesion classification. LEVit combines a hybrid Vision Transformer (ViT) with a Convolutional Neural Network (CNN). We evaluated LEVit on two benchmark dermoscopic datasets: HAM10000, which consists of 10,015 images across 7 classes, and ISIC 2019, with 25,331 images spanning 8 classes. Both datasets have notable class imbalances. To address this issue, we applied advanced augmentation techniques to oversample minority classes, ensuring a uniform class distribution and enhancing the model's ability to generalize. LEVit effectively captures local lesion textures and global spatial relationships through its integrated self-attention and convolutional modules. We compared its performance against four state-of-the-art models: NASNet, SqueezeNet, SE-Net, and Xception, across four metrics: F1 Score, Specificity, Matthews Correlation Coefficient (MCC), and Precision-Recall Area Under the Curve (PR AUC). LEVit achieved outstanding results, with a F1 Score of 98.11% and a PR AUC of 98.57% on the ISIC 2019 dataset, and a F1 Score of 96.11% and a PR AUC of 96.62% on HAM10000. For interpretability, we utilized Grad-CAM to generate class-specific heatmaps, which highlight the key areas of lesions that influence the model's predictions. This work demonstrates that balanced training and a hybrid architecture can enhance both classification accuracy and interpretability in skin cancer diagnostics, effectively addressing the limitations of existing models and paving the way for reliable clinical applications.
Skin cancer; Vision transformer; Deep learning; Explainable AI (XAI); Medical imaging.
Preview Article PDF
Anamul Haque Sakib, Md Ismail Hossain Siddiqui, Sanjida Akter, Abdullah Al Sakib and Mohammad Rasel Mahmud. LEVit-Skin: A balanced and interpretable transformer-CNN model for multi-class skin cancer diagnosis. International Journal of Science and Research Archive, 2025, 15(01), 1860-1873. Article DOI: https://doi.org/10.30574/ijsra.2025.15.1.1166.
Copyright © 2025 Author(s) retain the copyright of this article. This article is published under the terms of the Creative Commons Attribution Liscense 4.0







