Impact of Feature Selection on the Performance of KNN and SVM in Heart Disease Prediction

Main Article Content

Dhiyaussalam
M. Helmy Noor
Herlinawati
Isna Wardiah

Abstract

Feature selection plays a vital role in enhancing the performance of machine learning models by eliminating irrelevant or redundant attributes. This study investigates the impact of feature selection on the classification accuracy of K-Nearest Neighbors (KNN) and Support Vector Machine (SVM) in predicting heart disease. Using the UCI Heart Disease data, which initially includes 13 input features, feature importance scores were calculated using a Random Forest model. A threshold-based method was then applied to identify and retain the most informative features. Through iterative testing of importance thresholds, a value of 0.03 yielded the best results, reducing the feature set from 13 to 9 attributes. Classification models were trained and evaluated using full and reduced feature sets. Performance was assessed using accuracy, precision, recall, and F1-score and validated with 5-fold cross-validation. The results demonstrate significant performance gains after feature selection. The KNN classifier improved accuracy from 83% to 92%, with notable gains in recall and F1-score for the positive class. Similarly, SVM achieved 92% accuracy, with improved precision and overall performance stability. These findings suggest that data-driven feature reduction simplifies the model and enhances its predictive power. This study systematically compares feature selection effects on two distinct machine learning algorithms and offers practical insights for optimizing medical prediction models in clinical decision support systems.

Article Details

How to Cite
Dhiyaussalam, Noor, M. H., Herlinawati, & Wardiah, I. (2025). Impact of Feature Selection on the Performance of KNN and SVM in Heart Disease Prediction. Tech : Journal of Engineering Science, 1(1), 14–25. https://doi.org/10.69836/tech.v1i1.353
Section
Articles
Author Biography

Dhiyaussalam, Politeknik Negeri Banjarmasin

Informatics Engineering

References

Abidin, M., Munzir, M., Imantoyo, A., Bintang Grendis, N. W., Hadi San, A. S., Mostfa, A. A., Furizal, F., & Sharkawy, A.-N. (2025). Classification of Heart (Cardiovascular) Disease using the SVM Method. Indonesian Journal of Modern Science and Technology, 1(1), 9–15. https://doi.org/10.64021/ijmst.1.1.9-15.2025

Andono, P. N., Rachmawanto, E. H., Herman, N. S., & Kondo, K. (2021). Orchid types classification using supervised learning algorithm based on feature and color extraction. Bulletin of Electrical Engineering and Informatics, 10(5), 2530–2538. https://doi.org/10.11591/eei.v10i5.3118

Aziz, F., Malek, S., Ibrahim, K. S., Shariff, R. E. R., Wan Ahmad, W. A., Ali, R. M., Liu, K. T., Selvaraj, G., & Kasim, S. (2021). Short- And long-term mortality prediction after an acute ST-elevation myocardial infarction (STEMI) in Asians: A machine learning approach. PLoS ONE, 16(8 August), 1–23. https://doi.org/10.1371/journal.pone.0254894

Bianco, E., Skipalskyi, A., Goma, F., Odeh, H., Hasegawa, K., Zawawi, M. Al, Stoklosa, M., Dalmau, R., Dorotheo, E. U., Berteletti, F., Mwangi, J., & Wang, Y. (2021). E-Cigarettes: A New Threat to Cardiovascular Health - A World Heart Federation Policy Brief. Global Heart, 16(1), 1–10. https://doi.org/10.5334/gh.1076

Eswar, T. S., & Karthick, V. (2022). Realtime Visual Object Recognition using Support Vector Machine comparing with K- Nearest Neighbor algorithm for improving accuracy. Journal of Pharmaceutical Negative Results, 13(SO4), 831–837. https://doi.org/10.47750/pnr.2022.13.s04.097

Ingole, S., Ramineni, V., Bangad, N., Ganeeb, K. K., & Patel, P. (2024). Advancements In Heart Disease Prediction : A Machine Learning Approach For Early Detection And Risk Assessment. 11(4), 164–172. https://doi.org/https://doi.org/10.48550/arXiv.2410.14738

Janosi, A., Steinbrunn, W., Pfisterer, M., & Detrano, R. (1989). Heart Disease [Dataset]. https://doi.org/10.24432/C52P4X

Jiang, X., Zhou, J., Qiao, X., Peng, C., & Su, S. (2022). A Neighborhood Model with Both Distance and Quantity Constraints for Multilabel Data. Computational Intelligence and Neuroscience, 2022. https://doi.org/10.1155/2022/9891971

Kadhim, R. R., & Kamil, M. Y. (2023). Comparison of machine learning models for breast cancer diagnosis. IAES International Journal of Artificial Intelligence, 12(1), 415–421. https://doi.org/10.11591/ijai.v12.i1.pp415-421

Kodete, C. S. (2022). Improved Heart Disease Classification Using SVM-KNN Voting Classifier and GridSearchCV Optimization. Journal of Health, Metabolism & Nutrition Studies, Vol. 19(April). https://www.cambridgenigeriapub.com/wp-content/uploads/2025/02/BJHMNS_Vol19_No3_March_2022-21.pdf

Liang, J. X., Zhao, J. F., Sun, N., & Shi, B. J. (2022). Random Forest Feature Selection and Back Propagation Neural Network to Detect Fire Using Video. Journal of Sensors, 2022. https://doi.org/10.1155/2022/5160050

Ma, X. H., Shu, L., Jia, X., Zhou, H. C., Liu, T. T., Liang, J. W., Ding, Y. S., He, M., & Shu, Q. (2022). Machine Learning-Based CT Radiomics Method for Identifying the Stage of Wilms Tumor in Children. Frontiers in Pediatrics, 10(May), 1–9. https://doi.org/10.3389/fped.2022.873035

Nagavallika, V. (2022). Prediction of Heart Disease Using Machine Learning Techniques. Journal of Contemporary Medical Practice, 4(6), 195–198. https://doi.org/10.53469/jcmp.2022.04(06).39

Nasution, N., Hasan, M. A., & Bakri Nasution, F. (2025). Predicting Heart Disease Using Machine Learning: An Evaluation of Logistic Regression, Random Forest, SVM, and KNN Models on the UCI Heart Disease Dataset. IT Journal Research and Development, 9(2), 140–150. https://doi.org/10.25299/itjrd.2025.17941

Osei-Nkwantabisa, A. S., & Ntumy, R. (2024). Classification and Prediction of Heart Diseases using Machine Learning Algorithms. 1–10. http://arxiv.org/abs/2409.03697%0Ahttp://dx.doi.org/10.48550/arXiv.2409.03697

Prasetiyowati, M. I., Maulidevi, N. U., & Surendro, K. (2021). Determining threshold value on information gain feature selection to increase speed and prediction accuracy of random forest. Journal of Big Data, 8(1). https://doi.org/10.1186/s40537-021-00472-4