Impact of Feature Selection on the Performance of KNN and SVM in Heart Disease Prediction
Main Article Content
Abstract
Feature selection plays a vital role in enhancing the performance of machine learning models by eliminating irrelevant or redundant attributes. This study investigates the impact of feature selection on the classification accuracy of K-Nearest Neighbors (KNN) and Support Vector Machine (SVM) in predicting heart disease. Using the UCI Heart Disease data, which initially includes 13 input features, feature importance scores were calculated using a Random Forest model. A threshold-based method was then applied to identify and retain the most informative features. Through iterative testing of importance thresholds, a value of 0.03 yielded the best results, reducing the feature set from 13 to 9 attributes. Classification models were trained and evaluated using full and reduced feature sets. Performance was assessed using accuracy, precision, recall, and F1-score and validated with 5-fold cross-validation. The results demonstrate significant performance gains after feature selection. The KNN classifier improved accuracy from 83% to 92%, with notable gains in recall and F1-score for the positive class. Similarly, SVM achieved 92% accuracy, with improved precision and overall performance stability. These findings suggest that data-driven feature reduction simplifies the model and enhances its predictive power. This study systematically compares feature selection effects on two distinct machine learning algorithms and offers practical insights for optimizing medical prediction models in clinical decision support systems.
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.
References
Abidin, M., Munzir, M., Imantoyo, A., Bintang Grendis, N. W., Hadi San, A. S., Mostfa, A. A., Furizal, F., & Sharkawy, A.-N. (2025). Classification of Heart (Cardiovascular) Disease using the SVM Method. Indonesian Journal of Modern Science and Technology, 1(1), 9–15. https://doi.org/10.64021/ijmst.1.1.9-15.2025
Andono, P. N., Rachmawanto, E. H., Herman, N. S., & Kondo, K. (2021). Orchid types classification using supervised learning algorithm based on feature and color extraction. Bulletin of Electrical Engineering and Informatics, 10(5), 2530–2538. https://doi.org/10.11591/eei.v10i5.3118
Aziz, F., Malek, S., Ibrahim, K. S., Shariff, R. E. R., Wan Ahmad, W. A., Ali, R. M., Liu, K. T., Selvaraj, G., & Kasim, S. (2021). Short- And long-term mortality prediction after an acute ST-elevation myocardial infarction (STEMI) in Asians: A machine learning approach. PLoS ONE, 16(8 August), 1–23. https://doi.org/10.1371/journal.pone.0254894
Bianco, E., Skipalskyi, A., Goma, F., Odeh, H., Hasegawa, K., Zawawi, M. Al, Stoklosa, M., Dalmau, R., Dorotheo, E. U., Berteletti, F., Mwangi, J., & Wang, Y. (2021). E-Cigarettes: A New Threat to Cardiovascular Health - A World Heart Federation Policy Brief. Global Heart, 16(1), 1–10. https://doi.org/10.5334/gh.1076
Eswar, T. S., & Karthick, V. (2022). Realtime Visual Object Recognition using Support Vector Machine comparing with K- Nearest Neighbor algorithm for improving accuracy. Journal of Pharmaceutical Negative Results, 13(SO4), 831–837. https://doi.org/10.47750/pnr.2022.13.s04.097
Ingole, S., Ramineni, V., Bangad, N., Ganeeb, K. K., & Patel, P. (2024). Advancements In Heart Disease Prediction : A Machine Learning Approach For Early Detection And Risk Assessment. 11(4), 164–172. https://doi.org/https://doi.org/10.48550/arXiv.2410.14738
Janosi, A., Steinbrunn, W., Pfisterer, M., & Detrano, R. (1989). Heart Disease [Dataset]. https://doi.org/10.24432/C52P4X
Jiang, X., Zhou, J., Qiao, X., Peng, C., & Su, S. (2022). A Neighborhood Model with Both Distance and Quantity Constraints for Multilabel Data. Computational Intelligence and Neuroscience, 2022. https://doi.org/10.1155/2022/9891971
Kadhim, R. R., & Kamil, M. Y. (2023). Comparison of machine learning models for breast cancer diagnosis. IAES International Journal of Artificial Intelligence, 12(1), 415–421. https://doi.org/10.11591/ijai.v12.i1.pp415-421
Kodete, C. S. (2022). Improved Heart Disease Classification Using SVM-KNN Voting Classifier and GridSearchCV Optimization. Journal of Health, Metabolism & Nutrition Studies, Vol. 19(April). https://www.cambridgenigeriapub.com/wp-content/uploads/2025/02/BJHMNS_Vol19_No3_March_2022-21.pdf
Liang, J. X., Zhao, J. F., Sun, N., & Shi, B. J. (2022). Random Forest Feature Selection and Back Propagation Neural Network to Detect Fire Using Video. Journal of Sensors, 2022. https://doi.org/10.1155/2022/5160050
Ma, X. H., Shu, L., Jia, X., Zhou, H. C., Liu, T. T., Liang, J. W., Ding, Y. S., He, M., & Shu, Q. (2022). Machine Learning-Based CT Radiomics Method for Identifying the Stage of Wilms Tumor in Children. Frontiers in Pediatrics, 10(May), 1–9. https://doi.org/10.3389/fped.2022.873035
Nagavallika, V. (2022). Prediction of Heart Disease Using Machine Learning Techniques. Journal of Contemporary Medical Practice, 4(6), 195–198. https://doi.org/10.53469/jcmp.2022.04(06).39
Nasution, N., Hasan, M. A., & Bakri Nasution, F. (2025). Predicting Heart Disease Using Machine Learning: An Evaluation of Logistic Regression, Random Forest, SVM, and KNN Models on the UCI Heart Disease Dataset. IT Journal Research and Development, 9(2), 140–150. https://doi.org/10.25299/itjrd.2025.17941
Osei-Nkwantabisa, A. S., & Ntumy, R. (2024). Classification and Prediction of Heart Diseases using Machine Learning Algorithms. 1–10. http://arxiv.org/abs/2409.03697%0Ahttp://dx.doi.org/10.48550/arXiv.2409.03697
Prasetiyowati, M. I., Maulidevi, N. U., & Surendro, K. (2021). Determining threshold value on information gain feature selection to increase speed and prediction accuracy of random forest. Journal of Big Data, 8(1). https://doi.org/10.1186/s40537-021-00472-4