Impact of Feature Selection on the Performance of KNN and SVM in Heart Disease Prediction

Isi Artikel Utama

Dhiyaussalam
M. Helmy Noor
Herlinawati
Isna Wardiah

Abstrak

Feature selection memainkan peran penting dalam meningkatkan kinerja model machine learning dengan mengurangi atribut yang tidak relevan atau berlebihan. Studi ini menyelidiki dampak pemilihan fitur pada akurasi klasifikasi K-Nearest Neighbors (KNN) dan Support Vector Machine (SVM) dalam memprediksi penyakit jantung. Menggunakan data UCI Heart Disease, yang awalnya mencakup 13 fitur, skor feature importance dihitung menggunakan model Random Forest. Metode berbasis ambang batas kemudian diterapkan untuk mengidentifikasi dan mempertahankan fitur yang paling informatif. Melalui pengujian iteratif ambang batas kepentingan, nilai 0,03 menghasilkan hasil terbaik, mengurangi set fitur dari 13 menjadi 9 atribut. Model klasifikasi dilatih dan dievaluasi menggunakan set fitur lengkap dan tereduksi. Kinerja dinilai menggunakan akurasi, presisi, recall, dan skor F1 dan divalidasi dengan 5-fold cross-validation. Hasilnya menunjukkan peningkatan kinerja yang signifikan setelah feature selection. Pengklasifikasi KNN meningkatkan akurasi dari 83% menjadi 92%, dengan peningkatan yang nyata dalam recall dan skor F1 untuk kelas positif. Demikian pula, SVM mencapai akurasi 92%, dengan presisi yang lebih baik dan stabilitas kinerja secara keseluruhan. Temuan ini menunjukkan bahwa pengurangan fitur berdasarkan data menyederhanakan model dan meningkatkan daya prediktifnya. Studi ini secara sistematis membandingkan efek pemilihan fitur pada dua algoritma pembelajaran mesin yang berbeda dan menawarkan wawasan praktis untuk mengoptimalkan model prediksi medis dalam sistem pendukung keputusan klinis.

Rincian Artikel

Cara Mengutip
Dhiyaussalam, Noor, M. H., Herlinawati, & Wardiah, I. (2025). Impact of Feature Selection on the Performance of KNN and SVM in Heart Disease Prediction. Tech : Journal of Engineering Science, 1(1), 14–25. https://doi.org/10.69836/tech.v1i1.353
Bagian
Articles
Biografi Penulis

Dhiyaussalam, Politeknik Negeri Banjarmasin

Teknik Informatika

Referensi

Abidin, M., Munzir, M., Imantoyo, A., Bintang Grendis, N. W., Hadi San, A. S., Mostfa, A. A., Furizal, F., & Sharkawy, A.-N. (2025). Classification of Heart (Cardiovascular) Disease using the SVM Method. Indonesian Journal of Modern Science and Technology, 1(1), 9–15. https://doi.org/10.64021/ijmst.1.1.9-15.2025

Andono, P. N., Rachmawanto, E. H., Herman, N. S., & Kondo, K. (2021). Orchid types classification using supervised learning algorithm based on feature and color extraction. Bulletin of Electrical Engineering and Informatics, 10(5), 2530–2538. https://doi.org/10.11591/eei.v10i5.3118

Aziz, F., Malek, S., Ibrahim, K. S., Shariff, R. E. R., Wan Ahmad, W. A., Ali, R. M., Liu, K. T., Selvaraj, G., & Kasim, S. (2021). Short- And long-term mortality prediction after an acute ST-elevation myocardial infarction (STEMI) in Asians: A machine learning approach. PLoS ONE, 16(8 August), 1–23. https://doi.org/10.1371/journal.pone.0254894

Bianco, E., Skipalskyi, A., Goma, F., Odeh, H., Hasegawa, K., Zawawi, M. Al, Stoklosa, M., Dalmau, R., Dorotheo, E. U., Berteletti, F., Mwangi, J., & Wang, Y. (2021). E-Cigarettes: A New Threat to Cardiovascular Health - A World Heart Federation Policy Brief. Global Heart, 16(1), 1–10. https://doi.org/10.5334/gh.1076

Eswar, T. S., & Karthick, V. (2022). Realtime Visual Object Recognition using Support Vector Machine comparing with K- Nearest Neighbor algorithm for improving accuracy. Journal of Pharmaceutical Negative Results, 13(SO4), 831–837. https://doi.org/10.47750/pnr.2022.13.s04.097

Ingole, S., Ramineni, V., Bangad, N., Ganeeb, K. K., & Patel, P. (2024). Advancements In Heart Disease Prediction : A Machine Learning Approach For Early Detection And Risk Assessment. 11(4), 164–172. https://doi.org/https://doi.org/10.48550/arXiv.2410.14738

Janosi, A., Steinbrunn, W., Pfisterer, M., & Detrano, R. (1989). Heart Disease [Dataset]. https://doi.org/10.24432/C52P4X

Jiang, X., Zhou, J., Qiao, X., Peng, C., & Su, S. (2022). A Neighborhood Model with Both Distance and Quantity Constraints for Multilabel Data. Computational Intelligence and Neuroscience, 2022. https://doi.org/10.1155/2022/9891971

Kadhim, R. R., & Kamil, M. Y. (2023). Comparison of machine learning models for breast cancer diagnosis. IAES International Journal of Artificial Intelligence, 12(1), 415–421. https://doi.org/10.11591/ijai.v12.i1.pp415-421

Kodete, C. S. (2022). Improved Heart Disease Classification Using SVM-KNN Voting Classifier and GridSearchCV Optimization. Journal of Health, Metabolism & Nutrition Studies, Vol. 19(April). https://www.cambridgenigeriapub.com/wp-content/uploads/2025/02/BJHMNS_Vol19_No3_March_2022-21.pdf

Liang, J. X., Zhao, J. F., Sun, N., & Shi, B. J. (2022). Random Forest Feature Selection and Back Propagation Neural Network to Detect Fire Using Video. Journal of Sensors, 2022. https://doi.org/10.1155/2022/5160050

Ma, X. H., Shu, L., Jia, X., Zhou, H. C., Liu, T. T., Liang, J. W., Ding, Y. S., He, M., & Shu, Q. (2022). Machine Learning-Based CT Radiomics Method for Identifying the Stage of Wilms Tumor in Children. Frontiers in Pediatrics, 10(May), 1–9. https://doi.org/10.3389/fped.2022.873035

Nagavallika, V. (2022). Prediction of Heart Disease Using Machine Learning Techniques. Journal of Contemporary Medical Practice, 4(6), 195–198. https://doi.org/10.53469/jcmp.2022.04(06).39

Nasution, N., Hasan, M. A., & Bakri Nasution, F. (2025). Predicting Heart Disease Using Machine Learning: An Evaluation of Logistic Regression, Random Forest, SVM, and KNN Models on the UCI Heart Disease Dataset. IT Journal Research and Development, 9(2), 140–150. https://doi.org/10.25299/itjrd.2025.17941

Osei-Nkwantabisa, A. S., & Ntumy, R. (2024). Classification and Prediction of Heart Diseases using Machine Learning Algorithms. 1–10. http://arxiv.org/abs/2409.03697%0Ahttp://dx.doi.org/10.48550/arXiv.2409.03697

Prasetiyowati, M. I., Maulidevi, N. U., & Surendro, K. (2021). Determining threshold value on information gain feature selection to increase speed and prediction accuracy of random forest. Journal of Big Data, 8(1). https://doi.org/10.1186/s40537-021-00472-4