Comparative Study between Parallel K-Means and Parallel K-Medoids with Message Passing Interface (MPI)
Data mining is a combination technology for analyze a useful information from dataset using some technique such as classification, clustering, and etc. Clustering is one of the most used data mining technique these day. K-Means and K-Medoids is one of clustering algorithms that mostly used because it’s easy implementation, efficient, and also present good results. Besides mining important information, the needs of time spent when mining data is also a concern in today era considering the real world applications produce huge volume of data. This research analyzed the result from K-Means and K-Medoids algorithm and time performance using High Performance Computing (HPC) Cluster to parallelize K-Means and K-Medoids algorithms and using Message Passing Interface (MPI) library. The results shown that K-Means algorithm gives smaller SSE than K-Medoids. And also parallel algorithm that used MPI gives faster computation time than sequential algorithm.
Jing,Zhang., Gongqing, Wu., Xuegang, Hu., Shiying, Li., Shuilong, Hao. (2011) A Parallel K-Means Clustering Algorithm with MPI. International Symposium on Parallel Architectures, Algorithms and Programming, 2011 IEEE
Tan, Pang-Ning., Steinbach,Michael., Kumar,Vipin.(2006) Introduction to Data Mining.
Jiawei, Han., Kamber, Micheline.(2001) Data Mining Concepts and Technique.
F. Lusk, N. Doss, A. Skjellum. (1996) A High-Performance, Portable Implementation of the MPI Message Passing Interface. Parallel Computing. vol.22, pp 789-828.
Ahmad Firdaus Ahmad Fadzil, Noor Elaiza Abdul Khalid, Mazani Manaf. (2011) Scaling Perormance of Task-Intensive Applications via Mapreduce Parallel Processing. Faculty of Computer anda Mathematical Science, UiTM Shah Alam, Selangor, Malaysia.
C. Blake, E. Ceogh, C. Merz. (1996) UCI Repository of Machine learning databases. Irvine: Departement of Information and Computer Science, University of California.
KentRidge Biomedical Dataset Repository.Retrieved 13 August 2014, from http://datam.i2r.a-star.edu.sg/datasets/krbd/
S Singh, Shalini,. N. C, Chauhan.: K-means v/s K-medoids.( 1996) A Comparative Study. National Conference on Recent Trends in Engineering & Technology.
T. Soni Madhulatha: Comparison Between K-Means and K-Medoids Clustering. (2011) International Journal of Advanced Computing (IJAC) Vol 3
Hesam T. Dasthi., Tiago Simas., Rita A. Ribein., Amir Assadi,. And Andre Moitinho. (2010) MK-Means – Modified K-Means clustering algorithm.WCCI 2010 IEEE World Congress on Computational Intelegence. CCIB Barcelona, Spain.
Vilasaki, N. Karthikeyani., K. Thangavel. (2009) Impact of Normalization in Distributed K-Means Clustering. International Journal of Soft Computing.
Copyright (c) 2017 Fhira Nhita
This work is licensed under a Creative Commons Attribution 4.0 International License.Manuscript submitted to IJoICT has to be an original work of the author(s), contains no element of plagiarism, and has never been published or is not being considered for publication in other journals. Author(s) shall agree to assign all copyright of published article to IJoICT. Requests related to future re-use and re-publication of major or substantial parts of the article must be consulted with the editors of IJoICT.