Implementation of K-Means++ Algorithm for Store Customers Segmentation Using Neo4J

  • Arief Chaerudin
  • Danang Triantoro Murdiansyah Universitas Telkom
  • Mahmud Imrona
Abstract views: 26 , 65 downloads: 55
Keywords: K-Means , K-Means, Neo4J, customer segmentation


In the era of data and information, data has become one of the most useful and desirable things. Data can be useful information if the data is processed properly. One example of the results of data processing in business is by making customer segmentation. Customer segmentation is useful for identifying and filtering customers according to certain categories. Analysis of the resulting segmentation can produce information about more effective target market, more efficient budget, more accurate marketing or promotion strategies, and much more. Since segmentation aims to separate customers into several categories or clusters, a clustering algorithm can be used. In this research, customer segmentation is carried out based on the value of income and value of expenditure. The categorization method that will be used for this research is to use the K-Means ++ algorithm which is useful for determining clusters of the given data. In this study, the implementation of K-Means ++ is carried out using Neo4J. Then in this research, a comparison of K-Means ++ and K-Means is carried out. The result obtained in this study is that K-Means ++ has a better cluster than K-Means in term of silhouette score parameter.


Download data is not yet available.


[1]Lavanya, K., et al. "An Enhanced K-Means MSOINN based clustering over Neo4j with an application to weather analysis." International Conference on Intelligent Computing and Smart Communication 2019. Springer, Singapore, 2020.
[2]Ezenkwu, Chinedu Pascal, Simeon Ozuomba, and Constance Kalu. "Application of K-Means algorithm for efficient customer segmentation: a strategy for targeted customer services." (2015). (IJARAI) International Journal of Advanced Research in Artificial Intelligence, Vol. 4, No.10, 2015.
[3] J. J.Miller, "Graph Database Applications and Concepts with Neo4j," in Proceedings of the Southern Association for Information Systems Conference, Atlanta, 2013.
[4]D. e. a. Dominguez-Sal, "Survey of graph database performance on the hpc scalable graph analysis benchmark.," in International Conference on Web-Age Information Management., Berlin, 2010.
[5]J. W. E. E. Ian Robinson, Graph Databases, California: O’Reilly Media, Inc., 2015.
[6]J. C. &. R. H. Bryce Merkl Sasaki, Graph Databases for Beginners, California: Neo4j, 2018.
[7]F. a. R. P. Holzschuher, "Performance of graph query languages: comparison of cypher, gremlin and native access in Neo4j," in Proceedings of the Joint EDBT/ICDT 2013 Workshops. ACM, Genoa, 2013.
[8]N. Shi, X. Liu, Y. Guan, Research on k-means clustering algorithm: an improved k-means clustering algorithm, in 3rd International Symposium on Intelligent Information Technology and Security Informatics, IITSI 2010 (2010), pp. 63–67.
[9]A. Virk, R. Rani, recommendations using graphs on Neo4j, in 2018 International Conference on Inventive Research in Computing Applications (2018), pp. 133–138.
[10] Pradana, M., & Ha, H. (2021). Maximizing Strategy Improvement in Mall Customer Segmentation using K-means Clustering. Journal of Applied Data Sciences, 2(1), 19-25.
How to Cite
Chaerudin, A., Murdiansyah, D. T., & Imrona, M. (2021). Implementation of K-Means++ Algorithm for Store Customers Segmentation Using Neo4J. Indonesian Journal on Computing (Indo-JC), 6(1), 53-60.
Computational and Simulation

Most read articles by the same author(s)