TY - JOUR
T1 - Fuzzy kernel K-medoids clustering algorithm for uncertain data objects
AU - Tavakkol, Behnam
AU - Son, Youngdoo
N1 - Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature.
PY - 2021/8
Y1 - 2021/8
N2 - Most data mining algorithms are designed for traditional type of data objects which are referred to as certain data objects. Certain data objects contain no uncertainty information and are represented by a single point. Capturing uncertainty can result in better performance of algorithms as they might generate more accurate results. There are different ways of modeling uncertainty for data objects, two of the most popular ones are: (1) considering a group of points for each object and (2) considering a probability density function (pdf) for each object. Objects modeled in these ways are referred to as uncertain data objects. Fuzzy clustering is a well-established field of research for certain data. When fuzzy clustering algorithms are used, degrees of membership are generated for assignment of objects to clusters which gives the flexibility to express that objects can belong to more than one cluster. To the best of our knowledge, for uncertain data, there is only one existing fuzzy clustering algorithm in the literature. The existing uncertain fuzzy clustering algorithm, however, cannot properly create non-convex shaped clusters, and therefore, its performance is not that well on uncertain data sets with arbitrary-shaped clusters—clusters that are non-convex, unconventional, and possibly nonlinearly separable. In this paper, we propose a novel fuzzy kernel K-medoids clustering algorithm for uncertain objects which works well on data sets with arbitrary-shaped clusters. We show through several experiments on synthetic and real data that the proposed algorithm outperforms the competitor algorithms: certain fuzzy K-medoids and the uncertain fuzzy K-medoids.
AB - Most data mining algorithms are designed for traditional type of data objects which are referred to as certain data objects. Certain data objects contain no uncertainty information and are represented by a single point. Capturing uncertainty can result in better performance of algorithms as they might generate more accurate results. There are different ways of modeling uncertainty for data objects, two of the most popular ones are: (1) considering a group of points for each object and (2) considering a probability density function (pdf) for each object. Objects modeled in these ways are referred to as uncertain data objects. Fuzzy clustering is a well-established field of research for certain data. When fuzzy clustering algorithms are used, degrees of membership are generated for assignment of objects to clusters which gives the flexibility to express that objects can belong to more than one cluster. To the best of our knowledge, for uncertain data, there is only one existing fuzzy clustering algorithm in the literature. The existing uncertain fuzzy clustering algorithm, however, cannot properly create non-convex shaped clusters, and therefore, its performance is not that well on uncertain data sets with arbitrary-shaped clusters—clusters that are non-convex, unconventional, and possibly nonlinearly separable. In this paper, we propose a novel fuzzy kernel K-medoids clustering algorithm for uncertain objects which works well on data sets with arbitrary-shaped clusters. We show through several experiments on synthetic and real data that the proposed algorithm outperforms the competitor algorithms: certain fuzzy K-medoids and the uncertain fuzzy K-medoids.
KW - Clustering
KW - Fuzzy
KW - Kernel method
KW - Uncertain data
UR - http://www.scopus.com/inward/record.url?scp=85106471911&partnerID=8YFLogxK
U2 - 10.1007/s10044-021-00983-z
DO - 10.1007/s10044-021-00983-z
M3 - Article
AN - SCOPUS:85106471911
SN - 1433-7541
VL - 24
SP - 1287
EP - 1302
JO - Pattern Analysis and Applications
JF - Pattern Analysis and Applications
IS - 3
ER -