Proceedings of the
The Nineteenth International Conference on Computational Intelligence and Security (CIS 2023)
December 1 – 4, 2023, Haikou, China

SPACE2: Dual Space Learning for Categorical Data Clustering

Fanqi Nie1,a, Pengcheng Yan1,b, Yiqun Zhang1,c, Fangqing Gu1,d, Yang Lu2 and Yue Zhang3

1Guangdong University of Technology, Guangzhou, China.

2Xiamen University, Xiamen, China.

3Guangdong Polytechnic Normal University, Guangzhou, China.

ABSTRACT

Cluster analysis of unlabeled categorical data is crucial in many practical applications. Compared with numerical data in explicit distance space, the adoption of metrics is often critical to the success of cluster analysis on categorical data, where qualitative values do not initially have well-defined similarities. However, categorical data metrics are often defined based on certain prior knowledge with limitations, and a particular metric usually cannot reasonably serve the clustering on different datasets. This paper, therefore, first proposes to learn a fusion of metrics that complement each other and then learns to adapt the fusion to clustering tasks for more appropriate exploration of clusters. Experiments illustrate the superiority and stability of the proposed method.

Keywords: Metric learning, Clustering, Categorical data.



Download PDF