Proceedings of the
The Nineteenth International Conference on Computational Intelligence and Security (CIS 2023)
December 1 – 4, 2023, Haikou, China
SPACE2: Dual Space Learning for Categorical Data Clustering
1Guangdong University of Technology, Guangzhou, China.
2Xiamen University, Xiamen, China.
3Guangdong Polytechnic Normal University, Guangzhou, China.
ABSTRACT
Cluster analysis of unlabeled categorical data is crucial in many practical applications. Compared with numerical data in explicit distance space, the adoption of metrics is often critical to the success of cluster analysis on categorical data, where qualitative values do not initially have well-defined similarities. However, categorical data metrics are often defined based on certain prior knowledge with limitations, and a particular metric usually cannot reasonably serve the clustering on different datasets. This paper, therefore, first proposes to learn a fusion of metrics that complement each other and then learns to adapt the fusion to clustering tasks for more appropriate exploration of clusters. Experiments illustrate the superiority and stability of the proposed method.
Keywords: Metric learning, Clustering, Categorical data.

Download PDF
1Guangdong University of Technology, Guangzhou, China.
2Xiamen University, Xiamen, China.
3Guangdong Polytechnic Normal University, Guangzhou, China.
ABSTRACT
Cluster analysis of unlabeled categorical data is crucial in many practical applications. Compared with numerical data in explicit distance space, the adoption of metrics is often critical to the success of cluster analysis on categorical data, where qualitative values do not initially have well-defined similarities. However, categorical data metrics are often defined based on certain prior knowledge with limitations, and a particular metric usually cannot reasonably serve the clustering on different datasets. This paper, therefore, first proposes to learn a fusion of metrics that complement each other and then learns to adapt the fusion to clustering tasks for more appropriate exploration of clusters. Experiments illustrate the superiority and stability of the proposed method.
Keywords: Metric learning, Clustering, Categorical data.

Download PDF
