Review of Visualization Methods for Categorical Data in Cluster Analysis

 

Jana Cibulková, Barbora Kupková

Statistika, 102(4): 396-408
https://doi.org/10.54694/stat.2022.4

Abstract
The paper focuses on visualization methods suitable for outcomes of cluster analysis of categorical data (nominal data, specifically). Since nominal data have no inherent order, their graphical representation is often challenging or very limited. This paper aims to provide a list of common visualization methods in the domain of cluster analysis of objects characterized by nominal variables. Firstly, the various plot types (such as clustering scatter plot, dendrogram, icicle plot) for cluster analysis are presented, and their suitability for presenting clusters of nominal data is discussed. Then, we study approaches of sorting nominal values on chart axes in such a way that would improve visualization of the data. Lastly, we introduce a simple alternative to cluster scatter plot for nominal data, that makes the final visualization of clustering solution more efficient since the pattern and groups in data are now more apparent. The suggested method is demonstrated in illustrative examples.

Keywords
Cluster analysis, nominal data, hierarchical clustering, visualization

  • Download full article in PDF