Clustering methods are used to identify groups of similar objects in a multivariate data sets collected from fields. In this project, k-means cluster, hierarchical cluster, and spectral cluster are provided in the sample code.
We crawled articles in different areas of Wikipedia. The topics of these articles form a hierarchy (given by Wikipedia), as shown in Figure 1. We selected 10 documents in each of the finest topic categories (leaf nodes in the figure). We provide you the documents in a featurized form.
Each document has undergone the following pre-proessing steps:
NMI: 0.608, Purity: 0.500
NMI: 0.514, Purity: 0.425
NMI: 0.689, Purity: 0.581
Comments
😅 Commenting is disabled on this post.