Tuesday 31 August 2021

Hierarchical clustering label based on their merging order in python

lets say, I have this type of Hierarchical clustering as below diagram. To get the clustering labels, I need to define proper threshold distance. For example, If I put the threshold at 0.32, I probably would get 3 clusters and if I set around 3.5, I would get 2 clusters from this below diagram.

Instead of using threshold and use some fixed distance, I would like to get the clustering label based on their merging orders.

I would like to define the clustering based on their merging; like first merging, second merging, etc.

For example, here I would like to get clustering labels, when they do at least first merge and that would be 3 clusters;

cluster1: p1
cluster2: p3 and p4
cluster3: p2 and p5.

If I set here, find the clustering when there is at least second merging happens. In this case, I would have 2 clusters such as:

cluster1: p1
cluster2 = p3, p4, p2 and p5.

Does scipy has builtin method to extract this kind of information. If not, is there any way that I can extract this type of information from hierarchical clustering ? Any suggestions would be great.

enter image description here



from Hierarchical clustering label based on their merging order in python

No comments:

Post a Comment