How do I add legend to the plot over in my scenario? the parameter of text is the text = tfidf.transform(document)
and the parameter of clusters are the unsupervised clusters ranging from 0 to 19 clusters and have their bag of words. How do I add the legend to the plots? It is indistinguishable that which color corresponds to which cluster.
def plot_tsne_pca(data, labels):
max_label = max(labels)
max_items = np.random.choice(range(data.shape[0]), size=3000, replace=False)
pca = PCA(n_components=2).fit_transform(data[max_items,:].todense())
tsne = TSNE().fit_transform(PCA(n_components=50).fit_transform(data[max_items,:].todense()))
idx = np.random.choice(range(pca.shape[0]), size=3000, replace=False)
label_subset = labels[max_items]
label_subset = [cm.hsv(i/max_label) for i in label_subset[idx]]
f, ax = plt.subplots(1, 2, figsize=(20, 6))
ax[0].scatter(pca[idx, 0], pca[idx, 1], c=label_subset)
ax[0].set_title('PCA Cluster Plot')
ax[1].scatter(tsne[idx, 0], tsne[idx, 1], c=label_subset)
ax[1].set_title('TSNE Cluster Plot')
plot_tsne_pca(text, clusters)
Here is the full example of the code: https://pastebin.com/3PABg7xh
from How to add legend to Matplotlib for cluster data?
No comments:
Post a Comment