You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

ClusteringPurity.m 864B

1234567891011121314151617
  1. function [ purity ] = ClusteringPurity( true_clustering, test_clustering )
  2. %ClusteringPurity Calculate the purity of a clustering
  3. % Detailed explanation goes here
  4. num_clusters = max(test_clustering);
  5. num_correct = 0; %number of correctly assigned samples
  6. for i = 1 : num_clusters
  7. current_true = true_clustering(test_clustering == i); %check the true clustering of all the samples clustered in cluster i
  8. majority = mode(current_true); %determine the true clustering by selecting the most frequent "true cluster" in cluster i
  9. num_correct = num_correct + sum(current_true == majority); %the number of correctly clustered samples in cluster i is the number of appearences of the majority
  10. end
  11. purity = num_correct / size(true_clustering,1); %purity is defined as the percentage of correctly clustered samples out of all samples
  12. end