clusterA = [A, A, A, A, A, A, A, A] clusterB = [B, B, B, B] clusterC = [C, C] interleaved_example = [A, B, A, C, A, A, B, A, B, A, C, A, B, A]
(2, 2, 1, 2, 2, 2, 2), mean of ~1.9
(5, 2, 4), mean of ~3.7
(7) = mean of 7
My first shot at an algorithm is to take the largest cluster and place it in an array,
X, then take the second largest cluster and insert the values into
X at positions linearly spaced (rounding as necessary), then repeating for each subsequent smaller cluster, but that's easily demonstrated to be sub optimal, though decent on average.
I've struggled to model this as a convex optimization problem in hopes of using
scipy.optimize.minimize on it.
I wonder if there are existing principled algorithms that achieve this.