Warm tip: This article is reproduced from serverfault.com, please click

How to measure variance of a population of permutations?

发布于 2020-12-22 12:16:18

I need to compute the variance in a population (array) of permutations, i.e,

Let say that I have this array of permutations:

import numpy as np
import scipy.stats as stats


a = np.matrix([[1,2,3,4,5,6], [2,3,4,6,1,5], [6,3,1,2,5,4]])

# distance between a[0] and a[1]
distance = stats.kendalltau(a[0], a[1])[0]

So, how to compute (in Python) the variance on this array, i.e, how to measure how far theses permutations are from each other ?

Regards

Aymeric

p.s: I define the distance between two permutation by the kendalltau metric

Questioner
ailauli69
Viewed
0
Ivan 2020-12-22 21:00:49

I'm not sure if that's the mathematical result you are looking for. You could use stats.kendalltau to compute the distance for all possible pairs, then take the variance from that resulting vector.

To get the vector of distances, I loop through the zipped list (a, a-shifted) using np.roll:

dist = []
for x1, x2 in zip(a, np.roll(a, shift=1, axis=0)):
    dist.append(kendalltau(x1, x2)[0])

To take the variance of all distances:

np.std(dist)

Or if you are looking for the variance as enter image description here (discussed here) then take the norm of the distance vector:

np.linalg.norm(dist)

Note I'm using a as defined with np.array, not np.matrix:

a = np.array([[1,2,3,4,5,6], [2,3,4,6,1,5], [6,3,1,2,5,4]])