Warm tip: This article is reproduced from serverfault.com, please click

Why is random.shuffle so much slower than using sorted function?

发布于 2020-11-28 01:28:46

When using pythons random.shuffle function, I noticed it went significantly faster to use sorted(l, key=lambda _: random.random()) than random.shuffle(l). As far as I understand, both ways produce completely random lists, so why does shuffle take so much longer?

Below are the times using timeit module.

from timeit import timeit
setup = 'import random\nl = list(range(1000))'

# 5.542 seconds
print(timeit('random.shuffle(l)', setup=setup, number=10000))

# 1.878 seconds
print(timeit('sorted(l, key=lambda _: random.random())', setup=setup, number=10000))
Questioner
mazore
Viewed
0
ShadowRanger 2020-11-28 09:33:40

On CPython (the reference interpreter) random.shuffle is implemented in Python (and implemented in terms of _randbelow, itself a Python wrapper around getrandbits, the C level function that ultimately implements it, and which can end up being called nearly twice as often as strictly necessary in an effort to ensure the outputs are unbiased); sorted (and random.random) are implemented in C. The overhead of performing work in Python is higher than performing similar work in C.