Using SeedSequence to convert any random seed into a good seed

In NumPy 1.17, the random module was given an overhaul. Among the new additions were SeedSequence. In the section about parallel random number generation in the docs, SeedSequence is said to

ensure that low-quality seeds are turned into high quality initial states

which supposedly mean that one can safely use a seed of e.g. 1, as long as this is processed through a SeedSequence prior to seeding the PRNG. However, the rest of the documentation page goes on to describe how SeedSequence can turn one seed into several, independent PRNGs. What if we only want one PRNG, but want to be able to safely make use of small/bad seeds?

I have written the below test code, which uses the Mersenne twister MT19937 to draw normal distributed random numbers. The basic seeding (without using SeedSequence) is MT19937(seed), corresponding to f() below. I also try MT19937(SeedSequence(seed)), corresponding to g(), though this results in exactly the same stream. Lastly, I try using the spawn/spawn_key functionality of SeedSequence, which does alter the stream (corresponding to h() and i(), which produce identical streams).

import numpy as np
import matplotlib.pyplot as plt

def f():
    return np.random.Generator(np.random.MT19937(seed))
def g():
    return np.random.Generator(np.random.MT19937(np.random.SeedSequence(seed)))
def h():
    return np.random.Generator(np.random.MT19937(np.random.SeedSequence(seed).spawn(1)[0]))
def i():
    return np.random.Generator(np.random.MT19937(np.random.SeedSequence(seed, spawn_key=(0,))))

seed = 42  # low seed, contains many 0s in binary representation
n = 100
for func, ls in zip([f, g, h, i], ['-', '--', '-', '--']):
    generator = func()
    plt.plot([generator.normal(0, 1) for _ in range(n)], ls)
plt.show()

Question

Are h() and i() really superior to f() an g()? If so, why is it necessary to invoke the spawn (parallel) functionality, just to convert a (possibly bad) seed into a good seed? To me these seem like they ought to be disjoint features.

>Solution :

The reason nothing changed when you used an explicit SeedSequence is that the new randomness APIs already pass seeds through SeedSequence by default. It’s not a sign that something went wrong, or that you need to explicitly call spawn. Calling spawn doesn’t produce better output; it just produces different output.

Leave a Reply