Fastest "trivial" way of shuffling a vector

November 15, 2021

I am working on a chess engine for some time now. For improving the engine, I wrote some code which loads chess-positions from memory into some tuner code. I have around 1.85B fens on my machine which adds up to 40Gb (24B per position).

After loading, I end up with a vector of positions:

struct Position{
   std::bitset<8*24> bits{};
}

void main(){
    std::vector<Position> positions{};
   
    // mimic some data loading
    for(int i = 0; i < 1.85e9; i++){
        positions.push_back(Position{})
    }    

    // ...
}

The data is organised in the following way:

The positions are taken from games where the positions are seperated by just a few moves. Usually about 40-50 consecutive moves come the same game / line and are therefor somewhat equal.

Eventually I will read 16384 position within a single batch and ideally none of those positions come from the same game. Therefor I do some initial sorting before using the data.

My current shuffling method is this:

    auto rng = std::default_random_engine {};
    std::shuffle(std::begin(positions), std::end(positions), rng);

Unfortunately this takes quiet some time (about 1-2 minutes). Since I dont require perfect shuffles, I assume that some easier shuffles exist.

My second aproach was:

    for(int i = 0; i < positions.size(); i++){
        std::swap(positions[i], positions[(i*16384) % positions.size()]);
    }

which will ensure that there are not going to be positions coming from the same game within a single batch and are evenly spaces by 16384 entries.

I was wondering if there is some even simpler, faster solution. Especially considering that the modulo-operator requires quiet some clock cycles.

I am happy for any "trivial" solution.

Greetings
Finn

>Solution :

There is a tradeoff to be made: Shuffling a a std::vector<size_t> of indices can be expected to be cheaper than shuffling a std::vector<Position> at the cost of an indirection when accessing the Positions via shuffled indices. Actually the example on cppreference for std::iota is doing something along that line (it uses iterators):

#include <algorithm>
#include <iostream>
#include <list>
#include <numeric>
#include <random>
#include <vector>
 
int main()
{
    std::list<int> l(10);
    std::iota(l.begin(), l.end(), -4);
 
    std::vector<std::list<int>::iterator> v(l.size());
    std::iota(v.begin(), v.end(), l.begin());
 
    std::shuffle(v.begin(), v.end(), std::mt19937{std::random_device{}()});
 
    std::cout << "Contents of the list: ";
    for(auto n: l) std::cout << n << ' ';
    std::cout << '\n';
 
    std::cout << "Contents of the list, shuffled: ";
    for(auto i: v) std::cout << *i << ' ';
    std::cout << '\n';
}

Instead of shuffling the list directly, a vector of iterators (with a std::vector indices woud work as well) is shuffled and std::shuffle only needs to swap iterators (/indices) rather than the more costly actual elements (in the example the "costly to swap" elements are just ints).

For a std::list I don’t expect a big difference between iterating in order or iterating via shuffled iterators. On the other hand, for a std::vector I do expect a significant impact. Hence, I would shuffle indices, then rearrange the vector once, and profile to see which performs better.