Why removing random element from vector and list costs almost the same time?

Advertisements

As cppreference says

Lists are sequence containers that allow constant time insert and erase operations anywhere within the sequence, and iteration in both directions.

Considering the continuous memory used by std::vector where erase should be linear time. So it is reasonable that random erase operations on std::list should be more efficient than std::vector.

But I the program shows differently.

int randi(int min, int max) {
    return rand()%(max-min)+min; // Generate the number, assign to variable.
}

int main() {
    srand(time(NULL)); // Seed the time

    int N = 100000;
    int M = N-2;
    int arr[N];
    for (int i = 0; i < N; i++) {
        arr[i] = i;
    }
    list<int> ls(arr, arr+N);
    vector<int> vec(arr, arr+N);

    std::chrono::time_point<std::chrono::system_clock> start, end;
  
    start = std::chrono::system_clock::now();
    for (int i = 0; i < M; i++) {
        int j = randi(0, N - i);
        ls.erase(next(ls.begin(), j));
    }
    end = std::chrono::system_clock::now();
    std::chrono::duration<double> elapsed_seconds_1 = end - start;
    cout << "list time cost: " << elapsed_seconds_1.count()) << "\n";

    for (int i = 0; i < M; i++) {
        int j = randi(0, N - i);
        vec.erase(vec.begin() + j);
    }
    end = std::chrono::system_clock::now();
    std::chrono::duration<double> elapsed_seconds_2 = end - start;
    cout << "vector time cost: " << elapsed_seconds_2.count()) << "\n";

    return 0;
}
~/cpp_learning/list$ ./demo
list time cost: 8.114993171
vector time cost: 8.306458676

>Solution :

Because it takes a long time to find the element in the list. Insertion or removal from list is O(1) if you already hold an iterator to the desired insertion/deletion location. In this case you don’t, and the std::next(ls.begin(), j) call is doing O(n) work, eliminating all savings from the cheap O(1) erase (frankly, I’m a little surprised it didn’t lose to vector; I’d expect O(n) pointer-chasing operations to cost more than a O(n) contiguous memmove-like operation, but what do I know?) Update: On checking, you forgot to save a new start point before the vector test, and in fact, once you fix that issue, the vector is much faster, so my intuition was correct there: Try it online!

With -std=c++17 -O3, output was:

list time cost: 9.63976
vector time cost: 0.191249

Similarly, the vector is cheap to get to the relevant index (O(1)), but (relatively) expensive to delete it (O(n) copy-down operation after).

When you won’t be iterating it otherwise, list won’t save you anything if you’re performing random access insertions and deletions. Situations like that call for using std::unordered_map and related containers.

Leave a ReplyCancel reply