Quick Sort significantly slower

Advertisements

I am working on my lab assignment which is all about sorting algorithms, Heap Sort, Merge Sort, and Quick Sort. I finished the assigment pretty much, but after taking time measurements for each algorithm I got suprising results.

[***** [Merge Sort] *****]

[Original]: [54599, 62697, 92032, 19179, 17296, 27068, 99563, 9829, 89929, 57140]
[Sorted]:   [9829, 17296, 19179, 27068, 54599, 57140, 62697, 89929, 92032, 99563]


[size]:  10     [time]: 2      [ms]
[size]:  100    [time]: 15     [ms]
[size]:  1000   [time]: 170    [ms]
[size]:  10000  [time]: 2122   [ms]
[size]:  100000 [time]: 22946  [ms]

[***** [Quick Sort] *****]

[Original]: [10017, 37607, 51285, 83517, 7500, 81469, 40379, 19721, 48524, 74062]
[Sorted]:   [7500, 10017, 19721, 37607, 40379, 48524, 51285, 74062, 81469, 83517]


[size]:  10     [time]: 24     [ms]
[size]:  100    [time]: 95     [ms]
[size]:  1000   [time]: 1001   [ms]
[size]:  10000  [time]: 9697   [ms]
[size]:  100000 [time]: 107627 [ms]

[***** [Heap Sort] *****]

[Original]: [62697, 92032, 19179, 17296, 27068, 99563, 9829, 89929, 57140, 33429]
[Sorted]:   [9829, 17296, 19179, 27068, 33429, 57140, 62697, 89929, 92032, 99563]


[size]:  10     [time]: 1      [ms]
[size]:  100    [time]: 14     [ms]
[size]:  1000   [time]: 239    [ms]
[size]:  10000  [time]: 3088   [ms]
[size]:  100000 [time]: 39615  [ms]

I know that all of these algorithms are supposed to run in O(nlogn) and are considered the "fastest" sorting algorithms out there, but the time measurements for Quick Sort are very different from Heap Sort and Merge Sort.

I am using a random pivot, since I read that QS is very in-efficient if all of the elements are sorted or if all of them are the same.

Here is my code for QS:

/**
 * @brief Generates a random pivot index between low and high (inclusive)
 * @param low Starting index of the array
 * @param high Ending index of the array
 * @return Random pivot index
 */
int random_pivot(int low, int high) {
    srand(static_cast<unsigned int>(time(nullptr)));
    return low + rand() % (high - low + 1);
}

/**
 * @brief Partitions the array and returns the partition index
 * @param arr The array to be partitioned
 * @param low Starting index of the partition
 * @param high Ending index of the partition
 * @return Partition index
 */
int partition(int* arr, int low, int high) {
    int pivotIndex = random_pivot(low, high);
    int pivot = arr[pivotIndex];
    std::swap(arr[pivotIndex], arr[high]);

    int i = low - 1; // Index of the smaller element

    for (int j = low; j <= high - 1; j++) {
        // If current element is smaller than or equal to the pivot
        if (arr[j] <= pivot) {
            i++; // Increment index of smaller element
            std::swap(arr[i], arr[j]); // Swap current element with the smaller element
        }
    }

    std::swap(arr[i + 1], arr[high]); // Swap the pivot with the element at the partition index
    return i + 1; // Return the partition index
}

/**
 * @brief Sorts an array using the QuickSort algorithm
 * @param arr The array to be sorted
 * @param low Starting index of the array
 * @param high Ending index of the array
 */
void quick_sort_helper(int* arr, int low, int high) {
    if (low < high) {
        int partition_index = partition(arr, low, high); // partition the array and get the partition index
        quick_sort_helper(arr, low, partition_index - 1); // recursively sort the left subarray
        quick_sort_helper(arr, partition_index + 1, high); // recursively sort the right subarray
    }
}

/**
 * @brief Sorts an array using the QuickSort algorithm
 * @param arr The array to be sorted
 * @param size The size of the array
 */
void quick_sort(int* arr, int size) {
    quick_sort_helper(arr, 0, size - 1);
}

Code block for taking time measurements:

/**
 * @brief Measures the execution time of a sorting algorithm on arrays of different sizes.
 * @param sorting_function The sorting function to be measured.
 */
void measure_sort(void (*sorting_function)(int*, int)) {
  int sizes[] = {10, 100, 1000, 10000, 100000}; // sizes of the array
  int const MAX = 100000;
  int const SMALL = 10;

  for (auto i = 0; i < 5; i++) {
      int* arr = new int[sizes[i]];
      for(auto j = 0; j < sizes[i]; j++) { //fill array with random numbers
        arr[j] = rand() % MAX;
      }

      if (sizes[i] == SMALL) { //print og array before sorting
        std::cout << "\n[Original]: "; // << std::setw(2);
        print_arr(arr, sizes[i]);
      }

      // Measure execution time
      auto start = std::chrono::high_resolution_clock::now();
      sorting_function(arr, sizes[i]);
      auto end = std::chrono::high_resolution_clock::now();
      auto duration = std::chrono::duration_cast<std::chrono::microseconds>(end - start).count();

      if(sizes[i] == SMALL) {
        std::string const SPACE = "   "; //width const to align output
        std::cout << std::setw(4) << "[Sorted]:" << SPACE;
        print_arr(arr, sizes[i]);
        std::cout << std::endl << std::endl;
      }

      int const SIZE_W = 9;
      int const TIME_W = 8;
      int const W = 6;
      std::cout << std::left << std::setw(SIZE_W) << "[size]: " << std::setw(W+1) << sizes[i] << std::left <<std::setw(TIME_W) << "[time]: " << std::setw(W) << duration << " [ms]" << std::endl;

      // Clean up dynamically allocated memory
      delete[] arr;
  }
}

Could someone explain to me why QS is taking a lot more time to sort a random array than the other algorithms?

I reviewed this question and this and this but I still don’t understand what’s going on.

>Solution :

Calling srand and time for every random pivot is almost certainly impacting the performance of your quick_sort implementation. Those calls are expensive relative to the other operations you are performing.

You are correct that a poor choice of pivot can be disastrous for sorted data. However, instead of a random pivot I would suggest the median of three strategy for selecting the pivot (also see this answer).

The median of three selects the pivot as the median of the first, middle and last element of the partition. This works flawlessly with sorted data and even improves performance on random data by more evenly partitioning the data on average.

Update

You may also want to look at my answer to another sort related question. The associated code can be found on GitHub.

Leave a ReplyCancel reply