next up previous contents index Search
Next: 0.3.2.2 Improvements Up: 0.3.2 The Mergesort Previous: 0.3.2 The Mergesort

0.3.2.1 Analysis

One drawback this sorting method is that in order to combine two lists of n elements we either need storage for 4n data items in memory, or a clever way of merging lists. Using the most straightforward algorithm for merging two sorted lists we will need n storage cells for each of the two lists to be merged, and then another 2n for the aggregate list which we are building from members of the two sorted input lists.

There are some more space-efficient methods of building the aggregate list. One attack on this space problem is to use linked-lists   to represent the data items and then build the aggregate list by changing pointer values. This way no space for the aggregate list need be allocated as it will be built of the memory already used by members of each sorted input list. There are also other ways of building an aggregate list in the ``unused'' (or ``already used'') space of the two input lists but this saved space sometimes comes at the cost of a slight penalty in runtime and an increase in algorithm complexity.

Unlike the Quicksort, the Mergesort does not concern itself with the ordering of its sublists but rather accomplishes the goal of sorting the entire data set by sorting very small subsets and then merging these back together again in order. The Mergesort's performance, therefore, is not greatly affected by the arrangement of values in the input set. Thus, the worst case performance of the Mergesort does not deteriorate for certain ``bad'' data orders.

Mergesorts are often used to sort linked lists both because the algorithm does not require random access to the data set on which it is operating and because, due to the nature of a linked list, the need to allocate extra space for aggregate lists is eliminated (as discussed above). The Mergesort can also be modified slightly to produce an effective external (i.e. on disk) sort.

Here is a more formal analysis of the algorithm. In order to Mergesort two items we need one comparison. Thus:


T(2) = 1

In order to Mergesort n items we will need to split the n items in to two sets of $\frac{n}{2}$ and then combine them. In order to merge two data sets of $\frac{n}{2}$ items we will need to make, at most, (n-1) comparisons. Thus:


\begin{displaymath}T(n) = 2T(\frac{n}{2}) + n - 1
\end{displaymath}

If n is restricted to be 2x then T(n) can be expressed as:


\begin{displaymath}\begin{array}{ll}
T(n) & = 2T(\frac{n}{2}) + n - 1 \\
& = 2...
... & = 8T(\frac{n}{8} + 3n - 4 - 2 - 1 \\
& \vdots
\end{array}\end{displaymath}

Therefore,


\begin{displaymath}\begin{array}{ll}
T(n) & = 2^iT(\frac{n}{2^i}) + in - \sum_{j...
...-1} 2^j \\
& = 2^iT(\frac{n}{2^i}) + in - 2^i + 1
\end{array}\end{displaymath}

Now, since i = log2 n:


\begin{displaymath}\begin{array}{ll}
T(n) & = 2^{log_2 n}T(\frac{n}{2^{log_2 n}}...
...og_2 n - 2^{log_2 n} + 1 \\
& = n log_2 n - n + 1
\end{array}\end{displaymath}

Scott Gasch
1999-07-09