Explanation of Merge Sort for Dummies


Question

I found this code online:

def merge(left, right):
    result = []
    i ,j = 0, 0
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
    result += left[i:]
    result += right[j:]
    return result

def mergesort(list):
    if len(list) < 2:
        return list
    middle = len(list) / 2
    left = mergesort(list[:middle])
    right = mergesort(list[middle:])
    return merge(left, right)

It works 100% when I run it. I just do not really get how the merge sort works or how the recursive function is able to properly order both a left and a right.

1
20
10/4/2016 10:12:27 PM

Accepted Answer

I believe that the key to understanding merge sort is understanding the following principle -- I'll call it the merge principle:

Given two separate lists A and B ordered from least to greatest, construct a list C by repeatedly comparing the least value of A to the least value of B, removing the lesser value, and appending it onto C. When one list is exhausted, append the remaining items in the other list onto C in order. The list C is then also a sorted list.

If you work this out by hand a few times, you'll see that it's correct. For example:

A = 1, 3
B = 2, 4
C = 
min(min(A), min(B)) = 1

A = 3
B = 2, 4
C = 1
min(min(A), min(B)) = 2

A = 3
B = 4
C = 1, 2
min(min(A), min(B)) = 3

A = 
B = 4
C = 1, 2, 3

Now A is exhausted, so extend C with the remaining values from B:

C = 1, 2, 3, 4

The merge principle is also easy to prove. The minimum value of A is less than all other values of A, and the minimum value of B is less than all other values of B. If the minimum value of A is less than the minimum value of B, then it must also be less than all values of B. Therefore it is less than all values of A and all values of B.

So as long as you keep appending the value that meets those criteria to C, you get a sorted list. This is what the merge function above does.

Now, given this principle, it's very easy to understand a sorting technique that sorts a list by dividing it up into smaller lists, sorting those lists, and then merging those sorted lists together. The merge_sort function is simply a function that divides a list in half, sorts those two lists, and then merges those two lists together in the manner described above.

The only catch is that because it is recursive, when it sorts the two sub-lists, it does so by passing them to itself! If you're having difficulty understanding the recursion here, I would suggest studying simpler problems first. But if you get the basics of recursion already, then all you have to realize is that a one-item list is already sorted. Merging two one-item lists generates a sorted two-item list; merging two two-item lists generates a sorted four-item list; and so on.

56
6/15/2012 12:54:32 PM

When I'm stumbled into diffuculty to uderstand how the algorithm works, I add debug output to check what really happens inside the algorithm.

Here the code with debug output. Try to uderstand all the steps with recursive calls of mergesort and what merge does with the output:

def merge(left, right):
    result = []
    i ,j = 0, 0
    while i < len(left) and j < len(right):
        print('left[i]: {} right[j]: {}'.format(left[i],right[j]))
        if left[i] <= right[j]:
            print('Appending {} to the result'.format(left[i]))           
            result.append(left[i])
            print('result now is {}'.format(result))
            i += 1
            print('i now is {}'.format(i))
        else:
            print('Appending {} to the result'.format(right[j]))
            result.append(right[j])
            print('result now is {}'.format(result))
            j += 1
            print('j now is {}'.format(j))
    print('One of the list is exhausted. Adding the rest of one of the lists.')
    result += left[i:]
    result += right[j:]
    print('result now is {}'.format(result))
    return result

def mergesort(L):
    print('---')
    print('mergesort on {}'.format(L))
    if len(L) < 2:
        print('length is 1: returning the list withouth changing')
        return L
    middle = len(L) / 2
    print('calling mergesort on {}'.format(L[:middle]))
    left = mergesort(L[:middle])
    print('calling mergesort on {}'.format(L[middle:]))
    right = mergesort(L[middle:])
    print('Merging left: {} and right: {}'.format(left,right))
    out = merge(left, right)
    print('exiting mergesort on {}'.format(L))
    print('#---')
    return out


mergesort([6,5,4,3,2,1])

Output:

---
mergesort on [6, 5, 4, 3, 2, 1]
calling mergesort on [6, 5, 4]
---
mergesort on [6, 5, 4]
calling mergesort on [6]
---
mergesort on [6]
length is 1: returning the list withouth changing
calling mergesort on [5, 4]
---
mergesort on [5, 4]
calling mergesort on [5]
---
mergesort on [5]
length is 1: returning the list withouth changing
calling mergesort on [4]
---
mergesort on [4]
length is 1: returning the list withouth changing
Merging left: [5] and right: [4]
left[i]: 5 right[j]: 4
Appending 4 to the result
result now is [4]
j now is 1
One of the list is exhausted. Adding the rest of one of the lists.
result now is [4, 5]
exiting mergesort on [5, 4]
#---
Merging left: [6] and right: [4, 5]
left[i]: 6 right[j]: 4
Appending 4 to the result
result now is [4]
j now is 1
left[i]: 6 right[j]: 5
Appending 5 to the result
result now is [4, 5]
j now is 2
One of the list is exhausted. Adding the rest of one of the lists.
result now is [4, 5, 6]
exiting mergesort on [6, 5, 4]
#---
calling mergesort on [3, 2, 1]
---
mergesort on [3, 2, 1]
calling mergesort on [3]
---
mergesort on [3]
length is 1: returning the list withouth changing
calling mergesort on [2, 1]
---
mergesort on [2, 1]
calling mergesort on [2]
---
mergesort on [2]
length is 1: returning the list withouth changing
calling mergesort on [1]
---
mergesort on [1]
length is 1: returning the list withouth changing
Merging left: [2] and right: [1]
left[i]: 2 right[j]: 1
Appending 1 to the result
result now is [1]
j now is 1
One of the list is exhausted. Adding the rest of one of the lists.
result now is [1, 2]
exiting mergesort on [2, 1]
#---
Merging left: [3] and right: [1, 2]
left[i]: 3 right[j]: 1
Appending 1 to the result
result now is [1]
j now is 1
left[i]: 3 right[j]: 2
Appending 2 to the result
result now is [1, 2]
j now is 2
One of the list is exhausted. Adding the rest of one of the lists.
result now is [1, 2, 3]
exiting mergesort on [3, 2, 1]
#---
Merging left: [4, 5, 6] and right: [1, 2, 3]
left[i]: 4 right[j]: 1
Appending 1 to the result
result now is [1]
j now is 1
left[i]: 4 right[j]: 2
Appending 2 to the result
result now is [1, 2]
j now is 2
left[i]: 4 right[j]: 3
Appending 3 to the result
result now is [1, 2, 3]
j now is 3
One of the list is exhausted. Adding the rest of one of the lists.
result now is [1, 2, 3, 4, 5, 6]
exiting mergesort on [6, 5, 4, 3, 2, 1]
#---

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon