Today I spent a little bit of time researching the "Shell" sort. I wanted to post a few notes about the Princeton Algorithms Course's implementation to help me solidify my understanding.
The problem with insertion sort is that it doesn't perform well when there are smaller items at the end of series being sorted. If the first is last, then it will take N - 1 exchanges to get it in place (for a series of size N). So a bad case for insertion sort would be a series like this:
8, 7, 6, 5, 4, 3, 2, 1
The number of exchanges required to get each element in place increases...proportionally, I think? Anyway, it keeps getting bigger.
Shell sort is one way of dealing with this problem. As the quoted discussion says, the idea is to break the series into a series of interspersed sub-sequences and sort those. To begin with, the elements of the sub-sequences are very far from each other, but as the sorting proceeds, they get closer and closer together, until we reach insertion sort as a sort of limiting case.
Let me be more clear: we don't sort all of these sub-sequences at once (a misunderstanding I had the first time I encountered the shell sort) (or maybe we do, but we don't have to? -- see the implementation below) -- rather, we sort a new sub-sequence each pass, and each pass that sub-sequence comprises a larger and larger portion of the sequence to be sorted. So maybe the first pass we look at elements 8, 16, and 24, putting them in order. Then we look at elements 4, 8, 16, 20, and 24, putting them in order. Then 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, and 24, putting them in order. And then finally the whole series.
But putting it that way you can see how the shell sort is supposed to work. Each run-through is an expansion on the previous run-through. When we move from -- 8, 16, and 24 -- to -- 4, 8, 12, 16, 20, and 24 -- we find that some of the elements we're looking at have already moved. So maybe element 24 still isn't in the right place -- but it's going to be closer to being in the right place. I would guess (though I'm not totally sure) that by the time we get to the last pass, we have to swap each element out at most once -- as opposed to the N - 1 times required for the simple insertion sort.
Now for Algorithms' implementation, along with some commentary:
Think of h as the gap between elements of a sub-sequence, which gets smaller as the sub-sequences include more and more of the whole sequence. To understand the sequence of h values defined here (the "increment sequence" mentioned above), we can think of the case in which N is 14. One third of 14, as far as the computer is concerned, is 4, so h is set there. (From what I can tell, the idea is to get our first h to be close to 1/3 of N).
Then, as the comment says, we look at sub-sequences of length h, starting from h and continuing to the end of the series, putting them in order as we go. So we would traverse and arrange the following elements:
0 and 4
1 and 5
2 and 6
3 and 7
0, 4, and 8
1, 5, and 9
2, 6, and 10
3, 7, and 11
4, 8, and 12
1, 5, 9, and 13
2, 6, 10, and 14
(This is the source of my confusion above. It appears we are sorting all sub-sequences of elements h apart as we go. Notice that the sub-sequences are still being sorted via insertion sort: we look first at the first two elements of each sequence, then at the first three, and so on, in accordance with what I said.)
From there, h becomes 1, and we move directly into insertion sort.
First, a little tidbit. When I first heard about this algorithm, I thought it had something to do with shell games. Turns out a man named Donald Shell discovered this method of sorting, whence the name.
The Algorithms book gives the following explanation (Sedgewick and Wayne, Algorithms, 4th ed., p. 258):
The idea is to rearrange the array to give it the property that taking every hth entryShell sort is a variant of insertion sort, so it helps to start by explaining how that works. For insertion sort, you start at the beginning of the series to be sorted and put each consecutive element into the right order. So start by putting the first element into the right order (automatic), then the first two, then the first three, and so on. You don't have to start from scratch each time, fortunately -- generally, when it comes time to put the first n elements in order, the first n-1 elements have already been sorted. In that case, putting the n-th element in the right place means working backwards until it lands in the right place: switch it with the element to its left so long as that element is greater than it.
(starting anywhere) yields a sorted subsequence. Such an array is said to be h-sorted. Put
another way, an h-sorted array is h independent sorted subsequences, interleaved together. By h-sorting for some large values of h, we can move items in the array long distances and thus make it easier to h-sort for smaller values of h. Using such a procedure for any sequence of values of h that ends in 1 will produce a sorted array: that is shellsort. The implementation in Algorithm 2.3 on the facing page uses the sequence of decreasing values 1⁄2(3^k -1), starting at the largest increment less than N/3 and decreasing to 1. We refer to such a sequence as an increment sequence. Algorithm 2.3 computes its increment sequence; another alternative is to store an increment sequence in an array.
The problem with insertion sort is that it doesn't perform well when there are smaller items at the end of series being sorted. If the first is last, then it will take N - 1 exchanges to get it in place (for a series of size N). So a bad case for insertion sort would be a series like this:
8, 7, 6, 5, 4, 3, 2, 1
The number of exchanges required to get each element in place increases...proportionally, I think? Anyway, it keeps getting bigger.
Shell sort is one way of dealing with this problem. As the quoted discussion says, the idea is to break the series into a series of interspersed sub-sequences and sort those. To begin with, the elements of the sub-sequences are very far from each other, but as the sorting proceeds, they get closer and closer together, until we reach insertion sort as a sort of limiting case.
Let me be more clear: we don't sort all of these sub-sequences at once (a misunderstanding I had the first time I encountered the shell sort) (or maybe we do, but we don't have to? -- see the implementation below) -- rather, we sort a new sub-sequence each pass, and each pass that sub-sequence comprises a larger and larger portion of the sequence to be sorted. So maybe the first pass we look at elements 8, 16, and 24, putting them in order. Then we look at elements 4, 8, 16, 20, and 24, putting them in order. Then 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, and 24, putting them in order. And then finally the whole series.
But putting it that way you can see how the shell sort is supposed to work. Each run-through is an expansion on the previous run-through. When we move from -- 8, 16, and 24 -- to -- 4, 8, 12, 16, 20, and 24 -- we find that some of the elements we're looking at have already moved. So maybe element 24 still isn't in the right place -- but it's going to be closer to being in the right place. I would guess (though I'm not totally sure) that by the time we get to the last pass, we have to swap each element out at most once -- as opposed to the N - 1 times required for the simple insertion sort.
Now for Algorithms' implementation, along with some commentary:
Think of h as the gap between elements of a sub-sequence, which gets smaller as the sub-sequences include more and more of the whole sequence. To understand the sequence of h values defined here (the "increment sequence" mentioned above), we can think of the case in which N is 14. One third of 14, as far as the computer is concerned, is 4, so h is set there. (From what I can tell, the idea is to get our first h to be close to 1/3 of N).
Then, as the comment says, we look at sub-sequences of length h, starting from h and continuing to the end of the series, putting them in order as we go. So we would traverse and arrange the following elements:
0 and 4
1 and 5
2 and 6
3 and 7
0, 4, and 8
1, 5, and 9
2, 6, and 10
3, 7, and 11
4, 8, and 12
1, 5, 9, and 13
2, 6, 10, and 14
(This is the source of my confusion above. It appears we are sorting all sub-sequences of elements h apart as we go. Notice that the sub-sequences are still being sorted via insertion sort: we look first at the first two elements of each sequence, then at the first three, and so on, in accordance with what I said.)
From there, h becomes 1, and we move directly into insertion sort.
Comments
Post a Comment