Skip to main content

Shell Sort

Today I spent a little bit of time researching the "Shell" sort.  I wanted to post a few notes about the Princeton Algorithms Course's implementation to help me solidify my understanding.

First, a little tidbit.  When I first heard about this algorithm, I thought it had something to do with shell games.  Turns out a man named Donald Shell discovered this method of sorting, whence the name.

The Algorithms book gives the following explanation (Sedgewick and Wayne, Algorithms, 4th ed., p. 258):
The idea is to rearrange the array to give it the property that taking every hth entry
(starting anywhere) yields a sorted subsequence. Such an array is said to be h-sorted. Put
another way, an h-sorted array is h independent sorted subsequences, interleaved together. By h-sorting for some large values of h, we can move items in the array long distances and thus make it easier to h-sort for smaller values of h. Using such a procedure for any sequence of values of h that ends in 1 will produce a sorted array: that is shellsort. The implementation in Algorithm 2.3 on the facing page uses the sequence of decreasing values 1⁄2(3^k -1), starting at the largest increment less than N/3 and decreasing to 1. We refer to such a sequence as an increment sequence. Algorithm 2.3 computes its increment sequence; another alternative is to store an increment sequence in an array.
Shell sort is a variant of insertion sort, so it helps to start by explaining how that works.  For insertion sort, you start at the beginning of the series to be sorted and put each consecutive element into the right order.  So start by putting the first element into the right order (automatic), then the first two, then the first three, and so on.  You don't have to start from scratch each time, fortunately -- generally, when it comes time to put the first n elements in order, the first n-1 elements have already been sorted.  In that case, putting the n-th element in the right place means working backwards until it lands in the right place: switch it with the element to its left so long as that element is greater than it.

The problem with insertion sort is that it doesn't perform well when there are smaller items at the end of series being sorted.  If the first is last, then it will take N - 1 exchanges to get it in place (for a series of size N).  So a bad case for insertion sort would be a series like this:

8, 7, 6, 5, 4, 3, 2, 1

The number of exchanges required to get each element in place increases...proportionally, I think?  Anyway, it keeps getting bigger.

Shell sort is one way of dealing with this problem.  As the quoted discussion says, the idea is to break the series into a series of interspersed sub-sequences and sort those.  To begin with, the elements of the sub-sequences are very far from each other, but as the sorting proceeds, they get closer and closer together, until we reach insertion sort as a sort of limiting case.

Let me be more clear: we don't sort all of these sub-sequences at once (a misunderstanding I had the first time I encountered the shell sort) (or maybe we do, but we don't have to? -- see the implementation below) -- rather, we sort a new sub-sequence each pass, and each pass that sub-sequence comprises a larger and larger portion of the sequence to be sorted.  So maybe the first pass we look at elements 8, 16, and 24, putting them in order.  Then we look at elements 4, 8, 16, 20, and 24, putting them in order.  Then 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, and 24, putting them in order.  And then finally the whole series. 

But putting it that way you can see how the shell sort is supposed to work.  Each run-through is an expansion on the previous run-through.  When we move from -- 8, 16, and 24 -- to -- 4, 8, 12, 16, 20, and 24 -- we find that some of the elements we're looking at have already moved.  So maybe element 24 still isn't in the right place -- but it's going to be closer to being in the right place.  I would guess (though I'm not totally sure) that by the time we get to the last pass, we have to swap each element out at most once -- as opposed to the N - 1 times required for the simple insertion sort.

Now for Algorithms' implementation, along with some commentary:


Think of h as the gap between elements of a sub-sequence, which gets smaller as the sub-sequences include more and more of the whole sequence.  To understand the sequence of h values defined here (the "increment sequence" mentioned above), we can think of the case in which N is 14One third of 14, as far as the computer is concerned, is 4, so h is set there.  (From what I can tell, the idea is to get our first h to be close to 1/3 of N).

Then, as the comment says, we look at sub-sequences of length h, starting from h and continuing to the end of the series, putting them in order as we go.  So we would traverse and arrange the following elements:

0 and 4
1 and 5
2 and 6
3 and 7
0, 4, and 8
1, 5, and 9
2, 6, and 10
3, 7, and 11
4, 8, and 12
1, 5, 9, and 13
2, 6, 10, and 14

(This is the source of my confusion above.  It appears we are sorting all sub-sequences of elements h apart as we go.  Notice that the sub-sequences are still being sorted via insertion sort: we look first at the first two elements of each sequence, then at the first three, and so on, in accordance with what I said.)

From there, h becomes 1, and we move directly into insertion sort.

Comments

Popular posts from this blog

Getting Geodata From Google's API

The apps I'm going to be analyzing are part of Dr. Charles Severance's MOOC on Python and Databases and work together according to the following structure (which applies both in this specific case and more generally to any application that creates and interprets a database using online data). The data source, in this case, is Google's Google Maps Geocoding API.  The "package" has two components: geoload.py  and geodump.py .  geoload.py  reads a list of locations from a file -- addresses for which we would like geographical information -- requests information about them from Google, and stores the information on a database ( geodata.db ).  geodump.py  reads and parses data from the database in JSON, then loads that into a javascript file.  The javascript is then used to create a web page on which the data is visualized as a series of points on the world-map.  Dr. Severance's course focuses on Python, so I'm only going to work my way through ...

Compiling and Executing Java Files With -cp

I decided I was going to "man up" and figure out how to compile a java program with an external dependency from the command line instead of relying on an IDE-- the DOS command line, to be more specific. I ran into a few problems: 1.  The external dependency was given to me as a java file.  I experimented compiling it as a .jar, but I wasn't sure how to import a class from a .jar, so I ended up compiling it into a class. 2.  When I tried to run the file, I got an error saying that the class had been compiled with a different version of Java than my JRE.  The Internet told me to check my path variable for Java.  It sure looked like it was pointing to the latest JRE (and the same version of Java as my compiler).  I asked the Internet again and found the following command: for %I in (java.exe) do @echo %~$PATH:I I'm not exactly sure what the syntax of that magic command is (intuitively it's returning the path that executes when I run the "java" com...

Quick Find / Quick Union (Connected Nodes)

Setup This week I learned about the "Quick Find" or "Quick Union" algorithm. Imagine an NxN grid of nodes, some of which are connected by lines. A connection can be interpreted as accessibility: if two nodes are connected, you can get from one to the other. Every node is accessible to itself: to get where you already are, stay there. Also, If you can get from A to B, you can go back from B to A. And if you can get from A to B and from B to C, then you can get from A to C. As a consequence, the connection between nodes divides the grid into regions of mutually accessible nodes. You can travel from any node in a given region to any other node in that region -- but not to any nodes outside that region (exercise to reader -- proof by contradiction). The problem has two parts. First, find a way to represent this grid structure and the accessibility relation; second, use your schema to efficiently calculate whether two given nodes are accessible to each other. ...