Newsletters |

By Bill McKeeman, MathWorks and Loren Shure, MathWorks

While choosing how to upgrade `sort`

for MATLAB 7, we used MATLAB to prototype `sort`

algorithms and to add the new `sort`

direction qualifiers '`ascend`

' and '`descend`

'. We translated the prototype from MATLAB into C++ to add this latest version of `sort`

to Release 14—and found a few surprises in the process.

Algorithms in MathWorks products and user applications use the MATLAB `sort`

function for tasks ranging from alphabetizing lists to choosing significant singular values.

There are 12 types of sortable matrices in MATLAB: the eight integer types `([u]int8/16/32/64);`

plus single, double, char, and logical. Single and double may be real or complex; double and logical can be sparse. Full n-dimensional (ND) arrays can be sorted along any selected dimension, in either ascending or descending order. `sort`

can also return the permutation vector relating the input values to the outputs, with call syntax ranging from the most simple `r = sort(v);`

to the most complete `[r,i] = sort(v, dim, dirflag);`

The most readily available upgrade would use `qsort`

from the C library, but `qsort`

was too specific for the sort tasks that MATLAB requires. The C library `qsort`

has a hard-coded stride of 1 (data is assumed to be arranged in contiguous memory). MATLAB also needs to sort data containing `NaN`

values, which `qsort`

does not accommodate.

The available C `qsort`

implementations share a common characteristic: to achieve type independence, data access is indirect (via void* pointers) and the user supplies a comparison function. As a result, general-purpose sort algorithms written in C are inelegant and somewhat slow.

`sort`

, we expected to require four levels (Bentley, J. L. and M. D. McIlroy, 1993. Engineering a sort function. Software—Practice and Experience 23, 11, 1249--1265.

Knuth, D. E. 1998. The Art of Computer Programming, Volume 3: Sorting and Searching, 2nd ed. Boston: Addison-Wesley.

Cormen et al. 2001. Introduction to Algorithms, 2nd ed. Columbus: McGraw-Hill.

`sort`

AlgorithmsWe explored speed and memory tradeoffs by prototyping the basic algorithms in M. The prototypes work only on vectors. (Full ND arrays are discussed in Extending Vector Sorts to N-Dimensions later in this article.)

` function v = viSort(v) % vector insert`

for i=1:numel(v)-1

[m, k] = min(v(i:end));

v(i+1:i+k-1) = v(i:i+k-2) % make room

v(i) = m; % place min

end

` function v = vqSort(v) % vector quick sort`

if numel(v) > 1

piv = median(v); % divide

lss = v(v < piv); % and

eql = v(v == piv); % conquer

gtr = v(v > piv);

v = [vqSort(lss), eql, vqSort(gtr)];

end

` function v = vbSort(v) % vector bin sort`

n = numel(v);

if n > 1

mn = min(v);

mx = max(v);

binAvg = 200; % experimental

binCt = ceil(n/binAvg);

slope = binCt/(mx-mn);

binNo = floor(slope.*(v-mn))+1; % assign bin

bins = cell(1, binCt+1); % allocate bins

for i=1:binCt+1

bins{i} = vqSort(v(binNo==i));

end

v = [bins{:}]; % reassemble

end

As we planned to have `vqSort`

call `viSort`

when there were few enough data points, we modified the *quick sort* design.

` function v = vqSort(v) % quick sort`

if numel(v) < 32

v = viSort(v); % experimental

else

piv = median(v); % need approximation

lss = v(v < piv);

eql = v(v == piv);

gtr = v(v > piv);

v = [vqSort(lss), eql, vqSort(gtr)];

end

We translated the M-code vector sorts into C in-place sorts so that we could test our four-level design hypothesis. The results surprised us. For vectors of random data on Intel x86 hardware, the most modern *quick sorts* were slower than the already-implemented *quick sort* in MATLAB! Algorithmic changes made to the modern sorts to ensure good behavior on less-than-random input were costly, while the existing MATLAB `sort`

cleverly uses its pivot-picking approximation to do some sorting work.

We finally decided that best performance on random data was more useful than adequate performance on all data, and kept the existing MATLAB *quick sort*. (Performance results depend somewhat on the underlying hardware.)

We took into account that the C implementation of *bin sort* needs three passes: one pass to decide on the number of bins, one to bin the data, and one to call `qsort`

on each bin. The results are left in place, conveniently ordered for output. We can justify this considerable overhead because it results in an order *N*, instead of *N*log*N*, run-time algorithm.

Another surprise! We experimented to find a cutover from *quick sort* to *bin sort*, but never found it. Our best effort at a C-implemented *bin sort* was slower than our C-implemented *quick sort*, even on datasets containing up to 10 million elements. As a result, we abbreviated our development plan to a two-stage sort: *insert/quick*. Both these algorithms already have satisfactory implementations in MATLAB.

`NaN`

and `Inf`

Next, we extended the double sorts to the sortable MATLAB types. We recoded the `insert/quick`

sort pair as C++ template functions parameterized by the underlying MATLAB types.

MATLAB sorts `NaN`

values high, beyond all other single/double values. Because `NaN`

and `Inf`

need to be dealt with only for the real types, we implemented a function (called a jacket) to handle the cases for single and double. The jacket calls the usual sort routine on the finite values and handles the nonfinite values correctly for MATLAB. This jacket’s sorting prepass counts the `NaN`

s and replaces them with `+Inf`

. After the sort, the jacket overwrites an appropriate number of `+Inf`

values with `NaN`

s at the top end of the sorted vector. (Had we still been doing *bin sort*, this jacket would also have had to count and remove `Inf`

values, as the bin selection algorithm depends on finite values.)

` function v = vSort(v) % sort NaN jacket`

if isfloat(v)

nans = isnan(v);

v(nans) = inf;

v = vqSort(v);

v(end-sum(nans)+1:end) = nan;

else

v = vqSort(v);

end

We extended the vector sorts to sort along a user-selected dimension of an ND array. Array data is laid out in memory in column-major order: elements in a column are adjacent in memory. Elements along another dimension are separated by a constant distance (the so-called stride, usually larger than 1). We prototyped in M first

` function v = strideSort(v, d) % stride jacket`

s = size(v);

stride = prod(s(1:d-1));

for i = 1 : stride*s(d) : numel(v) % #of elements

for j = i : i+stride-1

idx = j : stride : j+stride*s(d)-1;

v(idx) = vSort(v(idx));

end

end

We recoded the C++ in-place *insert/quick* sort pair to step by a stride, and then wrote a second level of C++ jacket to call them.

We extended the array sorts to provide the permutation vectors, which required that the sort be stable, (i.e., sorting may not reorder equal values). The algorithms `viSort`

, `vqSort`

, and `vbSort`

are stable, but our C++ implementations were not. Without requesting the permutation vector, you cannot tell whether the sort is unstable.

Rather than recode the *insert/quick* sorts, we chose to make a post-sort pass over the sorted data, detect runs of equal values, and then sort the corresponding subvector in the parallel index vector. Any `NaN`

s presented a special problem because they had been interspersed with the `Inf`

values. We found the original indices of the `NaN`

s and `Inf`

s by making a pass over the original unsorted data.

The MATLAB algorithm for complex values first sorts by absolute value, then sorts by phase angle, and finally sorts to ensure stability for duplicate values. We decided to leave the MATLAB C code for sorting complex double values in place and call it from the jackets described above.

We passed the direction flags '`ascend`

' and '`descend`

' to the jackets. If the flag is '`descend`

', the output is reversed just before being returned. After the reversal, we ran the index vector sorts to restore stability.

We tested for the following criteria:

- The sorts should accept all valid inputs (12 types of data, shapes from empty arrays to large ND ensembles, and valid dimension and direction flags) and otherwise fail with an error message.
- If requested, the sorts should produce the permutation vector.
- The sorts should “run as fast as C.” Suppose T means “type” and S means “shape.” Then we guarantee the following.

Statement | Guarantees |
---|---|

r = sort(v, dim, flags) | T(r)=T(v) _{^} S(r)=S(v) |

[r,i] = sort(v, dim, flags) | T(r)=T(v) _{^} T(i)=double _{^} S(r)=S(i)=S(v) |

We checked that in each case the outputs were sorted and the permutation vectors applied to the inputs, given the outputs. We coded all the test routines in M, using the MATLAB functions `class`

and `size`

.

To test performance, we raced our algorithm against the C standard library `qsort`

and the C++ Standard Template Library `sort`

for simple cases. The MATLAB `sort`

function is about five times faster than C `qsort`

and a little faster than the C++ STL `sort`

.

MATLAB provided a productive environment to explore enhancements to the `sort`

algorithm. Prototyping the possibilities in MATLAB before recoding the algorithms in C++ helped us quickly understand our programming choices. We can still use the prototype M versions of `sort`

as test components when we verify the MATLAB `sort`

function’s behavior. Using MATLAB to upgrade MATLAB enabled us to deliver a higher performance `sort`

, and we can use M-code to explain how we did it.

`find`

`sort`

with options '`ascend`

' and '`descend`

’, the function `find`

has added functionality in MATLAB 7 that lets you find the '`first`

' or '`last`

' elements that meet some criterion. For example, if you want to find the first element in array X less than 42, use`find(X<42,1,'first')`

This operation stops as soon as the first element meeting the condition is encountered, which can result in significant time savings if the array is large and the value you find happens to be far from the end.

If, instead, you want to find the final 10 elements > 17, then use

`find(X>17,10,'last')`

Again, the operation returns as soon as 10 values are found. If fewer than the requested number is found after the entire array is scanned, the return vector will have fewer than the requested number of elements.

Before

`sort`

had the option to return items in descending order, you had to perform the sort and then index from the end. With MATLAB 7, `sort`

has the new '`descend`

' option, and `find`

has the new '`last`

' option to combine the find with the indexing into a single function call.Published 2004