kenics.net

Technical notes on perl, python, php, sql, cgi, c/c++, q/kdb+, unix/shell, revision control tools, data structures & algorithms, and their applications into web services and other various forms of software engineering.

radix sort

########################### 
####    radix sort     #### 
########################### 
 
- another non-comparison based "integer" sort algo that achives O(n) with certain assumptions. 
- (doesnt have to be 10-base integer, it can be other radix. but for an illustrative purpose, we stick to 10-base in below examples. ) 
- a stable sort.(if least-significant-bit method is used, which we will consider in this article) 
- (most-significant-bit is not discussed here.) 
- can be seen as a kind of bucket sort where bucket size = radix value. (10 in case of 10-base integer) 
 
 

# flow 

 
lets say there are n input elemtns to sort. 
 
1) sort all elements based on its least significant digit(LSD). 
2) sort all elements again based on its second LSD. 
3) repeat till the most significant digit of the biggest digit element. 
 
====> now dont use comparison based sort for each iteration, because that defeats the purpose of digit-based(hence the term "radix") sort. 
====> we use counting/bucket sort for each iteration. thus the complexity O(k*n) where k = max digits. 
====> for this to work, we must consider certain assuptions as per the next section. 
 
 
e.g. 
 
input elements = {324,13,84,467,73,9,70,38,14,931,81,20}  // 
 
n = 12  // this is given or you count at your first iteration 
k = 3   // this you have to determine somehow 
 
now bucket sort based on the LSD. 
 
 bucket 
  [0] = 70 - 20    // preserving the order from the input elem is important for this to be a stable sort 
  [1] = 931 - 81 
  [2] = null 
  [3] = 13 - 73 
  [4] = 324 - 84 - 14 
  [5] = null 
  [6] = null 
  [7] = 467 
  [8] = 38 
  [9] = 9 
 
 
now pop each of bucket in sequence. 
 
70 20 931 81 13 73 324 84 14 467 38 9 
 
next iteration based on the second LSD 
 
 bucket 
  [0] = 9            // notice how we interpret 9 = 009 
  [1] = 13 - 14 
  [2] = 20 - 324 
  [3] = 931 - 38 
  [4] = null 
  [5] = null 
  [6] = 467 
  [7] = 70 - 73 
  [8] = 81 - 84 
  [9] = null 
 
now pop each of bucket in sequence. 
 
9 13 14 20 324 931 38 467 70 73 81 84 
 
next iteration based on the third LSD 
 
 bucket 
  [0] = 9 - 13 - 14 - 20 - 38 - 70 - 73 - 81 - 84 
  [1] = null 
  [2] = null 
  [3] = 324 
  [4] = 467 
  [5] = null 
  [6] = null 
  [7] = null 
  [8] = null 
  [9] = 931 
 
 
now pop each of bucket in sequence. 
 
9 13 14 20 38 70 73 81 84 324 467 931    // done 
 
 
 
 
####################### 
###   assumptions   ### 
####################### 
 
as it will be discussed in the complexity analysis section, non-comparison based sort requires certain assumptions or special knowledge about the input data set. 
in case of radix sort, the assumptions are; 
- input elements are integer. (or at least integer representation of data) 
- the size of max(or average depending on your implementation) digits is smaller than N(= number of input elements) 
 
 
(ref) 
http://en.wikipedia.org/wiki/Radix_sort 
 
 
 
################################## 
####    complexity analysis   #### 
################################## 
 
n = number of elements to sort 
k = max(or average depending on your implementation) element length(digits) 
 

# time 

 
best    : O(k*n) 
average : O(k*n) 
worst   : O(k*n) 
 
as discussed in prev sections around the flow and the assumptions about radix sort, we just iterate the whole input for each digit. each iteration you put each element into the corresponding bucket. effectively counting/bucket sort. thus O(k*n). 
as such, for radix sort to be efficient, i.e. O(n), k has to be insignificantly small compared to n. 
 
 

# space 

 
n = number of elements to sort 
m = bucket size = radix value 
 
total     : O(n + auxiliary) = O(n + m+n) = O(m+n) 
auxiliary : O(m+n) 
 
at each iteration, we prepare bucket[m] and each bucket is a LL or queue. thus O(m+n). again, assuming m is insignificantly small to a large N, this is essentiall auxiliary O(n) space complexity. and m can be diff when you use counting sort or bucket sort for the underlying sort at each iteration of digit. 
 
 
 
###################################### 
####    implementation example    #### 
###################################### 
 
lets use queue for each bucket[i] for the simplicity's sake. 
(ref) 
http://www.cplusplus.com/reference/queue/queue/ 
 
assume input elems as 10-base integer. 
 
to determine the number of k, max digit, i used a counter += input[i] / 10^n; 
if, at the end of the iteration, the counter is 0, we know we are at the kth iteration. 
 
 
------------------------------------------- // radix.cpp 
 
#include <iostream> 
#include <queue> 
using namespace std; 
 
void printArr(int* arr, int size) 

  for(int i = 0; i < size ;i++) 
    cout << arr[i] << " "; 
  cout << endl; 

 
int power(int base, int expon)  // can use power() from <cmath> if preferred 

  if(expon == 0) 
    return 1; 
  return base * power(base, expon-1); 

 
void radix_sort(int* arr, int size) 

 
  queue<int> bucket[10];  // 10 is fixed for 10-base int 
  int counter = 0; 
  int k = 0; 
 
  while(true) // loop until we go k+1 iteration 
    {         // that detection is done using counter as below 
 
      int idx = 0; 
      for(int i = 0; i < size ; i++) 
        { 
          int div = arr[i] / power(10,k); 
          idx = div % 10 ; 
          bucket[idx].push(arr[i]); 
 
          if(div > 0) counter++; 
        } 
      if(counter == 0) 
        break; 
      else 
        counter = 0; 
 
      int tmpidx = 0; 
      for(int i = 0; i < 10 ; i++)   // 10 = bucket size 
        while(!bucket[i].empty()) 
          { 
            arr[tmpidx++] = bucket[i].front(); 
            bucket[i].pop(); 
          } 
      k++; 
    } 

 
int main() 

  int inputArr[] = {123,56,7777,291,14,6,9,27,83,70,990,4321}; 
  int arrSize = sizeof(inputArr)/sizeof(inputArr[0]); 
 
  printArr(inputArr,arrSize);     //  123 56 7777 291 14 6 9 27 83 70 990 4321 
  radix_sort(inputArr,arrSize); 
  printArr(inputArr,arrSize);     //  6 9 14 27 56 70 83 123 291 990 4321 7777 
 
  return 0; 

 
------------------------------------------- 
 
 
 
 

  1. 2014-07-07 23:23:21 |
  2. Category : algo
  3. Page View:

Google Ads