Thursday, 2 June 2016

My fav little algorithm

So, just doing a bit more study and i thought i would share my favourite Algorithm so far, count sort.

I like count sort cos it's so easy and it just does it.
Count sort is non-comparison (which means it doesn't compare this number vs this number), and it's stable. This means that things are left in the order in which the appeared in the original list.

So here is how it all goes down, lets work on a list that looks like ..... alist = [2,5,3,0,2,3,0,3]

It has two stages:
Stage 1: Finding the key positions, count the appearance of that number in the list
Stage 2: Using key positions, we place the elements from the input array in the right order in the output array

To find the key positions in our list:
Find the maximum number in our list, in our example this would be.. 5
Now create a list the of [max_num+1] and fill it with zero's... so, in our case that would look like:
index   0, 1, 2, 3, 4, 5, 6
alist = [2, 5 , 3, 0, 2, 3, 0, 3]
kp=    [0, 0, 0, 0, 0, 0, 0] 

So we go thru the list and when we hit a number eg 5 we increment that index value by 1 ie...
index   0, 1, 2, 3, 4, 5, 6, 7
alist = [2, 5 , 3, 0, 2, 3, 0, 3]
kp =    [0, 0, 0, 0, 0, 1, 0] 

So when we have done that, the final array should look like
index   0, 1, 2, 3, 4, 5, 6, 7
alist = [2, 5 , 3, 0, 2, 3, 0, 3]
kp =    [2, 0, 2, 3, 0, 1, 0]

Now we have how often the numbers repeat, we need to use a running sum to organize the array to give us the true key positions we do this by... initializing our sum = 0 and our index = 0
a new c array to keep track of our sums (see what happens with that in a minute)
index    0, 1, 2, 3, 4, 5, 6
kp        [2, 0, 2, 3, 0, 1, 0]
sum = 0
index = 0
idxValue = 2

Put our sum in the array at position index position, each step here we will be incrementing index by 1
how it goes, is we get the value @ index i, switch that value for our current sum, then add the switched value to our sum value. I have tried to show this in one line, reading from left to right
index = 0  idxValue = 2 (kp[index] = 2) sum = 0 kp = [0, 0, 2, 3, 0, 1, 0]  new sum = 0+2
index = 1  idxValue = 0 (kp[index] = 0) sum = 2 kp = [022, 3, 0, 1, 0 new sum = 2+0
index = 2  idxValue = 2 (kp[index] = 2) sum = 2 kp = [022, 3, 0, 1, 0]  new sum = 2+2
index = 3  idxValue = 3 (kp[index] = 3) sum = 4 kp = [0, 2, 2, 4, 0, 1, 0]  new sum = 4+3
index = 4  idxValue = 0 (kp[index] = 0) sum = 7 kp = [0, 2, 2, 4, 0, 1, 0]  new sum = 7+0
index = 5  idxValue = 1 (kp[index] = 1) sum = 7 kp = [2, 0, 2, 4, 7, 7, 0]  new sum = 7+1

We don't go around for the last time in the array

So that gives us the key positions that the numbers should be in

Now we just lookup the value as an index and increment the value in that index
                                                                     index   0, 1, 2, 3, 4, 5, 6, 7
alist[0] = 2, kp[2] = 2 (kp = kp[2] + 1 = 3) output_list = [_, _, 2, _, _, _, _, _ ]
alist[1] = 5, kp[5] = 7 (kp = kp[5] + 1 = 8) output_list = [_, _, _, _, _, _, _, 5 ]
alist[2] = 3, kp[4] = 4 (kp = kp[4] + 1 = 5) output_list = [_, _, _, _, _, 3, _, _ ]
alist[3] = 0, kp[0] = 0 (kp = kp[0] + 1 = 1) output_list = [0, _, _, _, _, _, _, _ ]
alist[4] = 2, kp[2] = 3 (kp = kp[2] + 1 = 4) output_list = [_, _, _, 2, _, _, _, _ ]
alist[5] = 3, kp[3] = 5 (kp = kp[3] + 1 = 6) output_list = [_, _, _, _, _, _, 3, _ ]
alist[6] = 0, kp[0] = 1 (kp = kp[0] + 1 = 2) output_list = [_, 0, _, _, _, _, _, _ ]
alist[7] = 3, kp[3] = 6 (kp = kp[3] + 1 = 7) output_list = [_, _, _, _, 3, _, _, _ ]
                                                             output_list = [0, 0, 2, 2, 3, 3, 3, 5 ]

Notice in this that in red i have highlighted the two indexes that have changed, this is because we already placed a number in the position that was there, so our next number is to the right of that one.

Some code for you to try out..


def key_positions(seq, key):
    array_items = []

    for items in seq:
        array_items.append(key(items))

    # find the max in the array
    max_value = max(array_items)

    # initialise with 0's
    count = [0] * (max_value + 1)

    # count occurances of integers
    for value in array_items:
        count[value] += 1

    # make the initial sum zero
    sum = 0
    print("size of count array = " + str(max_value + 1))
    # iterate over the length of our counting array
    for idx in range(0, len(count)):
        # save the current value we have in the array
        cur_val = count[idx]
        # put the current sum to the index pos in array
        count[idx] = sum
        print("index = {0}: sum = {1} kp = {2} current_value = {3} sum-> {1}+{3}".format(idx, sum, count, cur_val))
        # add the saved value to the sum, and make that the new sum
        sum = sum + cur_val

    print("index = {0}: sum = {1} kp = {2} current_value = {3} sum-> {1}+{3}".format(idx, sum, count, cur_val))

    return count

print(key_positions([2,5,3,0,2,3,0,3], lambda x: x))

2 comments: