## Friday, June 4, 2010

### Algorithm: Determine the element that occurs more than n/2 times in an array of size n

Problem: An array of size n contains integers between 1 to n. You have to determine if there exists an integer which occurs more than n/2 times in the array. Also, find the element. This is also called the majority problem. One more thing, the solution has to be done in linear time and using constant extra space.

Solution: I tried real hard to think it over myself but gave up eventually. Going through the solutions posted in the web, I selected the following ones, which seemed to me more acceptable.

Despite many claims of doing it in under linear time (like logn), it is not possible to do it. I can always find a way to arrange the numbers such that one has to traverse the whole array atleast. Also, sorting is not allowed as it will make it nlogn complexity, greater than linear. O(n) sorting is possible, like radix sort but I think they donot do it in constant extra space. So here are my collected approaches.

Approach-1: A simple solution is to find the median of the integers in the array. Median can be found in O(n) time, called the Selection in Worst case in linear time. Since if an integer exists in majority, it must be the median itself. But median doesn't always guarantee that its the one. Once median is found, we need to run another traversal of array to count the number of times the median element occurs. If its more than n/2, thats the answer else return none.

Before going to next approaches, lets understand the concept of the solution. Since there is only one integer(if any) that can occur more than n/2 times in the array, the trick is every time you encounter 2 different integers adjacent to each other, make arrangements to remove them from the array, or simply pair them together and forget about them. And if adjacent numbers are equal, you continue with next integer in the row but remembering that the integer you encountered just now could be a potential majority element. If we go through this one, we are actually collecting the majority element together, and the in-between different elements are (kind of) removed by pairing them with one of the potential majority element. In other words, we are sacrificing one of the majority element to remove one of the other (different ) element. Eventually, the non-majority elements will vanish but atleast one of majority element will still exist because it is present more than half the times.

Approach-2: Take the first element, consider it as ele, the majority element, and put a mark for it. Traverse through other elements and do the following:
a. If next element in array is equal to ele, increment mark by 1.
b. If next element is not equal to ele, decrement mark by 1.
c. If mark is zero, make mark as 1 and consider the current element as ele.
When traversal is done, take the ele and perform another traversal through the array and count the number of occurrence of ele. If its more than n/2, return it as answer, else return NONE.

pseudocode:
-------------------------------------
m = 0
for (x = i..j)
if (m = 0)
m = 1
ele = A [x]
else if (A [x] = ele)
m = m + 1
else
m = m –1
endif
endfor
Another array traversal and count number of times ele occurs
if count>n/2
return ele;
else
return NULL;
------------------------------------------

Approach 3: This solution is very much similar to the one above, i.e the same concept of taking out pair of different elements out of the array and this one uses a stack to remember the previous encounters.
Create an empty stack. traverse the array and for each new element e,
a. If the stack is empty, push e into the stack.
b. If stack is non-empty, compare e with top of stack e', if both are equal, push e into the stack, else remove e' from top of stack (pop).
c. Once traversal is finished, if stack is non empty, remove the top of stack e' and do another array traversal and count the number of times e' occurs. If its more than n/2, return it as answer else return NONE.
As one can see that the stack performs the work similar to mark in the previous approach.

References:

-- 1. 2. 