Question: There is a size-N array of integers whose values range from 1 to N. Some integers are duplicated. Find the first duplicate and its index without using any extra memory (not even O(1)).
Solution
There are a few keys in the question that we should consider:
- The array size is a finite number N which is also the upper limit for any integer inside the array. Hence, if the array size is 10, then there are ten elements in the array and each of those elements is between 1 and 10.
- Some elements are just duplicates. That means the array may not contain all numbers between 1 and N.
- We need to find only the first duplicate. Thus, as soon as we discover one, we can stop.
- No extra memory is allowed, not even O(0). That means we can’t use any extra storage or declare any new variable inside our algorithm because that is O(1).
To solve this problem, we must keep track of elements in order to figure out which one is duplicate. However, regardless of method, we must not use any extra memory. Here is where we must depend on the information given by the problem.
Method 1 - Using modulo operator
We know that each element’s value is between 1 and N, so if we increase that value by (N + 1), the modulus of that value and N + 1 is still the same. For example, modulus of 2 and (N + 1) is 2 and modulus of 2 + N + 1 is still 2. Thus, we can mark a visited element by adding N + 1 into its value or any element’s value in the array because that doesn’t change its modulus with N + 1 at all!
Furthermore, we know that each element’s value is between 1 and N. Hence, we can use each element’s value as a key and the element at array[key] as the mark. For example, if the current element is 4 then we can check if its value has been visited, and thus a duplicate, by looking at the value of the element at index 4 in the array, if the current element is 2 the we check the value of the element at index 2 in the array, and so on.
- One final point, we need to traverse the array and check each element, so we need a loop that run till we find the duplicate or till we reach the end of the array. The problem with such loop is that it needs additional counter / sentinel value as the loop’s termination condition! That requires memory allocation. Fortunately, we know that the first element is not a duplicate because there is no other element in front of it. Thus, we can use the first element in the array as our counter.
void print\_repeats(unsigned a\[\], unsigned n)
{
unsigned i, \_2n = 2\*n;
for(i = 0; i < n; ++i)
if(a\[a\[i\] % n\] < \_2n) a\[a\[i\] % n\] += n;
for(i = 0; i < n; ++i)
if(a\[i\] >= \_2n) printf("%u ", i);
}
```This algorithm is short but difficult to understand without doing an example, so lets take a look at the algorithm implementation and go straight to an example:
**Dry run**
Input - n be 7 and array be {1, 2, 3, 1, 3, 6, 6}
Output - 1
first foreach -
i = 0, a\[a\[0\]% 7 \] = a\[1\] = 2 < 14 => a\[1\] = 10 and array = {1,9 , 3,1,3,6,6,}
i = 1, a\[a\[1\]%7\] = a\[2\] = 3 <14 => a\[2\] = 10 and array = {1,9,10,1,3,6,6}
i = 2, a\[a\[2\]%7\] = a\[3\] = 1 < 14 => a\[3\] = 8 and array = {1,9,10,8,3,6,6}
i = 3, a\[a\[3\]%7\] = a\[1\] = 9< 14 => a\[1\] = 16 and array = {1,16,10,8,3,6,6}
Ideally we will break here as we need only 1 as answer. We can continue if want the other answers as well, like 3 and 6.
i = 4, a\[a\[4\]%7\] = a\[3\] = 8 <14 => a\[3\] = 15 and array = {1,16,10,15,3,6,6}
i = 5,a\[a\[5\]%7\] = a\[6\] = 6 < 14 => a\[6\] = 13 and array = {1,16,10,15,3,6,13}
i = 6, a\[a\[6\]%7\] = a\[13%7\] = 6 =<14 => a\[6\] = 20 and array becomes {1,16,10,15,3,6,20} and indices where number is more than 2n is our answer.
**Method 2 - Using sign**
traverse the list for i= 0 to n-1 elements
{
check for sign of A[abs(A[i])] ;
if positive then
make it negative by A[abs(A[i])]=-A[abs(A[i])];
else // i.e., A[abs(A[i])] is negative
this element (ith element of list) is a repetition
}
Here is the code:
void printRepeating(int arr[], int size)
{
int i;
printf(“The repeating elements are: \n”);
for (i = 0; i < size; i++)
{
if (arr[abs(arr[i])] >= 0)
arr[abs(arr[i])] = -arr[abs(arr[i])];
else
printf(” %d “, abs(arr[i]));
}
}
Time complexity - O(n), space complexity - O(1), we can make it O(0) for space.
**Reference**
* [http://codesam.blogspot.com/2011/04/find-first-duplicated-integer-in-array.html](http://codesam.blogspot.com/2011/04/find-first-duplicated-integer-in-array.html)
* [http://stackoverflow.com/questions/5739024/finding-duplicates-in-on-time-and-o1-space/5739336#5739336](http://stackoverflow.com/questions/5739024/finding-duplicates-in-on-time-and-o1-space/5739336#5739336)
* [http://www.geeksforgeeks.org/find-duplicates-in-on-time-and-constant-extra-space/](http://www.geeksforgeeks.org/find-duplicates-in-on-time-and-constant-extra-space/)