Find All Duplicates in an Array
Given an array of integers, 1 ≤ a[i] ≤n(n= size of array), some elements appeartwiceand others appearonce.
Find all the elements that appeartwicein this array.
Could you do it without extra space and in O(n) runtime?
Example:
Input: [4,3,2,7,8,2,3,1] Output: [2,3]
Brainstorm
Fundamentally, this problem is asking us to keep track of integers and the number of their occurrences. We can use a hash table to help us achieve this.
Approach #1: Hash Table
We can use the unordered_map data structure from C++'s STL. While iterating through the array, if we find that the integer we are at was already found before (as evidenced by the hash map's value for that integer being 1), then we add it to the array of duplicates. Otherwise, we mark it as found (by setting the hash map's value for that integer to 1), and move on.
vector<int> findDuplicates(vector<int>& nums) {
unordered_map<int, int> myMap;
vector<int> dups;
for (int i = 0; i < nums.size(); i++) {
if (myMap[nums[i]] == 1)
dups.push_back(nums[i]);
myMap[nums[i]] = 1;
}
return dups;
}
Time complexity: $$O(n)$$
Iterating through the array will be O(n), and since looking up a key-value pair in a hash table is constant time, the overall time complexity will be O(n).
Space complexity: $$O(n)$$
Since we use a hash table that keeps track of all key-value pairs in the array, and there will be n pairs (for each integer), the space complexity will be O(n).
Could we optimize this algorithm to get rid of a hash table and save space? Yes!
Approach #2: Use the Array Itself to Help Us Keep Track of Duplicates
The optimization comes from modifying the array itself to keep track of duplicates, thereby eliminating extra space. The approach here is that with each element i of the array, we can mark the element at index nums[i] - 1 to something special that indicates duplicate. If we find that nums[i] - 1 has already been marked, that means the element i has a duplicate. For our solution, we'll mark i-1 as the corresponding negative value of i. This works because we'll never have to go back to view previous elements, so we can safely modify them as we wish.
vector<int> findDuplicates(vector<int>& nums) {
vector<int> dups;
for (int i = 0; i < nums.size(); i++) {
// Every time we find a number i, set the number at position i - 1 to negative
int index = abs(nums[i]) - 1;
// If nums[index] is already negative, index is the number that occurs twice
if (nums[index] < 0)
dups.push_back(abs(index+1));
// Else set it to negative
nums[index] = -nums[index];
}
return dups;
}
Time complexity: $$O(n)$$
Iterating through the array will be O(n), and all our operations inside it are constant time.
Space complexity: $$O(1)$$
Having successfully eliminated the extra space, our space complexity is now constant.