## 217. Contains Duplicate

Given an array of integers, find if the array contains any duplicates. Your function should return true if any value appears at least twice in the array, and it should return false if every element is distinct.

b'
\n\n

\n

\n

## Solution

\n

#### Approach #1 (Naive Linear Search) [Time Limit Exceeded]

\n

Intuition

\n

For an array of integers, there are pairs of integers. Thus, we may check all pairs and see if there is any pair with duplicates.

\n

Algorithm

\n

To apply this idea, we employ the linear search algorithm which is the simplest search algorithm. Linear search is a method of finding if a particular value is in a list by checking each of its elements, one at a time and in sequence until the desired one is found.

\n

For our problem, we loop through all integers. For the th integer nums[i], we search in the previous i-1 integers for the duplicate of nums[i]. If we find one, we return true; if not, we continue. Return false at the end of the program.

\n

To prove the correctness of the algorithm, we define the loop invariant. A loop invariant is a property that holds before (and after) each iteration. Knowing its invariant(s) is essential for understanding the effect of a loop. Here is the loop invariant:

\n
\n

Before the next search, there are no duplicate integers in the searched integers.

\n
\n

The loop invariant holds true before the loop because there is no searched integer.\nEach time through the loop we look for any any possible duplicate of the current element.\nIf we found a duplicate, the function exits by returning true; If not, the invariant still holds true.

\n

Therefore, if the loop finishes, the invariant tells us that there is no duplicate in all integers.

\n

Java

\n
public boolean containsDuplicate(int[] nums) {\n    for (int i = 0; i < nums.length; ++i) {\n        for (int j = 0; j < i; ++j) {\n            if (nums[j] == nums[i]) return true;  \n        }\n    }\n    return false;\n}\n// Time Limit Exceeded\n
\n

Complexity Analysis

\n
\n
• \n

Time complexity : . In the worst case, there are pairs of integers to check. Therefore, the time complexity is .

\n
• \n
• \n

Space complexity : .\nWe only used constant extra space.

\n
• \n
\n

Note

\n

This approach will get Time Limit Exceeded on LeetCode. Usually, if an algorithm is , it can handle up to around . It gets Time Limit Exceeded when .

\n
\n

#### Approach #2 (Sorting) [Accepted]

\n

Intuition

\n

If there are any duplicate integers, they will be consecutive after sorting.

\n

Algorithm

\n

This approach employs sorting algorithm. Since comparison sorting algorithm like heapsort is known to provide worst-case performance, sorting is often a good preprocessing step. After sorting, we can sweep the sorted array to find if there are any two consecutive duplicate elements.

\n

Java

\n
public boolean containsDuplicate(int[] nums) {\n    Arrays.sort(nums);\n    for (int i = 0; i < nums.length - 1; ++i) {\n        if (nums[i] == nums[i + 1]) return true;\n    }\n    return false;\n}\n
\n

Complexity Analysis

\n
\n
• \n

Time complexity : .\nSorting is and the sweeping is . The entire algorithm is dominated by the sorting step, which is .

\n
• \n
• \n

Space complexity : .\nSpace depends on the sorting implementation which, usually, costs auxiliary space if heapsort is used.

\n
• \n
\n

Note

\n

The implementation here modifies the original array by sorting it. In general, it is not a good practice to modify the input unless it is clear to the caller that the input will be modified. One may make a copy of nums and operate on the copy instead.

\n
\n

#### Approach #3 (Hash Table) [Accepted]

\n

Intuition

\n

Utilize a dynamic data structure that supports fast search and insert operations.

\n

Algorithm

\n

From Approach #1 we know that search operations is in an unsorted array and we did so repeatedly. Utilizing a data structure with faster search time will speed up the entire algorithm.

\n

There are many data structures commonly used as dynamic sets such as Binary Search Tree and Hash Table. The operations we need to support here are search() and insert(). For a self-balancing Binary Search Tree (TreeSet or TreeMap in Java), search() and insert() are both time. For a Hash Table (HashSet or HashMap in Java), search() and insert() are both on average. Therefore, by using hash table, we can achieve linear time complexity for finding the duplicate in an unsorted array.

\n

Java

\n
public boolean containsDuplicate(int[] nums) {\n    Set<Integer> set = new HashSet<>(nums.length);\n    for (int x: nums) {\n        if (set.contains(x)) return true;\n        set.add(x);\n    }\n    return false;\n}\n
\n

Complexity Analysis

\n
\n
• \n

Time complexity : .\nWe do search() and insert() for times and each operation takes constant time.

\n
• \n
• \n

Space complexity : .\nThe space used by a hash table is linear with the number of elements in it.

\n
• \n
\n

Note

\n

For certain test cases with not very large , the runtime of this method can be slower than Approach #2. The reason is hash table has some overhead in maintaining its property. One should keep in mind that real world performance can be different from what the Big-O notation says. The Big-O notation only tells us that for sufficiently large input, one will be faster than the other. Therefore, when is not sufficiently large, an algorithm can be slower than an algorithm.

\n