Given a dictionary of word - Group all words into set of anagrams of each other

Given a set of words in a dictionary, write a program to group all words which are anagrams of each other in to sets.

input: “bat”, “tar”, “xyz” , “tab”, “tar”
output:[[“bat”, tab”], [“tar”, rat”],”xyz” ]

Anagrams
Two words are anagrams if one of them has exactly same characters as that of the another word.

Example : Anagram & Nagaram are anagrams (case-insensitive).

Approach
The simpler logic would be to :

Use some sort of a hash structure - key being string, and value may be list of words
Sort all words alphabetically and use the sorted word as the key, the value being the non sorted word. So bat will become abt, and it is the key in the dictionary and values will be bat and tab.
If the key is found, keep on adding the original word to the ‘value’ list of strings

Something like this :
Solution 1 : Using hash table

def compare\_sort(x):  
        sorted\_key\_list = list(x)  
        sorted\_key\_list.sort()  
        sorted\_str = ','.join(str(n) for n in sorted\_key\_list)  
        sorted\_str = sorted\_str.replace(',','')  
        return sorted\_str  
  
source = \['tab','bat','atb','tac','cat','xyz'\]  
  
target\_hash = {}  
  
for x in source:  
        sorted\_key = compare\_sort(x)  
        print sorted\_key  
  
        if target\_hash.has\_key(sorted\_key):  
                existing\_list = target\_hash\[sorted\_key\]  
                existing\_list.append(x)  
                target\_hash\[sorted\_key\] = existing\_list  
        else:  
                target\_list =\[x\]  
                target\_hash\[sorted\_key\] = target\_list  
  
print target\_hash

Complexity: number of words in dictionary
space complexity : hash table entries

Solution 2 - Assigning characters with prime numbers

The obvious solution is to map each character to a prime number and multiply the prime numbers. So if ‘a’’ -> 2 and ‘b’ -> 3, then

‘ab’ -> 6
‘ba’ -> 6
‘bab’ -> 18
‘abba’ -> 36
‘baba’ -> 36

To minimise the chance of overflow, the smallest primes could be assigned to the more frequent letters (e,t,i,a,n). Note: The 26th prime is 101.So, for all the words where the integer comes out to be same, you have an anagram. Here is more you can read on this.

See also