Google code jam solution for alien language

Christian Harms's picture

Problem

In the 2009 qualification round there was a simple problem with a nice background story:

After years of study, scientists at Google Labs have discovered an alien language transmitted from a faraway planet. The alien language is very unique in that every word consists of exactly L lowercase letters. Also, there are exactly D words in this language.

Once the dictionary of all the words in the alien language was built, the next breakthrough was to discover that the aliens have been transmitting messages to Earth for the past decade. Unfortunately, these signals are weakened due to the distance between our two planets and some of the words may be misinterpreted. In order to help them decipher these messages, the scientists have asked you to devise an algorithm that will determine the number of possible interpretations for a given pattern.

A pattern consists of exactly L tokens. Each token is either a single lowercase letter (the scientists are very sure that this is the letter) or a group of unique lowercase letters surrounded by parenthesis ( and ). For example: (ab)d(dc) means the first letter is either a or b, the second letter is definitely d and the last letter is either d or c. Therefore, the pattern (ab)d(dc) can stand for either one of these 4 possibilities: add, adc, bdd, bdc.

solution with regexp

We have a set of correct words and patterns which should match the words. How many words are matched by every pattern?

  1. words = [“abc”, “bca”, “dac”, “dbc”, “cba”]

The first pattern “(ab)(bc)(ca)” means, that the first character can be “a” or “b”, the second “b” or “c” and the third “c” or “a”.

The solution should print out how many words in the alien language match the pattern. After so many “match” and “pattern” you know the solution: regular expression! One pattern can be converted into a regular expression:

  1. searchStr = line.replace((“, “[).replace()”,”])

Here the complete solution uses the filter function again (read more about the functional part of python) to shrink the code:

  1. import sys, re
  2. fp = file(sys.argv[1])
  3.  
  4. #read params
  5. (l, d, n) = [int(x) for x in fp.next().split()]
  6.  
  7. #read words
  8. words = [fp.next() for x in range(d)]
  9.  
  10. #read pattern
  11. for i in range(1, n+1):
  12.     searchStr = fp.next().replace("(","[").replace(")","]")
  13.     searchIt = re.compile(searchStr).search
  14.     print "Case #%d: %d" % (i, len(filter(searchIt, words)))
  15. fp.close()

timing

25 words with 10 characters and 10 patterns to check:

  1. time python alien.py alien_small.in > alien_small.out
  2.  
  3. real    0m0.120s
  4. user    0m0.108s
  5. sys     0m0.012s

5000 words with 15 characters and 500 patterns to check:

  1. time python alien.py alien_large.in > alien_large.out
  2.  
  3. real    0m13.398s
  4. user    0m12.821s
  5. sys     0m0.316s

other solutions

I found (longer) solutions in other programming languages, feel free to read and comment them or offer an alternative or even better solution - we will link your article.

Comments

 Twitter Trackbacks for Google code jam solution for alien l's picture

[...] Google code jam solution for alien language | united-coders.com united-coders.com/christian-harms/google-code-jam-solution-for-alien-language – view page – cached In the 2009 qualification round there was a simple problem with a nice background story: Tweets about this link [...]

Anuj Mehta's picture
Nico Heid's picture

good to see a java solution ;-)

Google code jam solution for alien numbers | united-coders.c's picture

[...] again for a new google code jam article about alien numbers. This time instead of decoding words from an other alien language we have to convert numbers from one alien digit system to [...]

Bragaadeesh's picture

Wow.. thats a crisp solution

Solutions for the google code jam 2012 qualify round | unite's picture

[...] very short solution for alien language from the 2009 qualification [...]