Facebook hackercup 2012 - round 1 with merge sort

Christian Harms's picture

The first round of the facebook hackercup 2012 had three problems. All problems have to be solved in 24 hours and coders with three successful solved entries goes to round 2.

The problem "Recover the Sequence" can be solved in two lines of code (+ some lines for input and output).

Recover the Sequence - problem

Merge sort is one of the classic sorting algorithms. Conceptually, a merge sort works as follows. Divide the unsorted list into n sublists (n is 2 in the example), each containing 1 element (a list of 1 element is considered sorted). Repeatedly Merge sublists to produce new sublists until there is only 1 sublist remaining. (This will be the sorted list.)

The problem description starts with the pseudo code - I translated into a python class.

  1. class RecoverProblem(object):
  2.  
  3.     def __init__(self, arr, debug):
  4.         self.arr = arr
  5.         self.debug = debug
  6.  
  7.     def merge_sort(self, arr):
  8.         """normal merge_sort starter."""
  9.         if len(arr)<=1:
  10.             return arr
  11.         mid = len(arr)/2
  12.         left = self.merge_sort(arr[0:mid])
  13.         right = self.merge_sort(arr[mid:])
  14.         return self.merge(left, right)
  15.  
  16.     def cmp_item(self, item1, item2):
  17.         return item1 < item2
  18.  
  19.     def merge(self, arr1, arr2):
  20.         """merge arr1 and arr2 in sorted way."""
  21.         result = []
  22.         while len(arr1)>0 and len(arr2)>0:
  23.             if self.cmp_item(arr1[0], arr2[0]):
  24.                 if self.debug: print "1"
  25.                 result.append(arr1.pop(0))
  26.             else:
  27.                 if self.debug: print "2"
  28.                 result.append(arr2.pop(0))
  29.         result.extend(arr1)
  30.         result.extend(arr2)
  31.         return result
  32.  
  33.     def check_sum(self, arr):
  34.         ret = 1
  35.         for n in arr:
  36.             ret = (31*ret+n) % 1000003
  37.         return ret
  38.  
  39.     def solve(self):
  40.         """sort the array and return the checksum."""
  41.         return self.check_sum(self.merge_sort(self.arr))

To validate the result array an checksum function is defined to comparing your solution with the expected values.

The second part is the problem description: A very important permutation of the integers 1 through N was lost to a hard drive failure. Luckily, that sequence had been sorted by the above algorithm and the debug sequence of 1s and 2s was recorded on a different disk. You will be given the length N of the original sequence, and the debug sequence. Recover the original sequence of integers.

example data

In the first example, N is 2 and the debug sequence is 1. The original sequence was 1 2 or 2 1. The debug sequence tells us that the first number was smaller than the second so we know the sequence was 1 2. The checksum is 994.

In the second example, N is 2 and the debug sequence is 2. This time the original sequence is 2 1.

In the third example, N is 4 and the debug sequence is 12212. The original sequence is 2 4 3 1.

solution

My first try was to reverse this merge-sort algorithm - but this looks impossible! The idea after one night was simple. The sort-sequence (1 and 2 from the logfile) will map a n-length array into a second n-length-array. You have only reverse this mapping.

And this is my short solution. I replaced the compare-function with the log-sequence and run it on a sorted array (line 11). The original data can be recovered by appling the mapping.

  1. class RecoverSolution(RecoverProblem):
  2.     def __init__(self, n, seqStr):
  3.         self.n = n
  4.         self.seq = list(seqStr)
  5.  
  6.     def cmp_item(self, item1, item2):
  7.         return self.seq.pop(0)=="1"
  8.  
  9.     def solve(self):
  10.         """generate the sorted array, rearrange  and return the checksum."""
  11.         sort = self.merge_sort(range(1, self.n+1))
  12.         return self.check_sum([sort.index(x+1)+1 for x in range(self.n)])
  13.  
  14. if __name__=="__main__":
  15.     input = sys.stdin.read().split()
  16.     for case in range(int(input.pop(0))):
  17.         #first line is n, second line is the sort sequence as string
  18.         solver = RecoverSolution(int(input.pop(0)), input.pop(0))
  19.         print "Case #%d: %d" % (case+1, solver.solve())

My solution was correct and I got one point.

Comments

senthil's picture

The idea of mapping the sorted sequence using the modified comparison is just brilliant. I was trying the other way request reversing merge sort, etc etc. You solution is superb. :)