Robert Escriva

Things I wish I learned earlier

Daemon Challenge 3 Dissection

So I guess I am efficient. I've been chosen as the winner of Dustin Kirkland's Daemon Challenge 3 I'm not going to provide a full writeup as to how the challenge can be completed; Dustin did that thoroughly here. Instead I'm going to discuss the reason the approaches outlined by Dustin and Dave Walker are superior to the less efficient methods. In the process you'll probably learn how SHA512, and hashing functions in general, work.

My Approach

This is the main approach I used to tackle this problem.

Cracking the MD5

So as mentioned before the solution was relatively straightforward. For cracking the hash, I took the same approach as Dave and turned to Google. Unfortunately, Google did not know the original hash (whose input included a trailing newline). If Google had not provided me with a solution, I had a contingency plan: Andrew Zonenberg's distributed hash cracker This project has massive potential. It's a distributed hash cracking system that will eventually support pluggable hashes. Using his system on 2 cores, he managed to obtain 12 million hashes per second and was able to find the answer in just 69.15 seconds; this is the worst case scenario as he was scanning the entire 8-bit domain. It took only 2 seconds flat to scan alphanumeric permutations. A big thanks to him for being there as a backup in case Google failed me. (update: my original numbers for Andrew's cracker were way off)

On another side-note it appears that this is a common trend in these contests. So much so that it might be advantageous to add a feature to his cracker that tries all combinations with a newline at the end. During one contest RPISEC was a participant in, we had to reverse a hash of 'potaton' and had trouble brute-forcing it until Google came to the rescue.

Solving the riddle

To solve the riddle inside the ecryptfs, I wrote a Python script that exploited the round structure of SHA512. I also wrote a bash script (for kicks) that has been running since Wednesday and has yet to reach 750,000 lines in the file. I will cede that there are improvements to be made in the bash script, but they still pale in comparison to the Python script. Those interested can grab my script: sha512.sh

My Python script takes a more efficient approach which I'll describe in a following section. This script takes about 4.1 s real time on my Q6600 desktop. Much better than several days worth of time.

Another (non-trivial) savings is that the Python script is a single process that runs until completion. My bash script will fork millions of processes over the course of its lifetime.

The answer to the questions inside the riddle can be found on Dustin's blog.

SHA512 Structure

SHA512 uses what's called a compression function to map an arbitrarily long string to a constant-length (128 chars) string. To do this it operates in chunks of a fixed length, calculating an initialization vector for the next chunk. Python's implementation of SHA512 will not rehash data from previous chunks. Each time the update function is called, it calculates the next initialization vector, thus minimizing the time necessary to perform the steps of the riddle. The bash implementation recalculates these chunks each and every time. That's a lot of excess and wasteful work being done by a CPU assigned to process these tasks.

Time Analysis

In short, the difference between the two algorithms comes down to exploiting the round structure and compression function of SHA512. But just how many resources do we waste? The short answer is it is the difference between O(n) and O(n^2). The long answer is below:

Each line in the file is 129 characters; that is, 128 characters for the hash, and 1 for the newline. The linear time algorithm will hash 999,999 * 129 characters, and it will write 1,000,000 * 129 characters to memory (I'm ignoring the initial calculation of 'Daemonn' as it doesn't matter in the long run).

The shell script, in comparison, will do much worse, even if we assume equal read and write performance compared to the in-memory Python implementation. In short, it reads in the following pattern:

129 * 1 + 129 * 2 + 129 * 3 + 129 * 4 + ... + 129 * 999,997 + 129 * 999,998 + 129 * 999,999

Which can be more concisely expressed as:

/files/2009-01-23-daemon-challenge-3-dissection/ay.gif

Note that these numbers are in bytes which equates to 58 TB of data run through the SHA512 sum program. Compare that to the 123 MB of data processed by the Python script. It is also good to notice that I run wc -l to calculate the number of iterations that remain. The bash script is reading the file multiple times at each iteration, which implies that 58 TB is scaled by a constant factor. Talk about inefficiency.

Conclusion

In conclusion I'd like to express my gratitude to many people:

  • Dustin Kirkland for all the time and effort he put into this contest. It was something unique and gave me that "hacker high" that we all strive to achieve.
  • Daniel Suarez and his publishers. I look forward to enjoying my copy of Daemon when it arrives.
  • Google for being such a quality hash cracking utility.
  • Andrew Zonenberg for his extremely efficient MD5 implementation. If you're looking for a good cracker, I recommend following his work.
  • Yonatan Naamad for the improved latex/gif.

One last thought: Computing time is generally less expensive than a programmer's time; but a poor implementation cannot be improved through more computing power.

Copyright © 2010 Robert Escriva ¦ Powered by Firmant