------------------------------------------------------------------------- ____ ___ ____ __ __ __ |_ /( _ )/ __/ ___ ___ / /_ The Trailing Edge \ \ /_/_ </ _ / _ \_ / _ \/ -_) __/ of Technology /_\_\/____/\___/\___(_)_//_/\__/\__/ ------------------------------------------------------------------------- NEWS About IRC Web Client The first challange... {first 10-digit prime found in consecutive digits of e}.com Ok, e is a nice non-terminating non repeating number. Think pi. A prime number is one that is only evenly disvisible by itself and 1. The range of possible values is 1000000000 to 9999999999. The idea is you take the sequence of e, after working it out yourself to some extreme amount of digits, and start working through it, 10 digits at a time, testing to see if the sequence is prime. Lots of math, my head hurts, whats the easy way? A quick perl script that sequentially tests for the existance of n.com, n being a 10 digit number, showed my 386 capible of resolving 33 hostnames a second when using an external resolver. Switching to a locally running caching name server dragged me down to 13. At peak speed, testing every number, I'm looking at 7 years to go through the entire number space. Ok, thats just a bit silly. We can knock out all numbers that end in 0, or are even, which gets me closer, 3.5 years... but thats still kinda slow. What I need to do is some very quick and dirty tests on each number to see if it's a likely prime, if it is, THEN test for the host name. Hopefully I can do math quicker than I can resolve DNS and cut down the test pool while still being able to do 33 lookups a second. (Realistically, 33 lookups a second is a bit rude to the dns system, so leaning twards more math, less lookups is probalbly a good idea.) Some quick testing shows I can do approximately 5000 divisions a second. To tune the script, I just have to decide how long I'm willing to let it run, and thats how many tests I can do on each number. At the absolute shortest, I'm looking at 20 days if I just do a single division per number. 7 years to 20 days, we're gaining. If I change my loop to incriment by 2 instead of 1, skipping even numbers, we're cutting the work in half, but does it pan out in actual speed? A quick bench shows that my test speed has gone up to 7500 numbers a second. So, I need to write the app so that I'm generating 10 candidates a second to max out dns, or spend more time on math, less on lookups. My peak speed of churning through numbers puts me at 1 week to generate a list of candidates if I do one test per number, weeding out the evens ahead of time. A week minimum, seems pretty long for a 386 to churn on something. I'm used to far snappier performance from this box. There has to be a way to reduce the test set down. If only I could get a list of all prime numbers in the target range, then I wouldn't have to do any math, and could slam through all the possibilities with less wasted cpu cycles. According to the Prime Pages there are roughly 92 million primes between 100,711,433 and 2,000,000,009. So, based on those numbers, roughly 5% of the typical numberspace at this digit count is prime. That means in the range of numbers we're looking at, there are roughly 470 million valid targets. Testing 10 a second, that drags brute force out to 537 days, a year and a half. I can't wait for my schwank new job at google that long. Even with a list of primes thats far too much to dig through. I guess I'm back to figuring out e and testing, the way google wants. Googling for 'digits of e' comes up with a hit that looks promising right off the bat. Its someone bragging they already figured out google's first two challanges. Bah, I'm a hack, not a cheat. Hit number 2, the first 5 million digits of e, in html. WHEEE. Hope that prime is in there. 10 lookups a second puts me at 6 days to test all odd 10 digit numbers in the first 5 million digits of e. So, using the file provided, I've used a perl script that moves a 10 character window through the list, performs a quick battery of tests to determine if the number is a likely prime, and then does a nslookup. Using a very simplistic feedback loop based on the number of nslookups a second, it adjusts how hard it test numbers to see if they are prime. It targets 4 to 5 lookups a second, a rate I'm comfortable with. After settling it estimates 16 hours to churn through the frist 5 million digits of e. If that doesn't generate the number, I'll use the tuning data to run a brute force attack on the whole available number space occupied by all 10 digit numbers. Currently I estimate it'd take 1.5 years given my current speed to do so, but I should gain speed by not having to parse a 5mb text file for data, so that could shorten things. I also need to decide if building a DB of all unique 10 digit numbers in the first 5 million digits of e and using that as a quick exclude list is worth while. After correcting a bug in my script, foolish me for not considering the size of the integers I was dealing with, I was able to get the correct result in 4 minutes. google4.pl_