-------------------------------------------------------------------------
        ____ ___  ____            __ 
 __ __ |_  /( _ )/ __/  ___  ___ / /_ The Trailing Edge
 \ \ /_/_ </ _  / _ \_ / _ \/ -_) __/ of Technology
/_\_\/____/\___/\___(_)_//_/\__/\__/
------------------------------------------------------------------------- 
                 NEWS    About    IRC    Web Client 

The first challange...

{first 10-digit prime found in consecutive digits of e}.com

Ok, e is a nice non-terminating non repeating number.  Think pi.

A prime number is one that is only evenly disvisible by itself and 1.

The range of possible values is 1000000000 to 9999999999.

The idea is you take the sequence of e, after working it out yourself to
some extreme amount of digits, and start working through it, 10 digits
at a time, testing to see if the sequence is prime.  Lots of math,
my head hurts, whats the easy way?

A quick perl script that sequentially tests for the existance of n.com, n
being a 10 digit number, showed my 386 capible of resolving 33 hostnames
a second when using an external resolver.  Switching to a locally running
caching name server dragged me down to 13.  At peak speed, testing every
number, I'm looking at 7 years to go through the entire number space.

Ok, thats just a bit silly.  We can knock out all numbers that end in 0,
or are even, which gets me closer, 3.5 years... but thats still kinda slow.

What I need to do is some very quick and dirty tests on each number to see
if it's a likely prime, if it is, THEN test for the host name.  Hopefully
I can do math quicker than I can resolve DNS and cut down the test pool
while still being able to do 33 lookups a second.

(Realistically, 33 lookups a second is a bit rude to the dns system, so
leaning twards more math, less lookups is probalbly a good idea.)

Some quick testing shows I can do approximately 5000 divisions a second.
To tune the script, I just have to decide how long I'm willing to let it run,
and thats how many tests I can do on each number.

At the absolute shortest, I'm looking at 20 days if I just do a single 
division per number.  7 years to 20 days, we're gaining.  If I change my loop 
to incriment by 2 instead of 1, skipping even numbers, we're cutting the work 
in half, but does it pan out in actual speed? 

A quick bench shows that my test speed has gone up to 7500 numbers a second.

So, I need to write the app so that I'm generating 10 candidates a second to 
max out dns, or spend more time on math, less on lookups.  My peak speed of 
churning through numbers puts me at 1 week to generate a list of candidates 
if I do one test per number, weeding out the evens ahead of time.

A week minimum, seems pretty long for a 386 to churn on something.  I'm used to
far snappier performance from this box.  There has to be a way to reduce the
test set down.  If only I could get a list of all prime numbers in the target
range, then I wouldn't have to do any math, and could slam through all the
possibilities with less wasted cpu cycles.

According to the Prime Pages there are roughly 92 million primes between 
100,711,433 and 2,000,000,009.  So, based on those numbers, roughly 5% of the 
typical numberspace at this digit count is prime.  That means in the range of 
numbers we're looking at, there are roughly 470 million valid targets.  
Testing 10 a second, that drags brute force out to 537 days, a year and a 
half.

I can't wait for my schwank new job at google that long.  Even with a list of
primes thats far too much to dig through.  I guess I'm back to figuring out e
and testing, the way google wants. 

Googling for 'digits of e' comes up with a hit that looks promising right off 
the bat.  Its someone bragging they already figured out google's first two 
challanges.  Bah, I'm a hack, not a cheat.  Hit number 2, the first 5 million 
digits of e, in html.  WHEEE.  Hope that prime is in there.  10 lookups a 
second puts me at 6 days to test all odd 10 digit numbers in the first 5 
million digits of e.

So, using the file provided, I've used a perl script that moves a 10 
character window through the list, performs a quick battery of tests to 
determine if the number is a likely prime, and then does a nslookup.  Using a
very simplistic feedback loop based on the number of nslookups a second, it
adjusts how hard it test numbers to see if they are prime.  It targets 4 to 5
lookups a second, a rate I'm comfortable with.  After settling it estimates
16 hours to churn through the frist 5 million digits of e.

If that doesn't generate the number, I'll use the tuning data to run a brute
force attack on the whole available number space occupied by all 10 digit
numbers.  Currently I estimate it'd take 1.5 years given my current speed to
do so, but I should gain speed by not having to parse a 5mb text file for data,
so that could shorten things.  I also need to decide if building a DB of all
unique 10 digit numbers in the first 5 million digits of e and using that as a
quick exclude list is worth while.


After correcting a bug in my script, foolish me for not 
considering the size of the integers I was dealing with, I was able to get 
the correct result in 4 minutes.

google4.pl