Colin Harrington

Tag: EC2

Amazon EC2 High-CPU instances

by on Aug.16, 2008, under Amazon Web Services, Distributed Computing, EC2

At the end of May (May 29th 2008), Amazon announced that Amazon Web Services Customers can now utilize "High-CPU Instances" on EC2.  According to their specs, there are currently 2 versions of their "High-CPU Instances" as described below:

High-CPU Instances

    Instances of this family have proportionally more CPU resources than memory (RAM) and are well suited for compute-intensive applications.

    High-CPU Medium Instance

      1.7 GB of memory
      5 EC2 Compute Units (2 virtual cores with 2.5 EC2 Compute Units each)
      350 GB of instance storage
      32-bit platform
      I/O Performance: Moderate
      Price: $0.20 per instance hour
    High-CPU Extra Large Instance

      7 GB of memory
      20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
      1690 GB of instance storage
      64-bit platform
      I/O Performance: High
      Price: $0.80 per instance hour

So the Extra-Large Instance has the computing Power equivalent to 20 EC2 compute units.  This means that CPU bound problems get 2.5 times the performance for the same amount of money.  In a post from earlier this year, I estimated that it would take 3,100,000 CPU hours to crack a 16384 bit RSA key pair based on stats I had found elsewhere.  This came out to be about 38.75 hours (less than a couple days!!) with 10,000 instances and would cost a maximum of $310k (for an insanely large RSA key pair)ie an average of $160k to locate a specific pair.  With the High-CPU instances, it would take approximately 15.5 hours to do the whole computing task from top to bottom.  At 15.5 hours, it would cost $124k or an average of $62k.  This definitely puts some CPU Bound computing jobs in closer reach of those who need it. 

I could only imagine what this would do for CPU bound utilities like Video encoding/transcoding, weather pattern simulators, or large Rendering farms (among many other applications).  I’d love the chance to work with a farm of machines again – Its like having a fleet of robots doing the work in a portion of the time that a traditional desktop could offer.  Photogrammetry, hmmm… Videogrammetry…

Does anyone know of some good Linux based/open Photogrammetry software?

1 Comment :, , more...

MD_Update(&m,buf,j); /* purify complains */

by on May.17, 2008, under Amazon Web Services, Distributed Computing, EC2, General

This last week (the 13th of May 2008) they announced a jaw-dropping security hole in the Debian OpenSSL package.  This Bug was introduced on May 2nd 2006 (relased in September?) and fixed on May 13th 2008.

What was the Bug?  Basically the randomness of the key generation processes was severly inhibited, thus making it feasible to guess (by brute force) the private keys.  Someone commented out a block of code that was nessesary to guarentee the randomness of the key that was to be generated.

#ifndef PURIFY
/*
* Don't add uninitialised data.
MD_Update(&m,buf,j); /* purify complains */
*/
#endif

Ok what does that mean?  It means that someone could listen in on your communications that you thought were secure.  Sniff passwords, ssh into machines you don’t own, etc.

I was happy to get an urgent update from the Ubuntu update manager in such a short amount of time.  I like that I was able to patch my systems so quickly.  I am floored that this bug was allowed to happen for the last 2 years :-(

Many people have explained the fiasco/bug in more depth; here are some of my favorites 

I explained in a previous post on distributed computing, that one of my parallel programming courses in college required us to find the seed and depth of a sequence of random numbers (very similar to the generation of rainbow tables or brute force password/key checking).  I’m sure that a few slight modifications to that code and I would have a workable, scalable and efficient brute force attack.  Am I going to do this?  no.  Can you have the code?  Yes…and by yes I mean no.  Realistically anyone skillful enough to capture and stage an attack would have the skills to formulate this on their own.

H D Moore over at metasploit – calculated that it would take his 31 Xeon cores approximately 2 hours to brute force 2048bit RSA Keys, and ~ 100 hours (3100 CPU hours) to brute force a 8192 bit RSA key path, and 100,000 hours (3,100,000 CPU Hours) to brute force a 16384 RSA Key assuming the max-breadth to find the pair. 

With a tool like Amazon’s Ec2, this would allow you to scale this application as far as your pocket book would allow :-)  Well there is an actual limits, but it could be expanded by Amazon to handle your requests. 

I’m thinking something along the lines of 10,000 Extra large instances.  So that would be 80,000 cores, which would handle the 3,100,000 CPU hours in just 38.75 hours (yea, I know Ec2 core != Xeon … its just for illustration).  3,100,000 hours of computing could be completed in just over 3 days!!!!  with Amazon’s current pricing model, it would end up costing you $8000 per hour to run those 10,000 Large instances.  So the total bill (not including storage or testing time) would be around $310,000 to complete the processing.  I guess I have better things to do with $310k.  $310 is the most that you would pay, statistically you’d end up paying ~ $160k if you had to average it out. and that for 16384 bit RSA key pair.  the most common would be 1024 or 2048 bit RSA keys.

For a large organization such as the government, this would be cake money.  I’d be willing to bet that they already have much more computing horsepower than Amazon has at the disposal of EC2.   I love open source projects, but with so much going on at many levels, open projects can leave themselves open to bugs like this.  I guess thats why many projects go for the benevolent dictator approach.  Someone has to understand, and coordinate the project as a whole.  It will be interesting to see the fallout of this issue. 

This definitely got me to further my thoughts on Open Source Software.

What do you think?

 

10 Comments :, , , , more...

Distributed Computing

by on May.13, 2008, under Amazon Web Services, Distributed Computing, EC2, General

CloudWhen I was studying at Bethel College (now Bethel University) located in Arden Hills, Minnesota, I took a class called on Parallel Programming taught by Dr. Brian Turnquist.  I have to say that this class was my favorite.  I would stay up late just to solve the problems and projects that were presented to us.  I loved it!!!

We had a 40 CPU Beowulf cluster that we were able to work with.  It was a pretty standard AMD Dual Processor Configuration on a 10/100mbps ethernet network (which was usually the bottleneck).  Several students had the opportunity to help design and setup the cluster.  The cluster had its own housing inside one of the Computer Science labs. 

We ended up writing C++ programs that utilized MPI to communicate.  We ran calculations, rendered fractals, and simulated breaking passwords in a distributed form; Well maybe not passwords, but finding the seed and depth of how to replicate a series of "random" number’s generated by the stock random number generator could be easily substituted with other code .  I won’t get into how important the RNG (Random Number Generator) is to our modern systems (1,2) but it was a fun exercise none-the-less.  I ended up using the cluster briefly to render some intensive POV-Ray Fractals (See the contest results). 

I’ve always loved the concept of distributed computing.  I was really excited when I learned of Amazon’s Elastic Compute Cloud (EC2).  The concept of Pay as you go applied to Distributed computing is an interesting one!  And having a top-tier datacenter and Simple Storage Services (S3) makes it an attractive solution.  The concept of building scalable web applications is one that has caught my eye. 

I have some good ideas on how to utilize this service but haven’t made time to finish the concepts.  The Amazon Web Services crew have really started to round out ther services with the announcement of Persistent Storage for EC2 and SimpleDB.  Persistent Storage is, in my humble opinion, one of the last things that they needed to solve to service a fully viable, scalable, pay as you go/grow computing platform.  

77 Comments :, , , , , more...

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!