SharePoint is slowly becoming more cloud-like, which I think is where it belongs. So I am educating myself about the leaders in this area, one of which is Amazon. Their recent outage taught me a lot about their platform. The main issue was a manual change to the network layer caused a re-mirroring storm at the storage layer (EBS) that backed up not just that zone (AZ) but other zones relying on the same admin/control layer. I wonder was it one guy who accidentally switched the routing to the secondary network, if so, I pity him! The main point was it caused a flood that broke down the system at multiple points. At least it showed them where the weak points were.
Full article here:
Also got this general ebook about architecting for the cloud on my kindle for iPhone app: