Cloud APIs: It’s the Architecture that Matters

You’ve reached an archived Flexera blog post that may be out of date. Please visit the blog homepage for the most current posts.

Lack of excitement is not something the cloud market can be accused of! Citrix just announced a bold roadmap for its CloudStack platform coming right on the heels of a Eucalyptus and Amazon announcement to extend API compatibility and just two weeks before the OpenStack summit. My prediction for 2012: we will exit the year with as many cloud APIs as we had when we entered, and it doesn’t really matter!

Railroad gauges & compatibility…
(Source Wikipedia)

Citrix’s announcement is interesting because it is the first bold move around CloudStack coming after the acquisition of by Citrix. Over the past year the CloudStack team and Citrix have worked with the OpenStack project, contributing code and adopting the Swift storage service. I know that the CloudStack team was always quite open around adopting OpenStack technology but at the same time it always seemed like they would take a big step backwards if they adopted the compute portion. After all, CloudStack has powered some of the most successful large scale private cloud deployments in production and is powering a good number of large public cloud service providers too. From that point of view it only makes sense for Citrix to stick to its proven technology and continue to develop the CloudStack codebase, pulling-in parts of OpenStack where appropriate, such as for the storage service, and using its own technology for other parts. With this move to make CloudStack an Apache Foundation Project, Citrix puts a stake in the ground around its commitment to CloudStack as an independent technology, and its roadmap to remain competitive on the API front.

Last week Amazon Web Services and Eucalyptus jointly announced a partnership to enhance the compatibility of Eucalyptus with AWS APIs. This announcement is a great validation of Eucalyptus’ strategy which has always been to be API compatible with AWS instead of trying to establish yet another API. It’s also interesting from the point of view of how AWS’ strategy around its API has played out over time. When Eucalyptus first gained momentum and RackSpace first worked on their cloud offering there was a lot of speculation whether AWS would try to prevent API clones, even possibly with legal means.  It was clear at the time that the issue had been discussed internally at Amazon but AWS never said anything publicly one way or the other. This wait-and-see approach let them observe the market’s evolution and start playing their cards when it suited them (which they have now done). At this point we can only speculate whether an earlier play would have been beneficial — for example, to support Mark Shuttleworth’s appeal half a year ago to the OpenStack community to adopt the EC2 API. In any event, now that Citrix has announced that CloudStack will be AWS compatible (they do mean API compatible even if their press release doesn’t state it verbatim), it’s clear that AWS API compatibility will become increasingly common in the IaaS market.

But what does all this mean? How important is Cloud API compatibility? What should Cloud API compatibility even mean?

The pitfalls of Cloud API compatibility

I’ve said it many times and I’ll repeat it again: it’s the semantics of the resources in the cloud that matter, not the syntax of the API. This means that “API compatibility” has to reach very, very deep to be meaningful. Let me give you a couple of examples around EC2.

#1: EC2 has a number of discrete instance types, from a fractional core to 8, from 512MB to 64GB of memory, from no local disk to 4 spindles, from shared network bandwidth to dedicated 10Gps ports, etc. While all these instance type can be replicated in a local cloud, it’s not an easy task. This means that moving workloads to local clouds where instance types differ could well require some non-trivial architectural adaptations. And even if it is possible to exactly replicate the AWS configurations, it turns out that doing so may actually not be desirable! Some of our large hybrid (public + private) cloud customers report achieving significant cost savings precisely because they are using customized instance configurations specifically tuned for their applications.

#2: EC2 EBS block storage devices have quite peculiar performance characteristics (that are not universally liked…) both for regular I/O as well as for snapshots. It would seem rather crazy to try to duplicate those characteristics and not benefit from improvements that are possible in a smaller purpose-built private cloud. But by doing so the operating procedures for deployments may change rapidly making the notion of “compatibility” questionable. Put differently, should one retain compatibility if compatibility is worse?

#3: The AWS clouds are divided into regions that are independent of one another and all resources are bound to a region. This means that if you create an image in the us-east region and want to have a DR set-up in the us-west region you need to copy the image explicitly to us-west and keep maintaining any updates you make. Some of the more interesting innovations being pioneered by other cloud providers allow the ability to publish images and even volume snapshots worldwide across regions. Does something like that break the EC2 API? Not necessarily, but it certainly changes tools and the way one would maintain DR set-ups. So, API compatibility doesn’t help much here either.

#4: Along those lines it’s interesting to see how Softlayer leverages their worldwide network infrastructure to offer each customer a worldwide private network: the 10.* addresses of all instances worldwide are interconnected (on a customer-by-customer basis). That makes operating servers across multiple regions a lot easier than in EC2. If Softlayer were to use the EC2 API they could offer these benefits without being “API incompatible”, yet global deployment architectures and tools to manage them would be radically different between Softlayer and AWS.

The point of all this is that API compatibility is not a panacea for cloud portability or interoperability. For a simple tool that launches a few servers, API compatibility is helpful. But at the same time the amount of work to port that tool to a different API is minimal as well. When it comes to more important aspects of cloud appeal — real elasticity, high performance, cost efficiency, better governance and control, failure isolation & resiliency — then API compatibility is simply not a ticket to portability across clouds.

The bigger compatibility picture

The bigger picture around Cloud API compatibility starts with the observation that there are really many AWS APIs, and the EC2 API is really only a relatively small portion of the pie. And even there, I would claim that there are actually two EC2 APIs: the “EC2 classic” API and the “EC2 VPC” API for the virtual private cloud. They are very similar to one another, but have significant differences beyond what you might imagine the VPC differences to be. This ranges from VPC security groups having egress rules which “classic” security groups don’t all the way down to little trip-me-ups like being able to move IP addresses from one instance to another using one EC2 classic call but requiring two EC2 VPC calls.

But to get back to the bigger picture: to me “Amazon compatible” really brings all 20 services offered in the AWS cloud into view (+/- a couple depending on how you count). Many AWS deployments use features that are close to EC2, such as Elastic Load Balancing, features that are further removed, such as the Relational Database Service or the Simple Queue Service, and features that are not very private cloud amenable, such as the CloudFront content distribution service. For many organizations, if the services they use aren’t included in an “Amazon compatible” cloud, then it’s not really compatible, and loses attractiveness — hence the long-running claim of “lock-in” around this proliferation of services.

My prediction is that increasingly cloud users will look for equivalent portfolios of services in clouds and less for strict API compatibility. For example, RackSpace added load balancing as a service support to their offering not too long ago. SoftLayer offers a content distribution service. And if one looks at the services around Google AppEngine there are Task Queues, Email services, various SQL and NoSQL storage services, and more. It’s clear that other cloud providers also see the portfolio of cloud services as being key. Currently these various services are somewhat equivalent to what AWS offers and for many of them the semantics beyond the API veneer are even more important than for EC2. It will be really interesting to see how the cloud compatibility — or should we say cloud “equivalency” — landscape evolves around these cloud service portfolios.

Summing it up

I have to say that Citrix’s three-fold statement that ‘our technology is the best’, ‘we’ll open source it under the apache foundation’ and ‘we’ll make it AWS compatible’ is bold. Since I’ve commented mostly on the AWS compatibility aspect, I should add that their increased ‘open sourcing’ of CloudStack does open the door for others to complement anything Citrix does with more AWS compatible services. It’s still a wide open game at this level of the cloud market, with, as I said at the beginning, no lack of excitement!

My recommendation to cloud users, however, is twofold: 1) do continue to dive in with these technologies and expand your use of different cloud offerings for more & more sophisticated projects, 2) don’t get too hung up on the compatibility debate and fail to see the cloud automation forest for the API trees.  The issues of portability and interoperability are being addressed well at higher levels of the stack – that’s what we do at RightScale, after all :-).