The emergence of “cloud computing” has injected a lot of excitement within the IT industry, users and vendors alike, as it has shown to significantly reduce cost and increase flexibility/agility. Interestingly, cloud based services of many varieties – IT or not – have been available for many decades. Are there key ingredients of these well known services that also apply to modern IT clouds?
One everyday service often mentioned in IT cloud literature is electric utility service – always on, ubiquitous, elastic and priced based on usage. It provides AC power to every home and business within the territory served. Any certified electrical appliance can be plugged into the standard 3-pin electric socket. The industry is an eco system consisting of regulated utility companies, appliance vendors of all sorts, installers, wire/socket/peripheral makers, etc. Similarly, natural gas and water utilities are other examples of everyday non-IT clouds.
Key characteristics of these “infrastructure” cloud services are summarized in the table below.
Let’s consider few technology-related residential cloud service examples and their characteristics.
In a competitive market place, there may be multiple cloud providers providing the same service – for instance, AT&T and Verizon providing the mobile telephony service. These clouds often interact with each other, as shown in the example below. A landline phone user can call a mobile phone user, or talk to a Skype user on PC connected to the Internet. Similarly, an Internet user may consume cloud-based web application services, such as webmail.
What are the key ingredients in creating profitable markets for these everyday cloud services? They are open standards, vendor interoperability and certification (and, in some cases, regulations). Standards include physical interface, wire protocol, user-to-network interface (UNI) and network-to-network interface (NNI).
As we move to IT-focused “modern” clouds, similar type of ingredients are needed. Physical interfaces and wire protocols already exist, thanks to IEEE, IETF and ITU. Others need to be developed and/or widely adopted, including user-to-cloud provisioning, cloud-to-cloud (intercloud) provisioning, state migration of networks/network services/security/segmentation, virtual machine portability etc etc…
For sure, modern IT clouds are at an early stage of development, so it’ll take some time to see light at the end of the cloud tunnel. Nonetheless, the journey promises to be nothing short of exciting…
”To cloud or not to cloud?” is not up for debate – clouds were here before, are here now and will always be here in future, albeit under different labels to fit market inflections of each era. Benefits to enterprise customers (of all sizes) for doing clouds have been well established also: lower cost with pay-as-you-grow pricing model and higher flexibility/agility. Because these business benefits were also applicable to traditional hosted datacenter/application providers, it is important to identify newer elements that make up modern clouds.
As described in A Simplified Framework for Clouds, modern clouds may be characterized at least in two dimensions: infrastructure or application, public or private. All clouds, modern or not, need traditional elements such as scale, resiliency, multi-tenancy and automated provisioning, though there could be fresh requirements imposed on them by modern clouds for addressing today’s business requirements. In contast, canonical elements discused below are enablers for modern clouds and are not typical of legacy hosted datacenter/application environments:
- Computing: server virtualization, live application migration (LAM)
- Networking: modern network fabric
- Security: policy-based segmentation
- Application: federated single sign-on (SSO)
1) Server virtualization, a must-have element of infrastructure/compute clouds (public or private) that (a) maximizes efficacy of server computing by partitioning each physical server into many virtual machines, and (b) decouples OS – and thereby applications – from physical servers thus allowing applications to be portable (via application+OS packaged as a machine-executable “file” that can be moved, copied, stored, carried, deleted…). With deployment of dense multi-core blade servers, hundreds of virtual machines can be instantiated per server rack. Such massive server scale, along with application mobility, enables unparallel compute elasticity that allows rapid scale up/down of the number of virtual machines allocated to a given application workload.
2) Live application migration (LAM), an exciting element of infrastructure/compute (public or private) clouds that enables live (in-service) migration of application workloads from one virtual machine to another using technologies such as VMWare’s vMotion and Citrix XenServer’s XenMotion. Of course, the immediate benefit is that servers/OS can be upgraded/retired without bringing down the application itself. Extending this further, applications can be envisioned to run anywhere in a cloud through policies based on dynamic context, such as temperature (follow-the-mercury), cost (follow-the-price), time (follow-the-sun), availability (follow-the-uptime), capacity (follow-the-GHz)… For instance, an application workload could move from a hotter rack to a cooler rack, or from a higher cost data center to a lower cost one during non-business hours. Ultimately, application workloads could migrate live from one cloud to another, or expanded across multiple clouds during peak demands (e.g. from private cloud to private + public clouds). As migration footprint widens, crossing heterogeneous administrative domains, multiple new business and technology challenges emerge:
- security (should intellectual property data be served from a piracy-prone region?);
- compliance (can personal information move outside jurisdictional boundary?);
- eDiscovery (will a secondary, sub-contracting cloud provider block enterprise’s access to enterprise’s own electronically stored information?);
- cross-cloud interoperability – provisioning, consistent policy for LAM, SLA ownership, trouble-shooting etc. (who is the responsible party?).
Because of above issues, cross-cloud (inter-cloud) or even cross-datacenter LAM, while hugely exhilarating, will take some time to become practical (for more inter-cloud meandering, see A Hitchhikers Guide to the Intercloud).
3) Modern Network fabric, an underlying networking element of all modern clouds for enhancing server virtualization and live application migration, and for collapsing parallel cloud fabrics into a single Ethernet-based unified fabric. In particular, the latest Ethernet incarnation is adopting concepts from IP, Fibre Channel and Infiniband protocols for powering modern cloud environments (for more on Ethernet’s evolution, see Ethernet 4.0). Specifically:
- Server virtualization drives much higher utilization of 1Gb/10Gb Ethernet links, thus requiring a line-rate network infrastructure having symmetrically balanced cross-sectional bandwidths (i.e. ingress-to-egress bandwidth ratio in Ethernet/IP switches trending 1-to-1). Porting virtual machine images (few Gigabytes in size) to servers also needs a very high speed network. Similarly, LAM needs larger Layer-2 domains for broadening the live migration footprint and benefits from lower network latencies. With 10GbE network substrate (that is architecturally ready for 40GbE/100GbE interfaces) and lower latency Ethernet, plus enlarged Layer-2 domains via Ethernet multi-pathing or through virtual private LAN Service (VPLS), modern Ethernet/IP networking meets the needs of virtualized cloud environments.
- Unified fabric, an emerging ingredient for collapsing parallel cloud fabrics into a single transport fabric. In particular, Ethernet-based data fabric, Fibre Channel-based storage fabric and Infiniband-based cluster computing fabric can be replaced in aggregate by a single unified fabric, that is based on the latest evolution of Ethernet (aka Data Center Ethernet or Converged Enhanced Ethernet). Benefits of lower cost and reduced operational complexity are attained because (a) no need to buy and manage disparate devices, (b) lower number of host adapters on servers from up to six to two, (c) proportional reduction in cabling and (d) simplified transport topology.
4) Policy-based segmentation, a necessary ingredient for extending traditional network segmentation concepts (de-militarized zone or DMZ, Extranet, Intranet) to the cloud world. In virtualized, elastic and collaborative cloud environments where network borders are hard to quantify (see also Networks without Borders), segmentation based on VLANs, IP addresses and Layer-4 ports has become insufficient. What is needed are virtual zones (vZones) that (a) leverage policies based on identity, protocol, network, environment and application attributes (b) apply to user-to-machine as well as machine-to-machine transactions, and (c) are reconfigurable at a moment’s notice based on changing business, regulatory and security environments. In effect, along with traditional network zones created by firewall’s Layer-4 ACLs (access controls lists), vZones established by granular Layer-7 ACLs are necessary for policy-based enforcement without touching servers, OS or applications. With unprecedented granularity and control provided by policy-based vZones, IT can maintain consistent security posture and ensure regulatory compliance while enabling the business to reduce cost, improve agility and broaden collaboration.
5) Federated single sign-on (SSO), a necessary ingredient of all application clouds for providing highly convenient and seamless access to application resources. As described in the cloud framework post, enterprises will retain many applications in the private cloud, subscribe to multiple SaaS (public cloud) apps, and utilize public infrastructure clouds for running other internal apps. Despite such highly heterogeneous application environment, it is imperative that access to cloud applications be seamless, via cloud-agnostic single sign-on (SSO), and leverage one or more enterprise-administered directory stores for ensuring consistent, attribute-based application access. A user having contractual relationship with the enterprise – as an employee, customer, partner/vendor, contractor, etc. – should need to login only once to access all allowable cloud applications wherever they may reside.
The above five elements do appear to form a rather complete basis for modern clouds. Interestingly, as one would expect, canonical elements are applicable by and large to infrastructure clouds – a center point for most of the current cloud sizzle!
When it comes to Ethernet, it’s a rather joyful and nostalgic walk on the Layer-2 memory lane. This most recent walk has been triggered by a few good articles I read this week:
- Cisco’s Data Center Blog on Converged Enhanced Ethernet (CEE) versus Data Center Ethernet (DCE)
- Network World article on 100 Gigabit and Terabit Ethernet
- Network World slide show on the Evolution of Ethernet
Rise of Ethernet as the king of network connectivity, wired and wireless, has been simply fascinating. Not necessarily the best technology, its price-performance, ease of use and flexibility to adopt better traits of others have helped Ethernet get ahead and stay ahead of its Layer-2 compatriots, whether it be FDDI, ATM, Frame Relay or Infiniband (hey Fibre channel – watch out, Ethernet is coming to town). A clear case of pragmatism winning over perfection! Perhaps the only native Ethernet characteristic that has remained constant is its frame format…
As Ethernet has evolved, the industry has ended up using different modifiers to differentiate Ethernet from its prior avatars, using terms such as shared Ethernet, fast Ethernet, switched Ethernet, carrier/metro Ethernet… These prefix modifiers certainly provide a precise functional description but lack the temporal sense of the way Ethernet has evolved. This post is a curious attempt to chronologically categorize advances in Ethernet using a numerical suffix modifier – in the same spirit as the widely used Web 2.0 categorization.
According to the following, it seems that we are in the era of Ethernet 4.0:
- Ethernet 1.0 (Classic era, pre-1990): shared or classic Ethernet as one of many Layer-2 technologies, proposed by Bob Metcalfe at Xerox PARC in 1973 and standardized by the IEEE in 1985. All users on the network share the total bandwidth, and collisions are avoided based on the CSMA/CD algorithm (Carrier Sense Multiple Access with Collision Detection). Interface speed: up to 10Mb/s.
- Ethernet 2.0 (LAN era, 1990 – 2000): This was the “coming out” era for Ethernet, where it became the LAN technology of choice with functionalities such as bridged (switched) Ethernet, spanning tree protocol, VLAN, link aggregation, class of service, Wireless LAN (WLAN), power over Ethernet (PoE)… Ethernet, along with its Layer-3 counterpart Internet Protocol (IP), enabled convergence of parallel data, voice and video networks into one multi-service Ethernet/IP network for data/voice/video. Interface speeds: 100Mb/s and 1Gb/s.
- Ethernet 3.0 (MAN era, 2000 – 2007): A major win for Ethernet as it began to penetrate service provider networks for metro Ethernet services using technologies such as Q-in-Q, Mac-in-Mac and virtual private LAN service (VPLS). Interface speed: 10Gb/s.
- Ethernet 4.0 (Cloud era, 2008 – ?): Ethernet is adopting concepts from IP, Infiniband and Fibre Channel protocols for powering next-generation virtualized, workload agile data centers and modern cloud environments. These advanced characteristics consist of reliability, lower latency, multi-pathing and unified I/O (including Fibre Channel over Ethernet, or FCoE), and are being standardized in IEEE and IETF (see here). Like its 2.0 predecessor, Ethernet 4.0 has the opportunity to collapse parallel networks of data (Ethernet), storage (Fibre Channel) and cluster computing (Infiniband) into a single unified Ethernet/IP cloud network. Interface speeds: 40Gb/s and 100Gb/s (estimated 2010).
Interestingly, enhanced functionality of Layer-2 Ethernet as well as Ethernomics (10X interface speed increase for 3X price) have been the catalyst for most network equipment churns thus far. MPLS and IPv6 are perhaps the only celebrated enhancements that come to mind at Layers 3 and 4 (IP and TCP/UDP) to which network refresh can be attributed in limited customer segments and/or geographics (service providers, federal/defense, Asia).
Though not a concern any time soon, Ethernet should continue its watchful eye on other popular connectivity protocols, such as Bluetooth and USB. Like Ethernet, both these protocols are easy to use, very cost effective and widely deployed on personal computers, mobile phones, PDAs and electronic gadgets of all types.
What will Ethernet be up to in 5.0? 400 Gigabit and Terabit Ethernet? TCP-less data centers (via RDMA over Ethernet)? Massively scalable processor/memory area networks (through disaggregation of servers into mega islands of CPUs and memory communicating over Ethernet interconnect)?
After a long pause, the new world of networking is getting interesting again!
Update (5/21/09): Bob Klessig – who initiated me to the world of Metro Ethernet – indicated that “another key factor to the success of Ethernet is the addressing. MAC addresses are administered in an open way which allowed easy market entry for vendors and a high probability of uniqueness for each manufactured device. The similar nature of IP address administration is why IP was successful and why ATM failed.”
Will we run out of MAC addresses some day? IETF solved the similar IPv4 addressing concerns with IPv6. Perhaps IEEE 802 would need to deal with MAC address issue during the Ethernet 5.0 timeframe…
“Cloud” has become the latest love term that promises to provide IT services at much lower cost and with much higher agility. However, there has been much debate in regards to the type of services a cloud provides and how it provides them. Traditionally, networking folks have drawn clouds on network diagrams for conceptually representing transport services. Now-a-days anyone who touches (directly or indirectly) data center, virtualization, computing, networking, storage, security, provisioning, convergence, scaling, federation, software, hosting, infrastructure, platform, etc. etc. is on the cloud bandwagon. This “everything but the kitchen sink” approach undoubtedly has caused confusion as to what clouds really mean; a case in point is last week’s “What is a cloud?” article by Tim Green of Network World.
For a cloud services framework to be simple yet meaningful, two key categories come to the forefront:
- Cloud service type – infrastructure versus application: An infrastructure cloud service (aka infrastructure-as-a-service or IaaS) is the one where some portion of the IT infrastructure, such as compute, storage, programming, security, identity, etc., is offered as a service. This infrastructure service is an enabler for running end user applications. An application cloud service (aka software-as-a-service or SaaS), on the other end, is a self-sufficient soup-to-nut application offering to the end user, i.e. there is no additional IT effort and/or dependencies that an end user needs to address
- Cloud service usage – public versus private: A public cloud service can be subscribed by any end user (public at large) and is typically accessed through the Internet, e.g. using HTTP/HTTPS web protocols, and mostly via the web browser user interface. A private (or internal) cloud, on the other hand, is owned/controlled by a particular end user and hence its access restricted – e.g. through a campus network and/or through VPN tunnels.
The above Type x Usage framework forms a nice 2×2 analysis grid for evaluating cloud services. Amazon’s Elastic Compute Cloud (EC2) service and Simple Storage Service (S3), for example, are public infrastructure cloud services, whereas Salesforce.com and Cisco Webex are public application cloud services. Private clouds are not as prevalent today, though they are now being talked about more frequently; see the recent InformationWeek’s article on “Why ‘Private Cloud’ Computing Is Real — And Worth Considering“. Bechtel, for instance, has been an early adopter of private clouds, even before the term “private cloud” was founded; see “The Google-ization of Bechtel” and “Cloud Computing to the Max at Bechtel“.
Based on discussions with several large enterprise customers and cloud providers, it seems that large enterprises would likely follow Bechtel’s lead:
- build their own private IaaS and SaaS clouds,
- scale private clouds’ reach/capacity by extending to hosted IaaS & SaaS,
- subscribe to public IaaS opportunistically for non-core infrastructure needs,
- convert internal applications to private SaaS,
- outsource certain enterprise applications to public SaaS.
Traditional SMEs, on the other hand, are more likely to gravitate towards public cloud infrastructure for most of their IT needs.
Certainly, it’s natural to expect blending of the above framework components. For instance, start-up SaaS providers often leverage one or more IaaS services (e.g. compute, storage, security). Similarly, a public cloud provider may instantiate their service for private use. In most scenarios, the above framework should be reasonably sufficient for describing cloud services.
Needless to say, the hypothesis will be tested more thoroughly when applying widely in subsequent cloud-related posts.
Look ma – no borders!
Enterprise networks are going through phenomenal transformations, driven by the business’ determination to reduce cost and become highly agile. In the process, both internal and external borders or edges (or perimeters or boundaries) of enterprise networks are dissipating. Traditionally, network edges have been quite critical as many intelligent services are applied to network traffic crossing the edge.
Canonically, network edges can be mapped into three main categories: Campus-facing, External-facing, and Server-facing. In the new world, all three network edges are being re-defined.
1) Campus-facing network edge: In a typical campus environment, end user devices – e.g. desktops, laptops, IP phones – connect to the network through wiring closet switches and wireless access points. With virtual desktop infrastructure (VDI), the PC itself is moving to the data center and hence no longer connected to the campus edge. End users would connect to their “data center PCs” via smart terminals (e.g. ones that support RDP – the remote desktop protocol). Cost savings are obvious: OS patching, HW/SW upgrades, etc. are now done centrally, and, thanks to serer virtualization, server HW can be shared across multiple users. Edge features such as NAC, protocol recognition, … are no longer relevant on networking devices.
2) External-facing network edge: Traditionally, this edge delineated the trusted inside vs the untrusted outside using network firewalls. Firewalls provided controlled access to designated network segments, e.g. demilitarized zone (DMZ), ExtraNet zone. Because inter-enterprise collaboration is rapidly becoming web based and identity driven, network firewalls are no longer effective in providing the necessary controls to HTTP and SSL transactions – these transactions pass through the FW! Controls need to move much closer to servers/applications, taking into account user identity & attributes (not just source IP address), application attributes such as URLs & sub-sites & folders & files (not just destination IP address & port number) and potentially application-specific actions that are exposed in the protocol (e.g. via HTTP query string, header attributes, methods and even payload). This “vanishing perimeter” phenomenon has been widely covered in the industry and vendors are providing appliance-based solutions to re-establish controls through policy-driven virtual zones (vZones).
3) Server-facing network edge: Not too long ago, physical servers connected to a “top of rack” or “end of rack” switch, which formed the server-facing network edge. With the advent of blade servers, this edge moved into the blade servers in the form of a blade switch. Now with server virtualization coming to fame, that server-facing network edge has further moved out to the virtual “hypervisor” switch that connect multiple virtual machines within a server (or server blade). Interestingly, these virtual switches have been provided by server virtualization vendors; Cisco is the first traditional networking vendor that recently announced plans to offer its own virtual switch product, the Nexus 1000v.
Additionally, with the emergence of cloud computing, enterprise network edges are to be extended to the cloud – sometimes deterministically and other times on demand, e.g. on a per application basis or even on a per workload basis. And, as the network edges get re-defined, so must the network design best practices. After a long pause, the new world of networking is getting interesting again!
Update (25 April 2009): Network World article on “Cloud computing a ‘security nightmare,’ says Cisco CEO“ quoted Tom Gillis, vice president of marketing with Cisco’s Security Technology Business Unit: “The move to collaboration, whether it be video or the use of Web 2.0 technologies or mobile devices is really dissolving the corporate perimeter. This notion of security as a line that you draw in the sand… that notion is just gone.”
The Twitter phenomenon, or micro-blogging, has been quite intriguing. Though not yet a regular tweeter myself, I am told that the “aha” moment will come when I start using it actively. So I started tweeting this week on Twitter and Facebook.
As I was warming up, a new tweet popped up in my mind. What are the infrastructure implications of tweeting, in terms of HTTP connection rate, rate of new storage required, etc. I quickly looked up Twitter stats on tweetstats.com – nearly 2 million tweets per day. What if most of the world starts tweeting using smart phones (very much like SMS today)? To get a better sense of the infrastructure needed for this human urge to tweet, I did some quick back of the envelope calculation.
Average Tweet Size: 100 bytes
# of Tweets: 10 per tweeter per day
# of Tweeters: 1 billion worldwide (think big!)
Tweet Rate: 10 billion tweets per day
Tweet Storage: 100 Gigabytes per day (with 10:1 compression)
Each tweet is essentially an HTTP transaction (request and response). The tweet rate of 10B/day translates to ~115K HTTP transactions/sec for tweets uniformly distributed throughout the day. Assuming that the compute infrastructure (aggregate of web, application, database servers) can process 1000 transactions/sec/server, about 115 servers are needed. If a peak to average ratio of 3:1 is assumed, then about 350 servers are needed.
Storage needs appear to be quite manageable also – 100GB/day means ~37TB/year, which is no sweat in the petabyte world we live in today.
Net-net, setting up a tweeting service does not seem to need an onerous compute/storage infrastructure (even if people double or triple their daily tweetings). Any techie tweeters out there who can validate/correct the above?
An interesting extension of this would be to estimate capacity of handling all new thoughts of every human being on this planet!!!
Building a cloud-centric data center infrastructure demands the following canonical components:
• Connectivity – data networking, storage networking and Layer 4-7 services (e.g. firewalls, load balancers)
• Compute – servers & OS, virtualization software
• Storage – arrays/file shares for structured and unstructured data (CIFS, iSCSI, Fibre Channel based blocked storage, etc.)
• Provisioning – automated, end-user driven provisioning of cloud infrastructure
Multiple data center vendors are positioning themselves to provide one or more of these components. On Monday, Cisco announced its Unified Computing vision, which unifies connectivity and computing disciplines using a holistic architectural approach. It includes a portfolio of products under a new Unified Computing System (UCS) product line. ().
GigaOm provided some details on the announced products. How are these products different that what is available today? Few immediate thoughts:
1. A 4/8-slot, 6-RU blade server chassis (UCS 5100 blade chassis and B-Series blades) that can take up to 8 half-size or up to 4 full-size server blades. Key notables:
• Leverages the latest Intel Xeon processor and Nehalem microarchitecture
• Each blade server utilizes unified I/O network adapter (for Ethernet, FCoE, Data Center Ethernet and FCoE connectivity). Three different network adapters are available, though it is unclear whether their interface is 10G or 1G or something else
• Ability to do memory expansion to up to 384GB (no details available)
• Up to 2 fabric extenders (see below) for external fabric connectivity (in lieu of traditional blade switches)
• No separate management module!
2. Fabric extenders (UCS 2100), aka FEX, which is inserted in the blade server chassis for network connectivity. According to Nexus 2000 and 5000 literature, FEX is a “remote I/O module” that extends internal fabric to external data center/cloud fabric, providing singly-managed entity with common supervisory functions and inheriting unified fabric switch port characteristics. Though this blade FEX as four 10Gb uplinks, it isn’t clear whether the internal blade chassis fabric is 10Gb or 1Gb (like Nexus 2148T) or something else. Of course, the key theme is operational simplicity.
3. Unified fabric switches (UCS 6100) providing 20 or 40 ports of 10GbE connectivity. Key notables here are that these switches natively support unified I/O (consolidation of Ethernet and Fibre Channel) via Data Center Ethernet (DCE) and FCoE, plus they enable port extension to UCS 2100 FEXes – very much like the Nexus 5000 switch family.
4. UCS Manager that manages the unified computing infrastructure, up to 320 discrete servers. One potential configuration could be: 40 blade chassis, each with 8 half-size blade servers (total 320 servers), connected to one or a single HA pod of UCS 6100 fabric switches utilizing one 10G port per switch per chassis. By addressing the operational complexity head on across multiple discrete products, Cisco intends to reduce cost and increase operational agility of data centers and cloud infratructure – key end user care-abouts.
5. Because server virtualization is central to the unified computing vision, the virtual “hypervisor” switch – Nexus 1000v – has be an integral component as well as the ability to expose VMs to blade FEX/fabric switches via VN-link technology. These technologies ensure that consistent network policies can be applied across the VM infrastructure, even during VM migration, and the entire process can be managed centrally. It would be natural to offer this functionality as a pre-configured option for UCS blade servers.
Overall, this is a cool architectural-centric product offering for next generation data center and cloud computing infrastructure, consisting of an end-to-end centrally managed solution. No doubt Cisco has up the ante in the data center. It’ll be interesting to see how other data center vendors respond – via their own product innovations, M&A activities and/or partnership re-alignments.