Skip navigation.
Home

The Global Environment for Network Innovation (GENI)

Jack Brassil and Rick McGeer
HP Labs
E-mail: patrick.mcgeer@hp.com

Over the past decade, the focus of interest in computer science and computing systems has shifted to very large distributed systems. Specific examples include web services, Grid services, content distribution systems, overlay multicast trees, wide area storage systems, and distributed hash tables. And for good reason: these society-scale systems offer both unique challenges and unique opportunities for our community, and expose fascinating new research questions.

The principal challenges involved in these systems are challenges of scale. A sufficient change in scale is a qualitative change for a system. Emergent behavior at scale has been observed repeatedly in networked systems. To take some common examples, distributed denial-of-service attacks and high volume email abuse (spam) are behaviors that only emerge when a sufficient number of users are connected to a network to permit mutually-anonymous behavior. Other persistent examples of emergent behavior at large scale include flash crowds. There are also examples of applications where stability and/or value emerges only at scale: these include content distribution networks, distributed hash tables as routing and discovery overlays, cooperative domain name service, and file-swarming object transfer systems.

Some computational systems exhibit scale-independent properties of stability. These systems, which we call fractal systems, are self-similar in nature and scale with the number of systems and users, and whose behavior is the same at all scales. One common example of a fractal system is an end-system overlay multicast tree, whose behavior is the same at all scales.

Most distributed systems have at least some scale-dependent properties and behavior. In order to understand these systems, it is important to study them at scale. This requires the use of a large-scale testbed, where systems can be run and their performance and behavior studied at scale.

One such large distributed system, of course, is the Internet itself. That the Internet continues to function at all indicates that it is largely fractal in nature; it has, after all, grown by six orders of magnitude in a generation without fundamental change to the basic underlying protocols. Further, the application mix has changed dramatically. In the early 1980’s, the main internet applications were telnet, electronic mail, and file transfer; in the 1990’s, the web, instant messaging, and electronic commerce (mostly, three-tiered web applications); in this decade, peer-to-peer and file-swarming file transfer, voice over IP, web services, and streaming media. The behavior of the Internet in response to varying use (and misuse) cases requires the use of a large-scale observatory, with thousands of vantage points on the Internet. A classic case study in the benefits of such a wide-area large-scale observatory was the Netbait system developed by Brent Chun of UC-Berkeley and Intel Research Berkeley. Netbait was simply a network of web servers, deployed on PlanetLab, which logged queries as they came in. Through a simple filter on the logfiles Chun was able to construct the largest honeyfarm then in existence (it has since been superseded by the Paxson/Weaver/Savage Array of Telescopes project at UC San Diego and UC Berkeley). Chun’s Netbait system provided a worldwide trace of worm spread throughout 2002 and 2003, capturing the emergence of a new variant of CodeRed.

Network systems are first and foremost communication systems, and therefore their emergent behaviors often emerge in human interactions. In some sense, the scale of a network system where behaviors emerge is not only a scale in terms of number of nodes, but also a scale in terms of number of people using the system. To take simple examples, spam and phishing are phenomena that only emerge when a large number of people use a networked system. Similarly, peer-to-peer file transfer or an online classified system such as Craigslist or eBay only demonstrates interesting properties when a large number of users become attached to the system.

In order to study large-scale systems with large numbers of users, an experimental field station where long-running services can be deployed and used by real-world users is required.

The difficulty with such a wide-area testbed, observatory, and field experimental station is that few applications or experiments can justify the expense of such a platform, and thus a shared platform is required. The goal of a number of researchers is to build such a wide-area shared platform, modeled on the highly-successful PlanetLab distributed systems platform. This platform is called the Global Environment for Network Innovation, or GENI, and is currently being designed by a team of 70 industrial and academic researchers.

GENI will play three fundamental roles for the networking and distributed systems research community in the United States:

1. As a laboratory: a facility for controlled, repeatable, reproducible experiments under safe conditions. This facility should provide specific, precise, guaranteed conditions for the conduct of experiments, and mutual protective guarantees for facility users and third parties.
2. As an observatory: a facility for precise, non-invasive observations of the behavior of existing networks and distributed systems under current network conditions.
3. As a field experimental station where new systems can be tested under actual network conditions.

These three classes of usage have slightly varying resource requirements, and resource descriptions must fit all of them. Moreover, the resource descriptions must describe items that are realizable with available technology in the GENI timeframe.

There are three systems on whose experiences we draw, as representatives of the GENI frameworks: Emulab/Netbed, PlanetLab, and Tycoon.

Emulab, from the University of Utah, is the premiere distributed systems and networking laboratory facility. It offers a controlled testing environment consisting of bare machines, with images loadable from a centralized resource, and private links with a maximum bandwidth of 100 Mb/sec and controllable impairments.

PlanetLab, centered at Princeton, is the leading distributed systems and networking observatory and field experimentation station. It is an overlay network of bare virtualizable IA-32 Linux machines, with virtualization at the syscall level using Vservers, a Linux equivalent of BSD jails. Connectivity is over the open Internet, using physical links and bare IPv4 addresses. Bandwidth is capped to control expenses at the hosting sites. Only Linux executables are loadable.

Tycoon, from HP Labs, is a market-based cluster management system. Tycoon allocates virtual machines based on the Xen virtual computing base, with memory, CPU, and bandwidth set by the end-user/developer. While this is not specifically a networking testbed, its more general use as a cluster manager makes this an interesting model for experimental resource allocation: viewed as an abstract problem, network observation and experimentation is simply another cluster application: more properly, as an application for networks of clusters with strong isolation requirements.

The most important common feature of all three environments is that they offer the developer/user bare machines: in the case of PlanetLab and Tycoon, bare virtual machines, and in the case of Emulab, bare physical machines. The choice of bare machines is not accidental: conflicts in software environments are a major deployment barrier, and use of bare machines therefore is a prerequisite for rapid deployment.

The goal of GENI is to extend and deepen its specific antecedents, PlanetLab and Emulab, both broadening the scope of the testbed and providing more advanced services. In particular, PlanetLab’s ability to maintain a worldwide network of machines with a small staff, and the ability to provide users with the illusion of a bare virtual machine will be retained in GENI. Emulab’s ability to create a virtual network and rapidly populate it, sharing a common file system, will be retained and extended over the wide area. Several deficiencies of the PlanetLab infrastructure will be addressed in GENI.

* PlanetLab is known to be inappropriate for the execution of high-QoS services such as teleconferencing, since it is unable to guarantee bandwidth and processing power to individual services networks. GENI will remedy this with advanced resource allocation, derived from market-based techniques such as that used in Tycoon.
* PlanetLab’s experimental facilities were restricted to application-layer and user-level networking experiments and services, since the PlanetLab physical machines were hosted commodity nodes interconnected by the commodity Internet. GENI will offer dedicated subnets to permit experimentation at the internet and potentially physical layers.
* PlanetLab’s interconnect facilities ware restricted to wired commodity and research network facilities. GENI will offer optical interconnect on the National Lambda Rail and a variety of wireless testbeds to experiment in these emerging network domains.

Compelling social needs also motivate GENI. A particularly strong driver is the emergence of networking in the developing world. Intel Research Berkeley and UC Berkeley, led by Prof. Eric Brewer, have led a number of ambitious infrastructure projects in the developing world, featuring long-range wireless communication and delay-tolerant networking. Both Prof. Brewer and UC San Diego’s California Institute for Telecommunications and Information Technology have been deeply involved in Mission 2007, an ambitious project to connect 100,000 Indian villages.

Also ambitious is the networking infrastructure envisioned by the One Laptop Per Child (OLPC) project conceived by Prof. Nicholas Negroponte of MIT. Negroponte’s vision incorporates a mobile wireless ad-hoc network in every village, tied to a school server which connects to an internet backhaul. At a stroke, this entails problems in wireless mesh management; automated management of a nationwide network of school servers; a content distribution network among the schools. The scale of each of these will be the largest ever attempted, and one that cries out for experimentation on a facility such as GENI.