Frank Sommers Autospaces Abstract
This article is about two time-tested ideas that rightly refuse to die. The first one is time sharing. The second one is about a way to manage complexity: Because some problems require inherently complex solutions, often the best approach is to move that complexity to areas of a system where complexity can be managed best. Grid computing ties these ideas together, and provides a practical way to solve one inherently complex problem: Managing desktop computers.
First introduced in 1957, we often associate time-sharing with logging into remote servers via a character-based terminal to execute jobs on those servers[1] [2] [3] [4]. Although the first time-sharing systems matched that description, time-sharing is a much broader concept, and is an extension of the multitasking capability present in all modern operating systems. While multitasking shares a single processor between multiple independently running OS processes, time sharing services, in addition, allow those processes to run on behalf of different users. Recent time sharing systems are not based on character terminals. For instance, the Java Virtual Machine provides time-sharing services via the Java Authentication and Access Control (JAAS) framework. JAAS allows Java threads to execute in the VM on behalf of authenticated users, facilitating a multi-user, multitasking VM environment[5] [6]. Web servers also provide the conceptual equivalent of time-sharing: When you purchase a book from Amazon.com, for instance, your purchase request is represented with a thread of control on a remote Amazon server - a thread executing on your behalf, and sharing the CPU's resources with many other concurrent users for the duration of processing that request. Although such requests represent very short periods of CPU time, they operate on the same fundamental principle of sharing a centrally managed resource among users. While a single time-sharing system allows many users to share one server's CPU time and other computing resources, grid computing brings that sharing to the level of a network of servers. In a grid, parts of a computing task are parceled out among several independent servers. For instance, user authentication might be performed by one server, while a CPU-intensive calculation may be scheduled on another server's CPU, all the while data storage may be provided by yet another server's file system. What sets a grid apart from other distributed systems is that in a grid such resources are by design shared among many users, often belonging to different organizations and administrative domains[7]. A grid computing infrastructure, in turn, mitigates access to all those resources, providing basic time-sharing services, such as scheduling, resource allocation, or concurrency control, for a set of shared servers.
The Emancipation of Services
Each grid service may be defined via interfaces or protocols following widely accepted standards. For instance, user authentication may be specified via a Kerberos-based authentication facility, resource metadata may be offered via LDAP[8], and file systems may be provided via NFS[9] or Samba[10]. As long as grid services adhere to agreed-upon standards, implementations of each service become interchangeable. That, in turn, emancipates a service from a server: Since clients rely on logical service interfaces, not on access to specific servers, a client need not know what servers provide the required service. Instead, a client can depend on the grid provisioning software to find the server with the least load, or highest availability, or lowest service latency, based on the grid's policy. The emancipation of services provides advantages in scaling up a grid: To increase storage space, a grid administrator can swap a Samba server with another server providing more available disk space, for instance. Or, to offer more robust user authentication, an administrator may provision several authentication servers on the grid, offering the grid provisioning software a larger pool of available resources to choose from. In addition to improving system throughput, provisioning multiple redundant servers for a service also improves that service's availability[11].
Figure 1: Scaling a grid Desktop Complexity
Server specialization is the exact opposite of today's desktop computing environments. Instead of specializing on a handful of services, desktop operating systems today compete in offering an increasing number of increasingly complex features. Such features are created in response to a sophisticated and demanding user base: Most desktop users are no longer satisfied with being able to compose simple documents in a text editor, but want their computers to be able to access services on the Web, manage digital photos and music, and, more recently, to serve as home entertainment hubs. Mobile computers impose additional demands on a desktop environment, such as the ability to offer access to wireless hotspots, and to provide security and recovery features. Those capabilities are provided by cooperating tasks, such as user authentication, file system access, window management, and so forth. Each task contributing to a desktop session, in turn, is often represented by an independent operating system process, such as the window manager process, the file system mount daemon, the user authenticator, and even the processes that forward mouse and keyboard input to the operating system. In a traditional desktop environment, those processes must be installed, configured, and managed, on a local computer. As a result, instead of specialization, today's PCs are the result of integration: of bundling a myriad of services and associated software on each desktop. Such integration efforts have enabled users with richer features, but only at the cost of leaving users with an increasingly complex desktop environment to manage. A recent ZDNet UK article quotes research showing that
a PC can cost up to 25 times its purchasing price over a five-year period, particularly when calls to help desks escalate due to bad desktop management. An average call querying the desktop lasts 17 minutes, of which nine are spent simply identifying hardware and software[12].
The Desktop as a Grid Service
Although the desktop paradigm has come to represent access to a single computer, the processes providing a desktop session's capabilities can be distributed to servers on a grid. For instance, one server may perform user authentication, another may offer the user access to a filesystem, and yet another can provide the window manager. In that manner, a desktop user session lends itself to grid-based distribution. Such distribution pushes the complexity of running and managing the services that make up a desktop session to the network, alleviating users from desktop management chores. Hence, a grid-based desktop transforms the problem of software installation and maintenance to that of provisioning networked services. A key problem of provisioning the services of a grid-enabled desktop is deciding how much complexity to leave on a user's computer and, concomitantly, what responsibilities to move to specialized servers. The assumption is that users, in general, are bad at managing desktop complexity, whereas dedicated servers provisioned via grid middleware can excel at that task. Distributed desktop platforms in use today can be categorized according to their distribution of computational responsibilities between client and server. Among the most popular distributed desktop environments today include X Windows[13], Citrix's MetaFrame product[14] and GoToMyPC[15], Microsoft's RDP-based remote desktop[16], the Virtual Network Computer (VNC)[17], the research prototype THINC[18], and Sun's Sun Ray product line[19]. Some, like X windows, require much client-side resources and maintain lots of computational state at the client. Others, such as VNC, are implemented in software, and run as applications on top of a full-fledged OS. Sun's Sun Ray, the focus of this article, represents another extreme with no client-side state, and very minimal client-side computing. To appreciate the available distribution choices, it is helpful to illustrate the key components of a non-distributed desktop residing on a user's PC:
Figure 2. Components of a desktop display subsystem
Figure 3: Client-server distribution in the X Windows system
Figure 4: An X-based thin client Cache Your Pixels
A different approach to distributing a desktop's services on the network starts not with a client-server architecture, but with the well-known technique of data caching used to speed up access to networked information. In many enterprise applications, for instance, clients often maintain a local cache of frequently accessed data in order to avoid the overhead of performing queries each time similar data is requested. In a graphical desktop, the raster image data rendered by the operating system and graphics chipset is buffered to the video memory, or frame buffer. Instead of sending that data to the display, the buffered raster data can be distributed on the network in a replicated cache, with a master copy on a server, and a secondary copy on a thin client. Updates made to the screen are first copied into the server's cache, and then are immediately replicated to the thin client's cache. That architecture almost completely pushes desktop computation to the network, and leaves to a user's system only the task of keeping the local buffer cache in sync with the server's buffer. The problem of keeping the distributed pixel caches in sync between the desktop client and a server is orthogonal to the type of windowing system used, and even to the operating system. Such a mechanism can be implemented on any operating system or windowing system by means of a virtual graphics device that, instead of pushing a bitmap raster to the display, pushes those rendered images into the server-side of the distributed cache, which then takes care of synchronizing local and remote cache content. Successful cache replication depends on a high hit-to-miss ratio: The less the screen changes, the less data needs to transfer from server to thin client. That fits the pattern of most desktop usage well: in a typical desktop session, most desktop UI changes occur in response to user actions. And user actions are relatively sparse. Typing text in a word processor, for instance, causes very few changes on the screen - mostly just in the screen area of the newly typed text. The rest of the screen pixels remain the same, requiring relatively small amounts of data to update the thin-client's screen buffer. The worst-case scenario is a full-screen movie. A typical uncompressed MPEG 1-encoded movie with 25 frames per second, in 24-bit color and with a display resolution of 1280x1024 pixels, would require a sustained bandwidth of approximately 750 Mb/s. Compression would reduce that bandwidth requirement. In between these extremes are user actions such as dragging a window or scrolling inside windows. Even with many screen changes, the server's cache update mechanism can intelligently compute the difference between two consecutive screen updates. For instance, some white pixels may stay fixed between two screen updates even when scrolling text in a window. The server can compute such differences via a bit-by-bit comparison of two rasters, and send only the minimal updates necessary to the client. Screen updates may be communicated between client and server via a protocol that aims to minimize the amount of data transmitted. Instead of sending arrays of pixels representing screen changes, drawing primitives can affect bulk updates in the client. For instance, if two screen areas contain the same set of pixels, a COPY command can instruct the client to copy one rectangular screen area to another screen location. Or, if a screen area consists of similar pixels, a FILL command could cause the client to fill that area with the specified pixel. Such protocol primitives allow protocol optimizations specific to thin client and server communication, and are independent of the original display protocol, such as X. They also facilitate graceful degradation of display quality in the face of communication obstacles, such as high packet loss ratio. The server, noticing the dropped packets, could re-transmit only the updates most essential to maintaining usability. As well, communication between client and server can be compressed to further reduce bandwidth requirements. That leads to a design depicted in Figure 5:
Figure 5. Distributed pixel caching in a thin-client architecture.
Figure 6: Session mobility by redirection Thin Client Grid Computing: Test-Driving Sun's Display Grid
The rest of this article focuses on an implementation of the replicated frame buffer cache paradigm, Sun's Sun Ray[20]. The Sun Ray's architecture is novel in that it pushes almost all computation to the network. Indeed, the Sun Ray client, a small piece of hardware that connects to a keyboard, a mouse, a display, and a network cable, does so little computing that it can operate on about 9W of electricity. At the heart of the Sun Ray architecture is a communication protocol that relays status between server and client, including information about user authentication and a user's desktop session, sends keyboard and mouse state to the server, forwards audio and peripheral I/O between server and client, and transports screen updates from the server to the thin client. The Sun Ray's local screen buffer is used for display updates, but that cache is treated as ephemeral that can be overridden by the server at any time. Thus, the client is stateless. The firmware in the Sun Ray device contains networking code as well as code specific to the Sun Ray protocol.
| Command | Description |
|---|---|
SET |
Set literal pixel values of a rectangular region |
BITMAP |
Expand a bitmap to fill a rectangular region with a (foreground) color where the bitmap contains 1's, and another (background) color where the bitmap contains 0's |
FILL |
Fill a rectangular region with one pixel value. |
COPY |
Copy a rectangular region of the frame buffer to another location |
CSCS |
Color-space convert a rectangular region from YUV to RGB with optional bilinear scaling |
Bandwidth Versus Latency
In addition to subjective observations, we wanted to measure the Sun Ray's actual bandwidth consumption in the context of using the above applications. In order to gauge the Sun Ray's actual bandwidth use during the above interactive workload, we proceeded to measure the point-to-point peak available bandwidth between the remote Sun Ray server and the Sun Ray device. While a DSL connection presents asymmetrical network bandwidth, with more bandwidth available for download than for upload, that matches the bandwidth requirements of a Sun Ray session, since most data is transmitted from the server to the thin client device at the network's edge. Thus, we focused on measuring download bandwidth from the remote server. To eliminate TPC/IP connection startup latency from the measurements, we wrote a simple client-server bulk data transfer application. The server portion of that application ran inside a residential network on a fast laptop running the Fedora Core 4 version of Linux. The client portion, running on the remote Sun Ray server, opened a connection to the server and transfered a specified amount of bytes as character data from the Sun Ray server. To collect representative samples, we transferred 1 MB and 10 MB amounts of bulk data, and repeated each transfer three times at various times of the day, with the following results:
| Bytes Sent | Client's Time | Server's Time | Bandwidth on Client | Bandwidth on Server | |
|---|---|---|---|---|---|
| 1 | 1 MB | 9,085 ms | 9,201 ms | 901.7 Kbps | 890.34 Kbps |
| 2 | 1 MB | 9,117 ms | 9,191 ms | 898.54 Kbps | 891.3 Kbps |
| 3 | 1 MB | 8,648 ms | 8,865 ms | 947.27 Kbps | 925.23 Kbps |
| 4 | 10 MB | 85,382 ms | 85,403 ms | 959.45 Kbps | 959.22 Kbps |
| 5 | 10 MB | 84,891 ms | 84,984 ms | 965 Kbps | 963.95 Kbps |
| 6 | 10 MB | 89,040 ms | 89,049 ms | 920.03 Kbps | 919.94 Kbps |
| Average bandwidth measured: | 932.00 Kbps | 925.00 Kbps | |||
| Idle Time | StarOffice Spreadsheet | Reading Email with Thunderbird | Developing with IntelliJ IDEA | Web Browsing with Firefox | |
|---|---|---|---|---|---|
| Total bytes transferred | 321.51 KB | 15,057.7 KB | 22,553 KB | 23,560.08 KB | 43,464.89 KB |
| Avg bandwidth consumed | 1.43 Kbps | 66.92 Kbps | 100.24 Kbps | 104.71 Kbps | 193.18 Kbps |
| Avg of available bandwidth consumed | 0.11% | 7.21% | 10.8% | 11.28% | 20.81% |
| Peak bandwidth consumed | 23.61 Kbps | 428.15 Kbps | 420.06 Kbps | 405.09 Kbps | 894.44 Kpbs |
| Max. of available bandwidth consumed | 2.54% | 46.11% | 45.24% | 43.63% | 96.11% |
Future Possibilities
The Sun Ray system currently requires a dedicated thin client device. Since the architecture can be implemented in software as well, should Sun make a Sun Ray client available as a software package, or even open-source that software, possibly any network-connected client could access a remote Sun Ray server. The Sun Ray architecture's relatively low computation and memory requirements on the client make that an especially attractive option for mobile devices[24] [25] [26]. For instance, a cell phone connected to the network via a cellular network or WiFi could run the Sun Ray software, and render the display to a BlueTooth-enabled display. The cell phone could also integrate a BlueTooth-enabled keyboard and mouse to provide a complete portable desktop experience. Recent developments with foldable displays and miniaturized, expandable keyboards, together with a ubiquitous Sun Ray client, could foreshadow a new era in mobile computing[27] [28].
Acknowledgements
The author would like to thank Brian Foley, Bob Gianni, and Ismet Nesicolaci, all with Sun Microsystems, Inc., for assistance in evaluating the Sun prototype display grid.
Resources
http://www.ietf.org/rfc/rfc2251.txt [9] Sun Microsystems, Inc. NFS: Network File System Protocol Specification. Internet Engineering Task Force RFC 1094. 1989. http://www.faqs.org/rfcs/rfc1094.html [10] SAMBA http://us3.samba.org/samba/docs/ [11] I. Foster and C. Kesselman, editors. The Grid 2: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, 2003. Amazon.com link [12] M. Vernon. Save money by taking control of your desktops TechRepublic, August 23, 2005 http://insight.zdnet.co.uk/business/0,39020481,39214436,00.htm [13] X Windows. http://www.x.org [14] Citrix Access Suite http://www.citrix.com/English/ps2/products/documents.asp?contentid=12752 [15] Citrix GoToMyPC http://www.citrix.com/English/ps2/products/documents.asp?contentid=13994 [16] Microsoft Remote Desktop Protocol http://msdn.microsoft.com/library/?url=/library/en-us/termserv/termserv/remote_desktop_protocol.asp [17] Virtual Network Computer http://www.realvnc.com [18] R. Baratto, J. Nieh and L. Kim. THINC: A Remote Display Architecture for Thin-Client Computing Technical Report CUCS-027-04, Department of Computer Science, Columbia University, July 2004. http://www.ncl.cs.columbia.edu/publications/ucs-027-04.pdf [19] Sun Ray Thin Client http://www.sun.com/sunray/whitepapers.html [20] B. K. Schmidt, M. L. Lam, and J. D. Northcutt. The Interactive Performance of Slim: A Stateless Thin Client Architecture. Operating Systems Review, 34(5): 33-47., 1999. http://research.sun.com/features/tenyears/volcd/papers/nrthcutt.htm [21] Firefox and Thunderbird http://www.mozilla.org [22] OpenOffice http://www.openoffice.org [23] IntelliJ IDEA http://www.intellij.com [24] A. M. Lai, J. Nieh, B. Bohra, V. Nandikonda, A. P. Surana, and S. VarshneyaMobility. Improving Web Browsing Performance on Wireless Pdas Using Thin-Client Computing Proceedings of the 13th ACM International Conference on World Wide Web, 2004. http://citeseer.ist.psu.edu/lai04improving.html [25] S. J. Yang, J. Nieh, S. Krishnappa, A. Mohla, and M. Sajjadpour. Mobility and Wireless Access: Web Browsing Performance of Wireless Thin-Client Computing Proceedings of the 12th international conference on World Wide Web, 2003. http://citeseer.ist.psu.edu/sj03web.html [26] A. Lai, J. Nieh. Limits of Wide-Area Thin-Client Computing. Proceedings of the 2002 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2002. http://citeseer.ist.psu.edu/lai02limits.html [27] P. Sayer. Notebooks to Get Foldable Displays PC World, September 6, 2000. http://www.pcworld.com/news/article/0,aid,18349,00.asp [28] L. Valigra. Next Digital Screen Could Fold Like Paper Christian Science Monitor, January 8, 2004. http://www.csmonitor.com/2004/0108/p14s01-stct.html
About the author
Frank Sommers editor of IEEE Scalable Systems, the publication of the IEEE Technical Committee on Scalable Computing. He is also founder and president of Autospaces.
h
h