From IEEE Spectrum, March '97
Reproduced with permission.

The Rise of Shared Virtual Environments

This intriquing development could weave advances in computers and networking deep into our social fabric.

In the beginning, there was batch processing. When finished writing their codes, programmers submitted them on a deck of Hollerith cards to a computer center, where the routines would line up with other programs, waiting to be processed en masse. Hours--or days--later, the center would present the resulting alphanumeric printout to the programmers, who then examined the reams of paper for problems. If there were problems, another lengthy iteration would occur. Using a computer was then an instance of "action at a distance": many users shared a system to which they had no direct access; and between a program's submission and the completion of its run, there was a long delay essentially unrelated to the actual time it took the computer to perform the task.


The authors of this article, Dick Waters and John Barrus,
were two of the developers of "Diamond Park": a square-mile virtual space
where people could gather to work, learn and play together.
Take a tour of Diamond Park.

Today, computer use is predominantly single-user, real-time. Someone sits at a personal computer or workstation, makes changes (in, say, a spreadsheet program or a computer-aided design tool), and within a second or less sees the alterations' effects on a computer screen, often in the form of a table, graph, or drawing. Further, the computer now also serves as a medium of information exchange; the user can send and receive information through a modem or network connection. This, too, is a single-user operation; the style of communication with another user is generally disconnected (unlike the telephone, with its hard-wired, real-time channel between the users) and serial (one user sends a message and its recipient sends one back).

Interactions worldwide

In the future, people using a microcomputer-based system may routinely log onto a computing network and find themselves in a highly graphical environment. This computer-generated world would be populated by many other people, all of whom will be able to interact in real time, helped by various computer tools they could pick up and use intuitively, almost as they would a hammer. These shared, computer-resident worlds are called distributed virtual environments (DVEs). Like any truly new generation of computer technology, DVEs could radically alter the way we work, learn, consume, and, of course, play.

To see how DVEs might change the workplace, imagine a group of architects in London, structural engineers in New York City, and a developer in Madrid "walking through" a virtual building they had designed using computer tools. Each would be represented on screen by a three-dimensional icon--a so-called avatar--that the participant could move through the virtual environment and, in the process, open doors and rearrange tables, chairs, windows, walls, and other objects. As they viewed the computer-rendered building from diverse exterior and interior vantage points on their far-flung computer screens, they could talk with each other and experiment with possible changes. The changes would be made in real time, and their effect on the utility and structural integrity of the building be seen immediately.

During the lifetime of the project, the group might meet in this collaborative DVE many times to review progress and plan changes necessitated by altered economic conditions or corporate objectives. The depth of collaboration this working virtual environment would provide by allowing the participants to meet early and often is important. The DVE's convenience would encourage the kind of repeated interaction that is essential if each member of a group is to become fully aware of other members' concerns, so that the group arrives jointly at a solution--one that is likely to work better than any dominated by a single party.

Similarly, a DVE could reshape the educational process. For teaching a language, for example, imagine a virtual street in Paris. This computer construct might be lined with apartment buildings, shops, restaurants, a museum, a library, a theater, and, at its end, a small park. There could be throngs of residents, shopkeepers, patrons, waiters, pedestrians, and tourists. Language students and their teachers could take on the roles of some of these characters; the rest might be played by native speakers or a computer with artificial intelligence. Upon entering this virtual environment, students would practice a foreign language, speaking French with other people in a nonthreatening environment while learning about modern-day French life.

For this DVE, interactivity is the outstanding advantage. From time to time, tens of thousands of people from around the world could join in. As a result, at practically any time he or she wanted, a visitor might join in and truly interact with other people who were interested in practicing French. (Most current computer-assisted tools for learning languages boil down to rote drills with no interaction with other people.)

DVEs could also recast the way in which people make purchases and purchasing decisions. Imagine planning a trip to New Zealand. Using the environment of a virtual travel agency (which can be open 24 hours a day), would-be tourists could explore sites and talk with people from the government travel bureau. They might also visit hotels and resorts, check out rooms, and discuss needs, wants, and alternatives with hotel staff. Meanwhile, they would encounter others planning trips to New Zealand and swap information with them about what was new and interesting.

Constant updating is this DVE's strong point. The environment can evolve even as it is being used. Virtual models of cultural sites and accommodations could be added and refined continually without the environment ever having to be shut down.

Attributes of DVEs

These examples are only three out of a multitude of possible DVE applications. Still, they serve to illustrate the key DVE features:

Is it the future yet?

Research on DVEs has been under way for over 20 years. But general interest in developing virtual environments has been held back by two factors. Very few people could afford a computer that would generate in real time the visual and audio imagery of a virtual world. Even fewer people had access to wide-area computer networks with the high-bandwidth, low-latency communication needed by DVEs. Now that these obstacles are beginning to disappear, however, multiuser virtual worlds are on the brink of widespread availability.

As for computer capability, the arrival of powerful graphics accelerator cards, combined with the ever-growing power of PC processors, means that DVE computations can often be done by high-end PCs.

The networking needs of some simple DVEs can be met by 28.8-kb/s modems, which became common in 1996. Further, modems almost twice as fast as that are now on the market, and in a short while several developing technologies--asymmetric digital subscriber line (ADSL) and cable modems, to name but two--should be able to provide affordable multi-megabit-per-second network connections to the home. When this occurs, the pipeline to the home or office will offer enough bandwidth for just about any DVE.

Now that the hardware needed for DVEs exists, the results of research that has been going on for many years will soon appear in the home and office. This research has been conducted in two separate communities: the Internet world, paced by commercial developers, and the Distributed Interactive Simulation (DIS) world, pushed by developers of military simulators. Both efforts have the same long-range goal--a complex yet flexible cyber-environment fully shared by many users. But the Internet and DIS camps have made diametrically opposed decisions as to which path leads there, disagreeing about which DVE features are essential now and which can be deferred. In essence, each group has attacked the above list of attributes from opposite ends, with the Internet camp focusing on affordability while the DIS camp worked on multimillion-dollar environments.

The Internet camp's view

From the Internet group's perspective, a DVE must run on the kinds of computers most people own and over the kind of network connections that most people have. Until not so long ago, this goal meant sacrificing voice communication and the sense of 3-D immersion in a DVE on the altar of mass access.

The stage for Internet DVE work was set in the mid-1970s, with the advent of a new kind of computer game, Adventure, created by Will Crowther and Donald Woods of Xerox Palo Alto Research Center, California, and run on a Digital Equipment Corp. PDP-10 mainframe computer. (The game was also known by the truncated name Advent, since computers of the time did not permit longer file names.)

Adventure was inspired both by Crowther's love of exploring caves (in particular, the Mammoth Cave system in Kentucky) and by Dungeons and Dragons, a fantasy adventure game created by Dave Arneson and published by TSR Inc., Lake Geneva, Wis., in 1973. Played with pencil and paper, Dungeons and Dragons was the first role-playing game and was all the rage among teens, preteens, and college students worldwide.

To participate in their beloved Dungeons and Dragons, players gathered to act out roles as knights and sorcerers in a "world" designed and governed by an experienced member of the group, a so-called dungeon master. This person determined how the rooms and other environments in the game would look and be connected, and what objects would be in them. At first, only written descriptions were used, but later a painted metal figurine represented each player amid scale models of the rooms and props.

A player was free to ask the dungeon master questions about the room and, on picking up an object (perhaps a sword), would be told by the dungeon master of its special properties (maybe a magic ability to heal wounds). Movement through the world and the outcome of battles was usually determined by rolling a special die, often one with eight or 12 sides. Play could last for days, ending only when a goal defined by the dungeon master was reached or all the players "died."

Adventure, in contrast, was a single-player game, in which the player explored a computer-based fantasy world consisting of a maze of caverns, fought a dragon, and found treasure. The game was entirely text based--relying upon the player to conjure up mental images from textual descriptions. The descriptions of environments, objects, and possible actions appeared as text on the computer's display. To proceed through the space, players typed simple commands like UP or DOWN on a computer keyboard.

Within a few years, Adventure spawned a whole genre of computer-based fantasy games, which continue to be popular to this day. A pivotal development in computer-based fantasy games, from the DVE perspective, was enabling multiple players to interact. Perhaps the earliest of these multiuser games was Multiuser Dungeon, or simply MUD, a text-based game written by Roy Trubshaw and Richard Bartle of the University of Essex, Colchester, England, in 1980 to run on a Digital Equipment DEC station. Because of the popularity of this game and its successors, the term MUD became a generic description for multiuser computer games.

A later MUD development was a switch in focus from games to interpersonal interaction. This kind of MUD has become a staple of on-line service providers, in the guise of Internet chat rooms. Another development was toward permitting a "dungeon master," that is, the person in charge of running and maintaining the MUD, to change the design of a game setting over time. This kind of MUD relies on object-oriented programming techniques to permit its creator to construct the initial game, and later to change it and modify it in a modular fashion. Often referred to as a MUD, object-oriented, (or MOO), much of its appeal to on-line users lies in the fact that they can work together on extending and evolving a MOO.

Doubtless the developers of the first MUD dreamed of having pictures and audio communication instead of just text. An early move in this direction was LucasFilm Ltd.'s Habitat. Created in 1985, it was the first on-line environment to let users design their own avatars and interact with other people using graphical representations of bodies and objects. The system, which ran on a Commodore 64 (from now-defunct Commodore International), was accessed through Quantum Computer Service (now America Online) and supported only 500 participants. Its successor, WorldsAway, added 3-D graphics but not audio communication. It is now available on CompuServe and is supported by Fujitsu Software Corp., San Jose, Calif.

From the military

The Distributed Interactive Simulation (DIS) world pursues simulations realistic enough for use in training the armed forces. A DIS system is required to support thousands of simultaneous users and to immerse them all in convincing 3-D. Typically, such a system demands expensive custom hardware.

In the '70s, the U.S. Department of Defense already had highly immersive simulators for teaching personnel how to operate military vehicles--such as planes, helicopters, and tanks--in their typical surroundings. Each simulator was a one-of-a-kind virtual-reality system. Trainees sat in a mock-up of the vehicle's cockpit or battle station, surrounded by multimillion-dollar, special-purpose, real-time graphics hardware for generating images of the simulated world around the vehicle.

Starting with Simnet, the simulation network effort of the early '80s, the Defense Department sponsored research on using dedicated high-speed networks to connect simulators. The idea was for trainees, though in separate simulators, to perform group exercises. Early on, it was decided to take existing simulators using incompatible formats (for 3-D models, for instance) and make them work together, rather than develop standards for portable content. This decision led to the IEEE DIS standard, whose sole focus is on how simulators in the course of a training session can communicate dynamic information, such as the position of objects like virtual troops, tanks, and helicopters. DIS has created a prosperous simulation industry and several highly successful military training systems--systems that support thousands of users simultaneously in rich, 3-D visual and auditory worlds.

As yet, DIS systems lack any standards for moving virtual objects and environments from one simulator to another (portable content standards) and so are awkward to extend. For example, to add a new visual object, such as a new tank or aircraft, to a multi-simulator system of this kind, simulator manufacturers must each construct a 3-D model of the new object and link it into their proprietary simulator software.

DIS does a good job of minimizing the constraints on how a simulator is constructed, but it gives little help for reusing simulator software. Major standards efforts are under way to develop a high-level system architecture and a run-time software infrastructure needed to build a common platform for implementing simulators. This approach would reduce system procurement and maintenance costs.

Can the twain meet?

The Internet and DIS worlds have been advancing largely in ignorance of each other's achievements. This is unfortunate, because each has a great deal to learn from the other.

For example, the Internet world wants to be home to DVEs with large numbers of users. But until they incorporate what the DIS world has learned about how to scale up in users, they are unlikely to succeed.

Conversely, various government and subcontractor groups are trying to move DIS beyond the military training realm into a more general commercial arena. But they are unlikely to succeed until they incorporate key features, such as the ability to quickly move content from system to system, pioneered on the Internet.

DVEs with a full range of features will become available much sooner if the Internet and DIS worlds pool their respective strengths, rather than competing and condemning each other's weaknesses.

New on the Net

Now that more capable PCs and networks are proliferating, the Internet world is free to experiment with building virtual environments that provide 3-D immersion and spoken interaction. The effort includes proprietary systems, but there is also a strong movement toward open standards. Central to this movement is the Virtual Reality Modeling Language (VRML) standard, introduced in 1995 by the VRML working group, for portable VR content.

VRML first appeared as a standard for static 3-D models (VRML 1.0). Subsequently, it grew into a standard for content that can interact in real time with a single user (VRML 2.0). Currently, efforts such as the proposed Living Worlds and Open Community standards are attempting to expand VRML into a standard for DVEs.

There has been a rush over the past couple of years toward MUDs that, while still using text for conversations, represent the users with 3-D graphics (avatars) in a 3-D setting. These systems include Black Sun Interactive's CyberGate, Chaco Communications' Pueblo, Sony's CyberPassage, The Palace Inc.'s The Palace, and Worlds Inc.'s AlphaWorld and WorldChat. AlphaWorld is one of the most interesting of these systems because it supports a high degree of run-time extendibility. That is, users can design their own buildings and other models, add them to the environment, and visitors to the site can explore them.

At the time of writing, few Internet MUDs have taken the next step of substituting spoken communications for typed interaction between users. Two that have are Intel Inc.'s Moondo and OnLive! Technologies Inc.'s OnLive! Traveler. These MUDs use Internet phone software to transmit highly compressed sound between users. Unfortunately, until Internet latency shrinks (it now sometimes approaches a round trip delay of a second of more), conversation will at best be stilted. Moondo and OnLive Traveler point toward what will become possible as Internet quality improves.

Going further, CyberCampus, an experimental environment developed by NTT Software Corp., San Francisco, Calif., spices up 3-D graphics and live sound with limited video. In CyberCampus' simple 3-D world, users converse and interact through avatars whose "heads" display two-frame-per-second videos of their users' faces. But even at low frame rates and resolutions, the bandwidth needed dictates that ISDN (integrated-services digital network) lines be used to achieve low latencies.

Multi-player games have been installed for some time in arcades. They use dedicated wires and can cope with only a few users, who must all be at the same site and who can talk to each other only because they are sitting side by side.

Recently, though, Internet games for multiple players in remote locations have begun to appear. Among the most popular are Marathon by Bungie Software Products Inc., Doom and Quake from Id Software Inc., and the Warcraft series and Battle.net by Blizzard Entertainment. These games typically allow a few players , who are connected by 28.8-kb/s modems to the company's central game-software server, to interact over the Internet in the same game space. Sometimes, all the players can do is shoot at or race each other. Some of the games, though, have MUD-like text chat features. Special Internet service providers, such as Mpath Interactive Inc., Cupertino, Calif., are emerging to provide low-latency multiuser interaction and audio communication specifically for use in Internet games.

Enter multiuser VR

Virtual-reality (VR) research for most of its short history has taken aim at single-user applications. A key difference between this research and MUDs is that, like the DIS world, VR research tends to prefer to create more immersive DVEs through the application of high-end graphics machines and special input devices. Machines of this kind from Silicon Graphics Inc., Mountain View, Calif., were used in the virtual-reality room called the CAVE, contrived by the University of Illinois at Chicago. Input devices include head-tracking devices and gloves that sense hand movements.

Now research into multiuser VR applications is on the rise. For instance, VR toolkits are being extended to let developers design VR environments in which a number of users can interact with objects. The multiuser tools developed by Division Ltd., Bristol, England, and Sense8 Corp., Mill Valley, Calif., are typical. Formerly, they were limited to a few simultaneous users, because they replicated identical, synchronized copies of the database on each user's system. Also, the toolkits' support for live audio has been weak, relying on the use of separate phone connections. But these kits are evolving rapidly to eliminate these shortcomings.

A notable VR system--built from the ground up to be shared by multiple simultaneous users--is the Distributed Interactive Virtual Environment (DIVE) system devised by the Swedish Institute of Computer Science in Krista, a suburb of Stockholm. DIVE supports 3-D visual and audio worlds. Like VR tool kits, it initially relied on standard replicated-database technology, limiting its scalability. Recent improvements in DIVE's database technology allow dozens of users to join in over a local-area network. But because of the high-bandwidth requirements of transmitting sound, only a few users can be accepted on a wide-area network.

The proposed Open Community standard is notable for supporting all of the key DVE features listed above. It is based on the Scalable Platform for Large Interactive Networked Environments (Spline). Developed by Mitsubishi Electric's MERL research laboratory, Cambridge, Mass., Spline was first used for an application called Diamond Park, and supports the DVE features by merging key features of the Internet and the DIS approaches to virtual environments.

To scale the system to numerous users, for instance, Spline builds on the "approximate database replication" idea on which DIS is founded. At the same time, the platform's design lets a world's content be ported to a wide variety of computer systems, and the world can be changed and added to at run time.

Multiuser VR systems are admittedly a little ahead of the market. Relying as they do on high-end graphics hardware and relatively high-speed networks, they are inappropriate for current PCs connected through the Internet. Instead, they anticipate the situation a year or two from now, when PCs with plug-in graphics cards will be as powerful as current high-end graphics hardware, and the Internet will have lower latency and much greater capacity. MERL is currently implementing a revised version of Spline that will run on high-end PCs.

We are nearing the threshold at which full-scale DVEs become practical for general use. Such environments are so new that no one can predict confidently what their most important application will be. However, the successes of both MUDs and military simulators suggest that DVEs will fulfill many needs. Now is the time, as Mao Zedong said, for "letting a hundred flowers bloom and a hundred schools of thoughts contend...." Each new experiment must exploit to the full the findings of its precursors; the riches of this new medium will not be grasped until many have tried, failed, and tried again.


About the authors

Richard C. Waters is a research fellow at Mitsubishi Electric Corp.'s MERL research laboratory in Cambridge, Mass. Co-architect of the Spline platform for distributed virtual environments, he is a research affiliate of the Massachusetts Institute of Technology (MIT) Artificial Intelligence Laboratory, where he earned his doctorate.

John W. Barrus is a research scientist at MERL who, with Ilene Sterns and Stephan McKeown, led the design and construction of Diamond Park. He is also a lecturer in MIT's mechanical engineering department, where he earned his doctorate.


Spectrum editor: Richard Comerford

(c) Copyright 1997, The Institute of Electrical and Electronics Engineers, Inc.
Reproduced with permission.