Communication Institute for Online Scholarship
Communication Institute for Online
Scholarship Continous online service and innovation
since 1986
Site index
ComAbstracts Visual Communication Concept Explorer Tables of Contents Electronic Journal of Communication ComVista

EJC logo
The Electronic Journal of Communication / La Revue Electronique de Communication

Volume 16 Numbers 1& 2, 2006


Daphne Economou
University of the Aegean (UoA)

Abstract. The need for a real-world driving problem to guide technology development has long been recognised. However, this does not guarantee the identification of requirements for technology development. This paper argues that a more systematic approach is needed for choosing and making best use of a driving problem for Collaborative Virtual Environments (CVEs) technology.

The approach involves: choosing a problem area and application; identifying application requirements through low-tech prototyping; developing a set of application design guidelines; designing and building an application using CVE technology; and studying the application in use.

The approach is illustrated by considering the development of a particular CVE technology known as Deva. This paper illustrates this approach by describing a particular CVE technology known as Deva. The problem area chosen was the use of CVEs in museum education. An application was developed based on an ancient Egyptian game (Senet) and aimed at children at Key Stage Level 2 of the National Curriculum for education in England

The research results are based on observations with real users. The first phase formed a 2D single display groupware system where interactions took place face-to-face in the “real-world” (external to the CVE). The second phase was a conventional 2D multi-user groupware environment in which the users were remotely located and interactions were internal to the CVE. Observations of these two phases formed design guidelines that shaped the third phase of the study. This was built on the Deva CVE system, a 3D multi-user groupware environment in which the users are remotely located and interactions are internal to the environment.

1  Introduction

It has long been recognised that meaningful advances in computer technology can only come about by using it to build prototypes against the demands of real applications (Brooks, 1988). This “driving problem” philosophy sees the choice of a good problem and collaborators as essential to advancing the technology.

In the field of Virtual Reality, the considerable demands placed upon the technology in order to generate even the most basic of environments has meant that the choice of driving problem has traditionally been directed mainly by technological concerns. For example, the single-user MAVERIK system (Hubbold et al., 1999) used as one of its driving problems the visualisation of real-world process plants. These CAD models constitute very complex man-made objects, the rendering and management of which does not yield to traditional graphical optimisations or generic virtual environment techniques.

CVE systems such as DIVE (Carlsoon & Hagsand, 1993; Benford et al., 1995), MASSIVE (Greenhalgh & Benford, 1995) and Diamond Park (Waters et al., 1997) have been used mainly to investigate technical issues such as the system level mechanisms for networking, event distribution or provision of audio or video links between users (Benford, Greenhalgh & Lloyd, 1997). Even the investigation of virtual human representations has been oriented towards technical issues such as the need to improve rendering techniques in order to maintain performance (Capin et al., 1998).

However, the purpose of a CVE is to support the processes of collaboration between users. There is an existing body of work looking at user needs but this is primarily from the perspective of usability (Kaur 1998; Kaur et al. 2000a, 2000b; Stanney, Mourant & Kennedy, 1998). What is needed is to broaden that perspective to recognise the situated and social nature of the processes in collaboration. It is thus necessary to study a “real world” situation to determine “real” requirements for CVE technology. Problems that determine the success or failure of a system can only arise in such a situation.

This paper argues that the multi-user, social nature of CVE technology means that the choice and use of driving problem must be determined not just by technological concerns but also by the needs of the end user, and that the driving problem domain must be selected with care. This paper begins by describing the development of a CVE system called Deva (Pettifer & West, 1999), and details the various stages and driving factors during its development (Section 2). A method is then presented for selecting and using a driving problem. The method is then illustrated by describing the Senet project, which involves using Deva to create a multi-user board game for use in museum education. The choice of problem area and the Senet application is first described (Section 3). A set of application requirements was identified by the study of “low-tech” prototypes (Section 4). A rigorous methodology for studying social interaction in CVEs is described (Section 5). From these requirements a set of design guidelines were derived (Section 6). By using these design guidelines to implement the Senet application in the Deva CVE, a set of technology requirements can be identified (Section 7). A study of the Deva application in use led to a further set of technology requirements (Section 8).

2  Historical Development of the Deva VE System

In 1991 the Advanced Interfaces Group at the University of Manchester set out to develop a multi-user Virtual Reality system for dealing with large-scale virtual environments. The result was a prototype system called AVIARY (Snowdon & West, 1994) that included a novel framework for the management of multiple environments. At the time, limited support both of hardware and software for networking and rendering meant that most of the development effort had to be aimed at just making the system work, rather than satisfying the extensive real-time constraints of VR.

A prototype application was constructed for the system based on real-world aviation data, and aimed loosely at providing Air Traffic Control (ATC) personnel with a 3D visualization of aircraft flight paths.  With no real hardware support for 3D graphics and limited networking available, it was clear that the gulf between what might in principle be possible in Virtual Reality and what could in fact be supported by existing technology was significant. Though very much a prototype and thus inappropriate for “real-world” use, the ATC application highlighted important issues in the system's architecture. In particular, bottlenecks were identified in the inter-process communications and rendering models used by AVIARY and many other comparable systems.

Design effort shifted from AVIARY to two complementary new systems: MAVERIK, a single-user rendering and spatial management kernel aimed at eliminating the graphical bottleneck; and Deva, a distributed object management layer designed to provide coherent shared virtual environments over wide-area networks.

In spite of the substantial effort required, and in the absence of other systems that could be modified to suit the new architectures, it was decided to develop these systems “from the ground up.” In the case of MAVERIK, the implementation of which is essentially complete, the decision to implement a “pure” system rather than a demonstrator of concept based on existing system software appears to have paid off. MAVERIK has been used successfully to implement single user applications that would be significantly more difficult to implement using other methods (Hubbold et al., 1999).

MAVERIK has the advantage of dealing with relatively “low level” problems, such as rendering and spatial management. MAVERIK provides a novel architectural solution to a problem area that is primarily technological and where the difficulties are well understood (e.g. maintaining an appropriate frame rate for interaction, rendering scenes requiring diverse representations, handling user input appropriately). In this sense the requirements placed upon the system are clearly defined, and its ability to meet these requirements are readily measurable (e.g. take an application that is difficult to render, and measure the resulting frame-rate).

On the other hand Deva aims to address issues that are much less clearly defined, such as the relationships between users in a CVE, or the affordances of the environment itself. Finding a meaningful measure of success or an application that clearly drives the development of the system's support for CVEs has proven to be difficult. The following subsections outline two existing applications that have been constructed using Deva and MAVERIK, highlighting their difficulties as driving applications.

2.1  The Fishcages

This somewhat esoteric “application” developed along with the system, and is no more than an aggregation of the novel features of Deva and MAVERIK. It consists of a number of virtual cages in which virtual fish of different types swim about (see Figure 1). Users (usually two for demonstration purposes) are represented by animated avatars, and may interact simply with the environments that are represented by the cages and their contents, with one another, and with the environment they exist in themselves.

Figure 1

Figure 1. The Fishcages

The application demonstrates the system’s novel object orientated engine, its distributed nature and its integration with MAVERIK. In spite of its unusual nature, it has proven a useful tool for describing the architecture of the system. However, it in no way drives the development of either system in anything other than the most trivial manner (e.g. “It would be nice to have a different sweater on the second user's avatar”) since it is a demonstration of existing features.

2.2  The Distributed Legible City

The Distributed Legible City (DLC) was the first “real” application built using Deva. It was an attempt to build an engaging environment for multiple users in which to study social interaction within a virtual world. Indeed, other than fulfilling this goal of being an “engaging environment” - that is providing some reason for interaction - there were no other real end user requirements. The decision to base the environment on an evolution of Jeffrey Shaw’s 1990 multimedia installation The Legible City (Shaw, 1998) was purely opportunistic.

The art piece on which it is based consists of a darkened cuboid room in the centre of which is mounted a modified “touring cycle” facing a large back-projected screen. Seated on the bicycle, the visitor to the installation is presented with a three dimensional “street-level” view of one of three cities: Manhattan, Karlsruhe or Amsterdam. A liquid-crystal display, mounted on the handlebars of the bicycle, shows an overview map, including the position of the cyclist in the virtual world, and a single large button transports the user between the three cities. Physically pedalling and steering the cycle causes the viewpoint in the virtual environment to move accordingly. Each city is represented in the virtual world not by buildings and traditional “street furniture”, but rather by solid letters forming sentences from texts appropriate to the location (see Figure 2).

Figure 2

Figure 2. The Distributed Legible City

The original version of The Legible City was a single user virtual environment, installed within a purpose-built room, consisting of a custom built and instrumented tour bicycle, liquid crystal display and high-end graphics engine and projection system. The software responsible for rendering the letter-filled streets of the city was written in “raw” OpenGL.

The Distributed Legible City extends the original work to include multiple participants, and aims to provide an engaging shared virtual environment in which “cyclists” seated on modified exercise cycles and situated at geographically distant sites can tour together around the three virtual cities, communicating with one another via an audio link and headset.

Two versions of the installation were constructed. The first consists of a 21-inch monitor mounted in front of the user's cycle as the main output device. The second adds a head-mounted display for each user and retains the monitor only as a secondary output device for onlookers to watch the goings-on in the environment.

The move from “desktop VR” (using a fixed monitor) to an immersive environment (using a head mounted display) was prompted by results from ethnographic study of the installation in situ (Pettifer et al., 1999). In addition to influencing the development of the piece, the studies revealed that actual user behaviour in the environment diverged radically from initial expectations and from observed behaviour of single users in the original piece (i.e. whereas in the first piece there was nothing but the environment to investigate and engage with, in the distributed version the interaction was dominated by issues involving finding other inhabitants in the virtual world). From a system designer's point of view at least, it became clear that social interaction and user behaviour in such environments is poorly understood and that “reasonable assumptions” made in the design phase were badly founded.

In terms of general principles that guide future development of the system and its applications, however, it has proven difficult to extract clear design issues from such an experience. The difficulty primarily arises from the “undefined” nature of an art piece, what is expected from it, and what is expected from those experiencing it.

For example, the initial design of the Distributed Legible City was considerably more ambitious than the eventual installation, including such notions as deserted areas between the cities where words spoken by the users during conversation influenced the text of the surrounding environment, creating new word-structures based on the interchanges between the inhabitants. Other ideas included clouds of “dust” formed by such utterances that would be thrown out from the back of the cyclist's biker-avatar as if kicked up from the road by the rear wheel. In practice, such extensions to the piece turned out to be impractical to implement during the time available.

However, with goals of making “an environment that will engage the users” and “providing freedom of artistic interpretation”, determining what the end user requirements are is difficult (especially since identifying what is actually “engaging” and what might constitute “successful interaction” under such circumstances is especially difficult). In this instance it would seem that such details as the introduction of new buildings based on conversation would have made little improvement to the environment, since other issues, primarily locating one another and achieving a “comfortable” conversational co-orientation, dominated user behaviour. In spite of the difficulties uncovered during the design of such a widely specified virtual environment for this finding on user interaction alone, the work could be considered successful. However we believe that the design process may be improved by careful selection of the driving problem.

The Legible City project followed an open-ended approach of “build it and see what happens”. In contrast, the Senet project followed a more systematic approach.

3  A Method For Using a Driving Problem

The case study described in the previous section raises several concerns about the use of a driving problem. Firstly, what are the essential properties that make something a good driving problem? Secondly, how to ensure that the application developed is “true” or “valid” to the end users? Thirdly, how to study the application in use and derive requirements for the technology?

3.1  A Stakeholder Approach

To address the first and second concerns it was considered essential to identify from the outset who had a stake in the project and what that stake might be. This stakeholder method of working evolved out of the Soft Systems approach (Checkland & Scholes, 1990) used for studying the application area (museum education) (Mitchell, 1999). This emphasised the importance of identifying stakeholders and the context in which systems will be used. In the Senet project the set of stakeholders consisted of:

  • technologists
    The Manchester University researchers responsible for developing and implementing the application in Deva.
  • designers
    The Manchester Metropolitan University (MMU) researchers responsible for investigating the problem area and designing the application.
  • evaluators
    The MMU researchers responsible for conducting user studies of the application.
  • end users
    Primary school children (~8-12 years) and teachers.

Other sets of stakeholders from the problem area can also be identified, such as the Manchester Museum itself, and educators.

A good problem area can ensure that the application to be developed is one for which end users can see a real need. This will help to ensure that the users will be motivated to use the application and help in obtaining “real” users for evaluation.

3.2  Choice of Problem Area and Application

Problems arising in a real world situation can determine the success or the failure of the system (Gunton, 1993). In order to study an authentic learning activity the research was based around the work of Manchester Museum's Education Service (Mitchell, 1999).

This service caters for school visits to the museum aimed at Key Stage Level 2 (9-11 years old). It provides access to a wide range of Museum artefacts relevant to subjects in the National Curriculum for education. One particular strength of the Museum, with a major part in the Education Service's teaching, is its collection of every day life ancient Egyptian artefacts from the town of Kahun. The artefact chosen as basis of the learning activity in this research is Senet - a board game for two players. Players take turns to throw a die. The object of the game is to “bear off” your 10 pieces first. Through the activity and a collaborative process the children get familiar with the artefact and learn by using it how it was played.

Developing a CVE based on Senet provides a good testbed for various CVE properties. It allows object manipulation (the board, die and pieces), individual operations, as well as operations in pairs or as larger groups. In terms of collaboration it allows co-operation (to learn the game) as well as competition (to win the game). The game situation allows a range of teaching styles from traditional instructional methods (e.g. explaining the rules in advance) to constructivist methods (learning by playing). Current educational thought recognises the need for sociocultural methods that emphasise the social roles of teachers and learners (Soloway et al., 1996). A more practical impetus for collaborative learning has come from two main sources in the UK. The National Curriculum for education places great emphasis on such learning.

The game supports the needs of experimentation in various ways. It is a fairly well structured task (the players have to follow certain steps to learn the rules and play the game). The length of the time required to play matches well the length of time the children could take part in a task before becoming restless (30-45 minutes). Players’ knowledge assessment can occur in a fairly unobtrusive manner (e.g. by observing if they follow the rules).

3.3  A Phased Approach

To address the third concern, an exploratory approach to user studies was adopted. Steed & Tromp (1998) distinguishes evaluation as belonging to either a scientific enquiry framework (concerned with the study of specific phenomena) or a usability engineering framework (concerned with measuring the effectiveness of a system). The studies being carried out are not evaluations but observations of what is going on. The work can thus be seen as belonging to the scientific enquiry framework.

Roussos et al. (1999) supports the need for such exploratory work which involves building novel learning applications and carrying out informal evaluations of them. Past studies have dealt with users with ready access to the technology. However, it is necessary to recognise the situated nature of the processes in collaborative learning. It is necessary to study a “real world” situation to determine the CVE requirements. Problems that determine the success or failure of a system can only arise in such a situation.

One problem faced by CVE research is that the current immaturity of the technology does not allow the full potential of the CVEs to be exploited. This means that many of the applications developed so far have been of a prototypical nature. There are two issues in respect of the prototypical nature of applications: it is often not feasible to create different conditions for experiments within the time and effort available. Thus the process of studying specific phenomena is constrained; defects in the prototypical functionality of the application might cause difficulties in conducting studies with real users (Steed & Tromp, 1998). The technology is not mature enough to afford the activities that such complicated environments require.

Another problem faced in CVE research is the vast number of factors involved in the construction of CVEs for learning. Kaur (1998) has identified 46 design properties to be considered when designing VEs for usability. The number of factors increases dramatically when considering communication and collaboration issues in CVEs (Johnson, Stiles & Munro, 1998). This makes it difficult to isolate which design decisions are responsible for the overall effectiveness of the environment. It is also difficult to identify the interplay between various factors (e.g. the effects that usability issues have on pedagogic issues).

To overcome these problems application development was divided into three distinct phases (Economou, Mitchell & Boyle, 2000). The applications across the three phases differ in three main ways:

  • population: the degree to which the environment appears to be populated by other people: semi-populated (the user sees other virtual actors present); fully populated (the user can represent themselves via a virtual actor)
  • 2D/3D: use of a 2-dimensional environment simplifies issues relating to navigation and the way in which objects are manipulated
  • external/internal interaction: whether user interactions take place outside or via the computer (see Figure 3).

Figure 3a


Figure 3b

Figure 3. (a) Interactions external to the system (C=child, E=expert), (b) interactions internal to the system (C=child, E=expert).

Each phase addressed a subset of the range of factors in CVEs and formed a particular situation to be studied. In the first two phases of the project a “low-tech prototyping” approach was adopted. In the first phase a single display groupware prototype was constructed to study issues relating to playing the Senet game. In the second phase a conventional groupware prototype was used to study issues involving interaction and communication between remotely located users whilst playing the game. The results of these user studies effectively formed a set of application requirements from which design guidelines could be derived. These design guidelines were then used to implement the application using the Deva technology in the third phase of work.

The phased approach provides several benefits, like managing complexity by dealing with a manageable limited set of factors in each phase (e.g. 2D/3D and population) and allowing the results of each phase to inform subsequent phases. Thus, requirements can be progressively identified. The use of more robust technologies allows the essential features of the situation (interactivity and social communication) to be studied with real users in a way not possible with more immature and inaccessible CVE technology.

3.4 The Method

In summary, the method consists of the following stages:

  • choice of problem area and application
  • identification of application requirements via “low-tech prototyping”
  • development of design guidelines
  • design and implementation of the application using the Deva CVE technology
  • study of the Deva application in use

Technology requirements were identified in three main ways:

  • by identifying changes that needed to be made to Deva in order to implement the application according to the design guidelines
  • by identifying design guidelines that could not be followed when implementing the application
  • by studying the Deva version of the application

To illustrate the method, the following sections look at how it was applied during the Senet project.

4  Identification of application requirements

4.1  Single-display groupware prototype

For the first set of studies, a prototype application was developed that took the form of a single display groupware (Stewart, Bederson & Druin, 1999). Users see the Senet board and pieces and can also access the rules of the game (see Figure 4). Users sit next to each other and view the application on a single, shared display (see Figure 3(a)). The interactions between them were external to the computer. The prototype was constructed using established 2D multimedia technology. This helped simplify issues surrounding navigation of the environment and the ways in which objects could be manipulated.

Figure 4

Figure 4. First phase prototype, single display groupware

The prototypes were observed in use by the general public during an open week at Manchester Museum. Observations of school children were also conducted under more controlled conditions. Due to the nature of both activities and the environment that occurred, note taking was used for data collection.

The studies aimed to understand and identify the factors involved in a real world game playing situation:

  • the types of interactions that occur between the users and the game environment
  • the communication between users (content and modes)
  • the roles that the users adopt in a game playing situation
  • controls over the communication and the game playing activity

The purpose of this study was primarily exploratory in nature. It gathered a rich set of qualitative information and identified usability issues surrounding the prototype that informed the design of environments developed in subsequent studies. Technical issues like interface decisions of the prototype have also been evaluated and informed the development of sequential prototypes. The studies also gathered requirements related to experimental settings and conditions for organising controllable provision for future studies. What stood out was mainly:

  • the rich range of interactivity and social communication that needs to be supported in CVEs for learning
  • the importance the expert being aware of and able to control even such a seemingly well structured activity as game playing.

4.2  Conventional groupware prototype

The second set of studies has been driven by factors that have been outlined in literature, as well as in the first phase. It focused on understanding the ways these factors changed when the interactions and communication between users was internal to the environment. It also looked for possible new factors arising in remote communication and interaction and the way this affects the users’ behaviour. The focus of the second set of studies was:

  • interaction
    Issues related to users interaction with objects in the environment and the environment itself.
  • communication
    The internalisation of communication made turn-taking a key issue for study in the second phase of the project. Issues included: the communication content in different stages of a session; the communication modes involved for delivering certain topics (textual, pictorial, deictic); and the efficiency of the tools the system provided.
  • pedagogy
    The studies determined pedagogical tactics used for delivering various topics and the pedagogical method they fall into. Points of immense significance were the expert’s control over other users and over the situation, and the roles that the users adopted under certain states of affair.
  • appearance
    It dealt with the presence of the virtual actors in the CVE and how the users’ actions were associated to the virtual actor.
  • awareness
    It dealt with the users’ perception of each other’s: status of activity (e.g. users playing, typing a reply, reading); intention of action (e.g. users being ready to take turn to talk or play); association of actions to users and the contribution of the users representation in the environment to this awareness.

The second set of studies was more focused in comparison with the first phase studies. For the purpose of the studies three prototypes were developed, which took the form of conventional groupware systems (see Figure 6). Participants were remotely located so interactions between them were internal to the computer (see Figure 3(b)). The prototypes were developed using 2D multimedia tools coupled with groupware technology typical of that used in education. The prototypes also introduced the concept of population to the environment. One prototype was semi-populated (the child could see a virtual actor representing the expert) (see Figure 5 (a)) (P2.1) and the other two prototypes (P2.2, P2.3) were fully populated (the child could also see their own virtual actor) (see Figure 5(b,c)). Users communicated by typing text in chat boxes associated with their own actor or using a hand for pointing (see Figure 5(b,c)).

The groupware prototype was observed in use at Knutsford High School over a period of three days. 22 children (11 pairs) participated in the studies. The subjects were twelve-year old children (year 7). Twenty two children (11 pairs) participated in the studies. Two rooms were used (see Figure 6). One contained a researcher playing the role of the “expert” and the second contained one or two children working on individual computers accompanied by a second researcher (the helper). In the studies using the first two groupware prototypes (see Figure 5(a,b)) only one child used the environment, the second child accompanying the expert and adopting the role of an observer (see Figure 6(a)). In the studies using the third prototype (see Figure 5(c)) both children used the environment (see Figure 6(b)).


Figure 5a


Figure 5b


Figure 5c

Figure 5 (a) 2D semi-populated, dialogue external to the game environment (P2.1), (b) 2D fully-populated, dialogue internal to the game environment (P2.2), (c) 2D fully-populated, dialogue internal to the game environment, increased population (P2.3).

The session lasted approximately 45 minutes. Basic instructions about the system were given to the children at the start of the session and they were instructed to ask the expert for support. The expert and the child were both video taped. The text typed in the chat boxes was written to a file. Each session was followed up by an interview with the children about their experience, which lasted approximately ten minutes and was tape recorded.


Figure 6a


Figure 6b

Figure 6. (a) The physical set up of the first two studies in the second phase using P2.1, P2.2. (b) The physical set up of the third study of the second phase using P2.3. H=helper providing technical support. AC=active child. E=researcher playing the role of the expert. OC=observing child.

5  A Rigorous Method for Studying Social Interaction in CVEs

Despite the exploratory nature of the work, the primary purpose of the method to be adopted is a form of requirements gathering that follows rigorous steps to enable the identification of design factors in a way that can directly inform CVE systems design. Candidate methods such as conversation analysis (Atkinson & Heritage, 1984; Boden & Zimmerman, 1991; Silverman, 1997) and discourse analysis (Coulthard, Montgomery, & Brazil, 1981) are narrowly focused on issues surrounding the dialogue itself. Intimate and subjective study of human activities and interaction requires a permanent record of naturally occurring events (e.g. field notes, video and audio) (Luff, Hindmarch, & Heath, 2000). Ethnographic approaches contribute to understanding the production of social actions and activities and recognise the activities of others. However, when coupled with video, these methods result in a vast amount of rich qualitative data. The complexity of dealing with video data has been recognised by a growing amount of researchers (Silverman, 2000). It is not only unmanageable, but the moment-to-moment detailed analysis is notoriously time consuming (Allen, 1989; Neal, 1989). The information is interrelated and it is difficult to be separated and rationalised. Viller & Sommerville (1999) argue that is difficult to draw design principles and other abstract lessons from a technique that is concerned with detail of a particular situation. Thus, it is difficult to make generalisations about design factors related to CVEs. The analysis needs to be practised by a group of analysts to overcome subjectivity.

One method for which video technology is essential is interaction analysis (Jordan & Henderson, 1998 ). This method has its roots in the social sciences and sees knowledge and action as fundamentally social in origin, organisation and use. It studies human activities, such as talk, non-verbal interaction and the use of artefacts and technologies. It is primarily defined by its “analytic foci” or ways to describe a video tape. Such foci include: structure of events, temporal organisation of activity, turn-taking, trouble and repair, and spatial organisation of activity. Important to interaction analysis is the data analysis by a group of analysts, which goes some way to countering subjectivity of analysis. However, group-based analysis is not always possible (as in the case of the Senet project) because of resource limitations.

The proposed solution that addresses these problems is the creation of an analytic grid that can be used to generate numerical values from the qualitative data. For example, if the factor to be studied is physical activities that the virtual actors could take to improve communication and interaction issues, then the quantitative information derived from of the qualitative data should indicate in which circumstances, and for what purpose certain physical activities have been used. In qualitative form the data is much more manageable and can be linked forward more reliably to the design factors that are developed.

The analytic foci and orientation adopted in the method used to study the Senet project, outlined next, is based on, and adds to, the interaction analysis foci. The method follows rigorous steps for organizing experimental settings, collecting and analysing data, and provides the means of managing large amounts of disparate data (video tapes, field notes, text files) (Economou & Pettifer, 2004). It consists of seven main steps, which are carried sequentially:

  • data collection
  • transcription
  • chunking of the transcription
  • creation of a grid
  • application of the grid
  • analysis at the session level
  • derivation of design guidelines

The data collection step involves keeping a record of all the actions, activities, and dialogue that took place during the study. This record needs to be organised and analysed in order to extract design guidelines.

The transcription step involves creating one account of the session (the game playing activity) by combining the data collected about communication and interactions taking place internally and externally to the prototype.

Each session is divided into three main ethnographic chunks: stages, segments and turns. Stages are defined by changes in the communicated topic (e.g. expert explains the system tools, children set up the board, children play the game). Each stage is subdivided into segments that are marked by the pedagogical tactic adopted for the delivery of a topic. A segment is divided into a set of turns.

Central to the method is the use of a grid containing a set of analytic categories that provides a way for studying and managing the rich, qualitative data. The analytical categories identified included:

  • physical activity (physical movements of the user such as: head movement, facial expression, position of the body, movements of the rest of the body)
  • communication activity (this the modes of communication using text, pointing, speech, body language)
  • turn taking (how turn boundaries were marked, interruption mechanisms users employed)
  • external intervention (when complete breakdown occurs and real world intervention is needed)
  • pedagogy (issues related to who adopted the teacher’s role: expert, co-player, helper, topic being covered, pedagogical tactics employed, change in tactic, reason for the tactic being adopted)

The analytic categories derived from:

  • a framework of design factors based on previous work in the area: virtual actors’ appearance; awareness; object manipulation; communication content; communication modes; turn-taking; the users’ role in the situation (Bigge & Shermis,1992; Kaur, 1998; and Benford et al., 1995)
  • the outcome of studies using the single display groupware application regarding: the roles the users adopt in a learning situations; communication content and modes; children encountering problems and how this affected the users’ behaviour
  • a preliminary analysis of selective transcriptions of the conventional groupware application (this was to minimise possibility of ignoring factors that have not arisen in the previous situations)

Columns common across the grids identify: the turn chunk (location of the turn within the session); the location of the action (this indicates the relative actions internal versus external to the prototype); and the description of the chunk (this includes the content of the turns).

The grid generates quantitative information out of qualitative data at the turn or segment levels. Subsequent analysis across a whole session, as well as comparisons between sessions, allows identification of patterns of user behaviours. The final stage of the method is the derivation of design guidelines (DG).

The final stage of the method translates key points deriving for each analytic category of the grid into design guidelines (DGs). The findings from all sessions and for all the analytic categories of the grid are considered. DGs need to be precise. Providing guidelines with extra information and examples reduces the chances of the guideline being too vague or conflicting (Reisner, 1987). The method follows a model of reporting DGs for usability in CVEs that is determined by four parts:

  • design guideline, which reports the DG that needs to be incorporated
  • motivation, which argues the importance of the DG based on the phases’ results
  • benefit, which discusses how the application of a DG addresses the issues that drew the creation of the DG itself (depending on context it is possible that some DGs may have a negative force in the CVE, this can be addressed with the evaluation of the DGs, which may address the need for the derivation of other DGs to overcome such problems)
  • examples, one or two examples of the practical implementation of the DG

This method is based on Kaur’s method of reporting DGs for usability in VEs (Kaur, 1998).

The 7-step method has been applied to the second phase of the study and derived a preliminary set of DGs (Economou, 2001; Economou & Pettifer, 2004), which directed the development of the third phase prototype CVE. The method has subsequently been applied to the third phase of the project to evaluate the effectiveness of the implemented preliminary DGs and to investigate new factors arising in a 3D CVE for learning. The method can be repeated according to the analytic categories to be studied.

The following section presents the preliminary set of DG, which derived from the second phase of the project.

6  Derivation of a Preliminary Set of Design Guidelines That Led the Design of the Senet Prototype in Deva

Twenty-two preliminary design guidelines (PDG) derived by the application of the grid on the second phase of this study. The context of their use is related to the following aspects of CVEs for learning:

  • environment, which addresses issues related to general tools that CVEs for learning should provide
  • objects, which address issues regarding the objects’ features contained in CVEs for learning
  • virtual actors, which address issues regarding the virtual actors’ features in CVEs for learning
  • virtual actor behaviour, which address issues related to the behaviours virtual actors with different roles in CVEs for learning should incorporate

Virtual Actor behaviour includes behaviours for two categories of users:

  • the students, naïve users who did not know the rules of the game and were new to the experience of participating in a CVE application
  • the teacher, the knowledgeable user in the CVE, who did not play, but knew the rules of the game, was aware of the process to be followed, and who assisted the children and provided guidance and support

The following sections present the PDGsand discuss how they led the development of the Senet application using the Deva CVE technology.

In the Deva Senet prototype two children were playing against each other and the expert took the role of the mediator. The users are remotely located, they have individual displays (a monitor, or a head-mounted display) and input devices (e.g. 3 button mouse or a 3D mouse), and the interactions between the users are internal to the environment (see Figure 7).

It was not possible to follow all of the PDG. If a PDG was followed then either Deva supported the implementation or Deva had to be changed. If a PDG was not followed this was either due to the fact that Deva could not be changed to support the guideline or due to a design decision. The following sections discuss where the PDGs were followed or not based on technological limitation of the Deva CVE technology.

6.1  Environment

PDG1: Simultaneous control

Unlike the NetMeeting prototype there is no shared pointer in Deva. Users can type or move objects independently of each other.

PDG2: History of communication

Displaying the comparatively large amounts of text accumulated during a session is unrealistic in a 3D environment for the reasons given below (see Section 6.2, PDG10). It would thus not be possible to have scrollable chat boxes within the Deva environment. Instead, each user's text message is also written to a transcript window which is external to the CVE. This separation was not seen as a problem, on the basis that when a child requires access to the historical logs to resolve an issue, the engagement with the current activities in the environment is inevitably interrupted to some extent (see Figure 7).

PDG3: History of physical activity

This guideline was not followed.
In Deva all the events that occur in the virtual environment are broadcast from a central server to the client processes that manage interaction and rendering. To provide a basic version of such a tool it would be necessary to log these events in a file and to replay the events from file rather than from the server. This would not give perfect replay, as events are discreet rather than continuous, but it could provide a good first approximation.

PDG4: Permanent information resource present

The rules of the game were displayed on the walls of the environment as texture maps (see Figure 7).

Figure 7

Figure 7. Senet prototype in Deva

6.2  Virtual actors

PDG5: Aesthetically pleasing virtual actors

The default virtual actors currently implemented in Deva lack detail and support only static texture maps as faces, thus limiting the level of detail and possibilities for delivering facial expressions or body language. Attempts to make them more realistic lead to a greater rendering load and slower performance (the “ideal” actor pictured below far exceeds the real time rendering capability of PC graphics accelerators, and challenges even the ability of high-end workstations). The system already includes sophisticated radiosity rendering software, enabling aesthetically pleasing and realistic lighting models to be used, and a current focus is on developing the Deva actors so that they more closely match the actors as envisaged by the designers (see Figure 8).

Figure 8a

Figure 8b



Figure 8. (a) A sample of the level of detail the current version of Deva supports.
(b) Realistic virtual actors created for the Senet Deva CVE.

PDG6: Convey presence and identity
PDG7: Convey role

Each user is represented by their own articulated virtual actor that reveals the user's identity and role (though a user cannot see their own representation). Three models of virtual actors had to be created: a teacher, a girl and a boy (see Figure 8). When users with the same role and gender were present in the environment their representations were distinguished by changing the colour of their clothes.

PDG8: Convey viewpoint

An actor is oriented in the CVE according to the user's viewpoint (see Figure 9(a)).
A particular problem was the expert not being aware of a child's viewpoint. This was addressed by providing the expert with a second monitor displaying each child's viewpoint (see Figure 9(b)). It was considered to display these viewpoints on the expert's screen but this would have used up screen space.


Figure 9a


Figure 9b

Figure 9. (a) The user’s position, orientation and distance from other objects and virtual actors in the CVE indicates the users’ focus of attention. In this figure the expert (the adult figure) observes the children (the child figure) moving a piece. The differently shaded area indicates approximately the expert’s viewpoint. (b) The physical set up of the third phase of the study using the Senet prototype in Deva, where the expert is aware of the children’s viewpoint (i) expert, (ii, iii) active children.

PDG9: Convey actionpoint

A user selects an object that is close to them by positioning the avatar's hand so that it touches the object (if the virtual actor is within arm’s reach of the object then the avatar positions its hand correctly to “touch” it). To select and move objects that are distant from them a change had to be made to Deva. This was done by pointing a “laser pointer” from the virtual actors’ hand indicating the exact point for selection (see Figure 7).

PDG10: Easily associated with its communication

Limitations of current 3D technology mean that placing large amount of easily legible text in a virtual environment is difficult. Relatively low resolution displays, limited depth buffering and little support for anti-aliasing on all but the most high-end of graphics hardware means that rendering anything but a few large words in a 3D environment results in badly pixelated unreadable text.

To overcome this, text was rendered in 2D on the near drawing plane of the virtual environment, positioned so as to appear above the head of the “speaking” virtual actor and merged with the 3D components of the environment (see Figure 7). This technique has the benefit of allowing several sentences of text to be rendered clearly on the screen independent of the actual position of the user's actor in the virtual world. The text moves with the position of the virtual actor, but is not subject to aliasing or depth buffering problems.

PDG11: Show message as it is being composed

A chat box is associated with the speaker's virtual actor while the messages were composed (as outlined above) addressed pedagogical as well as turn-taking issues.

PDG12: Convey the process of an activity

The laser pointer helps to convey when a user is moving an object. Deva was able to convey when the user is moving around the environment. The virtual actor object consists of a number of AC3D-format body parts (Pettifer, 1999; Colebourne), these are animated based on the forward velocity of the virtual actor to give the impression of walking (or running, with extra bounce when the velocity is high enough) (see Figure 10).


Figure 10a


Figure 10b


Figure 10c

Figure 10. (a) Shows that the girl child was in the process of going to read the rules, (b) shows the girl child on the way back to the board, and (c) shows that the girl child reached the board where she was intended to make a move playing the game.

PDG13: Convey the user’s intention to take a turn
PDG14: Convey the user’s offering of a turn

These guidelines were not followed. Instead they were seen as issues to be investigated during the user study of Deva.

PDG15: Identify speaker when out of other users’ viewpoints

To follow this guideline Deva was changed so that when an actor speaks who is out of viewpoint, a text box appears at the left or right edges of the screen, depending on the speaker's location relative to the listener. A prompt is also provided to indicate who is talking (e.g. “the user’s name is talking”) (see Figure 11).

Figure 11

Figure 11. The warning bars on the left and right sides of the middle figure (user A’s viewpoint), indicate that other users are talking.

PDG16: Convey intention to take a turn even when not being in other users’ viewpoints
PDG17: Convey offering of a turn even when being out of other users’ viewpoints

This could only be done verbally by the chat boxes.

PDG18: Private communication
PDG19: Show when the user is involved in private communication

6.3  Teacher actor

PDG20: Control over an individual user’s viewpoint

This guideline was not followed. Such a tool is not provided in Deva. A possible solution is to control the user’s viewpoint from the Deva server, as the virtual actors in Deva are objects represented by a single entity whose velocity and position can be controlled.

PDG21: Able to take control of the session

This guideline was not followed. Such a tool is not provided in Deva. However, it could be supported by virtue of Deva's explicit management of time in an environment, that could allows such notions as freezing the virtual time. Before implementing such a tool it was decided to investigate how necessary it was during the user study stage.

PDG22: Teacher aware of and have control over private communication

These guidelines were not followed. Instead they were seen as issues to be investigated during the user study of Deva.

7  Study of the Senet Deva Application in Use

The study of the Senet prototype application developed in Deva CVE technology allowed issues identified in certain PDGs to be investigated in greater depth. In addition the exploratory nature of the empirical study revealed other issues not identified previously that led to further technology requirements. These are discussed in the following sections.

7.1 Method of study

The third phase of the study was conducted in laboratories of the Advanced Interfaces Group at the University of Manchester. Twelve (6 pairs), of twelve-year old children participated in the studies. Three rooms were used, one for the researcher playing the role of the “expert” (E), and the other two rooms for the “child” actively participating in the activity (AC), accompanied by a second researcher, a “helper” (H) providing technical support (see Figure 9). The children were introduced to the use of the Deva tools (e.g. the mouse controls and communication tools) and afterwards they were asked to carry out various tasks such as: read the rules and set up the board; learn how to play the game; and play the game.

The children were video taped individually. The video cameras were set to capture the users’ interactions, the artefacts and other users in the CVE. Screenshots of one of the children’s screen providing a detailed record of internal to the system interactions between users was also video taped. Transcription of the users’ textual communication saved in a file provided a permanent record of the user’s dialogue. The transcription provided a record of the sequential organisation of the user’s turns to talk and the exact time of the exchange. For capturing the expert’s activities the think-aloud method was used (Monk et al., 1993) and tape-recorded. It is one of the few methods of getting a record of the user’s mental activity. The way it works is that the users think aloud about their activities in terms of mental reasoning (e.g. the expert described aloud her actions, decision-making and observations while playing with the children). This studied closely the problem such as the expert’s lack of awareness of the child’s exact situation. Questionnaires filled out by the children before the session obtained background information about the children.

Each session lasted approximately 45 minutes. Each session was followed up by an interview with the children about their experiences, which lasted approximately ten minutes and was tape-recorded.

7.2  Results

The data gathered is being analysed using the method outlined previously (see Section 5). Some of the major findings are reported below.

7.2.1  Awareness of others. A virtual actor was successful in conveying information about the user’s viewpoint, actionpoint, and the activity they are in the process of doing (e.g. typing, reading the rules, navigating) (PDG8, PDG9, PDG12).

The fact that the application was a 3D environment increased the chances of events occurring outside of a user’s viewpoint (PDG15, PDG16, PDG17). This might be because something was blocking a user’s view (e.g. someone else’s actor), or the user was physically located far away from the event.

When an actor speaks who is out of viewpoint, a text box appears at the left or right edges of the listener’s screen depending on the speaker's location relative to the listener. A prompt is also provided to indicate who is talking (e.g. “the user’s name is talking”). This was very useful as it increased the user’s awareness of the on going dialogue.

This also suggested another design guideline concerning a user being aware of other users’ actions, as well as communication outside their viewpoint. A large number of activities taking place outside the field of view could lead to an overcomplicated display. This raises deeper issues over what “grain” of action would need to generate prompts.

7.2.2  Turn taking. Turn taking, one of the major problems in earlier studies, appeared to be much easier in Deva. One reason was that PDG1 was satisfied (users had simultaneous control and did not have to share a pointer).

Another reason that turn taking was smoother was due to following PDG8, PDG9 and PDG10. The virtual actor conveyed a lot of information about the user’s viewpoint, actionpoint, and the activity they are in the process of doing (e.g. typing, reading the rules, navigating), which increased other’s awareness of the user’s current activity and communication.

The position of the virtual actor also conveyed information about the user’s intention (see PDG12). For example, a virtual actor moving close to the board usually meant that the user was going to interact with the objects on it. As another example, a child would turn their actor to the expert after completing their turn to seek feedback.

There were no explicit mechanisms built in for turn taking. The only way was to negotiate by typing text messages. The interruption mechanism the expert used was to be within the children’s viewpoint and send a message. Children rarely interrupted. When they did so, it was to remind their co-players to take a turn, or ask the expert to repeat something.

7.2.3  Communication. Satisfying PDG10 (presenting a user’s communication as a text box above their virtual actor) helped in making explicit who the speaker was. One problem occurred when virtual actors were positioned close to each other. In this case one text box would obscure another. In such cases the users resorted to using the transcript window (see PDG2). Another problem was that the text remained in a text box until the user next typed something. This was confusing because it appeared as if the speaker was currently referring to something that was in fact referred to much earlier. This indicates that some means is needed of making a text box disappear after a certain period of time.

The text boxes also had a benefit in focusing the viewpoint. To follow the ongoing dialogue, a user would have to turn to see someone else’s text box. They thus also saw the actions that the speaker might be doing while talking (e.g. the teacher demonstrating the information as she talked).

Interviews with the children after the sessions highlighted that although textual communication was sufficient, there were times when it was tiring and audio communication would be preferable. One child’s comment (typical of many) was that: “if the application was not playing this type of a game but shooting, he would not bother talking because by the time he would have to type a message he would be shot”. This shows that while text based communication was suitable for this specific application (for educational and literacy reasons) it is not necessarily so for other types of applications.

In some cases the expert had to spend extensive time explaining things to one of the children. This was frustrating for the other child, as borne out by comments such as “not again” when the expert asked the child to do another action again. This emphasized again the need for private communication channels (PDG18). This will be an issue for further investigation, especially when the population in the environment increases.

7.2.4  Pedagogical issues. Pedagogical issues divided into three main categories:

  • support for educational resources
  • support for various styles of teaching/learning
  • support for practical issues of keeping order and managing the learning situation

Following PDG4 and providing the rules as a permanent source of information in the CVE was very useful. Children would walk to the wall and read the rules either after being directed by the expert or on their own initiative. However, Deva supported this via a texture map and the resource was static. This suggested a further technology requirement of incorporating multimedia display techniques in the CVE.

The CVE supported a variety of different teaching styles. The instructional style was supported by permanently displaying the rules on the wall. Children could also learn to play the game by observing others playing (Bandura, 1971; Salomon, 1979). In some cases a style based on the cognitive apprenticeship approach (Collins, Brown & Newman, 1989) was used, the expert gradually removing support as the children became familiar with the game.

Practical management was eased by providing support for the expert to be aware of the activities of the children (PDG8, PDG20). At a simple level this was done by displaying the children’s viewpoints on a separate monitor next to the expert (Post-it-notes were used to associate the viewpoints with particular children) (see Figure 9). This was found to be very helpful for the expert in terms of deciding when and how to support the children. For example, when the expert saw that a child had a poor view of the board they were able to ask the child to move closer to the board to see better. The expert made use of this tool when trying to attract a child’s attention. In such a case the expert would position her actor so that it was visible to the child (i.e. within their viewpoint) before speaking to them.

At another level the expert needs to be aware of past actions of the children. This could be for assessment purposes or simply to see what move the child had made last. This suggests that the Deva technology needs to be changed to provide mechanisms for managing “virtual time”, as PDG3 suggests. This would allow past events to be viewed.

There was no explicit mechanism for the expert to take control of the session (PDG21). In general this was not a problem but in some extreme cases children took advantage of this (e.g. kept rolling the dice until getting a suitable score for capturing a piece, or moving pieces off the board without being permitted to do so). This highlighted the need for implementing mechanisms to support PDG20. A virtual time mechanism would perhaps allow the events in a session to be “frozen” in order to restore order.

8  Technology Requirements and Future Work

Deva is a sophisticated programming environment, not an end-user application. The core of Deva provides only the basic services for rendering. The semantics and rules of the environment, together with the objects and mechanisms for communication are loaded at runtime as “plug ins”. This provides the flexibility to add new functionality to the system. The analysis method outlined above derived a set of DGs concerning the effectiveness of the application. The third phase application is being developed according to these DG. By analysing the gap between the PDGs and what could actually be implemented in Deva, it has been possible to identify shortcomings in Deva. Some of the technology requirements did not require large changes in Deva (e.g. the provision of a laser pointer). Some requirements were more surprising.

8.1  Text in the CVE

The importance placed on text as a means of communication and its importance in the environment (e.g. for displaying rules) came as a surprise to the technologists. This was mainly a requirement of the application and wider problem area, as literacy is a key skill in education.

Results from the first and second phases have indicated that it is important for the text to be embedded within the context of the virtual environment itself rather than appearing in external windows. User studies showed that the change of context between a “game window” and a “text window” is considerable for children, and disruptive of their engagement with the environment.

Placement of text in a 2D window is a simple enough task; however limitations of current 3D technology means that placing large amount of readily legible text in a virtual environment is difficult. Relatively low resolution displays, limited depth buffering and little support for antialiasing on all but the most high-end of graphics hardware means that rendering anything but a few large words in a 3D environment results in badly pixelated unreadable text.

Nevertheless, being able to display reasonably large quantities of readable text (i.e. several sentences) within the context of the environment has been a rigid requirement placed on the third phase CVE. For the first trials of the CVE version, text rendered in 2D on the near drawing plane of the virtual environment, positioned so as to appear above the head of the “speaking” avatar and merged with the 3D components of the environment is to be trailed (see Figure 7). This technique has the benefit of allowing several sentences of text to be rendered clearly on the screen independent of the actual position of the user's avatar in the virtual world. The text moves with the position of the avatar, but is not subject to aliasing or depth buffering problems.

8.2  Virtual actors’ appearance

The two major requirements that arose from the study in terms of the virtual actors’ appearance in CVEs are related to:

  • aesthetically pleasing virtual actors
  • believable personalised representation
  • realistic motion

The need for good aesthetics for the actors and the environment in general was not as surprising but in the past was not seen as a major factor in driving the technology development. An aesthetically pleasing environment is important for children. Children come with high expectations from their exposure to the high performance computer graphics available in most computer games. In comparison the default virtual actors provided in Deva are lacking in detail. Attempts to make them more realistic lead to a greater rendering load and slower performance (the ideal actor pictured below far exceeds the realtime rendering capability of PC graphics accelerators, and challenges even the ability of high-end workstations). The system already includes sophisticated radiosity rendering software, enabling aesthetically pleasing and realistic lighting models to be used, and a current focus is on developing the Deva actors so that they more closely match the actors as envisaged by the designers.

Personalisation of virtual actors in order to make users easier to distinguish is important. This requirement arose from the virtual actors’ need to provide a personalised representation to users that convey their user’s identity, role, process of activity and intentions of action. Some possibilities explored in the literature include: virtual actors bring a name tag above their head; personal colours and textures (as in PDG7); a palette of clothes and colours that the users can choose to customise their virtual actor (as in the Mirror project that introduced the idea of fashion in cyberspace (Walker, 1997)); and pasting a static image of the user’s face to the head of the user’s virtual representation. The current CVE technology supports the above stated design solutions.

However, there is a requirement for more intuitive and effective virtual actors representations. This can be achieved by:

  • analysing the visual parameters of lip movement by analysing the textual communication or the audio signal of the speech (Lavagetto, 1995)
  • capturing the user’s face and projecting the image on top of a virtual body (Pandzic et al., 1994; Kshirsagar et al., 1999)
  • capturing users’ actions and expression of emotions, which are key issues conveying the users’ focus of attention and state-of-mind that can make the interaction more intuitive

These solutions require speech and face recognition and real-time video streaming in CVEs, which are not supported by the majority of current CVE systems. DIVE is an exemplar CVE system that supports real-time video streaming.

8.4  Incorporating intelligence in objects’ and virtual actors’ behaviour

Developing intelligent CVEs is a major requirement from most of the applications, not necessarily pedagogical ones. This means that objects contained in CVEs and virtual actors should incorporate behaviours sensitive to the spatial context of the environment they inhabit.

Specifically, objects contained in the CVE should incorporate behaviours that correspond to:

  • the material they are constructed with (e.g. solid objects collide with each other)
  • their attributes

Intelligent virtual actors need to convey the following characteristics:

  • capability of performing language commands as visual actionssensitivity to the spatial context in which they are situatedunderstanding of communication in the CVE sensitivity to their own role understanding of procedures and the progress of actions
  • ability to keep episodic memory of actions and activities

These last three points may affect the virtual actors’ access control and ownership over other objects in the CVE (Badler et al., 2000; Pettifer & Marsh, 2001) and may affect their reaction towards activities and other virtual actors participating the same CVE.

Adding intelligence in the CVE aids the functionality of the CVE, the efficiency of the activity and the believability of the experience.

8.5  Previous communication/instruction/activity

Children in the previous trials made considerable use of the “logs” of communication between themselves and the teacher available in the chat boxes. Confusion over the rules of the game was occasionally resolved “locally” or without reference to the teacher by looking through the logs for confirmation of a rule stated previously. It was decided to retain this ability in the CVE.

However, displaying the comparatively large amounts of text accumulated during the exchanges of the game is unrealistic in a 3D environment for the reasons given above. Although a requirement was identified for displaying “speech” text within the environment, it has been considered appropriate to display this historical text in a window external to the 3D view of the CVE, on the basis that when a child requires access to the historical logs to resolve an issue the engagement with the current activities in the environment is inevitably interrupted to some extent. Although this is a hypothesis that remains to be confirmed by the real trials, the use of the previous prototypes lends a degree of confidence to the design decision.

8.6  Managing Virtual Time – Tracking, Rewinding and Freezing Virtual Time

The transcript approach raised the issue of time in the CVE. A transcript provides the basis for freezing and rewinding to previous activities in the transcript. Time also came up when considering how transient an actor’s text box should be. Keeping track and being in control of virtual time are technology requirements that have been considered in the literature and various solutions have been proposed:

  • leaving trace pathways
    Keeping track of users’ earlier presence in an environment is beneficial in terms of identifying the sequential steps that have been followed for the fulfilment of a certain task. One proposed solution that has been attempted to address this issue is leaving trails and pathways through the virtual space (Benford et al., 1995). However, this way of visualising activities in the past does not necessarily provide information regarding the activities the users have been involved in and/or fulfilled.
  • providing transcripts
    A transcript-based approach provides a basis for keeping control over time in the CVE, as it allows the users to follow sequences of past activities and dialogue. However, freezing virtual time to fill in on past activities can result in missing current activities.
    If it were possible to “freeze time” at a specific instant, then the individual would be paying attention and responding to a set of stimuli corresponding to one environment, not paying attention to all the other stimuli, or interpreting stimuli from one environment in the context of the currently present one (for example interpreting a sound from the real world as belonging to the virtual world). (Slater & Steed, 2002).

This calls for deeper investigation of how exactly recording, rewinding and freezing virtual time can be integrated in CVE systems to benefit social interaction.

The need for supporting historical awareness of past presence and activity has already been stated in the literature (Benford et al., 1995). The temporal links technique (PDG3) enables all the actions in a CVE to be recorded in such a way that can be recreated and re-experienced as a 3D virtual world (Greenhalgh et al., 2000). This means rewinding the virtual time not just to replay the action, but with the ability to manipulate the dynamics of the activity, which extends the means of body beyond its normal use. The pedagogical value of such a tool is momentous. Users can learn from past mistakes and alter their decisions and strategies accordingly. A similar concept has been applied in games like SimCity (a city simulator from Maxis) that allows users to build, manage and follow the city development throughout time. The game supports users going back in time and changing their decisions based on knowledge acquired by seeing the city development. For example, a decision to build a power station close to the city may be retracted after discovering the increased the levels of pollution in the surrounding area. Similar “superpowers” have been attempted and proved to be valuable for training team members to make good decisions in stressful situations (e.g. training fire-fighters in VR) (Romano & Brna, 2001).

8.7  Participating in Parallel Activities in Different CVEs

Another technology requirement that needs to be addressed is users being able to participate in parallel activities in different CVEs, or different areas of the same CVE that may not necessarily be visible to all users (e.g. in a large scale CVE that contains different areas of activities). This arose from the users’ need to attend private social interactions – communication and physical activities (e.g. for the expert to provide extra support to individual users, or a particular set of users). Earlier research has also addressed the need for people inhabiting different places at the same time, either through multiple direct presence, or through some kind of computer agent acting on their behalf (Benford et al., 1995).

A possible solution discussed in (COVEN D2.6) is the implementation of subjective views. This is based on the logic that the representation of a CVE can be tailored to particular users. For example, in an office CVE, electricity cables would be visible only to electrical engineers. This means that private areas of social interaction could be incorporated within the main CVE, but rendered transparent until some users need to participate in private activities. In this case a private environment would be rendered normally for these users, allowing them to get involved in social activities invisible to other participants.

For such solutions to be effective, issues related to greater rendering load and slower performance have to be addressed. Further investigation is also required to address the issue of users being able to follow and control parallel social interactions in a CVE.

9  Conclusion

This paper has outlined how a systematic method for choosing a driving problem and developing an application can derive a set of technology requirements. There are several directions for future work. A particular type of educational situation was chosen for investigation. This meant that certain issues took priority (e.g. the use of text rather than audio communication). The population used was limited to a maximum of three users. Increasing the population will have an effect on viewpoints and the need for private channels of communication.

The phased approach allowed the large number of factors involved to be handled in a manageable way. The first two phases focused on the requirements of the learning application. The third phase focused on particular issues concerning the move to a 3D environment (e.g. the problems of location and viewpoints and its impact on user awareness).

The use of a set of design guidelines ensured that application requirements were captured and expressed in the Deva version of the application. They also provided a way of focusing the studies throughout the project. The design guidelines captured in a more formalised way the application requirements obtained in the first two phases. They provided a systematic way of expressing these requirements during the implementation of the CVE version of the application in the third phase. They also provided a way of focusing the studies throughout the project. Technology requirements can derive by analysing where design guidelines could not be followed.

Paying careful attention to the problem area and application requirements revealed areas of concern for designing "real-world" problems. For example, this included the importance of text rather than audio for communication because of various educational and literacy requirements.

While the educational situation investigated is quite specific in terms of age group and curriculum we believe that the guidelines are generalisable to some degree. What became apparent over the course of the project was that the key pedagogical issue was not that of which particular theory of learning had to be supported. Rather, the emphasis became the practical issues of keeping track of and controlling children in an educational situation. These practical issues underlie most educational situations. From that point of view, we believe that the design guidelines can be generalised not only to other educational situations but also to more general situations where such control is desirable.

10  Acknowledgement

Thanks to the State Scholarships Foundation of Greece for funding Daphne Economou's Ph.D., the MMU Manchester Multimedia Centre for use of their facilities, Claremont Road Primary School and Knutsford High School, and the Advanced Interfaces Group at Manchester University for their co-operation.

11  References

Allen, C. (1989). The use of video in organizational studies. ACM SIHCHI Bulletin: Special edition on video as a research and design tool, 21(2), 115-117.

Atkinson, J.M., & Heritage, J. (eds.) (1984). Structures of social action: Studies in conversation analysis. Cambridge: Cambridge University Press.

Badler, N.I., Bindiganavale, R., Allbeck, J., Schuler, W., Zhao, L. & Palmer, M. (2000). Parameterized action representation for virtual human agents. In J. Cassell, J. Sullivan, S. Prevost & E. Churchill (Eds.), Embodied conversational agents (pp. 256-284), Cambridge: MIT Press.

Bandura, A. (1971). Social learning theory. New York: General Learning Press.

Benford, S. D., Bowers, J. M., Fahlén, L. E., Greenhalgh, C. M. & Snowdon, D. N. (1995). User embodiment in collaborative virtual environments. Proceedings of ACM Conference on Human Factors in Computing Systems (CHI'95), 242-249.

Benford, S. D., Greenhalgh, C. M. & Lloyd, D. (1997). Crowded collaborative virtual environments. Proceedings of ACM Conference on Human Factors in Computing Systems (CHI'97), 58-66.

Bigge, M. L. & Shermis, S. S. (1992). Learning theories for teachers, 5th ed., New York: Harper Collins.

Boden, D. & Zimmerman, D.H. (1991). Talk and social structure: Studies in ethnomethodology and conversation analysis. Cambridge: Polity Press.

Brooks, F. P. (1988). Grasping reality through illusion: interactive graphics serving science. Proceedings of ACM Conference on Human Factors in Computing Systems (CHI’88), 1-11.

Capin, T. K., Pandzic, I. S., Thalmann, N. M. & Thalmann, D. (1998). Realistic avatars and autonomous virtual humans in: VLNET networked virtual environments. In R. Earnshaw & J. Vince (Eds.), Virtual worlds on the Internet (pp. 157-174).Los Alamitos, CA: IEEE Computer Society Press.

Carlsoon, C. & Hagsand, O. (1993). DIVE- A platform for multiuser virtual environments. Computer and Graphics, 17(6), 663-669.

Checkland P. & Scholes, J. (1990). Soft systems methodology in action, Chichester: John Wiley.

Colebourne A. (AC3D). Retrieved April 21, 2006, from

Collins, A., Brown, J.S., & Newman, S.E. (1989). Cognitive apprenticeship: teaching the crafts of reading, writing and mathematics. In L.B. Resnick (Ed.), Knowing, learning, and instructions: Essays in honour of Robert Claser (pp. 453-494). Hillsdale, NJ: Lawrence Erlbaum Associates.

Coulthard, M., Montgomery. M. & Brazil, D. (1981). Developing a description of spoken discourse. In M. Coulthard & M. Montgomery (Eds.), Studies in discourse analysis (pp. 1-50)., London: Routledge.

COVEN D2.6. (1997, August). Guidelines for building CVE applications. Public Report number D2.6. Van Liempd, G. (Ed.). Retrieved August 31, 2006, from

Economou, D. (2001). The role of virtual actors in collaborative virtual environments for learning, Ph.D. thesis. Department of Computing and Mathematics, Manchester Metropolitan University, Manchester.

Economou, D., Mitchell, W. L. & Boyle, T. (2000). Requirements elicitation for virtual actors in collaborative learning environments. In R. S. Heller & J. Underwood (Eds.), Computers & Education (pp. 225-239), Oxford: Elsevier Science Ltd.

Economou, D. & Pettifer, S. (2004). Towards a user-centred method for studying CVEs for learning. In M.-I. Sánchez-Segura (Ed.), Developing Future Interactive Systems, Hershey, PA: Idea Group Publishing, 269-301.

Greenhalgh, C. & Benford, S. (1995). MASSIVE: A collaborative virtual environment for tele-conferencing. ACM Transactions on Computer Human Interfaces (TOCHI), 2(3), 239-261.

Greenhalgh, C., Purbrick, J., Benford, S., Graven, M., Drozd, A. & Taylor, T. (2000). Temporal links: Recording and replaying virtual environments. In R. Price (Ed.), Proceedings of the ACM Multimedia 2000, 67-74.

Gunton, T. (ed.) (1993). Information systems practice: The complete guide, Manchester: NCC Blackwell.

Hubbold, R., Cook, J., Keates, M., Gibson, S., Howard, T., Murta, A., West, A. & Pettifer, S. (1999). GNU/MAVERIK: A micro-kernel for large-scale virtual environments. Proceedings of ACM Symposium on Virtual Reality Software and Technology (VRST'99), 66-73.

Johnson, W. L., Stiles, R. & Munro, A. (1998). Integrating pedagogical agents into virtual environments. Presence, 7(6), 523-546.

Jordan, B. & Henderson, A. (1998 ). Interaction analysis: Foundations and practice. The Journal of Learning Sciences, 4(1), 39-103.

Kaur, K. (1998). Designing virtual environments for usability. PhD thesis. Centre for HCI Design, City University, London.

Kaur Deol, K. K., Steed, A., Hand, C., Istance, H. & Tromp, J. (2000a). Usability evaluation for virtual environments: Methods, results and future directions (part 1). Interfaces, 43, 4-8.

Kaur Deol, K. K., Steed, A., Hand, C., Istance, H. & Tromp, J. (2000b). Usability evaluation for virtual environments: Methods, results and future directions (part 2). Interfaces, 44, 4-7.

Kshirsagar, S., Escher, M., Sannier, G. & Thalmann, N.M. (1999). Multimodal animation system based on the MPEG-4 standard. Proceedings of the Multimedia Modelling 99, 215-232.

Lavagetto, F. (1995). Converting speech into lip movements: A multimedia telephone for hard of hearing people. IEEE Transactions on Rehabilitation Engineering, 3(1), 90-102.

Luff, P., Hindmarch, J. & Heath, C. (eds.) (2000). Workplace studies, recovering work practice and informing system design. New York: Cambridge Press.

Mitchell, W. L. (1999). Moving the museum onto the Internet: the use of virtual environments in education about ancient Egypt. In J.A. Vince & R.A. Earnshaw (Eds.) Virtual Worlds on the Internet (pp. 263-278). Los Alamitos, CA: IEEE Computer Society Press.

Monk, A., Wright, P., Haber, J. & Davenport, L. (1993). Improving your human computer interface: a practical approach. Hemel Hempstead: Prentice Hall International.

Neal, L. (1989). The use of video in empirical research. ACM SIGCHI Bulletin: Special Edition on Video as a Research and Design Tool, 21(2), 100-101.

Pandzic, I., Kalra, P., Magnenat Thalmann, N. & Thalmann, D. (1994). Real time facial interaction. Displays, 15(3), 157-163.

Pettifer, S. (1999). An operating environment for large scale virtual reality. PhD Thesis. The University of Manchester.

Pettifer, S. & West, A. (1999). Deva: An operating environment for large scale virtual reality. (Department of Computer Science Technical Report UMCS-99-10-1). Manchester, UK: University of Manchester.

Pettifer, S., West, A., Crabtree, A. & Murray, C. (1999). Designing shared virtual environments for social interaction. Proceedings of Workshop on User Centered Design and Implementation of Virtual Environments (pp. 45-56) York, UK: York University.

Pettifer, S. & Marsh, J. (2001). A collaborative access model for shared virtual environments. Proceedings of the 10th IEEE International Workshop on Enabling Technologies Infrastructure for Collaborative Enterprises (WetIce2001) (pp 257-272), Cambridge, MA: IEEE Computer Society, pages

Reisner, P. (1987). Discussion: HCI, what is it and what research is needed? In J.M. Carroll (Eed.), Interfacing Thought: Cognitive Aspects of Human¯Computer Interaction (pp. 337-352). Cambridge, MA: MIT Press.

Romano, D.M. & Brna P. (2001). Presence and reflection in training: Support for learning to improve quality decisionmaking skills under time limitations. CyberPsychology & Behavior, 4(2), 265-277.

Roussos, M., Johnson, A., Moher, T., Leigh, J., Vasilakis, C. & Barnes, C. (1999). Learning and building together in an immersive virtual world. Presence, 8(3), 247-263.

Salomon, G. (1979). Interaction of media, cognition, and learning. San Francisco, CA: Jossey-Bass.

Silverman D. (Ed.) (1997). Qualitative research: Theory, method and practice. London: Sage.

Silverman D. (Ed.) (2000). Doing qualitative research: A practical handbook. London: Sage.

Shaw, J. (1998). The legible city. Presence and representation in multimedia art and electronic landscapes. eSCAPE deliverable 1.1, Esprit Long-term Research Project 25377 (pp. 59-91), ZKM, Karlsruhe, Germany.

Slater, M., Steed, A. (2002). Meeting people virtually: Experiments in shared virtual envrionments. In R. Schroeder (Ed.), The social life of avatars (pp. 145-171). London: Springer.

Snowdon, D. & West, A. (1994). AVIARY: Design issues for future large-scale virtual environments. Presence, 3(4), 288-308.

Soloway, E., Jackson, S.L., Klein, J., Quintana, C., Reed, J., Spitulnik, J., Stratford, S.J., Studer, S., Eng, J. & Scala, N. (1996). Learning theory in practice: case studies of learner-centered design. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’96), 189-196.

Stanney, K. M., Mourant, R. R. & Kennedy, R. S. (1998). Human factors issues in virtual environments: A review of the literature. Presence, 1(7), 327-351.

Steed, A., & Tromp, J. (1998). Experiences with the evaluation of CVE applications. In D. Snowdon & E. Churchill (Eds.) Proceedings of the Collaborative Virtual Environments (CVE’98), (pp. 123-130). Manchester, UK.

Stewart, J., Bederson, B. B. & Druin, A. (1999, May). Single display groupware: A model for co-operative collaboration. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’99), 286-293.

Walker G.R. (1997). The mirror-reflections on inhabited TV. British Telecommunications Engineering Journal, 16(1), 29-38.

Waters, R. C., Anderson, D. B., Barrus, J. W., Brogan, D. C., Casey, M. A., McKeown, S. G., Nitta, T., Sterns, I. B. & Yerazunis, W. S. (1997). Diamond Park and Spine: Social virtual reality with 3D animation, spoken interactions, and runtime extendability. Presence, 6(4), 461-481.

Viller, S. & Sommerville, I. (1999). Social analysis in the requirements engineering process: From ethnography to method. Proceedings of the International Symposium on Requirements Engineering (RE’99), Limerick, Ireland: IEEE Computer Society Press, 6-13.

Copyright 2006 Communication Institute for Online Scholarship, Inc.

This file may not be publicly distributed or reproduced without written permission of the Communication Institute for Online Scholarship,

P.O. Box 57, Rotterdam Jct., NY 12150 USA (phone: 518-887-2443).