Volume 23 Numbers 1 & 2, 2013
Parallel Universes of Teleconferencing
EIC readers are probably quite familiar with the evolution of teleconferencing in industry and distance learning. Teleconferencing technology however has been used in different settings – and different ways in what can be thought of as parallel universes; I come from one of those parallel universes (more about that in a minute). Each of these universes frames what is an appropriate setting and use in very different ways. In fact, at times they use very different metaphors to think about mediated connection. This, then, is an argument to look at these different universes to understand how they relate and what can be learned from them.
History 1 tells the story of industrial forays into teleconferencing. History 2 recounts the story of the research-based parallel universe of “media space” that overlapped temporally but existed independently and in a different world of ideas, usually using a different technological base. Last, History 3 is the new story of home-based mediated connection, relying on more powerful bandwidth and omnipresent technology. These parallel universes have come together in a single technological basis on internet-enabled powerful personal computing devices. However, this coming together represents a particular moment in time when the most convenient enabling technologies are the same. The underlying motivation for each kind of system remains quite different, and the form factors and usage patterns are likely to change as we move forward.
Traditional teleconferencing has a number of roots. For purposes of highlighting the connection between the three different universes, I will begin with the Picturephone.
Starting as early as 1930, Bell Labs began experimenting with ways to transmit the image of speakers as well as their voices (http://www.tvhistory.tv/1930-ATT-BELL.htm). In 1962, AT&T marketed the Picturephone in various desktop designs (http://www.corp.att.com/attlabs/reputation/timeline/70picture.html). The video unit integrated camera, monitor, and speakerphone. Video was black and white and was configured for showing faces. Most actual users were AT&T executives because specialized networking was required. The public got demonstrations at various world’s fairs and Disneyland theme park (http://www.beatriceco.com/bti/porticus/bell/telephones-picturephone.html). The few units that were sold were to executives to speak with one another. Thus, it set a pattern of exclusivity and status, face-orientation, and telephonic conven tions of use. Most importantly, each session was a “call.”
The first teleconferencing systems used proprietary encodings and transmission protocols. Most ran over specialized circuits often leased or rented from the telephone company. In about 1980, I remember seeing an installation at the headquarters of the large architectural firm I worked for at the time. That unit, purchased in order to connect with corporate clients, had a dedicated room with a three beam video projector. A service technician was on standby during regular office hours to set up calls, trouble shoot problems, and to align the projector.
By the early 1990’s standards for compressing and decompressing video like H.261 and H.263 began to make teleconferencing more interoperable and possible over IP networks. But teleconferencing was still relegated to expensive custom conference rooms arranged to produce the illusion of being in a shared room. This began to change as codecs went from being big special purpose boxes to chips to software that could run on general purpose computers.
Desktop Conferencing using PC’s
By about 2000, various vendors of PC’s and PC peripherals began to offer separate cameras for conferencing. These relied on Skype (http://www.skype.com) or iChat (http://www.apple.com/support/ichat/). The success of those products and the rapid decrease in price and size of cameras quickly led to their integration into regular PC’s. Their use often was outside of the orderly control of institutions’ teleconferencing organizations, mimicking how PC’s snuck up on IT departments ten years before. The connection metaphor continues to be a “call,” even though there is often little or no added bandwidth cost.
Ubiquitous Desktop Conferencing using PC’s, Laptops and iPads
Once cameras started showing up on laptops and other mobile devices, the dam broke. Portability, widely available bandwidth, and efficient codecs combine to make it normal to see sales people sitting in Starbucks holding a sales meeting with others around the country. And following that sales meeting, use the same video connection technology to say good night to their kids – but more about that later.
Starting in the mid 1980’s, various industrial research and academic research labs began to explore what became known as “computer-mediated communication.” This began with the recognition that bandwidth would soon become very cheap.
The noted MIT-based futurist Ithia de Sola Pool, at the Optical Society of America in 1981, gave an invited talk “Who Needs All The Bandwidth.” In that talk, he predicted that fiber optical communication would not only enable vast improvements in long distance telephone communication, but would provide an infrastructure for hundreds of channels of cable television, allow distributed computers to share data nearly instantaneously (or even share active memory!), and to finally enable something like the AT&T Picturephone to be a ubiquitous reality.
So what would be the consequences of bandwidth being (nearly) free? Well, for one thing, there would be no need to ever turn off a teleconferencing connection if you expected people to come back in a few minutes, hours, or days. It meant that separated places could be more or less permanently joined. These systems came to be called “media spaces.”
Hole in Space
The first experiment in this actually was time-limited rather than always on but it did have the property that it was not a meeting. In fact, it was more like an encounter. It was called the “Hole In Space.” Over the course of three evenings in November, 1980, artists Kit Galloway and Sherri Rabinowitz created a “hole” between the sidewalks at Lincoln Center in New York and Century City in Los Angeles (Galloway and Rabinowitz, 1980). This was accomplished by projecting full-size images of the passersby at both sites in black and white in store-front windows, using a rear-projection display of the video, and manually echo-cancelled full-duplex audio. There were no user instructions, no local feedback monitor, no explanatory didactics, just the sounds and image of a place three time zones away.
Crowds gathered quickly once the art work was turned on. People would stop, realize they were hearing what was probably the sound from the remote location and ask the person they were seeing – a total stranger – where they were. People at each end realized that it must be somewhere far away because of the difference in sky color and the way people were dressed. The people had nothing in common except they happened to be in the same real-virtual location at the same time. Yet they took the time to find out where they now “were.” They asked if they were being seen. They asked why they were “there.”
Existential inquiry gave way to spontaneous games like charades. People behaved in ways they would not with strangers on the same physical sidewalk: they lingered instead of moving on; they were engaged with one another. Because of the video mediation, these sidewalks and these people were spontaneously creating an event together.
Media Space Research
In the mid 1980’s, I became a research scientist at a place called Xerox Palo Alto Research Center (“PARC”). PARC was famous in technology circles as the home of the personal computing, laser printing, and networked computing. But that great work had already been done when I arrived. In fact, I joined a group that was trying to think of what came next. Since they had invented “personal computing”, they decided to call this next step “interpersonal computing.” The group – called a lab in PARC parlance – pushed this idea by being geographically split between Palo Alto and Portland, Oregon. The method of research at PARC was to use technology under development as part of everyday work. Researchers discovered an “affordable” video codec to be used to hold project meetings (“Widcom” – now long out of business). The split site between Palo Alto and Portland was connected by a variety of high bandwidth lines so dedicating a line to video conferencing was not a significant additional cost. The line was leased and so always on. Simultaneously, some researchers in Palo Alto strung audio and video lines between offices in order to “share” offices even though separated around the building. These hard-wired, always on connections were attempts to mimic the physical environment of a design studio.
Both of these two media space variants quickly became integral aspects of work life in the lab. Eventually these systems merged with the addition of a video switch that could be controlled from any office computer.
Although such technologies are now commonplace features of computers, in 1986 video-mediated connection was not considered part of computing or networking research, certainly outside of office systems research, and therefore was almost too radical for most people to comprehend. Computing work spaces supported “tasks,” not space that could be social, task-oriented, ambient, or any number of other truly spatial characteristics. In fact, it was difficult for even the innovators of personal computing to envision that someday audio and video would be commonplace elements of a computer – so difficult that purchasing cameras and other gear required extraordinary approval!
Always On (PARC and Beyond)
Since PARC was a major center of computer science and human-computer research, researchers from other influential labs saw the PARC media space as a platform for other research. Soon, media spaces appeared in labs in the UK, Canada, Japan and the US. Some veered towards a telephony model where connections were established in order to hold conversations, but most recognized the space-to-space, always-on quality (Bulick et al., 1989; Buxton & Moran, 1990; Fish, Kraut, and Chalfonte, 1990; Mantei et al., 1991; Gaver et al., 1992; Gaver et al., 1993; Roseman and Greenberg, 1996; Tang, Issacs, and Rua, 1994).
Tunnels and Peripheral Awareness
Media space explored a variety of mediated interaction models. Sometimes shared common spaces would be linked; others would link offices in semi-permanent dyads; and some more radical arrangements were not symmetric, distributed to multiple locations. Some semi-permanent dyads would try to get perfect symmetry by developing displays with half-silvered mirrors so that eye contact could be simulated and people could reliably know that if they could not see the screen the other end could not see them. In contrast, others comfortably moved their cameras about the room to show work in progress, beautiful sunsets, visitors who might be standing in a doorway, and so forth (Adler and Henderson, 1994).
Awareness of the wider shared environment and the activities within it became important. Famously, while called the first webcam, the coffee pot camera at Cambridge University provided an awareness of the shared coffee pot (http://www.parkerinfo.com/coffee.htm and Stafford-Fraser, 2001). At about the same time, researchers at EuroPARC (the Cambridge branch of PARC) developed “portholes,” digitized low-resolution still images gleaned from media space cameras (Dourish and Bly, 1992). Media space subscribers to the service could keep colleagues aware of their availability. “Is Paul working late again?” “Looks like Scott and Victoria are really working hard on that paper.” People began to have complexly mediated “presence.” They would manage and as they became more sophisticated in its use, even actively &ld quo;construct” their presence (Bellotti and Dourish, 1997).
Probably the more subtle, yet significant aspect of awareness, however, is the ability to interrupt appropriately. People would look before interrupting a conversation. Just as they would do if they were co-located, they would look for the appropriate opportunity to join an existing conversation. Instead of feeling that the cameras were intrusive, they came to represent the social grease of more socially appropriate behavior.
Hands and Artifacts
Another direction that media space research took that broke from teleconferencing is that hands, artifacts, and contextualizing elements of the environment became as important as faces. Beginning with the original media space with moveable cameras, (sometimes more than one at a location), people would point their cameras at desks and notebooks. This led to experiments in what became known as “shared drawing” (Ishii and Kobayashi, 1992; Tang and Minneman, 1990; Harrison, Minneman and Marinacci, 1999).
Even in situations where there was no specific drawing or writing, researchers realized that showing hands made natural conversation clearer since deictic references (e.g. when people say “this” or “that”) could be better understood. One configuration found to be particularly well-suited for this is the over-the-shoulder camera. In this arrangement, the back of the head, the work surface, and the computer on the work surface would be made visible. In fact, the remote party would often be visible as well.
What was Learned
Not all -- or even very much -- of industrial research results in products, features or classes of products. Mostly, research results in findings and implications for design. So what was learned from all this media space research?
In the early-1990’s, one of the land line phone companies (this was still before the cellular explosion), had a television advertisement showing a physically separated family sharing Thanksgiving. The dining room had a wall-sized display that somehow afforded eye contact, even when pressed up against it. Interestingly, in the mid-2000’s, households did begin to share family events over teleconferencing-like systems – but they are a far cry in form from that advertisement.
Ithia de Sola Pool’s vision of nearly free bandwidth has come to pass (albeit using copper coaxial connection and broadband radio frequency broadcast as well as fiber optics). It is now possible for grandparents to read books to grandchildren, for parents to check in on the state of dorm rooms of their college-age kids, for dog owners at work to watch their pets roam around the house, and for separated lovers to “kiss” each other goodnight from different continents.
From Calls to Babysitting -- What Hath Skype Enabled?
It is not surprising that desktop conferencing systems such as Skype have been appropriated for uses in domestic settings. Ever since laptops and PC’s started coming with built-in cameras and home digital connectivity became good enough to stream HD, people have found ways to employ it to maintain social bonds.
While actual numbers are not known, there is tremendous anecdotal evidence of people using desktop conferencing systems to talk with loved ones. (Judge and Neustaeder, 2010) This seems to be most common between grandparents of young grandchildren and the children’s household. In one study, researchers found that grandparents had a standing date to be “present” for their young grandchild’s bath. Mom would haul the laptop to the bathroom and prop it so that every splash and giggle would come through.
In other studies, meal preparation time in two households is shared. Since this is carried out using laptops intended for face-oriented teleconferencing, the wide angle image cannot show the details of food preparation, but the visible motion of people cooking nonetheless is reported to “feel” like being present (Neustaedter and Judge, 2010).
The “helicopter parent” phenomenon – parents and college age children who are constantly electronically chatting and texting – extends to conferencing. Of course, adding a visual channel provides added (perhaps not well-founded) reassurance for parents and a small sense of home for the college student. “Are you well? You don’t look too good.” Or “Are you wearing the same shirt again today? Did you ever do your laundry?”
One way connection – a euphemism for the “webcam” – leads to a couple of awareness applications. Some day care centers provide webcam access to parents and pet owners can set up their own webcams to keep an eye on Fido or Kitty. These are fairly removed from the conversational, face-oriented world of teleconferencing. But yet, they stem from similar sense of the social world and the value of visual reassurance.
Separated couples also use this to feel some stronger bond of connection than the disembodied voice of the phone call. Often coming closer to the screen and camera than in a desktop conference, some report that the visual disconnect of broken eye contact, fuzzy imaging, and gestures that appear mis-directed (like kissing the screen which completely misses the camera-as-surrogate for the partner) often are worse than the imagined image in a voice-only good-night call.
Will we see the wall-sized, eye-contact fixing, spatialized-audio Thanksgiving conferencing system? Will we see specialized pillow-talk systems for intimate but separated good-night kisses? Will we really want these things?
The New Weirdnesses
So we have moved from the structured setting of the work meeting where a conference table can join people in meetings with known expectations for behavior, place them a socially appropriate distance apart, and have them converse to the variable, sometimes chaotic, and often fraught world of the home.
Mediated connection is not the same as being there. That may change. But we do know from all three universes that teleconferencing, like other media, has effects. There are very different and peculiar behaviors and expectations. And that these will probably change again as new forms of the technology emerge.
Cell Phone Behavior and Teleconferencing Expectations
Most of us have come to terms with the shifts in what is acceptable and expected behavior that the cell phone has brought. We at least put up with people who hold intimate phone conversations in the aisles of a supermarket; we pretend not to hear. We may not even notice any more when this happens.
Likewise, we have shifted our expectations of behavior in meetings with the rise of teleconferencing. The “power seat” has shifted from the one opposite the door to the one most central to the transmitted image; even in very good teleconferencing systems (e.g. Cisco’s Telepresence), some people are more charismatic on the screen (telegenic) than their co-present selves; the lack of eye contact in desktop systems can be disconcerting; and – like the cell phone – turn taking in conversation becomes self-conscious and conversations often become loaded with false starts at utterances. But we cope. And we even rely on the systems.
Your Face on their Laps – iPads and More Dis-Locations
One particular shift in expectation is dislocation. We now look at people as though they are crouching with just their head sticking up through our desks. Casual conferencers using iPads will put the image of the other on the table or, even weirder, on their lap. When a media space user would arrange the camera for an over-the-shoulder shot and the other would do a face shot, the logic of the space and its lack of visual reciprocity would make it hard to know how to orient viewing. In modest conferencing systems, we see remote locations placed on the wall of the room we are in, usually not in the same scale or perspective as our room. Endemic to high-end systems (and often in the modest ones as well), there are rarely any windows to the outside in attempt to enforce the “one-room” visual metaphor; the result is that the room could be anywhere (or nowhere). And, in the example of the Grandparents observing their grandson’s bath, the Grandparents were in the kitchen and the grandson was in the bathroom – what an odd conjoining of spaces that is!
Ubiquitous Interactive Display
Within the next ten years, almost any surface in the built environment can be an interactive display. There are enormous challenges to be overcome to make the relatively fixed world of conferencing an element in this ubiquitous display world. Let’s say I was walking along the a hallway with three friends, two of whom were virtually displayed with me even though one was sitting in her office and the other eating lunch in San Francisco.
These three histories can also be thought of as three genres of teleconferencing. One of the interesting things about the idea of genre is that while different genres might use the same media, the content and the expectations of producers and consumers of a genre change the media, making a better fit with the genre. Take the medium of print: it can be used to make memos, newspapers, and books. But the tools used to make each have evolved to be specific to their aspects. Printing memos is usually done with small office printers while printers for newspapers and books are factory scale; newspaper printers trade off quality for speed, and books do just the opposite. And the printed matter looks different, has different form and structure, and is read differently. So what about teleconferencing?
We have seen how the ubiquity of teleconferencing technology has been adapted for these very different settings and uses. Perhaps, the market will respond to these different genres with more specific systems. How will the Grandparent teleconferencing system be different from the shared coffee room system? How will these influence conferencing systems? The future looks pretty exciting.
 I saw a video of the talk shortly after coming to PARC, but as of the date of this publication, I have been unable to find anything other than index references to it.
 The scanty documentation of this project at the artists’ website (not currently operational) showed a few still images and a snippet of a longer, now apparently lost documentary. Occasionally, parts of the video appear on Youtube and may be accessible. A longer discussion of Hole in Space can be found at Harrison (2008).
 This story is told in more detail in an article in Communications of ACM (Bly, Harrison and Irwin, 1993) and a chapter in Video-mediated communication (Harrison et al., 1997). The vision for media space is Media space by Bob Stults (Stults, 1986).
 This idea gets re-invented numerous times for many different purposes. See Buxton’s “Hydra” for another version of this idea but for multiple simultaneous connections (Buxton, Sellen & Sheasby, 1996).
 Microphones were another matter – insuring privacy was usually done by making sure that mics were turned off.
 Like many media space ideas, shared spaces for hands keeps getting re-invented; see http://www.kickstarter.com/projects/jayne/portals?ref=users
 To read more about media space research, the ACM digital library has a quite number of articles. Also, see The Media Space, 20+ Years of Mediated Life (Harrison, 2009) which I edited and substantially contributed to.
 This matches the experience of camera manufacturers who report that the single largest trigger of camera purchases is not travel, but the birth of the first grandchild.
Adler, A. & Henderson, A. (1994). A room of our own: Experiences from a direct office share. In B. Adelson, S. Dumais, and J. Olson (Eds.), Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '94) (p. 138-144). New York, NY: ACM. doi: 10.1145/191666.191727
Bellotti, V. & Dourish, P. (1997). Rant and RAVE: Experimental and experiential accounts of a media space. In K. Finn, A. Sellen, & S. Wilbur (Eds.), Video-mediated communication (pp. 245-272). Mahwah, NJ: Lawrence Erlbaum.
Bulick, S., Abel, M., Corey, D., Schmidt, J., & Coffin, S. (1989). The US West Advanced Technologies prototype multimedia communication system. Proceedings of GLOBECOM '89: IEEE Global Telecommunications Conference & Exhibition (pp. 1221-1226). Los Alamitos, CA:IEEE. doi: 10.1109/GLOCOM.1989.64149
Buxton, W. & Moran, T. (1990). EuroPARC's Integrated Interactive Intermedia Facility (IIIF): Early experience. In S. Gibbs & A. A. Verrijn-Stuart (Eds.), Multi-user Interfaces and Applications. Proceedings of the IFIP WG 8.4 Conference on Multi-user Interfaces and Applications (pp. 11-34). Amsterdam, The Netherlands: North-Holland.
Dourish, P., & Bly, S., (1992). Portholes: Supporting awareness in a distributed work group. In P. Bauersfeld, J. Bennett, and G. Lynch (Eds.), Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '92) (pp. 541-547). New York, NY: ACM. doi: 10.1145/142750.142982
Fish, R., Kraut, R., & Chalfonte, B. (1990). The VideoWindow system in informal communication. In Proceedings of the 1990 ACM Conference on Computer-Supported Cooperative Work (CSCW '90) (pp. 1-11). New York, NY: ACM. doi: 10.1145/99332.99335
Galloway, K., & Rabinowitz, S. (1980). Hole in Space. Project displayed at Lincoln Center, NYC and Century City, Los Angeles, November, 1980. Retrieved from http://hdl.handle.net/10020/cat723172
Gaver, W., Moran, T., MacLean, A., Lövstrand, L., Dourish, P., Carter, K., & Buxton, W. (1992). Realizing a video environment: EuroPARC's RAVE system. In P. Bauersfeld, J. Bennett, and G. Lynch (Eds.), Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '92) (pp. 27-35). New York, NY: ACM. doi: 10.1145/142750.142754
Gaver, W., Sellen, S., Heath, C., & Luff, P. (1993). One is not enough: Multiple views in a media space. In Proceedings of the INTERACT '93 and CHI '93 Conference on Human Factors in Computing Systems (CHI '93) (pp. 335-341). New York, NY: ACM. doi: 10.1145/169059.169268
Harrison, S. (2008). Seeing the Hole in Space. In T. Erickson & D. W. MacDonald (Eds.), HCI remixed: Reflections on works that have influenced the HCI community (pp. 155-160). Cambridge, MA: MIT Press.
Harrison, S., Minneman, S., & Marinacci, J. (1999). The DrawStream Station or the AVC’s of video cocktail napkins. In IEEE International Conference on Multimedia Computing and Systems, 1999 (pp. 543 – 549). Los Alamitos, CA:IEEE. doi: 10.1109/MMCS.1999.779259
Ishii, H., & Kobayashi, M., (1992). ClearBoard: A seamless medium for shared drawing and conversation with eye contact. In P. Bauersfeld, J. Bennett, and G. Lynch (Eds.), Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '92) (pp. 525-532). New York, NY: ACM. doi: 10.1145/142750.142977
Judge, T., & Neustaedter, C. (2010). Sharing conversation and sharing life: Video conferencing in the home. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '10) (pp. 655-658). New York, NY: ACM. doi: 10.1145/1753326.1753422
Kraut, R., Rice, R., Cool, C., & Fish, R. (1994). Life and death of new technology: Task, utility and social influences on the use of a communication medium. In Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work (CSCW '94) (pp. 13-21). New York, NY: ACM. doi: 10.1145/192844.192858
Mantei, M., Baecker, R., Sellen, A., Buxton, W., & Wellman, B., (1991). Experiences in the use of a media space. In S. P. Robertson, G. M. Olson, and J. S. Olson (Eds.), Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '91) (pp. 203-208). New York, NY: ACM. doi: 10.1145/108844.108888
Neustaedter, C., & Judge, T. (2010). Peek-A-Boo: The design of a mobile family media space. In Proceedings of the 12th ACM International Conference Adjunct Papers on Ubiquitous Computing - Adjunct (Ubicomp '10 Adjunct) (pp. 449-450). New York, NY: ACM. doi: 10.1145/1864431.1864482
Roseman, M., & Greenberg, S. (1996). TeamRooms: Network places for collaboration. In M. S. Ackerman (Ed.), Proceedings of the 1996 ACM Conference on Computer Supported Cooperative Work (CSCW '96) (pp. 325-333). New York, NY: ACM. doi: 10.1145/240080.240319
Stafford-Fraser, Q. (2001). The life and times of the first web cam: When convenience was the mother of invention. Communications of the ACM, 44(7), 25-26.
Tang, J., Isaacs, E., & Rua, M. (1994). Supporting distributed groups with a montage of lightweight interactions. In Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work (CSCW '94) (pp. 23-34). New York, NY: ACM. doi: 10.1145/192844.192861
Tang, J., & Minneman, S. (1990). VideoDraw: A video interface for collaborative drawing. In J. Carrasco Chew and J. Whiteside (Eds.), Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '90) (pp. 313-320). New York, NY: ACM. doi: 10.1145/97243.97302
Copyright 2013 Communication Institute for Online Scholarship, Inc.
This file may not be publicly distributed or reproduced without written permission of