Communication Institute for Online Scholarship
Communication Institute for Online
Scholarship Continous online service and innovation
since 1986
Site index
ComAbstracts Visual Communication Concept Explorer Tables of Contents Electronic Journal of Communication ComVista

de Nooy 2013: Review of Decrire la conversation en ligne
Electronic Journal of Communication

Volume 23 Numbers 1 & 2, 2013

EJC Book Review

Review of Décrire la conversation en ligne: Le face-à-face distanciel, edited by Christine Develotte, Richard Kern, and Marie-Noëlle Lamy, Lyon: ENS Editions, 2011, 215 pp.

Juliana de Nooy
University of Queensland
Brisbane, Australia


In French, a convenient adjective – présentiel – has come into usage since the mid-1980s to denote a situation of physical co-presence, leaving the term “face-à-face” available to describe all conversations where participants face each other, whether across a café table or via screen interface. In English, however, “face to face” unfortunately remains the term of choice to denote non-technologically-mediated communication and non-distance education, despite the fact that over the last several decades we have been increasingly able to communicate face to face at a distance. “Face to face in distance mode” is the subtitle of a new book analysing online video conversations and emphasises the disjunction between visual contact and physical proximity.

Faces in close-up dominate the video chat screen, and so it is unsurprising that faces – gaze, expression and movement – are a prime concern in Décrire la conversation en ligne, in which ten established researchers, writing in French and English, explore a corpus of 8 online video exchanges in French, realized using MSN. These were recorded in 2007, precisely 20 years after the publication of a volume edited by Jacques Cosnier and Catherine Kerbrat-Orecchioni which serves as a reference point for comparison: Décrire la conversation.[1] In the essays of the 1987 volume, seven Lyon-based researchers undertook a multidimensional study of a video-recorded corpus of three face-to-face conversations en présentiel, analysing as exhaustively as possible their linguistic, gestural and interactional aspects.

To reduce the variables and facilitate comparative study, participants in the 2007 conversations were subject to similar constraints as the earlier cohort, and again included both genders and a variety of cultural and linguistic backgrounds. Explicit comparison of the two corpora is the focus of the contributions to the 2011 volume by Cosnier and Kerbrat-Orecchioni. Not all variables can, however, be controlled and individual preferences and contextual factors mean that hypotheses regarding the specificity of videoconferencing as a mode of communication need to be tested against a larger corpus.

The researchers were free to choose their focus within the corpus, and it is interesting to see the range of factors explored (rate of speech, pauses, overlaps, gesture, facial expression, gaze, spatial context, openings, closings, interruptions, register) but also those aspects that did not receive sustained attention (notably gender and cultural differences, as pointed out by Kerbrat-Orecchioni). Guiding discussion throughout the book is the question of affordance, the material possibilities and constraints of the communicative environment: to what extent do the norms of communication in presence mode predominate? What behaviours can be attributed to the videoconferencing arrangement? What communicative opportunities do participants take up? How are technical difficulties circumvented? What techniques evolve to facilitate interaction?

The MSN interface involved two video screens (images of self and of interlocutor filmed via webcam) and a text chat zone. Recording software operated simultaneously on each computer and in some cases additional programs such as Gmail were open. In four of the conversations, an additional video-camera was positioned in the room of one or both participants to provide contextual data not captured by the webcam.

Lack of eye contact is perhaps the most salient difference between the MSN conversations and the conversations in presence mode. In a typical conversation from the corpus, Cosnier and Develotte (Chapter 2) find that speakers tend to watch not the webcam but the image of their interlocutor 70% to 90% of the time. Véronique Traverso (Chapter 6) notes that while the gaze of the speaker may stray from the image of the interlocutor, the listener tends to fix her gaze on the image of the speaker, which Kerbrat-Orecchioni (Chapter 8) compares to the difference in eye contact between co-located speakers and listeners.

Hugues Constantin de Chanay (Chapter 7) explores in detail the effects of the gaze in video exchanges, whereby to give the impression of eye contact, you need to look away from the image of your interlocutor and towards the camera. In fact if both participants look at the camera, visual contact is lost entirely; neither will see the other at all. Furthermore, because the images of self and interlocutor are close together on the screen, it is difficult to know which of the two images your partner is looking at with her downward gaze, so you are never sure of being watched or not. Paradoxically, only when you have the impression of being watched (because your partner glances at the camera) can you be certain that you are unseen.

As a result, listeners cannot use eye contact to show attentiveness and, under the illusion of not being watched, allow their faces to become expressionless (“à l’abandon”). But when a speaker produces “pseudo eye contact” by glancing furtively at the camera, the listener responds with intense facial animation. These reactions are less synchronized than in conversations in presence mode, but are more prolonged, resulting in an impression of intermittent theatricality, a curious alternation between expressive and inexpressive faces.

Cosnier and Develotte (Chapter 2) similarly situate the gaze within the context of a wider gestural repertoire and note exaggerated facial expression. They study the data from the supplementary video cameras, which reveal movements of the forearms and hands that cannot be seen on the interlocutors’ screens. The fact that these gestures continue unseen suggests that their function is enunciative rather than communicative, aiding the speaker rather than the audience. Compensating for the invisibility of hand and body gesture is increased facial activity, with the authors counting four times as many expressive facial movements as in the earlier data set, and hypothesising that facial expression assumes many of the functions of other gestures in video chat.

The seen and the unseen of videoconferencing also preoccupy Michel Marcoccia (Chapter 5) whose focus is the difference in spatial contexts between interlocutors. Marcoccia argues against those who consider computer-mediated communication to be beyond spatial constraints, arguing that such a view is simply the product of researchers confining their analysis to what appears on the screen and ignoring the physical spaces in which the devices are used and their impact on communication. Marcoccia too uses the data from the supplementary cameras to show “site effects”. Spatial context shapes the exchange in the form of a topic of conversation (“where are you?”), a disturbance (a third party becoming a ratified participant with only one partner of the exchange), and a frame (domestic space producing more personal and informal conversations than professional space in the corpus).

The gap between the seen and the unseen also figures when Marie-Noëlle Lamy and Rosie Flewitt (Chapter 4) analyse the interruption of another chat conversation opening on one machine, and the collaborative strategies used to close this episode and resume the initial conversation. They highlight the unequal access to the secondary exchange, the fact that erasures while keying in (typed hesitations) and mouse movements are visible to only one participant, and the restricted access to distractions, noises and body movements occurring at the remote site.

Interruptions, along with openings and closings, are seen by Anthony Liddicoat (Chapter 3) as strategic points at which the interaction moves between written and spoken interaction. For Liddicoat it is important not to neglect the identity and ratification work and the securing and testing of the channel that occur before participants begin to speak or resume following interruption. Bids to attract attention and start a conversation, achieved verbally or non-verbally in presence mode, are accomplished in video chat interactions through automatically generated formulae, making the computer an actor in the multimodal conversation. Visual and spoken interaction do not start simultaneously, and closure of the conversation needs to be achieved in two stages, interactionally and technically, such that periods of silence occur during openings and closings while the participants’ images remain available on screen. The interaction thus needs to be understood as a complex co mbination of modalities rather than as a conversation merely facilitated by the technology.

Traverso (Chapter 6) studies overlaps between speakers and ways of managing these. She too notes disjunctions in the information available to each participant, but this time it is auditory information: equalization by microphones hinders perception of changes in the spatial orientation of speaker, and a temporal delay is noticeable in some recordings. These may be responsible for the frequent inter-turn pauses of 0.3-0.4 seconds and the slightly higher frequency of overlaps in the video chat corpus compared to that in presence mode, for 40% of overlaps arise after a pause, when both participants start to speak again at the same time. During overlaps, each speaker watches the image of the other, and in most cases, the overlap is resolved by one speaker rapidly abandoning the turn, most often without reattempting it. In the relatively few cases where neither speaker abandons the turn, this does not constitute a struggle for the floor (as would be the case in presence mode) . Rather, Traverso hypothesises, continuing to speak and not reprising abandoned turns appear to be adaptations to the medium, ways of maintaining conversational flow when auditory quality makes turn-taking cues difficult to recognize.

Kerbrat-Orecchioni (Chapter 8) synthesises the findings of the book, compares them to those of the earlier volume, and situates them in relation to broader issues. The conversations are artificially produced in both cases: to what extent can the findings be extrapolated to other videoconferencing situations? If the general impression is one of less interactivity, is this attributable to less engagement (itself possibly due to the artificiality of the situation) or to the plural focus of participants juggling multiple screens and modes of communication? She notes the way in which technical expertise determines hierarchical positioning (rapports de place) and suggests that the differences noted between video conversations and those in presence mode may be less salient in cultures where rapid turn-taking, competitive overlaps and frequent eye contact are not the norm.

Although some of the features identified as distinguishing the video corpus from the corpus en présentiel can be described in negative terms (lack of eye contact, restricted access to spatial context), the authors do not subscribe to the deficit model of computer-mediated communication, for in each case, the constraints of the medium produce new affordances, new communicative possibilities. If the video participants pronounce fewer words per second, their facial movements are not only more numerous but more animated (Chapters 2 and 7). If hands are usually hidden, their rare appearance on screen adds emphasis to illustrative and deictic gestures (Chapter 2). If technical problems interrupt conversations, multimodality provides the tools for dealing with them (Chapter 3). If sound quality results in overlaps, speakers switch strategy to maintain fluidity (Chapter 6). If physical space is shared only partially, participants create a shared transactional space by glancing at the camera and ensuring that their upper body doesn’t move out of field (Chapter 5). The visual channel is both poorer (two dimensional, little background) and richer (one can see both participants) in video chat (Chapter 8). Ultimately, in each chapter, the differences identified become a conversational resource, exploited inventively as participants not only fulfil the brief of providing a research corpus but bend it to their own purposes.

The corpus has been made publicly available from the website of the Université de Lyon 2,[2] which opens to all the possibility of extending this research and in particular of making comparisons across languages, cultures, subcultures and situations.


[1] Cosnier, J., & Kerbrat-Orecchioni, C. (Eds.). (1987). Décrire la conversation. Lyon, France: Presses Universitaires de Lyon.

[2] CLAPI: Corpus de LAngues Parlées en Interaction, Retrieved February 6, 2013.

Copyright 2013 Communication Institute for Online Scholarship, Inc.

This file may not be publicly distributed or reproduced without written permission of
the Communication Institute for Online Scholarship,
P.O. Box 57, Rotterdam Jct., NY 12150 USA (phone: 518-887-2443).