Access the full text.
Sign up today, get DeepDyve free for 14 days.
1IntroductionAnyone who was a regular participant at the research meetings at The Center for Semiotics in the early 2000s will at some point have seen Per Aage Brandt perform an artistic and ritualistic unfolding of an intricate mental space network (Fauconnier and Turner 2002). It was artistic because of Per Aage’s aesthetic drawings on the blackboard, as well as his intense and improvised manner of talking and gesturing his way through elaborate, in-depth analyses–often complete with piano accompaniment. It was ritualistic because the analyses always, always culminated in a triumphant recapitulation of his trademark theme: blended semiosis only makes sense when grounded in the circumstances and relevancies of the particular interlocutors and their engagement in the specific situations of their interaction.The formal upshot of the “Aarhus school’s” semiotic approach to mental spaces and conceptual blending was a “five (sometimes six) space model” whose main signature included a “Semiotic Base Space” – a more comprehensive version of Langacker’s (2001) notion of “Ground”. In the Aarhus model, networks of mental spaces and blends emerge as meaningful phenomena relative to the concrete world in which the communicative act is situated. Consequently, blends apply “virtual” renderings to situations in accordance with matters at hand represented in the Semiotic Base Space.Our general goal in this article is to push forward cognitive semiotic theory in relation to the reality of current, daily communication which to an ever-increasing degree is mediated by digital audio-visual technology (AVT) platforms. We pursue this goal via a small set of case studies that explore how this technology changes and challenges social interaction and how participants exploit and adapt cognitive, embodied, technological, and semiotic resources in creating meaningful, collective, virtual spaces of joint social activity. We pay homage to Per Aage Brandt by incorporating insights from the Aarhus model presented in Brandt and Brandt (2005). By seeing the construction of virtual meeting spaces as constructions of immersive blends in which social activities unfold, our analyses focus on how the Semiotic Base Space is itself the object of constant reconstruction, organization, and interpretation.2The Aarhus modelThe classical model of Conceptual Blending Theory (Fauconnier and Turner 2002) (CBT) has many virtues, most notably its account of how novel combinations of conceptual structure can help speakers to achieve their communicative goals. A framework of conceptual structure, however, is not sufficient to explain the dynamic facets of usage in particular discourse ecologies, a growing concern of cognitive semiotics. How, then, do we “semiotize” CBT?An initial step came from Brant and Brandt (2005), the publication ushering in the Aarhus model of conceptual blending that was subsequently elaborated in various guises by Coulson and Oakley (2005), Hougaard (2008), Oakley (2009, 2020, Oakley and Coulson (2008), and Oakley and Pascual (2017). Perhaps the most conspicuous difference from the mental space networks developed in Fauconnier and Turner (2002, hereafter F&T), is the addition of the Semiotic Base Space (often referred to as Grounding and Grounding Space by Oakley and co-authors). With F&T’s conceptual structure model, social dimensions are routinely treated as exogenous variables. By contrast, Brandt and Brandt (2005) emphasize the endogenous sociality of semiotic expressions. The addition of the Semiotic Base to the formalism requires the analyst to consider how meaning emerges from these interactional concerns.As Brandt and Brandt (2005) conceive it, the Semiotic Base Space comprises three determinants, represented diagrammatically below as three concentric spheres (see Figure 1).Figure 1:Semiotic base space (inspired by Brandt and Brandt 2005: 226).The innermost sphere designates expressive acts as such, with the default being face-to-face interaction. The innermost sphere may, in fact, be phylogenetically and ontogenetically primordial for establishing primary and secondary intersubjectivity (Trevarthen and Hubley 1978). This sphere encompasses intercorporeality (Meyer et al. 2017) as well as basic sociocognitive operations, such as joint attention, alignment, and expressive imitation. One might regard this inner sphere as comprising the personal and peripersonal space of bodily expression. The sphere directly enveloping it is the sphere of framing, that is the particular situation of participants, audience, exigence, and constraints (Bitzer 1969). This framing sphere is, in turn, enveloped by a third sphere, the “pheno-world,” or life world of given semantic domains (see Brandt 2004; Oakley 2020, Ch.9).In any communicative act, the Semiotic Base Space connects to two “input” spaces, a Presentation Space and a Reference Space, respectively. The Reference Space pertains to the topic at hand, the object signified, while the signifier is introduced in the Presentation Space. The construal in the Presentation Space is integrated with the object in the Reference Space in the Virtual Blend Space. Crucially, the blending in the Virtual Blend Space is accomplished by an Interpretant whose cares and concerns constrain the relevant emergent inferences. Those emergent inferences in turn impact the Semiotic Base in the circuit of meaning.Blending takes place in time and space and under specific conditions and constraints that must be “baked into” the semiotic account to have sufficient explanatory power. Consequently, the Semiotic Base serves as the launchpad for how mental space networks emerge, generate a meaning, and thus influence thought and actions. Importantly, these circuits of meaning develop relevance that is socially shared and framed (Brandt and Brandt 2005). The Aarhus model here bears resemblance to Erving Goffman’s account of “primary frameworks”, “keyings” (1974, chs 2&3), and of “guided doings” (22) that usher in the inherently social and astoundingly complex process of attending to some aspects of the situation while ignoring others. As we elaborate in Section 4, computer mediated communication presents demands on these processes that differ from those in face-to-face interactions and impact sensemaking in interesting ways.3Computer-mediated communicationResearch in computer-mediated communication (CMC) has a long standing interest in the forms of “presence” that technology affords, dating back to its introduction by Short et al. (1976). This interest has not, unfortunately, led to a consensus about what presence is and how it should best be studied. For example, Short et al. (1976) defined “social presence” as “the degree of salience of the other person in the interaction and the consequent salience of the interpersonal relationships” (Short et al. 1976, p. 65). Four years later Minsky (1980) introduced “telepresence” in connection to remote-controlled robots that would optimally “feel and work so much like our own hands that we won’t notice any significant difference.”These early works ushered in a series of studies of social presence that led to further, related concepts (e.g. “self presence”, see e.g. Ratan 2012), and their application to and modification for various emerging technologies and contexts (see e.g. Biocca 1997 on presence in Virtual Reality). In a comprehensive review, Lombard and Ditton (1997) identify six conceptualizations of presence, including presence as social richness, realism, transportation, immersion, social actor within medium, and medium as social actor. More recent review articles document the massive growth of social presence research, show the challenging variety of definitions, and testify to the struggle to achieve a standardized terminology (see, e.g., Lombard and Jones 2015; Löwenthal and Snelson 2017; Oh et al. 2018).The sources cited so far overwhelmingly concern research that can be categorized as “(socio)(media)psychological.” In keeping with trends and interests in this area, the aspiration is to measure the psychological experience of presence and to do soon the basis of valid and reliable definitions and methods. In these schemes “factors of presence” (aspects of the mediation) typically matter in terms of how they enhance or lower the experience of presence. The assumption is, then, that “the” experience of presence exists as a unified phenomenon that can be defined, operationalized, and measured.Complicating matters further is a line of work by researchers whose focus is more on the “constructivist” and experiential aspects of presence. This includes presence concepts and models based on embodiment theory and cognitive science (e.g. Biocca 1997; Mennecke et al. 2011; Triberti and Riva 2016), phenomenological description (e.g. Dreyfus 2009; Gleason 2016; Grabarczyk and Pokropski 2016; Hougaard 2021; Klevjer 2012; Zahorik and Jenison 1998), and social interaction analysis (e.g. Schulze and Brooks 2019). This work seeks to characterize the actions, resources, and conditions for the achievement and experience of presence, and to situate it in the philosophy and science of human experience and sensemaking.We enter the “constructivist” conversation on presence in technology use with a microanalytic exploration of the cognitive semiotic resources of engagement in social technological spaces. We make no assumptions about particular measurable feelings of presence. Instead, we consider how participants create and manage social, mediated spaces as practical, interactional, cognitive semiotic achievements. In doing so, we contribute an interactional, cognitive semiotic take on the accomplishment of mediation and the construction of virtual spaces.A similar approach was articulated by Steuer (1992: 77–78) in his discussion of “virtual reality.” Rather than treating mediation as a transfer of information in a process linking sender and receiver through a medium conduit, Steuer focuses attention on media as “environments” that are first created and then experienced. Thus Steuer proposes that, with virtual reality – as with older technologies – users experience a “reality” which derives not from their physical spaces, but, rather, from a shared, mediated space. Here we propose that this virtual environment is accomplished, orchestrated, and maintained via the interactional application of semiotic resources and the socially shared effort of conceptual blending.The move away from a simple transmission or magic mirror model of mediated communication and towards a virtual environment model also entails a reconceptualization of mediation itself. People, objects, and the physical environment are not “merely” transferred as representations through a medium. Rather representations are emmediated – managed as in and of the medium – to become part of the shared, virtual environment. This point can be illustrated by looking at one of many cartoons and memes that have been shared on social media in 2020–2022, depicting recognizable experiences of working from home during COVID-19 lockdowns (Figure 2). Synchronous AVT communication involves participants’ extended efforts to emmediate select features of their physical, corporeal presence for and with others such that they create a socially relevant or sanctioned appearance of themselves as bodies for and of the physically manifested medium.Figure 2:SoMe meme: working from home.The cartoon helps illustrate the sense in which computer mediated communication involves conceptual blending between the speaker’s Reference space and the technologically mediated Presentation space. The Reference space may correspond to F&T’s “focus” or “topic” spaces, and relates to actuality, or human manifest reality. The Presentation space may be regarded as a “predicating space,” and is often a figurative scene or scenario (see Brandt and Brandt 2005: 227–230 for discussion). In the case of a Zoom meeting, however, the Presentation space is the audiovisual representation of the speakers (and listeners). Semiotic structure from these two spaces integrates in a Virtual Blend to yield emergent structure. In Figure 3, the emergent structure in the Virtual Blend is the understanding that the worker is wearing dress pants that match the formality of his shirt. His professional attire feeds into the Semiotic Base, reinforcing the framing of the exchange as a business interaction.Figure 3:Conceptual blending network for a worker’s outfit on a Zoom call (diagram inspired by Brandt and Brandt 2005: 229, passim).Notice, too, how the cartoon presents part of the man’s physical environment as a “meeting view zone”. Physically, the man and the white shirt he wears for the meeting are where they are. But phenomenologically selected parts of the man’s physical environment are in the “zone” of a meeting which exists nowhere physically. The meeting exists only in the technologically facilitated and represented combination of selected, physical zones and their contents. In other words, the man’s torso, white shirt, and orderly background have been selectively emmediated for the shared virtual environment and comprise the focal elements of a Presentation space.The cartoon’s communicative intention is to contrast the orderly presentation of “work” with the disordered clutter of “home”. The effect is to highlight the blended reality of the new, “bifurcated” experience of working from home, as the Reference space highlights “home” and “privacy” that collides with the Presentation space of “staged work.” The virtual space “pans out” to reveal the bifurcation of appearance versus reality as the meaning of the cartoon: the appearance is the presentation of the working self; the reality is the extended view of the home beyond the surveillance of the computer camera (see Figure 4).Figure 4:Cognitive semiotic analysis of the “working from home” cartoon shown in Figure 2.The cartoon also illustrates how AVT challenges interpersonal communication and thus presents circumstances which are not written into the semiotic account of sensemaking (Brandt and Brandt 2005). The man in the white shirt participates in a meeting but he is not in a shared, physical space on enunciation. The only relation in the man’s peripersonal space is between himself and an object (the computer). The innermost sphere of expressive acts is disintegrated and yet communication continues to occur. This changes the status of the interpersonal relationships and spreads through the spheres of the Semiotic Base to challenge their alignment. In other words the semiotic “playing field” has changed, as illustrated in the examples below.4Zoom meetingsBecause the ground of communicative interaction often goes unnoticed, it is best explored in cases of breakdown. The case studies below are all excerpts from Zoom meetings recorded during the Covid Pandemic and all involve varying degrees of breakdown. All are cases of “organizational communication” and thus allow comparison. However, the meetings also constitute three very different contexts, each with very different participants. We suggest that the conditions of synchronization in AVT are sufficiently different from face-to-face interaction to leave distinct traces on the types of expressions used therein. As will be apparent, the segments under analysis present situations whereby the co-present virtual actors perform, apropos of Goffman, clumsy-conversational moves to conversational moves-made-clumsily to subtle-variations on moves from face-to-face conversation.4.1I’m not a catThe first segment involves the beginning of a formal court hearing of the 394th (Federal) District Court of Texas in which an Officer of the Court, attorney Ron Ponton, joins with a Zoom filter turned on, effectively metamorphosing him into a small kitten, with an effect so peculiar and potentially embarrassing for Mr. Ponton that the presiding judge uploaded the video clip to the Court’s website, directing future court plaintiffs, defendants, and officers to check their Zoom settings before “arriving.” Widely shared on social media, this clip depicts attorney Ron Ponton in a formal hearing in the guise of a kitten. We suggest it represents a prototypical clumsy conversational move.Figure 5 is a still image of the Zoom screen for the clip available for viewing at the URL listed in the caption. The following discourse occurs in the transcript:Ponton (lawyer): I’m prepared to go forward with it … I’m here live … I’m not a cat.Judge (video off): I can see that – I think … if you click the up arrow:Figure 5:Screenshot of the first segment ‘I’m not a cat’ available for viewing at https://www.youtube.com/watch?v=lGOofzZOyl.The judge posted this video clip with the statement:: “If a child used your computer before you join a virtual hearing, check the Zoom video options to be sure filters are off. This kitten just made a formal announcement on a case in the 394th.” Figure 6 presents a blending analysis of the construal that supports the judge’s reference to Ponton as “this kitten”. The Semiotic Base in Figure 6 refers to the two interlocutors in the exchange, Ron Ponton (the lawyer) and the federal judge. Two others, Jerry L. Phillips and Gibbs Bauer, are participants in the hearing. The purpose of the exchange is a petition for which the 394th Judicial District Court has jurisdiction.Figure 6:Blending diagram for the ‘I’m not a cat’ example.Figure 7:Teacher.Figure 8:Child 4, kicked her out.Figure 9:Child 7, wide eyes.In the Virtual Blend Space, the judge’s video is turned off so that he is manifested as a voice, Bauer and Philips are video-graphically present but not speaking, and Ponton is speaking through a kitten filter. It may be worth noting that the kitten filter does not produce cat-like noises. The lips conform to the words uttered by the real Ron Ponton, and the eye movements of feline Ron Ponton seem to track those of the human Ron Ponton in real-time, reflecting his worried attempts to remove the filter. Clearly, the hybrid kitten/lawyer in the Virtual Blend has a disruptive impact on the proceedings.In the Virtual Blend Space, the emmediated feline-attorney, referring to himself as “I,” states his intention to “go forward with it [the hearing],” but feels compelled to profess “I’m not a cat.” The relevance of the negated cat frame comes of course from the kitten in the Presentation space and Ponton’s negation is an entreaty to his interlocutors to undo the integration in the Virtual Blend. Although emmediated as a feline, Ponton’s words are to be taken as emanating from the lawyer in the Reference space. The court is to disregard the visage of the speaking agent in the Virtual Blend, as it is not relevant to the legal proceedings. Ponton’s locutions as a feline are to count as illocutions in the virtual hearing space that exists nowhere physically but relies on the cooperation of its participants for its realization as social fact.The digital platform that imposes the visual appearance of a kitten dominates the situation perceptually in a way that upsets its official frame. In so doing, it demonstrates how AVT challenges default assumptions of the semiotic approach to blended semiosis: participants are not given as directly, corporeally co-present. Prone to technological “invasion” that cannot be predicted from the communication itself, the humanly constituted communicative situation is precarious. The constant threat of presentation breaches may leave otherwise taken for granted aspects of semiosis uncertain – that participants will know what the referred situation is, what can be expected of the ground, that relevancies are inferrable, and so on. This shifts the focus from how the Semiotic Base generates relevant blends to how the network generates concerns for the Semiotic Base.Still while technology constrains emmediation in a Zoom event, it does not determine it. Socially negotiated framing is also relevant. Hearing the feline Ponton’s declaration that he is not a cat, the judge seems to sanction it in his response, “I can see that.” Despite Ponton’s cat-like appearance, the judge uses the sense perception verb “see” epistemically, to mean “know,” though the addition of the modal “can,” suggests it takes some effort to “see” Ponton as a lawyer and not as a cat.Syntactically, the “that” in the judge’s utterance is a bare complement, a means of indicating “you are not a cat” without uttering the entire complement clause. We suggest the reason the judge can zero out the complement is the ready availability of the Ponton in the linked Reference space. Although the real Ron Ponton is only accessible through the feline Ron Ponton, the judge’s utterance indicates he is suffering no delusion that a lawyer in his court is a cat. Moreover, because the judge’s culturally conferred position gives him the authority to determine social facts in his virtual court room, the kitten is allowed to petition the court.One meta-reflection that this introduces is whether aspects of appearance on AVT are due to “agentive” features of the technology or due to actions or lack of actions or skills by the user. Kittens aside, many discussions and simple noticings during an AVT meeting may concern what is possible and what is not possible given the technological framework, what is a result of the technology and what is not, or how the meeting is experienced given its tech-borne nature. Thus happenings during AVT meetings may be framed as of the technology and as less negotiable than conditions that are framed as of the social and competent activity of a meeting. Although negotiation among the participants is an essential component of a meeting, the technology and the way it facilitates communication is also a force to be reckoned with.4.2We’re toastOur second example is a segment of a remote class activity that made its way to broadcast news. While the previous segment might be considered a “clumsy move,” this one is best classified as including a “move done clumsily” (Figures 7–9).1 Teacher: Can everyone hear me? [Shares screen]. We’ll try and at least discu[ss]. [Connected interrupted.]2 Child 1: Guys what happened?3 Child 2: She left the meeting.4 Child 3: Yeah, she left the meeting.5 Child 4: It kicked her out the internet.6 Child 5: Can you guys see me? [holds an object up to the screen].7 Child 6: [Starts singing].8 Child 7: Um all of you guys, I see that there’s a little record sign at the top of the screen. So the teacher’s recording this. She can watch us and see that we’re not behaving.9 Child 1: Yeah, I know.10 Child 7: And the principal … [eyes widen]11 Child 1: Yeah, we know.12 Child 3: I’m just muting my mic, ‘cause I don’t want to talk.13 Child 7: So, we should behave.14 Child 2: Yeah. Yeah, you’re right. [Nodding head].15 Child 1: Yeah, my mom’s filming that. I’m eating gummies.16 Child 7: Cause if the principal finds out that the video of … if we were being bad right now, she would see us being very bad and, uh, we would be in big trouble with our parents.17 Child 6: Yup, we’d be toast18 Child 4: We’re toast19 Child 2: Who is talking?20 Child 6: Toast means we’re in trouble.21 Child 1: Mm, I want toast22 Child 8: Guys, she doesn’t have our number. Our phones.23 Child 7: And um, whoever said, we’ll be toast, whoever said that, I’ll maybe have toast for breakfast tomorrow.24 Child 6: I said that.25 Child 1: That sounds funny.26 Child 8: She doesn’t know our numbers.27 Child 6: I said that, I said “toast”28 Child 1: With toast and eggs that sounds really good with lettuce29 Child 9: Mhmm30 Child 6: Not eggs!Available for viewing at: https://www.youtube.com/watch?v=gPr8nComZVgFigure 10 depicts a blending network for the overall scenario in the ‘We’re toast’ clip. The Semiotic Base of this network consists of an elementary school class, the participants include a teacher and students. More specifically, this scenario is of a teacher-led discussion, the activities of which are sanctioned by the school calendar, constituting a legal part of the 180 day curriculum. The teacher has shared a text editor on her screen, presumably to write comments as the discussion ensues. The shared screen abruptly disappears, as does the teacher’s box in the Zoom interface and the lesson devolves into free-play and conversation. One child holds something up to his camera, and asks, “Can you guys see me?,” (6) while another starts singing.Figure 10:Blending network for the situation in the “We’re toast” segment.As noted above, the expressive acts in a Zoom meeting take place in spatial-temporal isolation, as each interlocutor strategically recruits the technology to emmediate their presence in the meeting (Figure 11). This example again highlights how emmediation is governed not just by technological affordances, but involves social work. In this case, the sanctioned emmediation as “pupil” has been led by the teacher. But in her absence there is a lack of activity-sanctioning authority as the joint, timed, and interactionally coordinated effort whereby the participants create and sustain the lesson as a medium-borne synchronous, social space dissolves, and the lesson is no longer a techno-social fact.Figure 11:Emmediation as class.As the teacher’s virtual self is disconnected from the meeting, Child 2 and Child 3 both describe the event as agentively initiated by the teacher (e.g., “She left the meeting.”) By contrast, with “It kicked her out the Internet,” Child 4 recruits the caused-motion construction X Causes Y to Leave Z (Goldberg, 1995: 152–179) to activate an embedded blend in which unknown agents “kick her out” of the virtual space, as if she were bodily removed from a physical one. The ensuing shift from focused discussion to spontaneous expression is analogous to that typical of children in a physical classroom when the teacher steps into the hallway without designating another adult as the new authority figure.However, the children’s understanding of the AVT leads to utterances and behaviors that would be unlikely to occur in a physical classroom setting under analogous conditions. For example, in 6, Child 5 poses a question regarding his visibility to others (“Can you guys see me?”) as he emmediates a small white object. Even more striking is Child 7’s utterance in 8 with an explicit reference to the AVT and its relevance for the situation. “Um all of you guys, I see that there’s a little record sign at the top of the screen. So the teacher’s recording this. She can watch us and see that we’re not behaving.”Whereas Child 1 in 2 addresses the group with “Guys,” and Child 4 in 5 does so with “you guys,” Child 7 in 8 uses “all of you guys,” – the quantifier presumably signaling that the utterance applies to everyone on the call. Attention orienting devices that might work well in the physical classroom – shouting and full-body gesticulations – are simply not as effective in the Zoom classroom. Without bodily co-presence, participants must resort to linguistic resources to capture the attention of interlocutors in atomized physical spaces to direct attention in the emmediated social space.Moreover, to do so, Child 7 appeals to a feature of the Presentation Space (the Recording sign) potentially available to all of the participants. While she reports her own observation “I see that there’s a little record sign” its location is described as “at the top of the screen” – not the top of MY screen, or the top of YOUR screen, – but the top of “the” screen, consistent with the perspective of a socially shared screen. The record sign in the Presentation Space is linked to the recording action in the Reference Space as Child 7 says, “So the teacher’s recording this.” While in theory, the proximal deictic “this” might refer to Child 7’s utterance, the subsequent context (“She can watch us and see that we’re not behaving,”) suggests it extends to the entire episode and its failure to conform to the normative standards of a real lesson.The others register agreement in 9 as Child 2 responds first for herself with “Yeah, I know”, and then in 11, “Yeah, we know” for the group. In 12, Child 3 signals his intention to behave with an explicit reference to his interaction with the technology, “I’m just muting my mic, ‘cause I don’t want to talk.”Despite the apparent uptake of her entreaty to the others to behave, Child 7 continues in 16 with a discussion of the implications of the teacher’s use of the recording to punish them for their unruly behavior. Using the present progressive, Child 7 elaborates, “if we’re being bad right now,” thus emphasizing the temporal present by calling attention to the shared here-and-now of the virtual classroom. She continues by asking the rest of the students to imagine a future space wherein the teacher and principal view their behavior. This surveillance scenario looms large in her imagination as her eyes widen in fitful contemplation.Although less chaotic than the initial part of the clip, the conversation in 16–22 is still a bit disjointed. Child 6 (17 “Yup, we’d be toast”) and Child 4 (18 “We’re toast”) respond rather directly to Child 7, as does Child 8 whose disagreement in 22 (“Guys, she doesn’t have our numbers. Our phones.”) is at odds with the on-going shift in topic that begins before 22 in 19–21, and continues afterwards in 23–25 and persists in 27–30 – despite Child 8’s repeated objection in 26 (“She doesn’t know our numbers”).The topic shift concerns Child 6’s completion of the hypothetical scenario raised by Child 7 in 16 (“if we were being bad right now, she would see us being very bad and, uh, we would be in big trouble with our parents”). Maintaining the subjunctive mode of Child 7’s utterance (“we would be in big trouble”), Child 6 in 17 says, “We’d be toast.” In 18, Child 4 employs the indicative “We’re toast,” potentially implying punishment is inevitable, though the interaction shifts to consider the idiom itself.A full understanding of the idiom, “We’re toast” implies a presentation scenario of making toast, that – when integrated with the relevant concepts in the Reference space – conjures up a hyperbolic immolation scene in which the students are burned alive. In this scenario, the teacher, principal, and parents are referenced as the authorities who will be surveilling their recorded behavior and meting out punishment (Figure 12). It is unclear, however, whether the children fully consider this scenario, given the exchange that follows in turns 19–29.Figure 12:We’re toast.Child 2 in 19 asks, “Who is talking?” – another question that would be out of place in a face-to-face conversation. That is, in face-to-face interaction we usually know who has made a remark and don’t need to ask. The (lack of) affordances on a Zoom call, however, often make it difficult to infer the source of a brief interjection. Given the context of a discussion of potential punishment for misbehavior, it seems possible that this was an attempt to promote silence. Subsequent turns, however, suggest it was interpreted as a request to clarify the meaning of the utterance. In 20, Child 6 notes that “Toast means we’re in trouble.” Child 1, who had earlier announced that she was eating candy, follows in 21 with “Mm, I want toast.” The Presentation space becomes salient as the discussion topic changes to food, hence the grayed-out spaces in Figure 13.Figure 13:Toast for breakfast.The uncertain identity of the speaker who introduced the toast idiom is emphasized by Child 7 in 23, “whoever said, ‘We’ll be toast’, whoever said that” as she announces she might have toast for breakfast. Child 6 claims credit in 24 “I said that,” (although Child 4 also used the idiom), and, after his contribution is lauded by Child 1 in 25, “That sounds funny”, Child 6 again emphasizes “I said that. I said ‘toast’,” and all subsequent utterances in the clip recruit the literal meaning of toast initiated by Child 1.In this way, toast becomes the new reference space for the children, but the reason the surveilling adults (which includes us) find it funny is precisely the salient discursive presence of the metaphoric and hyperbolic blend of punishment-as-immolation (Figure 12). Here we see a formal lesson slipping into casual conversation in the absence of the proper authority to guide it, all the while being fraught with concern for the future status of the meeting as a real virtual lesson or a behavior eliciting a sanction.4.3The floor is yoursThe final segment is by far the least conspicuous in its blending operations, but for this reason remains highly instructive for the effective management of computer-mediated communication. Below is a transcript of approximately thirty seconds of a ninety minute meeting, beginning at the approval of the minutes and the introduction of their first speaker, Jason Dawson (Figure 14).1Andrew Brown (committee chair): All in favor say “Aye”- [Some speak, others use hand signals (e.g., thumbs up)].2Andrew Brown: Jason. Has Jason joined us?3Jason Dawson: [waves his next to face].4Andrew Brown: He has… I see him [???] Hi, Jason, how you doing?5Jason Dawson: Good, thank you.6Andrew Brown: Great. The floor is yours.Figure 14:Gallery view of the Waipo District Council meeting available for viewing at https://www.youtube.com/watch?v=3cZla4RyUUk.Consider in the first instance the screenshot of the meeting in gallery view, and suppose this is a likely mode of engagement for the committee members. As a participant one is confronted with a range of profiles, in either self-directed, other-directed, and sideways gazes. As confirmed by Schilbach et al. (2006), we are quick to assign social intentions and socially relevant behavior to self-directed gazes, intentional behaviors to other-directed gazes, and arbitrary expressive behavior to oblique gazes. Neurophysiologically, each sort of gaze activates a different network of brain regions, suggesting they initiate three distinct processing routines.These findings suggest a disposition to rely on eye gaze as a fast and reliable means of managing face-to-face interaction. In a Zoom meeting, by contrast, an other-directed gaze might correspond to a self-directed communicative intention and vice versa. Further, an askance gaze may be other-directed, while a self-directed gaze may, in fact, covary with unrelated or irrelevant expressions. Thus, gaze dynamics that tend to be immediately and reliably meaningful in person, are wildly ambiguous in this sort of AVT interaction. Establishing speaker provenance takes up considerably more bandwidth in Zoom meetings than in person and likely produces a range of linguistic effects, such as the palpable sense that we occupy unshared perceptual spaces, as presented in Figure 15.Figure 15:Becoming a meeting participant.As the committee chair moves from the approval of the minutes to the first order of business, he begins by directly addressing the first speaker in 2, “Jason.” When Jason is not immediately forthcoming he continues with, “Has Jason joined us?” The use of the verb “joined” presupposes a group activity, and the first person plural signals Andrew’s alignment with the group. Indeed, the question itself is a bit odd and may be a by-product of the limitations of the AVT. In face-to-face communications, felicitous inquiries of “has X joined us?” involve big crowds or when the perception of individuals in the group is inhibited in some other way. Given that the committee is not such a big crowd, (though big for Zoom!), Andrew’s question in 2 shows orientation towards inhibited mutual perception in the group.At the same time, Andrew recruits the group as a resource for the perception of its members. By addressing the group and not Jason, the question accommodates the possibility that Jason is not in the meeting and cannot come forth. And, even when Jason does come forth (in 3), he is not treated as fully in the group until his being spotted is confirmed. That is, rather than directly addressing Jason with, “Ah, there you are,” Andrew addresses the group with an answer to his own question (4 “He has … I see him). The designation of Jason with third person pronouns marks Andrew’s orientation toward the group and emphasizes Jason’s status as an outsider.It is only when the how-are-you-doing sequence (“Hi, Jason, how you doing?”) is initiated in 4 that Jason “formally” and “ritually” enters the group. While the greeting is somewhat out of place in the midst of a public meeting (being more appropriate for the beginning of a private exchange), it contributes to the framing in the virtual space of Jason arriving from outside. Jason’s reply in 5 (“Good. Thank you.”) is addressed to Andrew and functions as an appropriate response. In 6, with “Great.” Andrew closes the exchange, thus completing a “full sequence” in which Jason is sought, brought forth, and confirmed as a matter of a collaboration between the chair and the group.With “The floor is yours” in the next breath, Andrew invokes a common space with the idiomatic expression. Although there is no physical floor shared by all committee members, this speech act granting “the floor” proceeds without difficulty, as the virtual space accommodates the figurative floor for recognizing that Jason is now a full-fledged participant in the meeting. Committee members attend to Jason’s emmediated utterances as they would utterances made at an in vivo meeting – as contributions for semiosis in their shared social activity.5DiscussionIn a dialectic process, we take as our point of departure general concepts of cognitive semiotics to confront the new media wild. Mediated communication is no longer at the periphery of everyday communication, and, if cognitive semiotics is to remain at the forefront of the study of human meaning, a theory of new media and mediation is needed. We have taken initial steps to consider how a cognitive semiotics approach to mediated communication in the form of AVT leads to the discovery of new issues, and challenges some built-in assumptions in default or “canonical” cognitive semiotics descriptions. Most relevant to the present discussion, the analyses above suggest that assumptions built into the Aarhus model – that the Semiotic Base Space entails the full presence of participants–do not obtain in the new media wild.Sensemaking depends on coordinated activities across all three spheres in the Semiotic Base. As is apparent in our analyses, AVT presents some challenges to the smooth coordination of these spheres, most dramatically because the default of live, in-person, face-to-face interaction relies on facets of the perceptual space for delineating who the interlocutors are, when they speak, and the definition of their intended audience. Moreover, alignment of the expressive act, the framing, and the life world relies on physical and temporal constraints that differ in AVT and face-to-face interaction. The challenge faced can be summed up as a series of general transformations that affect how we think about communicative grounding and the role and nature of virtuality in our everyday lives.Most conspicuous is our sense of space and place in these virtual interactions. The full range of personal, peripersonal, and interpersonal spaces are no longer fully available to specify, disambiguate, or otherwise use in facilitating joint attention. Each participant occupies a separate and uncommon place; each individuated place must selectively project partial facets thereof to be integrated into a common, non-spatial emmediated “place”, where matters of reference, including the use of deictic schemes discussed in the analyses, must be renegotiated to account for the new fact that space is neither canonically here nor there, and entities such as this and that, these and those as it pertains to each participant’s immediate communicative space cannot be held constant across participants, as it can in face-to-face interactions.According to Nadler (2020) the lack of a stable, shared physical space may be a key factor in the recently discovered phenomenon of “Zoom fatigue.” An AVT meeting is a social event where a series of separate, personal spaces are connected. This means that the very, existential basis of intercorporeal, mutual activities and direct, joint management of both intentional and unplanned events is absent. At times “turf battles” may thus be experienced as one engages in an institutional activity via AVT while physically embedded in a home setting (Nadler 2020: 9), and unplanned events in separate, personal spaces may interfere with the shared virtual activity.The fact that these interactions occur over vast distances and different time zones makes palpable the role of time. The common emmediated place is at once synchronous and distributed, which often highlights the differences in our respective time zones. Our own meetings for this project entail as part of the slowly drifting contextual information from each author’s diurnal frames–morning coffee and cereal for Seana; midday soft drink and sandwich for Todd; and early evening beer and steak for Anders. These diurnal frames can be completely irrelevant or highly focal at any given moment. We suspect that as for space and place, such matters may be more or less palpable depending on the demographic.However, the technology also presents functionalities that affect the experience of time. For example, the role of time becomes quite acute for Child 7 in our second segment, as she imagines the goings-on “now” as becoming part of a “permanent record” of misbehavior in a future “then.” The easy means of recording combined with the relative ease of each participant to “see” that they are being recorded greatly reduces temporal discounting by intensifying mental time travel (Boyer 2008): you can satisfy your impulse now, but you will definitely pay for it later! Applying blending terminology (F&T) we can say that the technology facilitates a compression of time where current behavior is subjected to future surveillance.But perhaps more acute than time is the role of timing. The individuated spatial origin of each participant compromises quick access to orientation cues that dominate in-person communication, such as eye-gaze and posture. In the common emmediated place, timing is thrown off, accounting for the fact that it takes considerably more time and effort to orient turn taking. We see this most palpably in the gallery view interface of the Waipa District Council meeting, where it seems that Jason has just joined from “there” despite being present from the very beginning of the meeting.Besides, space, place, time, and timing, AVT impacts the communicating body. For example, one aspect of the way bodies are transformed into technological representations is hyperembodiment in which users’ body representations are technologically enhanced (Hougaard 2021). Hyperembodiment means that users can, either deliberately or accidentally, create new kinds of hybrid intercorporeal experience. We see this in “We’re Toast,” when Child 7 rallies the disintegrated group of pupils and warns them of the consequences of not “behaving.” With her face in framed, close-up focus, and her skin artificially lit, she appears with an enhanced embodied expression of fear as she widens her eyes (see Figure 9).Further, the attorney-kitten in ‘I am not a cat’ highlights the fact that in AVT all bodies are emmediated as avatars to some degree. Filters, backgrounds, green screens, etc. allow for all manner of self-presentation, such that new norms and even rules need to be made explicit, as the simple choice of avatar can amplify the possibility of making clumsy conversational moves. It would seem that the mere fact that the lawyer’s true visual appearance contrasts with his claims about identity creates unease. Perhaps Ponton here violates a new norm of communicative ethics such that a party to a conversation should display as much as possible of one’s public appearance given practical circumstances. Indeed, such a norm might account for much discomfort in problematic episodes of AVT meetings. Ethics of appearance are at stake, and this can lead to breaching the social trust that sustains a stable communicative ground.Considering the issue of what counts as “full display” in an AVT meeting, the cases above suggest a clear norm: faces of interactionally available participants and some limited portion of the upper body and physical background. Interestingly, this is reminiscent of the linguistically connected context-free, talking heads in Saussure’s speech cycle. One might infer from this that AVT communication supports the Saussurian perspective that verbal communication is the talk that heads produce. However, if Saussurian talking heads were sufficient for communication, AVT would proceed effortlessly, with no need to accommodate semiotic resources.In fact, if anything AVT communication shows us how the Saussurian model fails to capture authentic, embodied communication since given a talking heads-setup, participants manage social interaction in ways that highlight the absence of a shared physically embodied space – the very things missing from Saussure’s model. It is more likely that the standard appearance reflects the best presentation of an interactionally aligned body given the circumstances and technological affordances.6ConclusionOur analyses are deeply indebted to conceptual blending theory and the Aarhus model. In a mental space of nostalgic dreaming, we imagine ourselves having these discussions with Per Aage Brandt at the blackboard in a smoky room on Finlandsgade. We have no doubt that he would be quick to develop thorough answers, challenge every point made, and come up with a new model that incorporated even more than the three of us together have imagined while writing this paper.Our ambition here has been to begin to spot the contours of a new arena of analysis. We have zoomed in on the way in which semiotic, interactional, cognitive, and embodied resources are applied and modified to manage, repair, and sustain emmediated, virtual spaces given challenges that follow from the AVT mediatization of communication. We have seen how the technology enters the dynamics of interaction to constitute a factor that requires management (I’m not a cat and The floor is yours) as part of the ongoing social activity, and we have seen how this may upset fundamental aspects of interpersonal communication such as trust (I’m not a cat). We have seen how the transformations of body, time, and space are reflected in language use (We’re toast and The floor is yours). Finally, we have seen how embodied appearance is a concern, challenge and resource of AVT meetings (all cases).With this we can begin to envision further research in this arena and ambitions on behalf of cognitive semiotics. A continuation of cross-disciplinary efforts already established around cognitive semiotics seems like a fruitful basis for a “new media branch.” Our analyses show how attention to the details of semiotic adaptation is central and require an appeal to discursive, semiotic, interactional and intercorporeal approaches to joint sensemaking (e.g. Hougaard 2021; Oakley and Hougaard 2008) and human-computer interaction. Issues related to emmediated bodily appearance and experience also call upon neurocognitive approaches to social cognition and interpersonal communication. Finally, assessments and descriptions of relevancies, the ontology of the communicative ground, and background beliefs call upon phenomenological, ethnographic, and anthropological approaches.Cognitive semiotics is a resourceful enterprise and Per Aage Brandt’s legacy is rich. Come, join us in the new media wild.
Cognitive Semiotics – de Gruyter
Published: May 1, 2022
Keywords: audio-visual technology; computer mediated communication; conceptual blending; mediation; social presence
Access the full text.
Sign up today, get DeepDyve free for 14 days.