21.15 Human Machine Miscommunication
Human Machine Miscommunication explores how errors in human-machine interaction arise, their causes, and implications for technology design and user experience.
Human-machine miscommunication is a breakdown in the exchange of information between a human user and a computational or mechanical system, in which the meaning, intent, or content of a communicative act is not correctly transmitted, received, or interpreted by one or both parties. It is the condition in which the human and the machine are not successfully sharing the information needed for the interaction to achieve its purpose — where the user's intent is not accurately interpreted by the system, or where the system's output is not accurately understood by the user, or both simultaneously. Human-machine miscommunication differs from simple system errors: the system may be functioning technically without fault while still failing to communicate effectively with its human user, because the communicative gap lies in the representation and interpretation of meaning rather than in mechanical or computational failure.
Sources of Miscommunication in Human-Machine Interaction
Human-machine miscommunication arises from several distinct structural features of the interaction:
Model mismatch is the fundamental source: the user's mental model of how the system works, what it can do, and what inputs it expects differs from the actual system model. The user formulates inputs based on their model; the system interprets inputs based on its own design. Where the models diverge, inputs and outputs are subject to systematic misinterpretation. The user who believes the search system understands natural questions will ask a natural question; the system designed around keyword matching will extract keywords from the question while losing the question's pragmatic structure.
Semantic gap is the divergence between the meaning the user intends to convey and the meaning the system assigns to the received input. Words and actions that have one meaning in the user's context may have a different meaning in the system's vocabulary, or may have no meaning at all. The user who asks a voice assistant to "play something relaxing" intends a complex semantic specification; the system's interpretation of "something relaxing" is defined by its category and tagging system, which may or may not align with the user's subjective concept.
Contextual gap is the difference between the contextual information the user assumes is available to the system and the contextual information the system actually has. Human communication is richly contextual — speakers assume shared background knowledge, conversational history, current situational awareness, and mutual knowledge of the communication's purpose. Machine systems typically have only the immediately provided input, without access to the surrounding context the user assumes is available. Inputs that are fully specified from the human communicative perspective may be radically underspecified from the system's processing perspective.
Output interpretation failure is miscommunication in the reverse direction: the user misinterprets the system's output. System outputs that are technically accurate but expressed in ways not matched to the user's interpretive frameworks — technical vocabulary, unfamiliar visual representations, ambiguous formatting — can be misread in ways that lead the user to believe they have received information they have not, or to act on a misunderstanding of what the system reported.
Undetected Miscommunication
A particularly dangerous form of human-machine miscommunication is miscommunication that neither party detects. In human-human communication, miscommunication is often revealed through observable incongruity — the partner's response makes no sense given the intent, or the action taken is clearly inappropriate to the request. These signals prompt repair: the participants notice the breakdown and work to restore shared understanding.
In human-machine communication, repair mechanisms are less robust. The machine may process an input correctly according to its model while producing an output that was not what the user wanted — and may report a successful result that confirms the user's belief that their intent was understood. The user, not receiving a clear signal of failure, proceeds with an incorrect understanding of what has been accomplished. Automated processes triggered by the misunderstood input may compound the miscommunication by taking consequential actions based on the incorrectly processed intent before the error is discovered.
Miscommunication in Natural Language Systems
Natural language human-machine interfaces are particularly susceptible to miscommunication because the richness of natural language creates many opportunities for divergent interpretation. The same natural language utterance may be interpreted differently by the user who produced it and the system that receives it because natural language is ambiguous, context-dependent, and pragmatically complex in ways that current systems handle with varying success.
Specific natural language miscommunication patterns include: literal interpretation of figurative or idiomatic expressions; interpretation of underspecified pronouns that are clear in context but ambiguous to the system; application of default interpretations to underspecified requests that differ from what the user meant; and failure to maintain conversational context across turns so that follow-up questions are interpreted as standalone queries rather than as continuations of an ongoing exchange.
Repair and Recovery from Miscommunication
Like human-human communication, human-machine communication benefits from repair mechanisms — processes for detecting and correcting miscommunication after it occurs. Effective repair requires that the system provide signals interpretable by the user as indicating a possible misunderstanding, that the user have mechanisms for correcting the system's interpretation, and that the system be able to update its interpretation based on user correction.
Explicit clarification requests from the system — asking the user to specify which of several possible interpretations was intended — are one repair mechanism. Transparent display of the system's interpretation before executing a consequential action — "I understand you want to delete all files in this folder; is that correct?" — allows the user to catch interpretation errors before they have effects. Undo and reversion capabilities lower the cost of detected miscommunication by allowing consequences to be reversed. And explanations from the system about why it produced a particular output can reveal interpretation mismatches that the user might otherwise not notice.
Design Implications
Reducing human-machine miscommunication requires design attention to all stages at which it arises. Reducing model mismatch requires interface designs that reveal system capabilities and constraints in ways users can accurately model. Reducing semantic gaps requires aligning system vocabularies with user vocabularies, through user research and iterative design. Reducing contextual gaps requires building systems that can maintain and use more context — conversational history, user state, situational awareness. And reducing output interpretation failure requires presenting system outputs in forms matched to user interpretive frameworks, tested against actual user understanding rather than assumed from within the design team's vocabulary.