Summary of the Purpose and Aims of Dialogue Matters Network
The DIALOGUE MATTERS network aims to provide a forum where representatives of three sub-disciplines (theoretical linguistics, computational linguistics and psycholinguistics) can collaborate in developing models of dialogue that can bridge the gulf between theoretical understanding, psychological modelling, and practical system development. We aim to demonstrate the benefits of an integrated approach by developing co-ordinated linguistic and psycholinguistic dialogue models supported by psycholinguistic experimentation and prototype implementation (human-human and human-intelligent-software-agent).
Rationale and Background
How does dialogue work? Dialogue, or informal conversation, is arguably the basic form of human communication. Yet it is currently a form of linguistic expression not at all well understood by the scientific communities interested in language research. Theoretical linguists and psychologists have largely ignored dialogue: linguists because they view it as degenerate language, full of fragments that must be completed by information from context; psychologists because they view it as too error-ridden and difficult to study experimentally. Instead, linguistic knowledge has been modelled on the basis of decontextualised complete sentences, and language use as the production and comprehension of such decontextualised complete sentences. As a result, dialogue is made to seem a mysterious, complex activity involving comprehension and production systems modelled as separate from the knowledge underpinning human performance.
This theoretical vacuum has not just restricted our understanding of the nature of human communication; it has also made it far harder to develop naturalistic dialogue systems, i.e., automated systems that use natural language to interact intelligently with people. Though development of dialogue systems is rapidly expanding (e.g. by international companies such as Nuance, ScanSoft, Philips, Telia), these systems are currently limited in scope and naturalness, typically requiring tightly constrained scripts or tasks (e.g. they can only use highly specific prompts to communicate information about very restricted knowledge domains). Moreover, unlike humans, such systems are poor at using context to drive interpretation and production, and at determining what to say and when to say it (dialogue management). Even prototype state-of-the-art systems which do use context to improve understanding (e.g. TRAINS, CommandTalk and the TrindiKit) can only manage this to a limited extent in generating replies, making them awkward conversation partners.
However, currently getting recognition for dialogue research isn’t straightforward. Understanding dialogue requires an interdisciplinary approach, yet the disciplines essential to its study do not widely recognise that it must be central to understanding and modelling language at a theoretical level. The researchers who do, are relatively isolated; and they have to define appropriate dialogue-oriented models within their individual discipline against the tide of current pervasive methodologies. Putting dialogue at the heart of understanding language is a vital step forward for language-related research, but it urgently needs a high and international profile to become established as mainstream research. Recently, research of the sort that is needed has begun to emerge at a few sites in the UK, mainland Europe, and the US. Focused on international cooperation, the Dialogue Matters Network aims to bring together representatives of three sub-disciplines (theoretical linguistics, computational linguistics and psycholinguistics) collaborating in the development of models of dialogue that can bridge the gulf between theoretical understanding, psychological modelling, and practical system development.
Objectives
We aim to demonstrate the benefits of an integrated approach by developing co-ordinated linguistic and psycholinguistic dialogue models supported by psycholinguistic experimentation and prototype implementation (human-human and human-intelligent-software agent). Dialogue systems offer a way of making information accessible to society at large: systems that could engage in human-like dialogue would allow human-computer interaction using normal means of communication. But if natural, flexible and reliable systems are to be built, a principled approach to studying dialogue is urgently needed now. We believe that progress in this area urgently needs a two-pronged approach: theoretical investigation of the mechanisms of speaker-hearer interaction and information exchange, and the building of computer-implemented prototypes that reflect these theoretical results. Key questions for which we seek answers are:
- How does incremental interpretation of spoken utterances relate to generation of utterances, and how do we switch so easily between the roles of speaker (producer) and hearer (parser)?
- How do interlocutors’ mental models of situations under discussion become aligned through dialogue, particularly following misunderstandings or otherwise imperfect communication?
- How can these mechanisms be extended to align software agents' and robots' mental models with those of human interlocutors?
- How does the production of speech in dialogue relate to the gestures a speaker makes and the visual attention of listeners?
Programme Description
Through site visits, we will exchange methodological expertise and instrumentation, and develop computational architecture and computational/psycholinguistic tools for implementing and testing dialogue systems (using dialogue corpora at Edinburgh, SUNY). Site visits will be bi-directional to ensure that researchers at all sites can benefit from and contribute to the ongoing research. Annual workshops will report these results and discuss foundational issues and their consequences for specific research methodologies.
Annual Workshops
Workshop 1: Coordinating Parsing and Production
The first workshop addresses incremental interpretation of spoken utterances, how this relates to generation of utterances, and how humans switch so easily between the roles of speaker (producer) and hearer (parser). This is arguably the central foundational issue in understanding dialogue: although speaker/hearer role switching is trivially easy for humans, it is hard to model under standard assumptions. Yet we have:
- grammar formalisms that directly reflect dialogue patterns, including role switches (KCL, Essex);
- experience of the technical issues associated with creating dialogue systems that simultaneously listen and produce speech (Gothenburg, CSLI);
- different psycholinguistic perspectives concerning collaboration between dialogue participants (HCRC, SUNY, CSLI).
Workshop 2: Incrementality, Collaboration and Repair
The second workshop will explore the mechanisms that underlie the alignment of interlocutors’ mental models through dialogue and how these mechanisms can be extended to align software agents’ and robots’ ‘mental models’ with those of human interlocutors. Grammatical modelling of incremental understanding and clarification is well developed at KCL, Essex; contextual update models are central at KCL, Gothenburg, CSLI; psycholinguistic exploration of explicit negotiations has been a focus at SUNY; and HCRC has explored shifts in perspective between dialogue participants which take place without explicit negotiation. These issues have considerable practical consequences (automated question-answering dialogue systems usually use clarification to repair misunderstandings), which Gothenburg, CSLI and KCL jointly are particularly interested in pursuing via computational implementations.
Workshop 3
The third workshop will address the question of how the production of speech in dialogue relates to the gestures a speaker makes and the visual attention of listeners. Gothenburg, CSLI have experience in the design and use of multi-modal computational systems, and their theoretical basis in multimodal grammars; SUNY in multi-modal generation; SUNY, Essex in visual and verbal attention. As in previous workshops, there are both foundational and applicational issues to explore, in order to set in place appropriate interdisciplinary links for subsequent research. This event will draw together a larger interdisciplinary body to publicise the results of the team over the previous years.
Site-Visits
Cross-site research enterprises will include:
Stanford-Gothenburg-HCRC
- Comparing the sites’ current dialogue technologies and exploring how these can be combined, particularly combining autonomous agents in dynamic environments (CSLI) with multilingual systems (Gothenburg). Later visits will explore precisely the extent to which CSLI methodologies can be carried over to the multilingual tasks modelled at Gothenburg.
- Discussing the theoretical and empirical feasibility of testing the psychological reality of dialogue strategies involving accommodation (as used in Gothenburg’s dialogue-management approach, which employs the notion of "issues under discussion").
- Exploring the implications of HCRC’s experimental psychological results for designing future (multi-lingual) dialogue systems.
London-Gothenburg
- Collaborating in developing a semantic formalism based on type theory and HPSG to capture semantically complex dialogue phenomena (in particular how people clarify their meaning).
- Exploring how such phenomena can be captured within Gothenburg's dialogue-management approach.
- Exploring how the semantic formalism and these phenomena might be transferred to other implemented systems, in particular systems collaboratively developed at KCL/CSLI.
Stanford-HCRC
- Investigating differences between (i) human-human and human-computer interaction, and (ii) multi-party and two-party interaction from psycholinguistic and computational perspectives.
- Investigating how HCRC's psycholinguistic results might be applied in CSLI's dialogue systems to capture how people interpret ambiguous utterances in comprehension and choose sentence forms in production.
Stanford-London
- Exploring ways of scaling up the Dynamic Syntax (DS) formalism and prototype implementation. Early visits will consider general design issues (including using other existing large-scale implementations); later meetings will develop and test particular implementations, using dialogue corpora available from HCRC and SUNY colleagues.
- Integrating KCL's theory of (and prototype implementation of) advanced clarification strategies with CSLI's large-scale dialogue system; extending this to joint-site work on a general treatment of dialogue fragments (including corrections).
- Exploring the application of KCL corpus-based and theoretical work in multi-party dialogue to CSLI's large-scale dialogue systems.
London-HCRC
- Evaluating how closely DS formalism matches HCRC’s psychological model, and developing appropriate experimental methods to test DS claims that grammatical tools and and general-reasoning tools must be able to interact on a word-by-word basis. KCL (Philosophy) currently lacks infrastructure for psychological experiments; early visits would explore how to remedy this.
- Exploring the integration of KCL's grammatical model of clarification and repair with HCRC's psychological model.
- Exploring how HCRC's and DS's independent research concerning language change in a dialogue context can jointly inform experimental tools for investigating language change.
Essex-SUNY
- Developing an empirical methodology that combines behavioural studies and multimodal corpus annotation:
-
- to test predictions concerning how people collaborate when referring to objects;
-
- to explore the relation between visual and verbal attention, and the effect of visual attention on how people refer to objects.
- Exploring audience design by bilingual speakers for bilingual addressees, both through experiments measuring code-switching between languages and by modelling code-switched utterances incrementally.
HCRC-SUNY
- Developing experiments to investigate how people converge cross-dialectally on particular use of language; and incorporating the results into models of human processing and spoken dialogue systems.
- Testing two psychological models of how language users manipulate context: priming models of language convergence ("alignment") vs. more collaborative "intentional" models that make reference to context from the earliest moments of processing.
- Exchanging methodological expertise in investigating multi-party dialogues and in how eye-trackers can be used to measure what speakers and listeners look at and attend to in dialogue.
Essex-HCRC
- Using analysis of dialogue corpora as input to devising experimental methods for:
-
- investigating the differences between "intentional" and "alignment" models in their explanation of how people clarify meaning and complete each others' utterances,
-
- exploring how plans for actions become routinised in dialogue.
Dissemination
During the project, we anticipate one guest-edited issue of
Research on Language and Computation, publications in conference proceedings, and specialist journals. We also plan one edited book of collected papers.