Ron Artstein

Abstracts of papers

Ron Artstein, Jacob Cannon, Sudeep Gandhe, Jillian Gerten, Joe Henderer, Anton Leuski and David Traum. Coherence of off-topic responses for a virtual character. To be presented at the 26th Army Science Conference, Orlando, Florida, December 2008.

Abstract: We demonstrate three classes of off-topic responses which allow a virtual question-answering character to handle cases where it does not understand the user s input: ask for clarification, indicate misunderstanding, and move on with the conversation. While falling short of full dialogue management, a combination of such responses together with prompts to change the topic can improve overall dialogue coherence.

PDF version (460K)
Back to Ron Artstein's home

Sudeep Gandhe, David DeVault, Antonio Roque, Bilyana Martinovski, Ron Artstein, Anton Leuski, Jillian Gerten, and David Traum. From domain specification to virtual humans: An integrated approach to authoring tactical questioning characters. To be presented at Interspeech 2008, Brisbane, Australia, September 2008.

Abstract: We present a new approach for rapidly developing dialogue capabilities for virtual humans. Starting from domain specification, an integrated authoring interface automatically generates dialogue acts with all possible contents. These dialogue acts are linked to example utterances in order to provide training data for natural language understanding and generation. The virtual human dialogue system contains a dialogue manager following the information-state approach, using finite-state machines and SCXML to manage local coherence, as well as explicit modeling of emotions and compliance level and a grounding component based on evidence of understanding. Using the authoring tools, we design and implement a version of the virtual human Hassan and compare to previous architectures for the character.

PDF version (427K)
Back to Ron Artstein's home

David DeVault, David Traum and Ron Artstein. Making grammar-based generation easier to deploy in dialogue systems. Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue, pages 198–207. Columbus, Ohio, June 2008.

Abstract: We present a development pipeline and associated algorithms designed to make grammarbased generation easier to deploy in implemented dialogue systems. Our approach realizes a practical trade-off between the capabilities of a system s generation component and the authoring and maintenance burdens imposed on the generation content author for a deployed system. To evaluate our approach, we performed a human rating study with system builders who work on a common largescale spoken dialogue system. Our results demonstrate the viability of our approach and illustrate authoring/performance trade-offs between hand-authored text, our grammar-based approach, and a competing shallow statistical NLG technique.

PDF version (845K)
Back to Ron Artstein's home

David DeVault, David Traum and Ron Artstein. Practical grammar-based NLG from examples. Proceedings of the Fifth International Natural Language Generation Conference, pages 77–85. Salt Fork, Ohio, June 2008.

Abstract: We present a technique that opens up grammar-based generation to a wider range of practical applications by dramatically reducing the development costs and linguistic expertise that are required. Our method infers the grammatical resources needed for generation from a set of declarative examples that link surface expressions directly to the application s available semantic representations. The same examples further serve to optimize a run-time search strategy that generates the best output that can be found within an application-specific time frame. Our method offers substantially lower development costs than hand-crafted grammars for applicationspecific NLG, while maintaining high output quality and diversity.

PDF version (376K)
Back to Ron Artstein's home

Massimo Poesio and Ron Artstein. Anaphoric annotation in the ARRAU corpus. LREC 2008, Marrakech, Morocco, May 2008.

Abstract: Arrau is a new corpus annotated for anaphoric relations, with information about agreement and explicit representation of multiple antecedents for ambiguous anaphoric expressions and discourse antecedents for expressions which refer to abstract entities such as events, actions and plans. The corpus contains texts from different genres: task-oriented dialogues from the Trains-91 and Trains-93 corpus, narratives from the English Pear Stories corpus, newspaper articles from the Wall Street Journal portion of the Penn Treebank, and mixed text from the Gnome corpus.

PDF version (191K)
Back to Ron Artstein's home

Ron Artstein, Sudeep Gandhe, Anton Leuski and David Traum. Field Testing of an interactive question-answering character. Proceedings of the ELRA workshop on evaluation, pages 36–40. Marrakech, Morocco, May 2008.

Abstract: We tested a life-size embodied question-answering character at a convention where he responded to questions from the audience. The character's responses were then rated for coherence. The ratings, combined with speech transcripts, speech recognition results and the character's responses, allowed us to identify where the character needs to improve, namely in speech recognition and providing off-topic responses.

PDF version (1.2M)
Back to Ron Artstein's home

Ron Artstein and Massimo Poesio. Identifying reference to abstract objects in dialogue. brandial 2006 proceedings, Potsdam, Germany, September 2006.

Abstract: In two experiments, many annotators marked antecedents for discourse deixis as unconstrained regions of text. The experiments show that annotators do converge on the identity of these text regions, though much of what they do can be captured by a simple model. Demonstrative pronouns are more likely than definite descriptions to be marked with discourse antecedents. We suggest that our methodology is suitable for the systematic study of discourse deixis.

PDF version (58K)
Back to Ron Artstein's home

Massimo Poesio, Patrick Sturt, Ron Artstein, and Ruth Filik. Underspecification and Anaphora: Theoretical Issues and Preliminary Evidence. Discourse Processes 42(2): 157-175, 2006.

Distributed as Technical report CSM-438, University of Essex Department of Computer Science, October 2005.

Abstract: Much experimental work in psycholinguistics suggests that fully specified syntactic and semantic interpretations are obtained incrementally. The finding that intepretation takes place incrementally is very robust and underlies our own view of sentence processing as well; however, most of this work tends to test very simple interpretive judgments, and using materials which have very clean-cut interpretations, which makes the view expressed above more questionable when applied to semantic interpretation. This article discusses a class of anaphoric expressions that do not appear to have a clear antecedent, using both corpus analysis and psychological experiments. We argue that these cases of anaphora are similar to cases of lexical polysemy, and propose an explicit semantic representation for such cases.

Published version
Article preprint (PDF, 150K)
Technical report (PDF, 143K)
Back to Ron Artstein's home

Ron Artstein and Massimo Poesio. Inter-coder agreement for computational linguistics (survey article). To appear in Computational Linguistics.

Abstract: This article is a survey of methods for measuring agreement among corpus annotators. It exposes the mathematics and underlying assumptions of agreement coefficients, covering Krippendorff’s alpha as well as Scott’s pi and Cohen’s kappa; discusses the use of coefficients in several annotation tasks; and argues that weighted, alpha-like coefficients, traditionally less used than kappa-like measures in Computational Linguistics, may be more appropriate for many corpus annotation tasks – but that their use makes the interpretation of the value of the coefficient even harder.

Journal version (PDF, 304K)
Extended version (PDF, 367K)
Back to Ron Artstein's home

Ron Artstein and Nissim Francez. Plurality and temporal modification. Linguistics and Philosophy 29(3): 251-276, 2006.

Abstract: A semantics with plural entities and plural times accounts for cumulative relations between plural arguments and temporal expressions. The semantics equips nominal, verbal and sentential meanings with temporal context variables and treats temporal modifiers as temporal generalized quantifiers; cumulative conjunction, however, takes place at types lower than generalized quantifiers. The mediation of temporal context variables allows cumulative relations to percolate between an argument in a main clause and one in a temporal clause, in apparent violation of locality restrictions. Plural times form a semilattice structure imposed on the set of intervals; no interaction is observed between this and the internal temporal structure of intervals.

Published version
PDF preprint (85K)
Back to Ron Artstein's home

Ron Artstein and Massimo Poesio. Bias decreases in proportion to the number of annotators. In Gerhard Jaeger, Paola Monachesi, Gerald Penn, James Rogers, and Shuly Wintner (eds.), Proceedings of FG-MoL 2005, pages 141-150. Edinburgh, August 2005.

Abstract: The effect of the individual biases of corpus annotators on the value of reliability coefficients is inversely proportional to the number of annotators (less one). As the number of annotators increases, the effect of their individual preferences becomes more similar to random noise. This suggests using multiple annotators as a means to control individual biases.

PDF version (110K)
Back to Ron Artstein's home

Massimo Poesio and Ron Artstein. Annotating (anaphoric) ambiguity. Corpus linguistics, Birmingham, England, July 2005.

Abstract: We report the results of a preliminary study attempting to identify ambiguous expressions in spoken language dialogues. In this study we developed methods for marking explicit ambiguity, and generalized previous proposals by Passonneau concerning a distance metric for anaphora to be used with the α coefficient to allow for ambiguous annotations.

PDF version (64K)
Back to Ron Artstein's home

Massimo Poesio and Ron Artstein. The reliability of anaphoric annotation, reconsidered: Taking ambiguity into account. Proceedings of the Workshop on Frontiers in Corpus Annotation II: Pie in the Sky, pages 76-83. Ann Arbor, June 2005.

Abstract: We report the results of a study of the reliability of anaphoric annotation which (i) involved a substantial number of naive subjects, (ii) used Krippendorff's α instead of κ to measure agreement, as recently proposed by Passonneau, and (iii) allowed annotators to mark anaphoric expressions as ambiguous.

PDF version (71K)
Back to Ron Artstein's home

Ron Artstein. Quantificational arguments in temporal adjunct clauses. Linguistics and Philosophy 28(5): 541-597, 2005.

Abstract: Quantificational arguments can take scope outside of temporal adjunct clauses, in an apparent violation of locality restrictions: the sentence few secretaries cried after each executive resigned allows the quantificational NP each executive to take scope above few secretaries. I show how this scope relation is the result of local operations: the adjunct clause is a temporal generalized quantifier, which takes scope over the main clause (Pratt and Francez 2001), and within the adjunct clause, the quantificational argument takes scope above the implicit determiner which forms the temporal generalized quantifier. The paper explores various relations among quantificational arguments across clause boundaries, including temporal clauses that are modified internally by a temporal adverbial and temporal clauses with embedded sentential complements.

Published version
PDF preprint (172K),
PostScript preprint (300K)
Back to Ron Artstein's home

Ron Artstein. Coordination of parts of words. Lingua 115(4): 359-393, 2005.

Abstract: Coordination of parts of words, as in ortho and periodontists, has to be interpreted at the level of the word parts because the above NP can felicitously describe a pair of one orthodontist and one periodontist. This paper develops a theory of denotations for arbitrary word parts, in which the coordinate word parts denote their own sound, and the rest of the word is a function from sounds to word meanings. This yields the correct interpretation for number in coordinate constructions. The paper also explores phonological constraints on coordinate structures, and shows how certain ungrammatical structures that can be interpreted by the semantics are ruled out on phonological grounds.

Published version
PDF preprint (150K),
PostScript preprint (238K)
Back to Ron Artstein's home

Ron Artstein. Focus below the word level. Natural Language Semantics 12(1): 1-22, 2004.

Abstract: Intonational focus can be observed on parts of words that appear to lack intrinsic meaning, and triggers alternatives that are similar in form. In order to provide a unified treatment of focus above and below the word level (they do, after all, behave the same in most respects), I develop a theory of denotations for arbitrary word parts in which focused word parts denote their own sound and the unfocused parts are functions from sounds to word meanings. This allows focus theories to generalize below the word level; any differences with focus above the word level are located in the semantics of word parts. The paper also explores phonological constraints on focus placement, and shows that the focusability of a word part depends solely on its prosodic status, not on any semantic factors.

Published version
PDF preprint (135K),
PostScript preprint (165K)
Back to Ron Artstein's home

Ron Artstein. A focus semantics for echo questions. In Ágnes Bende-Farkas and Arndt Riester (eds.), Workshop on Information Structure in Context, pp. 98-107. IMS, University of Stuttgart, 2002.

Abstract: Echo questions are interpreted through focus semantics. Echo questions must be entailed by previous discourse; focus is therefore not needed to mark givenness, and instead it is used to compute the question denotation: the questioned element, marked with a pitch accent, is a focus constituent, and the alternative set of the echo question is its question denotation, i.e. the set of possible answers. The focus strategy exempts echo questions from locality restrictions (``islands''), allows echo questions on parts of words, and allows second-order echo questions which denote sets of questions.

PDF version (88K)
PostScript version (112K)
Back to Ron Artstein's home

Ron Artstein. Person, animacy and null subjects. In Tina Cambier-Langeveld, Anikó Lipták, Michael Redford and Erik Jan van der Torre (eds.), Proceedings of Console VII, pp. 1-15. SOLE, Leiden, 1999.

Abstract: Licensing of null subjects can be contingent on person and animacy specification. For example, Hebrew allows null subjects if they are first or second person, but not if they are third person. This follows from a general typology that is based on the universal person/animacy hierarchy: if a subject of a certain person or animacy specification may be null, then every subject higher on the hierarchy may be null as well. The above typology, in turn, follows from the general way abstract hierarchies interact in the grammar: elements that appear on the high end of one hierarchy and the low end of another give rise to marked configurations. The mechanism of alignment in Optimality Theory gives a formalization of these universal properties of hierarchies.

PDF version (75K)
Back to Ron Artstein's home

Ron Artstein. The incompatibility of underspecification and markedness in Optimality Theory. In Ron Artstein and Madeline Holler (eds.), RuLing Papers 1: Working Papers from Rutgers University, pp. 7-13. Rutgers University Department of Linguistics, New Brunswick, NJ, 1998.

Abstract: Underspecification in the underlying representation cannot give rise to marked structure on the surface, because Optimality Theory grammars force an output to be equally or less marked than the input. Underspecification can still account for alternations involving unmarked structure, but it is only useful when such alternations exist along with forms that do not alternate. The evidence for the existence of such grammatical systems is not very convincing, casting doubts about the usefulness of underspecification in general.

PDF version (25K)
Back to Ron Artstein's home

Ron Artstein. Group events as means for representing collectivity. In Benjamin Bruening (ed.), MITWPL 31: Proceedings of the Eighth Student Conference in Linguistics , pp. 41-51. MIT Working Papers in Linguistics, Cambridge, MA, 1997.

Abstract: In this paper I argue in favor of the introduction of "group" events into a framework of event semantics; these mirror the "group" individuals introduced by Landman (1989), and give the domain of events a structure similar to that of the domain of individuals. Group events are used in order to capture collectivity effects that cannot be represented through the domain of individuals, as in the case of predicate conjunction. An attempt to extend the notion of group events and to use them for counting with adverbials such as three times proves at the very least troublesome.

PDF version (39K)
PostScript version (132K)
Back to Ron Artstein's home