Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Memory and expectations in learning, language, and visual understanding

Memory and expectations in learning, language, and visual understanding Research in vision and language has traditionally remained separate in part because the classic task of generating a representation of a given image or sentence has resulted in an emphasis on low level structural aspects of these media. In this paper we argue that image and language understanding should be approached with the intent of facilitating the performance of a task. Under this view research in image and language understanding must confront common issues that arise as a task is pursued. Language and images are both input that can be used to maintain a model of a task. We argue that a model may be maintained by incorporating changes in the scene that can be characterized at a high level of abstraction yet manifest themselves at relatively low levels of analysis. Existing task-relevant models and the associated domain knowledge are used to expect specific changes and disambiguate the interpretation of these changes, thereby allowing them to modify the existing model. From this perspective, understanding input is largely independent of the modality of the input. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Artificial Intelligence Review Springer Journals

Memory and expectations in learning, language, and visual understanding

Loading next page...
 
/lp/springer-journals/memory-and-expectations-in-learning-language-and-visual-understanding-c0ee40Y1T2

References (23)

Publisher
Springer Journals
Copyright
Copyright
Subject
Computer Science; Artificial Intelligence; Computer Science, general
ISSN
0269-2821
eISSN
1573-7462
DOI
10.1007/BF00849039
Publisher site
See Article on Publisher Site

Abstract

Research in vision and language has traditionally remained separate in part because the classic task of generating a representation of a given image or sentence has resulted in an emphasis on low level structural aspects of these media. In this paper we argue that image and language understanding should be approached with the intent of facilitating the performance of a task. Under this view research in image and language understanding must confront common issues that arise as a task is pursued. Language and images are both input that can be used to maintain a model of a task. We argue that a model may be maintained by incorporating changes in the scene that can be characterized at a high level of abstraction yet manifest themselves at relatively low levels of analysis. Existing task-relevant models and the associated domain knowledge are used to expect specific changes and disambiguate the interpretation of these changes, thereby allowing them to modify the existing model. From this perspective, understanding input is largely independent of the modality of the input.

Journal

Artificial Intelligence ReviewSpringer Journals

Published: Dec 4, 2004

There are no references for this article.