10 - Reading and ambiguity resolution
"This volume's pages are uncut:
a first obstacle opposing your impatience."1.
When a reader comes in contact with a text, she faces an
arduous task: to decode it. One of the main obstacles is the fact that texts
are full of ambiguities of a semantic and syntactic character.
A "semantic ambiguity" occurs when one word covers many different meanings. Some of these meanings, usually the denotative ones, are partially indicated in dictionaries under the corresponding word entry. Other meanings, mainly those of a connotative character, are traceable to the (environmental) context in which the utterance occurs and to the (verbal) co-text into which a word is collocated.
We cannot emphasize enough that connotative meanings are extremely instable, and that denotative meanings of a word in a natural code are always different from the meanings of any other word, either when they belong to the same natural code, or to another natural code.
Recently the American researcher Trueswell has published an essay based on experiments carried out on readers in which he tries to explain how syntactic and semantic ambiguities are solved during the act of reading. Some of the examples on which the experiments were carried out are based on the completion of incomplete utterances.
1) Henry forgot Lila...
a) ... at her office. (direct object interpretation)
b) ... was almost always right. (sentence complement interpretation)2
When faced with ambiguities like the one in the first
utterance, experiments indicate that readers tend to resolve ambiguities.
In the quoted example most readers opted for the a) interpretation.
A theory of sentence processing has been created that emphasizes the integrative nature of interpretation: ambiguities are resolved, having considered a wide range of sources of information, on the basis of restraints that prevent different interpretations.
In as much as a polysemic word has some meanings that are dominant when compared to others, i.e. meanings that are considered more probable a priori without a context, so ambiguous words can have dominant and/or subordinate syntactic structures. From the experiments carried out by Trueswell and by authors quoted in his essay emerges the fact that a structure is or is not dominant changes from one instance to another, from one word to another. And, we add, probably it varies from one culture to another too, even if we remain within the same natural code, and from one speaker to another.
According to such theory, the so-called "lexicalist" theory of sentence, the availability of many alternative syntactic structures to a reader depends on how often that reader "has encountered the word in each syntactic context. In addition, semantic/contextual information can come into play quite rapidly to help resolve possible ambiguities"3.
These two kinds of restraints - how frequent the experience with a syntactic structure has been and the presence of the semantic and co-textual information - do not occur in sequence, but simultaneously, in a reciprocal interaction. This was controlled based on the presupposition that, when one of the limiting factors contradicts with the other, the time required to resolve the ambiguity increases.
To test this hypothesis, the time to decode the left co-text of an ambiguous word (i.e. the words that, in alphabets in which you read from left to right, are read before an ambiguous word) was measured in comparison with the time necessary to decode words forming the right co-text (following words). Long decoding times are matched by supposed conflicts between the two kinds of restraint (frequency of syntactic pattern and semantic-co-textual aspects).
In order to know the odds that a given syntactic pattern or a given semantic value will be used within a given speaker's community, textual corpora were used containing millions and millions of 'real' utterances, i.e. pronounced or written by speakers and not created by researchers. And it was noted on the basis of experiments that when readers come across the clue that lets them think of a very probable structure that, however, develops in an unexpected way, they take much more time in the process of resolving the ambiguity.
All these results explain why machine translation is so unsuccessful. Our brain, while solving ambiguities within an utterance, addresses not only our grammatical knowledge, not only our lexical knowledge, but statistics, as well - of an undoubtedly unconscious character - of the frequencies at which given lexical and grammatical structures have occurred in our experience.
Since, however, large textual corpora have a greater and more reliable storage capacity than our brains, presently the greatest power possible to carry out a translation job is obtained by combining the flexible human intelligence with (manual, not automatic) consultation of existing corpora. In the fourth part of this course textual corpora, and many other tools necessary to translators, make up the most important subjects discussed.
CALVINO I. If on a Winter's Night a Traveller, translated by William Weaver, London, Vintage, 1998, ISBN 0-7493-9923-6.
TRUESWELL J. C. The organization and use of the lexicon for language comprehension, in Perception, Cognition, and Language. Essays in Honor of Henry and Lila Gleitman. Cambridge (Massachusetts), The M.I.T. Press, 2000. ISBN 0-262-12228-6.1 Calvino 1998, p. 53.
2 Trueswell 2000, p. 327.
3 Trueswell 2000, p. 331-332.