10 - Reading and ambiguity resolution
"This volume's pages are uncut:
a first obstacle opposing your impatience."1.
When a reader comes in contact with a text, she faces an
arduous task: to decode it. One of the main obstacles is the fact that texts
are full of ambiguities of a semantic and syntactic character.
A "semantic ambiguity" occurs when one word covers many different
meanings. Some of these meanings, usually the denotative ones, are partially
indicated in dictionaries under the corresponding word entry. Other meanings,
mainly those of a connotative character, are traceable to the
(environmental) context in which the utterance occurs and to the (verbal)
co-text into which a word is collocated.
We cannot emphasize enough that connotative meanings are
extremely instable, and that denotative meanings of a word in a natural
code are always different from the meanings of any other word, either when
they belong to the same natural code, or to another natural code.
Recently the American researcher Trueswell has published an
essay based on experiments carried out on readers in which he tries to
explain how syntactic and semantic ambiguities are solved during the act
of reading. Some of the examples on which the experiments were carried out
are based on the completion of incomplete utterances.
1) Henry forgot Lila... a) ... at her office. (direct object interpretation) b) ... was almost always right. (sentence complement interpretation)2 |
When faced with ambiguities like the one in the first
utterance, experiments indicate that readers tend to resolve ambiguities.
In the quoted example most readers opted for the a) interpretation.
A theory of sentence processing has been created that
emphasizes the integrative nature of interpretation: ambiguities are
resolved, having considered a wide range of sources of information, on
the basis of restraints that prevent different interpretations.
In as much as a polysemic word has some meanings that are
dominant when compared to others, i.e. meanings that are considered more
probable a priori without a context, so ambiguous words can have dominant
and/or subordinate syntactic structures. From the experiments carried out
by Trueswell and by authors quoted in his essay emerges the fact that a
structure is or is not dominant changes from one instance to another, from
one word to another. And, we add, probably it varies from one culture to
another too, even if we remain within the same natural code, and from one
speaker to another.
According to such theory, the so-called "lexicalist" theory of
sentence, the availability of many alternative syntactic structures to a
reader depends on how often that reader "has encountered the word in each
syntactic context. In addition, semantic/contextual information can come
into play quite rapidly to help resolve possible ambiguities"3.
These two kinds of restraints - how frequent the experience with
a syntactic structure has been and the presence of the semantic and
co-textual information - do not occur in sequence, but simultaneously, in a
reciprocal interaction. This was controlled based on the presupposition
that, when one of the limiting factors contradicts with the other, the time
required to resolve the ambiguity increases.
To test this hypothesis, the time to decode the left co-text of
an ambiguous word (i.e. the words that, in alphabets in which you read from
left to right, are read before an ambiguous word) was measured in comparison
with the time necessary to decode words forming the right co-text (following
words). Long decoding times are matched by supposed conflicts between the
two kinds of restraint (frequency of syntactic pattern and semantic-co-textual
aspects).
In order to know the odds that a given syntactic pattern or a
given semantic value will be used within a given speaker's community,
textual corpora were used containing millions and millions of 'real'
utterances, i.e. pronounced or written by speakers and not created by
researchers. And it was noted on the basis of experiments that when readers
come across the clue that lets them think of a very probable structure
that, however, develops in an unexpected way, they take much more time in
the process of resolving the ambiguity.
All these results explain why machine translation is so
unsuccessful. Our brain, while solving ambiguities within an utterance,
addresses not only our grammatical knowledge, not only our lexical
knowledge, but statistics, as well - of an undoubtedly unconscious
character - of the frequencies at which given lexical and grammatical
structures have occurred in our experience.
Since, however, large textual corpora have a greater and more
reliable storage capacity than our brains, presently the greatest power
possible to carry out a translation job is obtained by combining the
flexible human intelligence with (manual, not automatic) consultation of
existing corpora. In the fourth part of this course textual corpora, and
many other tools necessary to translators, make up the most important
subjects discussed.
Bibliographical references
CALVINO I. If on a Winter's Night a Traveller, translated by William Weaver, London, Vintage, 1998, ISBN 0-7493-9923-6.
TRUESWELL J. C. The organization and use of the lexicon for language comprehension, in Perception, Cognition, and Language. Essays in Honor of Henry and Lila Gleitman. Cambridge (Massachusetts), The M.I.T. Press, 2000. ISBN 0-262-12228-6.
1 Calvino 1998, p. 53.2 Trueswell 2000, p. 327.
3 Trueswell 2000, p. 331-332.