10 - Reading and ambiguity resolution
"This volume's pages are uncut:
a first obstacle opposing your impatience."1.
When a reader comes in contact with a text, she faces an 
arduous task: to decode it. One of the main obstacles is the fact that texts 
are full of ambiguities of a semantic and syntactic character.
  A "semantic ambiguity" occurs when one word covers many different 
meanings. Some of these meanings, usually the denotative ones, are partially 
indicated in dictionaries under the corresponding word entry. Other meanings, 
mainly those of a connotative character, are traceable to the 
(environmental) context in which the utterance occurs and to the (verbal) 
co-text into which a word is collocated.
  We cannot emphasize enough that connotative meanings are 
extremely instable, and that denotative meanings of a word in a natural 
code are always different from the meanings of any other word, either when 
they belong to the same natural code, or to another natural code.
  Recently the American researcher Trueswell has published an 
essay based on experiments carried out on readers in which he tries to 
explain how syntactic and semantic ambiguities are solved during the act 
of reading. Some of the examples on which the experiments were carried out 
are based on the completion of incomplete utterances.
| 
1) Henry forgot Lila... a) ... at her office. (direct object interpretation) b) ... was almost always right. (sentence complement interpretation)2  | 
When faced with ambiguities like the one in the first 
utterance, experiments indicate that readers tend to resolve ambiguities. 
In the quoted example most readers opted for the a) interpretation.
  A theory of sentence processing has been created that 
emphasizes the integrative nature of interpretation: ambiguities are 
resolved, having considered a wide range of sources of information, on 
the basis of restraints that prevent different interpretations.
  In as much as a polysemic word has some meanings that are 
dominant when compared to others, i.e. meanings that are considered more 
probable a priori without a context, so ambiguous words can have dominant 
and/or subordinate syntactic structures. From the experiments carried out 
by Trueswell and by authors quoted in his essay emerges the fact that a 
structure is or is not dominant changes from one instance to another, from 
one word to another. And, we add, probably it varies from one culture to 
another too, even if we remain within the same natural code, and from one 
speaker to another.
  According to such theory, the so-called "lexicalist" theory of 
sentence, the availability of many alternative syntactic structures to a 
reader depends on how often that reader "has encountered the word in each 
syntactic context. In addition, semantic/contextual information can come 
into play quite rapidly to help resolve possible ambiguities"3.
  These two kinds of restraints - how frequent the experience with 
a syntactic structure has been and the presence of the semantic and 
co-textual information - do not occur in sequence, but simultaneously, in a 
reciprocal interaction. This was controlled based on the presupposition 
that, when one of the limiting factors contradicts with the other, the time 
required to resolve the ambiguity increases.
  To test this hypothesis, the time to decode the left co-text of 
an ambiguous word (i.e. the words that, in alphabets in which you read from 
left to right, are read before an ambiguous word) was measured in comparison 
with the time necessary to decode words forming the right co-text (following 
words). Long decoding times are matched by supposed conflicts between the 
two kinds of restraint (frequency of syntactic pattern and semantic-co-textual 
aspects).
  In order to know the odds that a given syntactic pattern or a 
given semantic value will be used within a given speaker's community, 
textual corpora were used containing millions and millions of 'real' 
utterances, i.e. pronounced or written by speakers and not created by 
researchers. And it was noted on the basis of experiments that when readers 
come across the clue that lets them think of a very probable structure 
that, however, develops in an unexpected way, they take much more time in 
the process of resolving the ambiguity.
  All these results explain why machine translation is so 
unsuccessful. Our brain, while solving ambiguities within an utterance, 
addresses not only our grammatical knowledge, not only our lexical 
knowledge, but statistics, as well - of an undoubtedly unconscious 
character - of the frequencies at which given lexical and grammatical 
structures have occurred in our experience.
  Since, however, large textual corpora have a greater and more 
reliable storage capacity than our brains, presently the greatest power 
possible to carry out a translation job is obtained by combining the 
flexible human intelligence with (manual, not automatic) consultation of 
existing corpora. In the fourth part of this course textual corpora, and 
many other tools necessary to translators, make up the most important 
subjects discussed.
Bibliographical references
CALVINO I. If on a Winter's Night a Traveller, translated by William Weaver, London, Vintage, 1998, ISBN 0-7493-9923-6.
TRUESWELL J. C. The organization and use of the lexicon for language comprehension, in Perception, Cognition, and Language. Essays in Honor of Henry and Lila Gleitman. Cambridge (Massachusetts), The M.I.T. Press, 2000. ISBN 0-262-12228-6.
1 Calvino 1998, p. 53.2 Trueswell 2000, p. 327.
3 Trueswell 2000, p. 331-332.



