Logos Multilingual Portal

5 - Text generation - second part



"[words] cannot be imagined without ornament,
though it is often involuntary; there is ornament in
even the most arid exposition [...]"1.



Human mind has a long-term memory, containing its stablest contents, the nearly permanent material characterizing the cultural patrimony of the individual; it''s a memory that, in computer terms, can be compared to a hard disk: data can be added or deleted but, except in case of serious accidents, when the computer is switched off, data are still there.
  The mind has another, short-term, service memory; it is a volatile memory, lasting only as long as it is useful, and has an exclusively organizational character. It is similar to our computer RAM for at least two reasons: the first is that on its capacity depends the number of operations simultaneously feasible within a given time frame; the second is that, when the computer is turned off, the RAM is cleared. In the case of text production, such a short-term memory can, for example, organize ideas while a speech act is being organized. Since this memory is limited, many conjectures about the functioning of text generation are based precisely on this feature. The limits of our short-term memory impede our strategic capabilities to plan sentences and, especially in spoken language, they cause mistakes or flaws, like incongruity of number or gender, anacolutha, structural or logical misconnections.
  The passage from mental content to written text can be described in these terms:

  • pinpointing elements useful for discrimination of the content to be expressed from similar contents;
  • pinpointing redundant elements;
  • choice of words (lexicalization) and attention to their cohesion (inner links);
  • choice of grammatical structure/s;
  • linear order of words;
  • part of speech;
  • sentence complexity;
  • prepositions and other function words;
  • final form2.

Due to limitations in short-term memory, sentences are built gradually, interpolating execution and planning stages. While one part of the speech act is actualized (=pronounced, written), the next one is planned: "we think while we talk and, while we talk, we think"3. There are many affinities between language and perception: they are both compositional devices, both must satisfy good-form and completeness conditions (in the sense meant by Gestalt psychology: functioning occurs by chunks, by perceiving structures) 4.
  A fundamental stage for text generation consists in the choice of the words for expressing the message. As Michael Zock, French CNRS researcher, writes, to state that an exact chronological order exists according to which first the thought formation occurs, and then words adequate to express them are chosen, has two implications: thought rigidly and completely precedes language, and thought is completely coded and specified before lexicalization occurs5. According to Zock''s hypothesis, though, linguistic coding plays an important role in content modeling as well, i.e. there is a reciprocal interaction between content and language.
  If the text-generation process is considered in terms of intersemiotic translation from the mental language into verbal language, since applying words to mental content generates a content, the translation process becomes complex, bi-directional, and manifold. If the selection of given words alters the content of the message to be expressed, such selection has an impact both on the structuring of that message, diverging onto paths unforeseen before the start of lexicalization, and on all revisions before the final draft (when there is one; hypertextualization of communication, the highest technical transmissibility of information increasingly limits use of words like "final draft": instead we see more "works in progress").
  As we have seen when studying the interpretant and its subjective component, there is no exact, constant or universal correspondence between words and the mental subjective meaning aura or, as Zock says,

[...] words cannot be directly mapped on their conceptual counterpart, that is, there is no one-to-one correspondence between concepts and words: a given word may express more than a single concept [...]6.

This implies that choosing a word or a word combination alters not only the way in which content is expressed, but the speech act''s content as well. Such a statement, among other things, clamorously supports from a psychological and psycholinguistic point of view the point made, in literature by Romantic writers and, in criticism, by Russian Formalists first and then by the linguistic Structuralists, who revived interest in text semiotics in the 1960s: it is not possible to neatly distinguish form from content, because form is content and content lies in form as well.

Natural languages, as opposed to artificial languages, are very flexible. The different components (conceptual, lexical and syntactic) are highly interdependent, each component possibly influencing the others. The advantage of such heterarchical architecture is that it allows for various orders of data processing. For example, lexical choice may precede the choice of syntactic structure and vice versa7.

  

Bibliographical references

BATEMAN J. & ZOCK M. Natural Language Generation, in R. Mitkov, editor, Handbook of Computational Linguistics, Oxford University Press, 2001, ISBN

MARÍAS J. Negra espalda del tiempo, Punto de lectura, 2000 (original edition 1998), ISBN 84-663-0007-7.

MARÍAS J. Dark Back of Time, New York, New Directions, 2001 (translated by Esther Allen), ISBN 0-8112-1466-4.

ZOCK M. Holmes meets Montgomery: an unusual yet necessary encounter between a detective and a general, or, the need of analytical and strategic skills in outline planning, in VI Simposio Internacional de Comunicacion Social, Santiago de Cuba, 1999, p. 478-483.

ZOCK M. The power of words, in Message Planning, 16th International Conference on Computational Linguistics (COLING), København, 1996, p. 990-5.

ZOCK M. Sentence generation by pattern matching: the problem of syntactic choice, in R. Mitkov & N. Nicolov editors, Recent Advances in Natural Language Processing. Series: Current Issues in Linguistic Theory, Amsterdam, Benjamins, 1997, ISBN p. 317-352.


1 Marías 2001, p. 8. «[...] y además [la palabra] no se concibe sin ornamento, a menudo involuntario, lo hay hasta en la exposición más árida [...]» Marías 1998 (2000), p. 10.
2 Bateman e Zock 2001.
3 Zock 1997, p. 327 note.
4 Zock 1997, p. 328.
5 Zock 1997, p. 990.
6 Zock 1997, p. 990.
7 Zock 1997, p. 326.