Logos Multilingual Portal

5 - From dictionary to corpora

"Since these relationships must be lost by translation into our language, the incomprehensibil-ity of the equivalents in our popular 'dream-books' is hereby explained"1.

There are two approaches to tracing of the possible meanings of a word, corresponding to two ap-proaches toward the very notion of "language".

In the former case, language is considered very similar to an artificial language in which tech-nical committees meet to decide which terms to use and which to abolish, which (denotative) meanings to attribute to terms and how to differentiate mean-ings in function of external objective differences.

In this view, a little naive and best suited for the school context, grammar and syntax handbooks contain rules one should abide by when speaking and writing, and, analogously, dic-tionaries contain meanings of words; if a given meaning appears in the dictionary, that word can be used to refer to that meaning, if, on the con-trary, that given meaning doesn't appear, one should use another word.

Why do I call "naive" such a view? Because, as I wrote for example in unit 16 of the fourth part of this course, natural language doesn't come out of technical or technical-linguistic committees, it originates in the spontane-ous interaction of speakers.

Linguistic use is the basic empirical datum of language science, that seeks constants, usage regu-larities, more than rules to follow. It is not as inter-esting to normalize and prescribe, i.e tell say some-body that he cannot say what he is saying because it is "wrong", as it is to attribute linguistic usage to individual, social, local, sectorial contexts and de-scribe such uses. Linguistic competence doesn't be-come knowledge of the rules, but knowledge of the constants, and the use of different registers, idio-lects, sociolects etc. with full knowledge of the "whys".

One of the consequences of such considera-tion of linguistic varieties for translation is that translating means aiming at reproducing linguistic variety from linguistic variety, not obliterating or simplifying it.

As to tracing of possible meaning of a word, the reversal of perspective is self-evident: it is not as interesting to consult repertories in which linguis-tic experts have already decided what the meanings of a word are, as it is to ascertain, recording from everyday usage of speakers the given meanings ac-tually present in living language.

The practical difficulty of such vision consists in finding a "place" in which it is possible to trace such a linguistic usage, finding a source that a lin-guist or a translator can refer to that, similarly to a dictionary, gives immediate responses and is readily available. And this is where the trouble lies.

To this aim we would need repertories of ut-terances registered in real usage (not created by ex-perts), possibly in an heterogeneous way: written texts, spoken texts, texts coming from many social classes, many economic and sectorial contexts, texts from the radio (that, obviously, are oral, but often have peculiar features, being similar in some cases to written registers) and from television; and texts of different eras.

Such repertories exist and are called "cor-pora", that in Latin is the plural of the word "cor-pus", meaning, in Latin, "body", but in a modern context, in reference to linguistics and translation, "corpus" has different meanings:

"complete and ordered collection of written texts, by one or more authors, concerning a given subject"

"a representative sample of language, spoken or written, taken into exam in the description of a lan-guage or dialect"

Corpora that most interest in our case are in elec-tronic format, because their consultation is quicker and more versatile. In this case, the definition could be:

"collection of texts in electronic format that can be consulted and analyzed in many ways".

Investigating the meaning of a word through a cor-pus, as compared to investigating it through a dic-tionary, implies a difference similar to the one exist-ing between learning a language through a direct contact to speakers and learning it through a course or a handbook.

The former case reminds of the infant that finds himself facing adults interacting through a natural language (Quine's home language) and must reconstruct by way of reverse process (abduction), starting from the result, the values the exchanges of words have. Certainly an infant spends much more time in learning a home language than he would if he could use a dictionary, but in the end he gets a first-hand idea of what he learns, every single sen-tence he learns is linked to given affective experi-ences, so much so that the later confrontation with native language results in a shock. A world of cer-tainties falters, and is then partially confirmed and partially disproved after the radical translation he manages to make.

Of course, to the infant, home language is at first the only existing language and, when eventu-ally he discovers that there are other variants of it, he still considers it the "right" language. Such a consideration of course has nothing to do with moral reasons: it is psychological. It is the only "right" one because his models - parents - use that one. All the ways of speaking different from this one are considered imperfect because, projecting his feeling onto the rest of the speakers, the infant im-plies that the others have tried to speak like his par-ents, like his models, but failed.

Searching for a word in a corpus to get its meaning is a longer operation as compared to the consulting of a dictionary, too. And, above all, in a corpus you don't find any definition: you "only" find the complete utterance that contains the inves-tigated word. For this reason one should behave like the infant confronted with his parents' sentences: activating one's antennas and trying to decode the meaning of what is happening. With a huge differ-ence, however: a translator often knows, in a sen-tence, nearly all other words, so that the meaning of the investigated word stands out in a much more evident way.

Searching a word in a corpus means consult-ing maybe ten or twenty sentences, instead of a sin-gle dictionary definition: but in the end the meaning one gets to is much more precise because it arises from a first-hand interpretation, a direct interpreta-tion based on the context. People who have fol-lowed the entire course in order will remember that the importance of contextual meaning of a word was stressed since the times of Ogden and Richards (1923), in the unit 7 of the second part of this course. Applying the concept of "corpus" to the investigation of the meaning of a word has revolutionized the translators work methods. This is what we shall see in the next units.


Bibliographical references

FREUD SIGMUND, L'interpretazione dei sogni, in Opere, vol. 3, Torino, Boringhieri, a cura di C. L. Musatti, 1966.

FREUD SIGMUND, The Interpretation Of Dreams, translated by A. A. Brill, Lon-don, G. Allen & company, 1913.

1 Freud 1900: 97.