11 - Other corpora
"In the light of the foregoing exposition, we shall translate the insistence with which this dream exhibits its absurdities as a sure sign of a particularly embittered and passionate polemic in the dream-thoughts"1.
We have seen in the previous units the use of some corpora. To have a more complete idea of corpora and similar resources present in the internet, let’s examine more now. Some of these present themselves as true corpora, therefore on these it is possible to make queries concerning the whole corpus, as we have seen in the case of the Wordtheque. Others are collections of texts that are made available to the public, but if the texts are not connected into a corpus and there is no search engine to connect them and make a query possible. I start with the former.
The first that I’ll examine is the Alex Catalogue of Electronic Texts found at the address http://www.infomotions.com/alex/. On the home page of the site you have many possibilities of access. It is a database created specifically for a university course, therefore contains a quantity of English and American fiction and some philosophical texts, all in English. The selection criteria, consequently, cannot be considered the same that a linguist would have used.
It is possible to examine the content of the database following three parameters: author, date, title. You can also proceed to the "download" section, from which you can download the whole database divided in three files, the whole, in harmony with the democratic sharing spirit of the internet, is completely free.
Let us say we click on "browse by authors". A long list of the authors present in the database appears:
Abbott, Edwin A. (1 items); 2. Alcott, Louisa May (2 items); 3. Alger, Horatio, Jr. (14 items); 4. Anderson, Sherwood (2 items); 5. Anonymous (1 items); 6. Aristotle (28 items); 7. Augustine (1 items); 8. Austen, Jane (4 items); 9. Austin, Mary (1 items); 10. Bacon, Francis (2 items); 11. Barrie, James Matthew (1 items); 12. Baum, L. Frank (5 items); 13. Behn, Aphra (2 items); 14. Berkeley, George (2 items); 15. Bierce, Ambrose (3 items); 16. Bronte, Charlotte (1 items); 17. Bronte, Emily (2 items); 18. Browning, Robert (3 items); 19. Buchan, John (14 items); 20. Bulwer-Lytton, Edward George (1 items); 21. Bunyan, John (3 items); 22. Burke, Edmund (1 items); 23. Burnett, Frances Hodgson (7 items); 24. Burroughs, Edgar Rice (21 items); 25. Burton, Richard Francis (1 items); 26. Butler, Samuel (1 items); 27. Byron, George Gordon (1 items); 28. Carroll, Lewis (3 items); 29. Cather, Willa (3 items); 30. Chaucer, Geoffrey (1 items); 31. Cleland, John (1 items); 32. Coleridge, Samuel Taylor (3 items); 33. Conrad, Joseph (3 items); 34. Crane, Stephen (4 items); 35. Dana, Richard Henry (1 items); 36. Defoe, Daniel (1 items); 37. Dell, Thomas (1 items); 38. Descartes, Rene (2 items); 39. Dickens, Charles (20 items); 40. Douglass, Fredrick (2 items); 41. Doyle, Arthur Conan (13 items); 42. Dreiser, Theodore (1 items); 43. Eliot, George (1 items); 44. Emerson, Ralph Waldo (16 items); 45. Epictetus (3 items); 46. Epicurus (2 items); 47. Fielding, Henry (1 items); 48. Franklin, Benjamin (7 items); 49. Freud, Sigmund (1 items); 50. Gay, John (1 items); 51. Gray, Thomas (1 items); 52. Hamilton, John (1 items); 53. Hardy, Thomas (1 items); 54. Hart, Michael S. (2 items); 55. Hawthorne, Nathaniel (25 items); 56. Henry, O. (1 items); 57. Hobbes, Thomas (1 items); 58. Hubbard, Elbert (1 items); 59. Hume, David (13 items); 60. Hylton, Jeremy (1 items); 61. Irving, Washington (34 items); 62. James, Henry (2 items); 63. James, William (2 items); 64. Jefferson, Thomas (13 items); 65. John, King (1 items); 66. Kant, Immanuel (8 items); 67. Keats, John (34 items); 68. Kennedy, John F. (1 items); 69. King, Martin Luther, Jr. (1 items); 70. Kipling, Rudyard (2 items); 71. Lang, Andrew (1 items); 72. Leibniz, Gotfried Wilhelm (1 items); 73. Lewis, Sinclair (1 items); 74. Lincoln, Abraham (2 items); 75. Locke, John (7 items); 76. London, Jack (8 items); 77. Longfellow, Henry Wadsworth (3 items); 78. Lucretius Carus, Titus (1 items); 79. MacDonald, George (1 items); 80. Machiavelli, Nicolo (1 items); 81. Marx, Karl (2 items); 82. Melville, Herman (4 items); 83. Mill, John Stuart (4 items); 84. Millay, Edna St. Vincent (1 items); 85. Milton, John (30 items); 86. Montaigne, Michel (1 items); 87. Moore, Clement Clarke (1 items); 88. More, Thomas (1 items); 89. Morley, Christopher (1 items); 90. Nietzsche, Friedrich (1 items); 91. Norris, Frank (1 items); 92. Orczy, Emmasku Orczy (1 items); 93. Paine, Thomas (4 items); 94. Pascal, Blaise (2 items); 95. Phillips, David Graham (1 items); 96. Plato (24 items); 97. Plotinus (1 items); 98. Poe, Edgar Allen (124 items); 99. Porter, Eleanor H. (1 items); 100. Prescott, William Hickling (1 items); 101. Rinehart, Mary Roberts (1 items); 102. Rousseau, Jean-Jacques (2 items); 103. Saki (6 items); 104. Sandburg, Carl (2 items); 105. Sauer, Geoffrey (1 items); 106. Scott, Walter, Sir (3 items); 107. Shakespeare, William (39 items); 108. Shelley, Mary Wollstonecraft (1 items); 109. Smith, Adam (1 items); 110. Spenser, Edmund (1 items); 111. Spinoza, Benedict De (4 items); 112. Stevenson, Robert Louis (23 items); 113. Stoker, Bram (2 items); 114. Stratton-Porter, Gene (4 items); 115. Swift, Jonathan (2 items); 116. Tennyson, Alfred (1 items); 117. Thoreau, Henry David (5 items); 118. Trollope, Anthony (1 items); 119. Twain, Mark (19 items); 120. Voltaire (1 items); 121. Wallace, Lew (1 items); 122. Washington, Booker T. (1 items); 123. Wells, H. G. (3 items); 124. Whit
e, Andrew Dickson (1 items); 125. Whitman, Walt (1 items); 126. Wilde, Oscar (17 items); 127. Williams, Bill (1 items); 128. Wollstonecraft, Mary (2 items); 129. Wright, Harold Bell (1 items).
If now I click on "Berkeley", I get the list of the works of this author that I can find in the database. I choose the Treatise Concerning The Principles Of Human Knowledge and I am given three possibilities. download the work in text format in my computer, download it in a specific format as electronic book (PDF format), or use the software that emphasizes concordances.
By "concordances", traditionally, we mean an alphabetic repertoire of all the words used in one or more works by an author, together with a short co-text providing the indication of the passages where there appear. In the electronic version, concordances are not unnecessary alphabetic lists, but search engines that give as a result a list of co-texts in which a given string occurs in a text or corpus.
If I click on "use concordance" and then in the following window where the writing "word" is I write "sign", I get these results:
...general by being made the sign, not of an abstract general idea, but...
...which taken absolutely is particular, by being a sign is made general.
And as the former owes its generality not to its being the sign of an abstract or general line...
...the relation of cause and effect, but only of a mark or sign with the thing signified.
In like manner the noise that I hear is not the effect of this or that motion or collision of the ambient bodies, but the sign thereof.
I therefore can use a program to consult concordances provided by the site itself and get the online results even without downloading Berkeley’s text into my computer. This service is very useful, of course.
At the address http://www.ruf.rice.edu/~barlow/corpus.html there is the page entitled Corpus Linguistics. It is a metasite, because it contains links to corpora all over the world. From this page, you can get information about corpora in the internet, divided by language. Clicking on "English" you get a list of corpora in English. The first that is listed is the American National Corpus that is under construction based on the British National Corpus and for the time being has a beta version running (a version that is untested and is not finalized).
One of the most important links concerns the Project Gutenberg, found at the address http://www.promo.net/pg/index.html. The home page presents itself with a search grid both by author and by keyword, beyond many links to pages listing authors and works. The repertoire contains authors of many nationalities, more than two thousand in the whole.
If for example you choose "Tolstoy" (of course names must be inserted in the English spelling), you get a list of nineteen works by Lev Nukolaevič Tolstoj, all in English translation, freely downloadable. In this site there are no search engines allowing to search concordances online.
In the next unit we’ll go on examining online resources for translators.
Alex Catalogue of Electronic Texts available on the world wide web at the address http://www.infomotions.com/alex/ consulted 17 April 2004.
http://www.ruf.rice.edu/~barlow/corpus.html available on the world wide web at the address Corpus Linguistics consulted 17 April 2004.
FREUD SIGMUND L’interpretazione dei sogni in Opere vol. 3 Torino Boringhieri a cura di C. L. Musatti 1966.
FREUD SIGMUND The Interpretation Of Dreams translated by A. A. Brill London G. Allen & company 1913.
Project Gutenberg, available on the world wide web at the address http://www.promo.net/pg/index.html consulted 17 April 2004.
1 Freud 1900: 377.