【正文】
ng your intuitions + exploring online resources What is a corpus? ? The word corpus es from Latin (“body”) and the plural is corpora ? A corpus is a body of naturally occurring language – …but rarely a random collection of text – Corpora “are generally assembled with particular purposes in mind, and are often assembled to be (informally speaking) representative of some language or text type.” (Leech 1992) ? “A corpus is a collection of (1) machinereadable (2) authentic texts (including transcripts of spoken data) which is (3) sampled to be (4) representative of a particular language or language variety.” (MXT 2021: 5) What is not a corpus? ? A list of words is not a corpus – Building blocks of language ? A text archive is not a corpus – A random collection of texts ? A collection of citations is not a corpus – A short quotation which contains a word or phrase that is the reason for its selection ? A collection of quotations is not a corpus – A short selection from a text chosen on internal criteria by human beings ? A text is not a corpus – Intending to be read in different ways ? The Web is not a corpus – Its dimensions unknown, constantly changing, not designed from a linguistic perspective Sinclair (2021) What is a corpus for? ? A corpus is made for the study of language in a broad sense – To test existing linguistic theory and hypotheses – To generate and verify new linguistic hypotheses ? The purpose is reflected in a welldesigned corpus Why use corpora? ? Even expert speakers have only a partial knowledge of a language – A corpus can be more prehensive and balanced ? Even expert speakers tend to notice the unusual and think of what is possible – A corpus can show us what is mon and typical ? Even expert speakers cannot quantify their knowledge of language – A corpus can readily give us accurate statistics Why use corpora? ? Even expert speakers cannot remember everything they know – A corpus can store and recall all the information that has been stored in it ? Even experts speakers cannot make up natural examples – A corpus can provide us with a vast number of examples in real munication context ? Even expert speakers have prejudices and preferences and every language has cultural connotations and underlying ideology – A corpus can give you more objective evidence Why use corpora? ? Even expert speakers are not always available to be consulted – A corpus can be made permanently accessible to all ? Even expert speakers cannot keep up with language change – A constantly updated corpus can reflect even recent changes in the language ? Even expert speakers lack authority: they can be challenged by other expert speakers – A corpus can enpass the actual language use of many expert speakers Intuitions as an alternative ? Intuitions are always useful in linguistics – To invent (grammatical, ungrammatical, or questionable) example sentences for linguistic analysis – To make judgments about the acceptability / grammaticality or meaning of an expression – To help with categorization Intuitions as an alternative ? Intuitions should be applied with caution – Possibly biased as they are likely to be influenced by one?s dialect or sociolect – Introspective data is artificial and may not represent typical language use as one is consciously monitoring one?s language production – Introspective data is decontextualized because it exists in the analyst?s