【正文】
heoryneutrality but for its relevance to the specific research question – . Hunston (1993) studies how people talk about sameness and difference (“l(fā)ocal grammar”) Annotation styles ? LOB style – going_VVGK ? TEI entity references – goingamp。 ? WSJ style – going/VVGK ? SGML – w POS=VVGKgoing/w ? BNC style (simplified SGML) – w VVGKgoing ? XML – w POS=“VVGK”going/w ? Standalone – s w id=“1”He/w w id=“2”was/w w id=“3”going/w w id=“4”to/w w id=“5”die/w w id= “6”./w /s – s word id=“1”PPHS1/word word id=“2”VBDZ/word word id=“3”VVGK/word word id=“4”TO/word word id=“5”VVI/word word id=“6”./word /s Introducing CLAWS ? CLAWS: some basic facts – The Constituent Likelihood Automatic Wordtagging System – The most well known POS tagger for English – Has been used to tag a number of large corpora, incl. 100 million word British National Corpus (BNC) – Has consistently achieved 9697% accuracy – Free online tagging service allow academic users to tag 100,000 word at a time (from an academic website) ? CLAWS tagsets ? C7 taget – A detailed tagset of 146 tags – ? C5 tagset – Less refined, 61 tags (BNC tagset) – ? The mapping between C7 and C5 is a manytoone conversion, and is available in a tabdelimited text file ? C8 tagset is an extension of C7 tagset that makes further distinctions in the determiner and pronoun categories as well as for auxiliary verbs – Free CLAWS trial service CLAWS output formats Vertical output format Horizontal output format (Use copy amp。50 per year… Click here to find out more about the UCREL Semantic Annotation System Click here to run “tag wizard” Click here to see your work area (for data you have already processed) Amongst other things, the link explains the categorisation scheme utilised … Hierarchy of 21 major discourse fields (or domains), which expand into 232 semantic field tags (see the web link) semantic field (or domain) = “A named area of meaning in which lexemes interrelate and define each other in specific ways” (Crystal 1995: 157) Note the USAS scheme is derived from McArthur (1981) ? Designed to undertake the automatic semantic analysis of presentday English texts (spoken and written) ? Involving two stages (i) POS tagging by CLAWS A POS tag is assigned to ev