Tese de Doutorado
Caracterização de registros orientada para a produção textual no ambiente multilíngue: estudo baseado em corpora comparáveis
Fecha
2013-08-27Autor
Kelen Cristina Santanna de Lima
Institución
Resumen
This thesis aims to contribute to the development of a model to account for text production and variability within a multilingual environment (FIGUEREDO, 2011). It builds on Systemic-Functional Linguistics (HALLIDAY; MATTHIESSEN, 2004; EGGINS, 2004; MATTHIESSEN; TERUYA; WU, 2008; FIGUEREDO, 2011) as a framework that supports an interface between Translation Studies and Corpus Linguistics (MCENERY; XIAO, 2007; GRANGER, 2008) oriented towards the semiautomatic analysis of comparable corpora. It reports a study of patterns of language use (SINCLAIR, 1991; BERBER SARDINHA, 2004) in a comparable corpus compiled with texts on newborn screening for sickle cell disease ascribed to three specific text types and thus labelled by language users: 1) research articles (i.e., specialist-specialist interaction), 2) technical guides (specialist-technician interaction), and 3) pamphlets and patient information leaflets (specialist-layperson interaction). Text sampling was carried out following Biber (1990) as adapted by Neumann (2005). The texts were automatically and manually annotated and queried using the software R to obtain co-occurrence patterns of specific lexical and grammatical items. After classifying and computing these items, the analysis targeted patterns of use that account for how each text type could be classified according to their socio-semiotic process. The corpora were subsequently POS-tagged using Treetager. Chi-square tests, Fishers exact tests, and Z tests were carried out to identify patterns of word classes that could be taken to differentiate subcorpora and could support further analyses aimed at characterizing the registers with which the texts in each subcorpus were associated. Excerpts of 1,000 words were selected to represent each text type in English and Portuguese (BIBER, 1990) and classified according to language typology in the context of culture (cf. MATTHIESSEN; TERUYA; WU, 2008). These texts were pasted to UAM CorpusTool® for annotation and semi-automatic analyses of choices within the ideational, interpersonal and textual metafunctions (HALLIDAY; MATTHIESSEN, 2004). Frequencies of lexical and grammatical items in each text were computed with a view to proposing a systemic-functional (SFL) description (FIGUEREDO, 2011) of the TRANSITIVITY, MOOD, THEME and MESSAGE systems. The results pointed to registerial differences for lexical variation, lexical density, occurrence frequency of lexical and grammatical items, and provided a word class-based mapping of how these items are distributed in the texts. In the light of Systemic-Functional Linguistics (SFL), between-text differences and similarities were underscored building on the impact of context variables (i.e., field, tenor, and mode) on the lexico-grammar (EGGINS, 2004). Context parameters were used to locate the labels in the text typology (MATTHIESSEN; TERUYA; WU, 2008) and classify them as pertaining to the socio-semiotic processes EXPLORING (research article) and ENABLING (technical guide and pamphlets and patient information leaflets). This classification shed light on text production within the multilingual environment. The SLF-based description of the metafunctional profile of the texts showed that, ideationally, material and relational processes were the main processes used to construe the real world in all text types in both languages. Interpersonally, i.e. regarding author-reader interaction, the declarative mood, with the semantic function of information supply, was predominant in all text types both in Portuguese and in English, and the imperative mood, with the semantic function of demands of goods and services, was found only in the pamphlets and patient information leaflets. Textually, all text types are organized, at the semantic level, on the basis of initial messages, and continuity and discontinuity messages (change and shift) that are similar in both English and Portuguese, and, at the grammatical level, on the basis of theme types that are different for English and Portuguese. Initial messages are those that sort out text information; continuity messages add information to the initial messages; and discontinuity:change messages guide text information flow on the basis of a participant in particular. Textual and default themes were the most frequent type of Theme in all text types in Portuguese, whereas simple and multiple (textual and topical) themes were the most frequent in all text types in English; the angle-source Theme was found only in the Portuguese research articles; and the multiple (interpersonal and topical) Theme occurred only in the English pamphlets and patient information leaflets. Building on the prototypical metafunctional profile identified based on the metafunctional profile, a template was developed to serve as a basis for the production of part of an ENABLING text type that is based on SFL-informed metafunctional choices. The results reported are the first within a joint project developed by the Laboratory for Experimentation in Translation (LETRA, Faculty of Arts, Federal University of Minas Gerais) and the Center for Newborn Screening and Genetics Diagnosis (NUPAD, School of Medicine, Federal University of Minas Gerais).