Linguistic corpora of understudied languages: do they make sense?

Vinogradov,Igor

dc.creator	Vinogradov,Igor
dc.date	2016-06-01
dc.date.accessioned	2023-09-25T14:11:06Z
dc.date.available	2023-09-25T14:11:06Z
dc.identifier	http://www.scielo.sa.cr/scielo.php?script=sci_arttext&pid=S2215-26362016000100116
dc.identifier.uri	https://repositorioslatinoamericanos.uchile.cl/handle/2250/8814583
dc.description	Abstract:A corpus of an understudied language usually has documentary-linguistic nature and comprises all text material available in a particular language. However, without resorting to text selection, it is impossible to obtain a representative and balanced sample of language use. Lack of these two characteristics makes a corpus almost useless for any kind of quantitative research. Nevertheless, corpora of understudied languages comply with a wide range of language documentation objectives. Furthermore, they can serve as evidence of the existence of word forms or grammatical features in texts that meet specific search criteria. If such corpora have well-elaborated linguistic annotation, they can complement grammatical descriptions and dictionaries, standing out against common text collections due to their digital format. They are especially suitable for typological research, when one has to deal with a huge amount of data in different and unrelated languages.
dc.format	text/html
dc.language	en
dc.publisher	Universidad de Costa Rica
dc.relation	10.15517/rk.v40i1.24143
dc.rights	info:eu-repo/semantics/openAccess
dc.source	Káñina v.40 n.1 2016
dc.subject	corpus linguistics
dc.subject	understudied languages, language documentation
dc.subject	quantitative methods
dc.title	Linguistic corpora of understudied languages: do they make sense?
dc.type	info:eu-repo/semantics/article

Este ítem pertenece a la siguiente institución

SciELO (Costa Rica)