Stefano Ondelli

University of Trieste, Italy

The Impact of Sociolinguistic and Morphological Factors on Corpus Design

Representativeness is probably the most important target to be achieved when compiling a corpus. Since language per se is unfathomable with statistical means, any investigation is bound to deal with a specific sample that needs to be constructed ad hoc with a view to the specific objectives of the research being conducted. This presentation deals with the main sociolinguistic factors involved in the selection of texts to achieve corpus representativeness and balance and provides an overview of some examples of recent research on language for special purposes. In addition, the main issues emerging from the pre-processing of language tokens shall be illustrated with specific reference to Italian morphology.

Lingua:

Ciao Lorenzo…

Una pagina in onore del Prof. Lorenzo Bernardi (1943-2014).

Highlights!

Highlights!

University of Wroclaw (Poland)

luglio: 2017
L M M G V S D
« Mag    
 12
3456789
10111213141516
17181920212223
24252627282930
31