Stefano Ondelli

University of Trieste, Italy

The Impact of Sociolinguistic and Morphological Factors on Corpus Design

Representativeness is probably the most important target to be achieved when compiling a corpus. Since language per se is unfathomable with statistical means, any investigation is bound to deal with a specific sample that needs to be constructed ad hoc with a view to the specific objectives of the research being conducted. This presentation deals with the main sociolinguistic factors involved in the selection of texts to achieve corpus representativeness and balance and provides an overview of some examples of recent research on language for special purposes. In addition, the main issues emerging from the pre-processing of language tokens shall be illustrated with specific reference to Italian morphology.

Lingua:

Ciao Lorenzo…

Una pagina in onore del Prof. Lorenzo Bernardi (1943-2014).

Highlights!

Highlights!

University of Wroclaw (Poland)

ottobre: 2017
L M M G V S D
« Mag    
 1
2345678
9101112131415
16171819202122
23242526272829
3031