Maciej Eder

Polish Academy of Sciences, Kraków

Introduction to distributional semantics: topic modelling and word vector representations

The lecture will address two methods of detecting latent semantic structure in a given collection of texts, without a priori assumptions about semantic relations between words. The first method, referred to as topic modelling, is aimed at extracting a probabilistic model that consists of abstract “topics”, or words that tend to co-occur in documents, and allows for measuring the proportion of particular “topics” across the documents. The second method, also known as word embeddings, is usually associated with its implementation “word2vec”. The method tries to map words and their contexts onto vectors in a multidimensional space, which can be used to represent relative distances (and consequently, lexical distances) between different words.

Lingua:

Ciao Lorenzo…

Una pagina in onore del Prof. Lorenzo Bernardi (1943-2014).

Highlights!

Highlights!

University of Wroclaw (Poland)

luglio: 2017
L M M G V S D
« Mag    
 12
3456789
10111213141516
17181920212223
24252627282930
31