Jacques Savoy

University of Neuchatel, Switzerland

Other Applications of Authorship Attribution Methods

The general authorship attribution question can appear in other contexts than the classical one.  In the verification problem, we must determine whether or not a given person wrote a text.  The expected answer is binary (yes or no) with usually a degree of support (or a probability) that the proposed answer is correct.  Even if this question seems simple than the classical authorship question, it is not because the training examples are limited to one person and the generation of candidates is always problematic.  

As a second question, we will present the author clustering problem. In this case, having a set of n texts, we must determine the number k of distinct authors, and regroup all texts written by the same person.  In this context, no training examples are available (because the author names are unknown).  To illustrate solutions to this question, corpora from the English and French literature (mainly from the 19th century) will be used.  Usually, effective solutions must take account of both the style and the topics of the texts.  

To promote a solution for these two questions, we will show how to build various the text representations and implement different intertextual distance measures.  Using these techniques, the temporal evolution can also be detected and visualized.  In this latter case, speeches uttered by the US presidents will form our evaluation corpus.  In this case too, the style can justify partially the results but the topics are also useful to further explain the resulting classification.  


Ciao Lorenzo…

Una pagina in onore del Prof. Lorenzo Bernardi (1943-2014).



GIAT is part of FoSL – Federation of Stylometry Labs

marzo: 2019
« Mag