< Back to previous page

Dataset

Personae Corpus

The Personae corpus was collected for experiments in Authorship Attribution and Personality Prediction. It consists of 145 Dutch-language essays, written by 145 different students (BA in Linguistics and Literature at the University of Antwerp, Belgium). Each student also took an online MBTI personality test, allowing personality prediction experiments. The corpus was controlled for topic, register, genre, age, and education level. We make available the original texts, a syntactically annotated version of the texts, and the metadata.
Publication year:2008
Accessibility:open
Publisher:CLiPS Research Group, University of Antwerp
License:CC-BY-4.0
Format:txt
Keywords: Linguistics