< Terug naar vorige pagina

Publicatie

Wablieft: An Easy-to-Read Newspaper Corpus for Dutch

Boekbijdrage - Boekhoofdstuk Conferentiebijdrage

This paper presents the Wablieft corpus, a two million words corpus of a Belgian easy-to-read newspaper, written in Dutch. The corpus was automatically annotated with CLARIN tools and is made available in several formats for download and online querying, through the CLARIN infrastructure. Annotations consist of part-of-speech tagging, chunking, dependency parsing, named entity recognition, morphological analysis and universal dependencies. By making this corpus available we want to stimulate research into text readability and automated text simplification.
Boek: Proceedings of CLARIN Annual Conference 2019
Pagina's: 188 - 191
Aantal pagina's: 4
Jaar van publicatie:2019
Toegankelijkheid:Open