< Back to previous page

Publication

Wablieft: An Easy-to-Read Newspaper Corpus for Dutch

Book Contribution - Book Chapter Conference Contribution

This paper presents the Wablieft corpus, a two million words corpus of a Belgian easy-to-read newspaper, written in Dutch. The corpus was automatically annotated with CLARIN tools and is made available in several formats for download and online querying, through the CLARIN infrastructure. Annotations consist of part-of-speech tagging, chunking, dependency parsing, named entity recognition, morphological analysis and universal dependencies. By making this corpus available we want to stimulate research into text readability and automated text simplification.
Book: Proceedings of CLARIN Annual Conference 2019
Pages: 188 - 191
Number of pages: 4
Publication year:2019
Accessibility:Open