< Terug naar vorige pagina

Publicatie

Varro: An Algorithm and Toolkit for Regular Structure Discovery in Treebanks

Boekbijdrage - Boekhoofdstuk Conferentiebijdrage

The Varro toolkit is a system for identifying and counting a major class of regularity in treebanks and annotated natural language data in the form of treestructures: frequently recurring unordered subtrees. This software has been designed for use in linguistics to be maximally applicable to actually existing treebanks and other stores of tree-structurable natural language data. It minimizes memory use so that moderately large treebanks are tractable on commonly available computer hardware. This article introduces condensed canonically ordered trees as a data structure for efficiently discovering frequently recurring unordered subtrees.
Boek: Proceedings of the 23rd International Conference on Computational Linguistics (CoLing 2010)
Pagina's: 810 - 818
ISBN:9787302234562
Jaar van publicatie:2010
Toegankelijkheid:Open