< Terug naar vorige pagina

Publicatie

MUPPETS: Multipurpose Table Segmentation

Boekbijdrage - Boekhoofdstuk Conferentiebijdrage

We present muppets, a framework for partitioning cells in a table in segments that fulfil the same semantic role or belong to the same semantic data type, similar to how image segmentation is used to group pixels that represent the same semantic object in computer vision. Flexible constraints can be imposed on these segmentations for different use cases. muppets uses a hierarchical merge tree algorithm, which allows for efficiently finding segmentations that satisfy given constraints and only requires similarities between neighbouring cells to be computed. Three applications are used to illustrate and evaluate muppets: identifying tables and headers, type detection and discovering semantic errors.
Boek: Lecture Notes in Computer Science
Pagina's: 389 - 401
ISBN:978-3-030-74251-5
Jaar van publicatie:2021
BOF-keylabel:ja
IOF-keylabel:ja
Authors from:Higher Education
Toegankelijkheid:Open