< Back to previous page

Publication

Semantic role labeling of speech transcripts

Book Contribution - Book Chapter Conference Contribution

Speech data has been established as an extremely rich and important source of information. However, we still lack suitable methods for the semantic annotation of speech that has been transcribed by automated speech recognition (ASR) systems. For instance, the semantic role labeling (SRL) task for ASR data is still an unsolved problem, and the achieved results are significantly lower than with regular text data. SRL for ASR data is a difficult and complex task due to the absence of sentence boundaries, punctuation, grammar errors, words that are wrongly transcribed, and word deletions and insertions. In this paper we propose a novel approach to SRL for ASR data based on the following idea: (1) combine evidence from different segmentations of the ASR data, (2) jointly select a good segmentation, (3) label it with the semantics of PropBank roles. Experiments with the OntoNotes corpus show improvements compared to the state-of-the-art SRL systems on the ASR data. As an additional contribution, we semi-automatically align the predicates found in the ASR data with the predicates in the gold standard data of OntoNotes which is a quite difficult and challenging task, but the result can serve as gold standard alignments for future research.
Book: Lecture Notes in Computer Science
Pages: 583 - 595
ISBN:978-3-319-18116-5
Publication year:2015
BOF-keylabel:yes
IOF-keylabel:yes
Authors from:Higher Education
Accessibility:Open