When we built the unit inventory from the Blizzard corpus, three types of manual works were performed. All these works took about 12 working days of our labelers. In order to see how much benefit these manual works bring us, we performed several perceptual experiments to compare the speech generated with/without manual works. The results show that although the manual proofreading identified more than 500 word-errors, no improvement is observed in our experiment. Both manual checking of segmental boundaries and manual prosody annotations make the synthesized speech better. And the later one brings more benefit. The preference rate between the final version of the synthetic speech with limited manual works and the fully automatically processed version is 68% to 32%.
A Study on How Human Annotations Benefit the TTS Voice
Min Chu,Yining Chen,Yong Zhao,Yusheng Li,F. Soong
Published 2006 in Blizzard Challenge
ABSTRACT
PUBLICATION RECORD
- Publication year
2006
- Venue
Blizzard Challenge
- Publication date
2006-09-16
- Fields of study
Linguistics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-10 of 10 references · Page 1 of 1
CITED BY
Showing 1-4 of 4 citing papers · Page 1 of 1