Paper
26 March 1998 Continuous speech segmentation determined by blind source separation
Author Affiliations +
Abstract
One of the problems of 5 percent error rate encountered in continuous speech recognition is partly due to the difficulty in the identification of a mixed up to two phonemes in a close concatenation. For instance, one speaks of 'Let's go' instead of 'Let us go'. There are two kinds of speech segmentations: the linguistic segmentation and the acoustic segmentation. The linguistic segmentation relies on a combination of acoustic, lexical, semantic, and statistical knowledge sources, which has been studied. Daily spoken conversations are usually abbreviated for speakers' convenience. The acoustic segmentation is to separate the mixed sounds such as /ts/ into /t/ and /s/ for automatically finding linguistic units. Adaptive wavelet transform (AWT) developed by Szu is a linear superposition of banks of constant-Q zero-mean mother wavelets implemented by an ANN called a 'wavenet'. Each neuron is represented by a daughter wavelet, which can be an affine scale change of identical or different method wavelet for a continuous AWT. AWT was designed for the cocktail party effect and to solve the acoustic segmentation of phonemes using a supervised learning ANN architecture. In this paper, we reviewed AWT from Independent Component Analysis viewpoint, and then applied blind source separation to the acoustic de-mixing and segmentation.
© (1998) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Harold H. Szu, Charles C. Hsu, and Da-Hong Xie "Continuous speech segmentation determined by blind source separation", Proc. SPIE 3391, Wavelet Applications V, (26 March 1998); https://doi.org/10.1117/12.304890
Lens.org Logo
CITATIONS
Cited by 13 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Wavelets

Acoustics

Independent component analysis

Superposition

Neural networks

Neurons

Feature extraction

Back to Top