Paper
15 September 1995 Performance and competence models for audiovisual data fusion
Harouna Kabre
Author Affiliations +
Proceedings Volume 2589, Sensor Fusion and Networked Robotics VIII; (1995) https://doi.org/10.1117/12.220949
Event: Photonics East '95, 1995, Philadelphia, PA, United States
Abstract
We describe two Artificial Neural Network (ANN) Models for Audio-visual Data Fusion. For the first model, we start an ANN training with an a-priori chosen static architecture together with a set of weighting parameters for the visual and for the auditory paths. Those weighting parameters, called attentional parameters, are tuned to achieve best performance even if the acoustic environment changes. This model is called the Performance Model (PM). For the second model, we start without any unit in the hidden layer of the ANN. Then we incrementally add new units which are partially connected to either the visual path or to the auditory one, and we reiterate this procedure until the global error cannot be reduced anymore. This model is called the Competence Model (CM). CM and PM are trained and tested with acoustic data and their corresponding visual parameters (defined as the vertical and the horizontal lip widths and as the lip-opening area parameters) for the audio-visual speech recognition of the 10 French vowels in adverse conditions. In both cases, we note the recognition rate and analyze the complementarity between the visual and the auditory information in terms of number of hidden units (which are connected either to the visual or to the auditory inputs vs Signal To Noise Ratio (SNR)) and in terms of the tuning of the attentional parameters vs SNR.
© (1995) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Harouna Kabre "Performance and competence models for audiovisual data fusion", Proc. SPIE 2589, Sensor Fusion and Networked Robotics VIII, (15 September 1995); https://doi.org/10.1117/12.220949
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Curium

Phase modulation

Visualization

Signal to noise ratio

Acoustics

Data modeling

Performance modeling

Back to Top