AI techniques are based on learning a model based on a large available data set. The data sets typically are from a single modality (e.g., imagery) and hence the model is based on a single modality. Likewise, multiple models are each built for a common scenario (e.g., video and natural language processing of text describing the situation). There are issues of robustness, efficiency, and explainability needed. A second modality can improve efficiency (e.g., cueing), robustness (e.g., results can not be fooled such as adversary systems), and explainability from different sources help. The challenge is how to organize the data needed for joint data training and model building. For example, what is needed (1) structure for indexing data as an object file, (2) recording of metadata for effective correlation, and (3) supporting models and analysis for model interpretability for users. There are a variety of questions to be discussed, explored, and analyzed for fusion-based AI tool.
|