Presentation + Paper
13 June 2023 Analyzing synthetic datasets through the training and inference domain gap
Vickram Rajendran, Chuck Tang, Frits van Paasschen
Author Affiliations +
Abstract
Synthetic data has been shown to be incredibly useful for improving the performance of perception algorithms; however, it is still unclear how to identify the right techniques to generate and to train perception models with those synthetic datasets. In this work, we show how creating a digital twin of a real-world dataset allows us to have a more principled evaluation of the synthetic-to-real domain gaps for both training and inference, and use this information to test and evaluate synthetic datasets themselves, rather than the models trained on them. Furthermore, we show how this framework allows for a measure of the inference domain gap—a measure that tells us whether testing a perception model on synthetic data is representative of testing in the real world. We use these measures to generate a synthetic dataset from the nuScenes autonomous driving dataset, targeted to maximally improve performance on specific rare classes. We optimize the synthetic data generation parameters for this dataset in order to reduce the inference and training domain gaps. We show performance improvements of over 18% on the bicycle class. Our training results provide a way of measuring the training domain gap to analyze synthetic datasets.
Conference Presentation
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Vickram Rajendran, Chuck Tang, and Frits van Paasschen "Analyzing synthetic datasets through the training and inference domain gap", Proc. SPIE 12529, Synthetic Data for Artificial Intelligence and Machine Learning: Tools, Techniques, and Applications, 125290R (13 June 2023); https://doi.org/10.1117/12.2664569
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Performance modeling

LIDAR

Object detection

Machine learning

Data processing

Distance measurement

Back to Top