Synthetic data has been shown to be incredibly useful for improving the performance of perception algorithms; however, it is still unclear how to identify the right techniques to generate and to train perception models with those synthetic datasets. In this work, we show how creating a digital twin of a real-world dataset allows us to have a more principled evaluation of the synthetic-to-real domain gaps for both training and inference, and use this information to test and evaluate synthetic datasets themselves, rather than the models trained on them. Furthermore, we show how this framework allows for a measure of the inference domain gap—a measure that tells us whether testing a perception model on synthetic data is representative of testing in the real world. We use these measures to generate a synthetic dataset from the nuScenes autonomous driving dataset, targeted to maximally improve performance on specific rare classes. We optimize the synthetic data generation parameters for this dataset in order to reduce the inference and training domain gaps. We show performance improvements of over 18% on the bicycle class. Our training results provide a way of measuring the training domain gap to analyze synthetic datasets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.