Automatic building of annotated image datasets for training neural networks

Andrei Zhdanov; Egor Khilik; Dmitry Zhdanov; Igor Potemin; Alexander Belozubov; Yan Wang; Igor Kinev

doi:10.1117/12.2687751

27 November 2023 Automatic building of annotated image datasets for training neural networks

Andrei Zhdanov, Egor Khilik, Dmitry Zhdanov, Igor Potemin, Alexander Belozubov, Yan Wang, Igor Kinev

Proceedings Volume 12767, Optoelectronic Imaging and Multimedia Technology X; 127671K (2023) https://doi.org/10.1117/12.2687751
Event: SPIE/COS Photonics Asia, 2023, Beijing, China

Conference Poster

Abstract

As a rule, specially trained neural networks are engaged in the recognition and classification of objects in RGB and RGB-D images. The quality of object recognition depends on the quality of neural network training. Since the network cannot go far beyond the limits of the training set, the problems of forming datasets and their correct annotating are of particular relevance. These tasks are time-consuming and can be difficult to perform in real-world conditions since it is not always possible to create the required real-world illumination and observation conditions. So, synthesized images with a high degree of realism can be used as input data for deep learning. To synthesize realistic images, it is necessary to create appropriate realistic models of scene objects, illumination and observation conditions, including ones achieved with special optical devices. However, this is not enough to create a dataset, since it is necessary to generate thousands of images, which is hardly possible to do manually. Therefore, an automated solution is proposed, which allows us to automatically process the scene to observe it from different angles, modify the scene by adding, deleting, moving, or rotating individual objects, and then perform automatic annotation of a desired scene image. As a result, not only directly visible scene object images but also their reflections may be annotated. In addition to the segmented image, a segmented point cloud and a depth map image (RGB-D) are built, which helps in training neural networks working with such data. For this, a Python scripting interpreter was built-in into a realistic rendering system. It allows us to perform any actions with the scene that are allowed in the user interface to control the automatic synthesis and segmentation of images. The paper provides examples of automatic dataset generation and corresponding trained neural network results.

(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.

Citation Download Citation

Andrei Zhdanov, Egor Khilik, Dmitry Zhdanov, Igor Potemin, Alexander Belozubov, Yan Wang, and Igor Kinev "Automatic building of annotated image datasets for training neural networks", Proc. SPIE 12767, Optoelectronic Imaging and Multimedia Technology X, 127671K (27 November 2023); https://doi.org/10.1117/12.2687751

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available