We present a method for monitoring rapidly urbanizing areas with deep learning techniques. This method was generated during participation in the SpaceNet7 deep learning challenge and utilizes a U-Net architecture for semantically labeling each frame in a time series of monthly images that span roughly two years. The image sequences were collected over one hundred rapidly urbanizing regions. We discuss our network architecture and post processing algorithms for combining multiple semantically labeled frames to provide object level change detection.
Standard ATR algorithms suffer from a lack of transparency into why the algorithm recognized a particular object as a target. We present an enhanced Explainable ATR algorithm that utilizes super-resolution networks to provide increased robustness. XATR is a two-level network, with the lower level using Region-based Convolution Neural Networks (R-CNNs) to recognize major parts of the target, known as vocabulary. The upper level employs Markov Logic Networks (MLN) and structure learning to learn the geometric and spatial relationships between the parts in the vocabulary that best describe the objects. Image degradation due to noise, blurring, decimation, etc., can severely impact XATR performance as feature content is irrevocably lost. We address this by introducing a novel super-resolution network. This network uses a dynamic u-net design. A ResNet is on the encoder path while the imagery is reconstructed with dynamically linked upsampling heads in the decoder path. The network is trained on high resolution and degraded imagery pairs to super-resolve the degraded imagery. The trained dynamic u-net then super-resolves unseen degraded imagery to improve XATR’s performance compared to lost performance when using the degraded imagery. In this paper, we perform experiments to 1) Determine the sensitivity of XATR to image corruption 2) Improve XATR performance with super-resolution and 3) Demonstrate XATR robustness to image degradation and occlusion. Our experiments demonstrate improved recall (+40%) and accuracy (+20%) on degraded images when super-resolution is applied.
Many problems in defense and automatic target recognition (ATR) require concurrent detection and classification of objects of interest in wide field-of-view overhead imagery. Traditional machine learning approaches are optimized to perform either detection or classification individually; only recently have algorithms expanded to tackle both problems simultaneously. Even highly performing parallel approaches struggle to disambiguate tightly clustered objects, often relying on external techniques such as non-maximum suppression. We have developed a hybrid detection-classification approach that optimizes the segmentation of closely spaced objects, regardless of size, shape, and object diversity. This improves overall performance for both the detection and classification problems.
Automatic Target Recognition (ATR) in Synthetic Aperture Radar (SAR) for wide-area search is a difficult problem for both classic techniques and state-of-the-art approaches. Deep Learning (DL) techniques have been shown to be effective at detection and classification, however they require significant amounts of training data. Sliding window detectors with Convolutional Neural Network (CNN) backbones for classification typically suffer from localization error, poor compute efficiency, and need to be tuned to the size of the target. Our approach to the wide-area search problem is an architecture that combines classic ATR techniques with a ResNet-18 backbone. The detector is dual-stage and consists of an optimized Constant False Alarm Rate (CFAR) screener and a Bayesian Neural Network (BNN) detector which provides a significant speed advantage over standard sliding window approaches. It also reduces false alarms while maintaining a high detection rate. This allows the classifier to run on fewer detections improving processing speed. This paper’s focus tests out the BNN and CNN components of HySARNet through experiments to determine their robustness to variations in graze angle, resolution, and additive noise. Synthetic targets are also experimented with for training the CNN. Synthetic data has the potential to allow for the ability to train on hard to find targets where little or no data exists. SAR simulation software and 3D CAD models are used to generate the synthetic targets. This paper focuses on the utilization of the Moving and Stationary Target Acquisition (MSTAR) dataset which is the widely used, standard data set for SAR ATR publications.
KEYWORDS: Data modeling, Sensors, 3D modeling, Object recognition, LIDAR, Detection and tracking algorithms, Video, Signal to noise ratio, Radar, Electro optical modeling
Automatic object recognition capabilities are traditionally tuned to exploit the specific sensing modality they were
designed to. Their successes (and shortcomings) are tied to object segmentation from the background, they typically
require highly skilled personnel to train them, and they become cumbersome with the introduction of new objects. In this
paper we describe a sensor independent algorithm based on the biologically inspired technology of map seeking circuits
(MSC) which overcomes many of these obstacles. In particular, the MSC concept offers transparency in object
recognition from a common interface to all sensor types, analogous to a USB device. It also provides a common core
framework that is independent of the sensor and expandable to support high dimensionality decision spaces. Ease in
training is assured by using commercially available 3D models from the video game community. The search time
remains linear no matter how many objects are introduced, ensuring rapid object recognition. Here, we report results of
an MSC algorithm applied to object recognition and pose estimation from high range resolution radar (1D), electrooptical
imagery (2D), and LIDAR point clouds (3D) separately. By abstracting the sensor phenomenology from the
underlying a prior knowledge base, MSC shows promise as an easily adaptable tool for incorporating additional sensor
inputs.
Proceedings Volume Editor (6)
This will count as one of your downloads.
You will have access to both the presentation and article (if available).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.