A Scene Understanding Challenge Problem was released by AFRL at this conference in 2015 in response to DARPA’s Mathematics, Sensing, Exploitation, and Execution (MSEE) program. We consider a scene understanding system as a generalization of typical sensor exploitation systems where instead of performing a narrowly defined task (e.g., detect, track, classify, etc.), the system can perform general user-defined tasks specified in a query language. That paper1 laid out the general challenges and methods for developing scene understanding performance models. This is an enormously challenging problem, so now AFRL is illustrating the methods with a baseline system primarily developed by the University of California, Los Angeles (UCLA) during the MSEE program. This system will be publicly available for others to utilize, compare, and contrast with related methods. This paper will further explain and provide insights into the challenges, illustrating them with examples from a publicly available data set. Our intent is that these tools will relieve the requirement for developing an entire system and enable progress to occur by focusing on individual elements of the system. Finally, we will provide details as to how interested researchers may obtain the system and the data.
Shape- and motion-reconstruction is inherently ill-conditioned such that estimates rapidly degrade in the presence of noise, outliers, and missing data. For moving-target radar imaging applications, methods which infer the underlying geometric invariance within back-scattered data are the only known way to recover completely arbitrary target motion. We previously demonstrated algorithms that recover the target motion and shape, even with very high data drop-out (e.g., greater than 75%), which can happen due to self-shadowing, scintillation, and destructive-interference effects. We did this by combining our previous results, that a set of rigid scattering centers forms an elliptical manifold, with new methods to estimate low-rank subspaces via convex optimization routines. This result is especially significant because it will enable us to utilize more data, ultimately improving the stability of the motion-reconstruction process.
Since then, we developed a feature- based shape- and motion-estimation scheme based on newly developed object-image relations (OIRs) for moving targets collected in bistatic measurement geometries. In addition to generalizing the previous OIR-based radar imaging techniques from monostatic to bistatic geometries, our formulation allows us to image multiple closely-spaced moving targets, each of which is allowed to exhibit missing data due to target self-shadowing as well as extreme outliers (scattering centers that are inconsistent with the assumed physical or geometric models). The new method is based on exploiting the underlying structure of the model equations, that is, far-field radar data matrices can be decomposed into multiple low-rank subspaces while simultaneously locating sparse outliers.
Understanding and organizing data is the first step toward exploiting sensor phenomenology for dismount tracking.
What image features are good for distinguishing people and what measurements, or combination of measurements,
can be used to classify the dataset by demographics including gender, age, and race? A particular technique,
Diffusion Maps, has demonstrated the potential to extract features that intuitively make sense [1]. We want to
develop an understanding of this tool by validating existing results on the Civilian American and European Surface
Anthropometry Resource (CAESAR) database. This database, provided by the Air Force Research Laboratory
(AFRL) Human Effectiveness Directorate and SAE International, is a rich dataset which includes 40 traditional,
anthropometric measurements of 4400 human subjects. If we could specifically measure the defining features for
classification, from this database, then the future question will then be to determine a subset of these features that can
be measured from imagery. This paper briefly describes the Diffusion Map technique, shows potential for dimension
reduction of the CAESAR database, and describes interesting problems to be further explored.
This paper addresses several fundamental problems that have hindered the development of model-based recognition
systems: (a) The feature-correspondence problem whose complexity grows exponentially with the number
of image points versus model points, (b) The restriction of matching image data points to a point-based model
(e.g., point based features), and (c) The local versus global minima issue associated with using an optimization
model.
Using a convex hull representation for the surfaces of an object, common in CAD models, allows generalizing
the point-to-point matching problem to a point-to-surface matching problem. A discretization of the Euclidean
transformation variables and use of the well known assignment model of Linear Programming renown leads to
a multilinear programming problem. Using a logarithmic/exponential transformation employed in geometric
programming this nonconvex optimization problem can be transformed into a difference of convex functions
(DC) optimization problem which can be solved using a DC programming algorithm.
The ability to reconstruct the three dimensional (3D) shape of an object from multiple images of that object is
an important step in certain computer vision and object recognition tasks. The images in question can range
from 2D optical images to 1D radar range profiles. In each case, the goal is to use the information (primarily
invariant geometric information) contained in several images to reconstruct the 3D data. In this paper we apply
a blend of geometric, computational, and statistical techniques to reconstruct the 3D geometry, specifically the
shape, from multiple images of an object. Specifically, we deal with a collection of feature points that have been
tracked from image (or range profile) to image (or range profile) and we reconstruct the 3D point cloud up to
certain transformations-affine transformations in the case of our optical sensor and rigid motions (translations
and rotations) in the radar case. Our paper discusses the theory behind the method, outlines the computational
algorithm, and illustrates the reconstruction for some simple examples.
This paper describes the automatic target recognition (ATR) challenge problem which
includes source code for a baseline ATR algorithm, display utilities for the results, and a high
range resolution (HRR) data set consisting of 10 civilian vehicles. The Ku-band data in this data
set has been processed into 1-dimensional range profiles of vehicles in the open, moving in a
straight line. It is being released to the ATR community to facilitate the development of new and
improved HRR identification algorithms which can provide greater confidence and very high
identification performance. The intent of the baseline algorithm included with this challenge
problem is to provide an ATR performance comparison to newly developed algorithms. Single-look
identification performance results using the baseline algorithm and the data set are provided
as a starting point for algorithm developers. Both the algorithm and data set can support single
look and multi-look target identification.
KEYWORDS: Automatic target recognition, Sensors, Data modeling, Algorithm development, Performance modeling, Systems modeling, 3D modeling, Detection and tracking algorithms, Systems engineering, Data centers
The purpose of the Automatic Target Recognition (ATR) Center is to develop an environment conducive to producing
theoretical and practical advances in the field of ATR. This will be accomplished by fostering intellectual growth of
ATR practitioners at all levels. From an initial focus on students and performance modeling, the Center's efforts are
extending to professionals in government, academia, and industry. The ATR Center will advance the state of the art in
ATR through collaboration between these researchers.
To monitor how well the Center is achieving its goals, several tangible products have been identified: graduate student
research, publicly available data and associated challenge problems, a wiki to capture the body of knowledge associated
with ATR, development of stronger relationships with the users of ATR technology, development of a curriculum for
ATR system development, and maintenance of documents that describe the state-of-the-art in ATR.
This presentation and accompanying paper develop the motivation for the ATR Center, provide detail on the Center's
products, describe the Center's business model, and highlight several new data sets and challenge problems. The
"persistent and layered sensing" context and other technical themes in which this research is couched are also presented.
Finally, and most importantly, we will discuss how industry, academia, and government can participate in this alliance
and invite comments on the plans for the third phase of the Center.
An object-image metric is an extension of standard metrics in that it is constructed for matching and comparing
configurations of object features to configurations of image features. For the generalized weak perspective camera,
it is invariant to any affine transformation of the object or the image. Recent research in the exploitation of the
object-image metric suggests new approaches to Automatic Target Recognition (ATR). This paper explores the
object-image metric and its limitations. Through a series of experiments, we specifically seek to understand how
the object-image metric could be applied to the image registration problem-an enabling technology for ATR..
Understanding and organizing data, in particular understanding the key modes of variation in the data, is a first
step toward exploiting and evaluating sensor phenomenology. Spectral theory and manifold learning methods
have been recently shown to offer sever powerful tools for many parts of the exploitation problem. We will
describe the method of diffusion maps and give some examples with radar (backhoe data dome) data. The so-called
diffusion coordinates are kernel based dimensionality reduction techniques that can, for example, organize
random data and yield explicit insight into the type and relative importance of the data variation. We will
provide sufficient background for others to adopt these tools and apply them to other aspects of exploitation and
evaluation.
An object-image metric is an extension of standard metrics in that is constructed for matching and comparing configuration of object features to configurations of image features. For the generalized weak perspective camera, it is invariant to any affine transformation of the object or the image. Recent research in the exploitation of the object-image metric suggests new approaches to Automatic Target Recognition (ATR). This paper explores the object-image metric and its limitation. Through a series of experiments, we specifically seek to understand how the object-image metric could be applied to the image registration problem-an enabling technology for ATR.
KEYWORDS: Time-frequency analysis, Radar, Doppler effect, Transform theory, Fourier transforms, Data modeling, Scattering, Sensors, Iterated function systems, X band
This paper describes work that considered two Joint Time-Frequency
Transforms (JTFTs) for use in a SAR-based (single sensor/platform
Synthetic Aperture Radar) 3D imaging approach. The role of the
JTFT is to distinguish moving point scatterers that may become
collocated during the observation interval. A Frequency Domain
Velocity Filter Bank (FDVFB) was compared against the well-known
Short Time Fourier Transform (STFT) in terms of their maximal
Time-Frequency energy concentrations. The FDVFB and STFT energy
concentrations were compared for a variety of radar scenarios. In
all cases the STFT achieved slightly higher energy concentrations
while simultaneously requiring half the computations needed by the
FDVFB.
This paper describes the development of an algorithm for detecting
multiple-scattering events in the 3D Geometric Theory of
Diffraction (GTD)-based Jackson-Moses scattering model. This
approach combines microlocal analysis techniques with
geometric-invariant theory to estimate multiple-scattering events.
After multiple-scattering returns were estimated, the algorithm
employed the Generalized Radon Transform to determine the
existence of multiple scattering within the measured data. The
algorithm was tested on an X-band simulation of isotropic point
scatterers undergoing unknown rotational motion.
In this paper we survey some of the mathematical techniques that have led to useful new
results in shape analysis and their application to a variety of object recognition tasks. In particular,
we will show how these techniques allow one to solve a number of fundamental problems
related to object recognition for configurations of point features under a generalized weak perspective
model of image formation. Our approach makes use of progress in shape theory and
includes the development of object-image equations for shape matching and the exploitation
of shape space metrices (especially object-image metrics) to measure matching up to certain
transformations. This theory is built on advanced mathematical techniques from algebraic and
differential geometry which are used to construct generalized shape spaces for various projection
and sensor models. That construction in turn is used to find natural metrics that express the
distance (geometric difference) between two configurations of object features, two configurations
of image features, or an object and an image pair. Such metrics are believed to produce the
most robust tests for object identification; at least as far as the object's geometry is concerned.
Moreover, these metrics provide a basis for efficient hashing schemes to do identification quickly,
and they provide a rigorous foundation for error and statistical analysis in any recognition system.
The most important feature of a shape theoretic approach is that all of the matching tests
and metrics are independent of the choice of coordinates used to express the feature locations
on the object or in the image. In addition, the approach is independent of the camera/sensor
position and any camera/sensor parameters. Finally, the method is also independent of object
pose or image orientation. This is what makes the results so powerful.
KEYWORDS: Scattering, 3D modeling, Radar, Optical spheres, 3D acquisition, Data modeling, Detection and tracking algorithms, Fourier transforms, Solid modeling, Sensors
This paper details a model building technique to construct geometric target models from RADAR data collected in a controlled environment. An algorithm to construct three-dimensional target models from a complex RADAR return expressed as discrete sets of scattering center coordinates with associated amplitudes is explained in detail. The model is a three-dimensional extension of proven RADAR scattering models that treat the RADAR return as a sum of complex exponentials. A Fourier Transform converts this to impulses in the frequency domain where the relative phase difference between scattering centers is a wrapped phase term. If the viewing sphere is sampled densely enough, the phase is unambiguously unwrapped. The minimum sampling interval is explicitly determined as a function of the extent of the target in wavelengths. A least squares solution determines the coordinates of each scattering center. Properties of the collection geometry allow the minimum sampling density of the viewing sphere to be increased, but at the cost of testing competing hypotheses to determine which one best fits the phase data. The complex RADAR return of a random object is created sampling a 1 degree(s) slice of the viewing sphere to validate the model-building algorithm All coordinates of the random object are extracted perfectly. Hopefully this algorithm can build three-dimensional scattering center models valid over the entire viewing sphere with each target represented as a discrete set of scattering centers. A rectangular window function associated with each scattering center would model persistence across the viewing sphere.
Procrustes Analysis (least-squares mapping) is typically used as a method of comparing the shape of two objects. This method relies on matching corresponding points (landmarks) from the data associated with each object. Typically, landmarks are physically meaningful locations (e.g. end of a nose) whose relationship to the whole object is known. Corresponding landmarks would be the same physical location on the two different individuals, and therefore Procrustes analysis is a reasonable method of measuring relative shape. However, in the application of automatic target recognition, the correspondence of landmarks is unknown. In other words, the description of the shape of an object is dependent upon the labeling of landmarks, an undesirable characteristic. In an attempt to circumvent the labeling problem (without exhaustively computing the factorial number of correspondences), this paper presents a label-invariant method of shape analysis. The label-invariant method presented in this paper uses measurements which are related to the measurements used in Procrustes Analysis. The label-invariant approach of shape measurement yields near-optimal results. A relation exists between Procrustes Analysis and the label-invariant measurements, however the relationship is not one to one. The goal is to further understand the implications of the nearly optimal results, and to further glean these intermediate results to form a measure of shape that is efficient and one to one with the Procrustes metric.
Equating objects based on shape similarity (for example scaled Euclidean transformations) is often desirable to solve the Automatic Target Recognition (ATR) problem. The Procrustes distance is a metric that captures the shape of an object independent of the following transformations: translation, rotation, and scale. The Procrustes metric assumes that all objects can be represented by a set of landmarks (i.e. points), that they have the same number of points, and that the points are ordered (i.e., the exact correspondence between the points is known from one object to the next). Although this correspondence is not known for many ATR problems, computationally feasible methods for examining all possible combinations are being explored. Additionally, most objects can be mapped to a shape space where translation, rotation, and scaling are removed, and distances between object points in this space can then form another useful metric. To establish a decision boundary in any classification problem, it is essential to know the a prior probabilities in the appropriate space. This paper analyzes basic objects (triangles) in two-dimensional space to assess how a known distribution in Euclidean space maps to the shape space. Any triangles whose three coordinate points are uniformly distributed within a two-dimensional box transforms to a bivariate independent normal distribution with mean (0,0) and standard deviations of 2 in Kendall shape space (two points of the triangle are mapped to {-1/2,0} and {1/2,0}). The Central Limit Theorem proves that the limit of sums of finite variance distributions approaches the normal distribution. This is a reasonable model of the relationship between the three Euclidean coordinates relative to the single Kendall shape space coordinate. This paper establishes the relationship between different objects in the shape space and the Procrustes distance, which is an established shape metric, between these objects. Ignoring reflections (because it is a special case), the Procrustes distance is isometric to the shape space coordinates. This result demonstrates that both Kendall coordinates and Procrustes distance are useful features for ATR.
Object-image relations (O-IRs) provide a powerful approach to performing detection and recognition with laser radar (LADAR) sensors. This paper presents the basics of O-I relations and shows how they are derived from invariants. It also explains and shows results of a computationally efficient approach applying covariants to 3-D LADAR data. The approach is especially appealing because the detection and segmentation processes are integrated with recognition into a robust algorithm. Finally, the method provides a straightforward approach to handling articulation and multi-scale decomposition.
Synthetic Aperture Radar (SAR) sensors are being developed with better resolution to improve target identification, but this improvement has a significant cost. Furthermore, higher resolution corresponds to more pixels per image and, consequently, more data to process. Here, the effect of resolution on a many class target identification problem is determined using high resolution SAR data with artificially reduced resolution, a Mean-Squared Error (MSE) criterion, and template matching. It is found each increase in resolution by a factor of two increases the average MSE between a target and possible confusers by five to ten percent. Interpolating SAR images in the spatial domain to obtain artificially higher resolution images results in an average MSE that is actually much worse than the original SAR images. Increasing resolution significantly improves target identification performance while interpolating low- resolution images degrades target identification performance.
KEYWORDS: Scattering, 3D modeling, Radar, 3D acquisition, 3D image processing, Automatic target recognition, 3D image reconstruction, Reflectors, Sensors, Databases
Automatic Target Recognition (ATR) is difficult in general, but especially with RADAR. However, the problem can be greatly simplified by using the 3-D reconstruction techniques presented at SPIE[Stuff] the previous 2 years. Now, instead of matching seemingly random signals in 1-D or 2-D, one must match scattering centers in 3-D. This method tracks scattering centers through an image collection sequence that would typically be used for SAR image formation. A major difference is that this approach naturally allows object motion (in fact the more the object moves, the better) and the resulting 'image' is a 3-D set of scattering centers scattering centers directly from synthetic data to build a database in anticipation of comparing the relative separability of these reconstructed scattering centers against more traditional approaches for doing ATR.
KEYWORDS: 3D modeling, Radar, Sensors, Image sensors, Scattering, 3D image processing, Motion models, Data modeling, Systems modeling, Synthetic aperture radar
Recent research in invariant theory has determined the fundamental geometric relation between objects and their corresponding 'images.' This relation is independent of the sensor (ex. RADAR) parameters and the transformations of the object. This relationship can be used to extract 3-D models from image sequences. This capability is extremely useful for target recognition, image sequence compression, understanding, indexing, interpolating, and other applications. Object/image relations have been discovered for different sensors by different researchers. This paper presents an intuitive form of the object/image relations for RADAR systems with the goal of enhancing interpretation. This paper presents a high level example of how a 3-D model is constructed directly from RADAR (or SAR) sequences (with or without independent motion). the primary focus is to provide a basic understanding of how this result can be exploited to advance research in many applications.
This paper presents a linear system approximation for automated analysis of passive, long-wave infrared (LWIR) imagery. The approach is based on the premise that for a time varying ambient temperature field, the ratio of object surface temperature to ambient temperature is independent of amplitude and is a function only of frequency. Thus, for any given material, it is possible to compute a complex transfer function in the frequency domain with real and imaginary parts that are indicative of the material type. Transfer functions for a finite set of ordered points on a hypothesized object create an invariant set for that object. This set of variates is then concatenated with another set of variates (obtained either from the same object or a different object) to form two random complex vectors. Statistical tests of affine independence between the two random vectors is facilitated by decomposing the generalized correlation matrix into canonical form and testing the hypothesis that the sample canonical correlations are all zero for a fixed probability of false alarm (PFA). In the case of joint Gaussian distributions, the statistical test is a maximum likelihood. Results are presented using real images.
Research on the formulation of invariant features for model-based object recognition has mostly been concerned with geometric constructs either of the object or in the imaging process. We describe a new method that identifies invariant features computed from long wave infrared (LWIR) imagery. These features are called thermophysical invariants and depend primarily on the material composition of the object. Features are defined that are functions of only the thermophysical properties of the imaged materials. A physics-based model is derived from the principle of conservation of energy applied at the surface of the imaged regions. A linear form of the model is used to derive features that remain constant despite changes in scene parameters/driving conditions. Simulated and real imagery, as well as ground truth thermo-couple measurements were used to test the behavior of such features. A method of change detection in outdoor scenes is investigated. The invariants are used to detect when a hypothesized material no longer exists at a given location. For example, one can detect when a patch of clay/gravel has been replaced with concrete at a given site. This formulation yields promising results, but it can produce large values outside a normally small range. Therefore, we adopt a new feature classification algorithm based on the theories of symmetric alpha- stable (S(alpha) S) distributions. We show that symmetric, alpha-stable distributions model the thermophysical invariant data much better than the Gaussian model and suggest a classifier with superior performance.
Research on the formulation of invariant features for model-based object recognition has mostly been concerned with geometric constructs either of the object or in the imaging process. We describe a new method that identifies invariant features computed from long wave infrared imagery. These features are called thermophysical invariants and depend primarily on the material composition of the object. We use this approach for identifying objects or changes in scenes viewed by downward looking infrared images. Features are defined that are functions of only the thermophysical properties of the imaged materials. A physics-based model is derived from the principle of conservation of energy applied at the surface of the imaged regions. A linear form of the model is used to derive features that remain constant despite changes in scene parameters/driving conditions. Simulated and real imagery, as well as ground truth thermo-couple measurements were used to test the behavior of such features. A method of change detection in outdoor scenes is investigated. The invariants are used to detect when a hypothesized material no longer exists at a given location. For example, one can detect when a patch of clay/gravel has been replaced with concrete at a given site.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.