Commercial deep learning capabilities are available for many applications such as computer vision processing and intelligent chat bots. The Google Cloud Platform product Google Dialogflow provides lifelike conversational artificial intelligence (AI) using machine learning (ML) to generate natural conversations between computers and humans. This ML utilizes natural language understanding (NLU) to recognize a user’s intent and extracts key information into a form of entities. We have developed a user-friendly application through understanding the hazardous material database, first aid safety guidelines and observing the process of first responders who access this information in the field. We created the Trusted and Explainable Artificial Intelligence for Saving Lives (TruePAL) virtual assistant using Dialogflow1 and TensorFlow2 paired with EasyOCR.3 The chatbot supports first responders by providing voice interaction which helps limit additional steps such as browsing through multiple categories when searching for information. Using feedback from our field interviews, the voice interface has been developed to enable the first responder to focus on the immediate emergency. With less distractions, the first responder is able to engage the incident more effectively. The partial hands-free TruePAL chatbot assistant improves the accessibility to the correct guidance by an average of 1.9 seconds compared to the widely used application, NIH WISER, which requires full attention to operate. We combined this intelligent chatbot with a separate visual processing capability to produce hazardous signage analysis and generate the proper guidance for first responders. With the evolving functionality of AI tools, the use of virtual assistants in first responder technology will be an advancement, benefiting the safety of both first responders and civilians.
This paper presents the development of an AI assistant, Trusted and Explainable Artificial Intelligence for Saving Lives (TruePAL), to provide real-time warning of risks of potential crashes to the first responders. The TruePAL system employs an AI and deep learning technology for saving first responders and roadside crews lives in and around active traffic. A deep neural network (DNN) and a Non-Axiomatic Reasoning System (NARS) are implemented as an AI system. A mobile app with AI interface is developed to perform verbal communication with the first responders. The TruePAL team has developed an explainable AI approach by opening up the DNN blackbox to extract the activation filters of various features and parts of the targeted objects. The combination of DNN and NARS makes the TruePAL system explainable to the users. TruePAL ingests on-board cameras, radar, and other sensor signals, analyzes the environment and traffic patterns to generate timely warning to drivers and roadside crews to avoid crashes. The TruePAL team, in collaboration with the Miami/Dade Police Dept., has designed five use cases and multiple subscenarios in a CARLA driving simulator to test the capability of TruePAL in timely warning to the first responder drivers in potential crash scenarios. We have successfully demonstrated its capability of timely warning in over a dozen scenarios based on the use cases. The preliminary test simulation results show that TruePAL could provide the drivers and crew members advanced warning before a crash occurs.
Deep learning alone has achieved state-of-the-art results in many areas, from complex gameplay to predicting protein structures. In particular, in image classification and recognition, deep learning models have achieved much higher accuracy than humans. But sometimes it can be very difficult to debug if the deep learning model doesn't work. Deep learning models can be vulnerable and are very sensitive to changes in data distribution. Here, we combine deep learning-based object recognition and tracking with an adaptive neurosymbolic network agent, called the non-axiomatic reasoning system, that can adapt to its environment by building a concept based on perceptual sequences. We achieved an improved intersection-over-union (IOU) object recognition performance of 0.65 in the adaptive retraining model compared to IOU 0.31 in the COCO data pre-trained model. We improved the object detection limits using RADAR sensors in a simulated environment.
Flashover is a dangerous phenomenon caused by near-simultaneous ignition of exposed materials. It is one of the major causes of firefighter fatalities. Research has been done using CMOS vision cameras combined with thermal sensors to perform remote detection and dynamic classification of fire and smoke patterns. Tests and experiments have been done to detect fire and smoke remotely. The inexpensive visible and infrared sensors used in the tests corroborate and closely follow the detailed trends recorded by the more expensive (and less mobile) radiometers and thermocouples. Deep neural networks (DNN) have been used to detect, classify and track fire and smoke areas. Real-time segmentation is utilized to measure the fire and smoke boundaries. The segmentations are used to dynamically monitor fluctuations in temperature, fire size and smoke progression in the monitored areas. A fire and smoke progression curve has been drawn to predict the flashover point. In the paper, data analysis and preliminary results will be shown.
Paramedics often have to make lifesaving decisions within a limited time in an ambulance. They sometimes ask the doctor for additional medical instructions, during which valuable time passes for the patient. This study aims to automatically fuse voice and text data to provide tailored situational awareness information to paramedics. To train and test speech recognition models, we built a bidirectional deep recurrent neural network (long short-term memory (LSTM)). Then we used convolutional neural networks on top of custom-trained word vectors for sentence-level classification tasks. Each sentence is automatically categorized into four classes, including patient status, medical history, treatment plan, and medication reminder. Subsequently, incident reports were automatically generated to extract keywords and assist paramedics and physicians in making decisions. The proposed system found that it could provide timely medication notifications based on unstructured voice and text data, which was not possible in paramedic emergencies at present. In addition, the automatic incident report generation provided by the proposed system improves the routine but error-prone tasks of paramedics and doctors, helping them focus on patient care.
Acquiring large amounts of data for training and testing Deep Learning (DL) models is time consuming and costly. The development of a process to generate synthetic objects and scenes using 3D graphics software is presented. By programming the path and environment in a 3D graphical engine, complex objects and scenes can be generated for the purpose of training and testing a Deep Neural Network (DNN) model in specific vision tasks. An automatic process has been developed to label and segment objects in synthetic images and generate their corresponding ground truth files. Performances of DNNs trained with synthetic data have been shown to outperform DNNs trained with real data.
It is important to find the target as soon as possible for search and rescue operations. Surveillance camera systems and unmanned aerial vehicles (UAVs) are used to support search and rescue. Automatic object detection is important because a person cannot monitor multiple surveillance screens simultaneously for 24 hours. Also, the object is often too small to be recognized by the human eye on the surveillance screen. This study used an UAVs around the Port of Houston and fixed surveillance cameras to build an automatic target detection system that supports the US Coast Guard (USCG) to help find targets (e.g., person overboard). We combined image segmentation, enhancement, and convolution neural networks to reduce detection time to detect small targets. We compared the performance between the autodetection system and the human eye. Our system detected the target within 8 seconds, but the human eye detected the target within 25 seconds. Our systems also used synthetic data generation and data augmentation techniques to improve target detection accuracy. This solution may help the search and rescue operations of the first responders in a timely manner.
Infrared (IR) images are essential to improve the visibility of dark or camouflaged objects. Object recognition and segmentation based on a neural network using IR images provide more accuracy and insight than color visible images. But the bottleneck is the amount of relevant IR images for training. It is difficult to collect real-world IR images for special purposes, including space exploration, military and fire-fighting applications. To solve this problem, we created color visible and IR images using a Unity-based 3D game editor. These synthetically generated color visible and IR images were used to train cycle consistent adversarial networks (CycleGAN) to convert visible images to IR images. CycleGAN has the advantage that it does not require precisely matching visible and IR pairs for transformation training. In this study, we discovered that additional synthetic data can help improve CycleGAN performance. Neural network training using real data (N = 20) performed more accurate transformations than training using real (N = 10) and synthetic (N = 10) data combinations. The result indicates that the synthetic data cannot exceed the quality of the real data. Neural network training using real (N = 10) and synthetic (N = 100) data combinations showed almost the same performance as training using real data (N = 20). At least 10 times more synthetic data than real data is required to achieve the same performance. In summary, CycleGAN is used with synthetic data to improve the IR image conversion performance of visible images.
KEYWORDS: Speech recognition, Data modeling, Neural networks, Data centers, Data communications, Detection and tracking algorithms, Modulation, Speaker recognition, Visual process modeling, Machine vision
Transcribing voice communications in NASA’s launch control center is important for information utilization. However, automatic speech recognition in this environment is particularly challenging due to the lack of training data, unfamiliar words in acronyms, multiple different speakers and accents, and conversational characteristics of speaking. We used bidirectional deep recurrent neural networks to train and test speech recognition performance. We showed that data augmentation and custom language models can improve speech recognition accuracy. Transcribing communications from the launch control center will help the machine analyze information and accelerate knowledge generation.
Firefighters suffer a variety of life-threatening risks, including line-of-duty deaths, injuries, and exposures to hazardous substances. Support for reducing these risks is important. We built a partially occluded object reconstruction method on augmented reality glasses for first responders. We used a deep learning based on conditional generative adversarial networks to train associations between the various images of flammable and hazardous objects and their partially occluded counterparts. Our system then reconstructed an image of a new flammable object. Finally, the reconstructed image was superimposed on the input image to provide "transparency". The system imitates human learning about the laws of physics through experience by learning the shape of flammable objects and the flame characteristics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.