Real-time clothing recognition is a useful technique for describing individuals from the video stream captured by surveillance cameras. In most cases, surveillance cameras are shot from a high angle and a long distance, so the clothing objects in the captured images are small and blurred. Traditional object detection models are designed to locate obvious and clear objects and then classify them into different categories. They are not sensitive enough to detect or even neglect small and blurry clothes in surveillance camera images. To this end, we propose an effective real-time clothing detection model- Multi-Scale Tiny attention-based networks, called MuST. We also collect 5000 surveillance camera images and artificially annotate clothes into 112 categories to build an accurate benchmark dataset for clothing recognition. In particular, we develop a special tiny decoupled prediction head that helps to detect small and fuzzy clothes more accurately. Moreover, clothes are usually sheltered or affected by distracting environments, which may negatively affect the classification accuracy. Therefore, we introduce the novel multi-scale concatenation module to integrate and contrastingly analyze the information of the clothing objects and their local environment. Finally, MuST can better localize and classify small and fuzzy clothing objects. Experimental results on real-world collected clothing recognition datasets prove that MuST achieves the best recognition accuracy among all real-time object recognition models. Moreover, MuST achieves optimal inference speed among all real-time and non-real-time object recognition models.
In the interdisciplinary field of finance and computing, scholars have proposed a method to calculate investor sentiment and thus predict stock trends from investor comment section data, which has also been transferred to perform analysis of the emotion of the news texts that would affect the investors’ decisions on investment. After the great success of textual sentiment analysis, some scholars found that the pictures’ emotions in the news have similar influences on investor sentiment. However, the problem is that, unlike the texts which can reflect the editor’s attitude directly and effectively, since some editors usually do not deliberately pay attention to the emotions of news pictures, most of them are published as how they look when taken. Few editors intentionally change pictures’ emotions according to their attitudes in post-processing moments. Thus, to apply a more precise stock forecasting model to automatic buying and selling software, it seems significant to figure out whether the news pictures and the texts express similar attitudes and in what kinds of news pictures can accurately convey the attitude of the news editor. We collected articles about finance from four sources and conducted a correlation analysis between textual sentiments and pictures’ emotions to solve these two problems. A modified NLP model provides textual sentiments, and a two-category deep learning model gives pictures’ emotions. On the whole, this work not only helps improve the accuracy of the automatic buying and selling software but also makes automatic control more intelligent.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.