Video classification is a crucial aspect when we discuss human-machine interface as it helps to analyze various activities. Using transfer learning techniques can help us in making predictions accurately. The dataset used for research is a subdivision of the UCF101-Action Recognition Dataset, consisting of 10 classes in total, where each class contains more than 120 videos. Each video is converted into a series of frames at a frame rate of 5. Feature extraction is performed on these frames using InceptionV3. The fine-tuned model architecture is composed of 4 dense layers. These layers are built using “relu” activation function with 1024, 512, 256 and 128 neurons respectively and another dense layer is built using “softmax” activation function with 10 neurons so as to predict 10 classes. This technique finds a huge range of applications related to human-machine interface such as helping the visually challenged people in classifying various activities.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.