Action Recognition in Videos Using Pre-Trained 2D Convolutional Neural Networks

Kim, Jun-Hwa; Won, Chee Sun

Detailed Information

Cited 22 time in webofscience

Cited 26 time in scopus

Metadata Downloads

Action Recognition in Videos Using Pre-Trained 2D Convolutional Neural Networks

Full metadata record

DC Field	Value	Language
dc.contributor.author	Kim, Jun-Hwa	-
dc.contributor.author	Won, Chee Sun	-
dc.date.accessioned	2023-04-28T00:41:18Z	-
dc.date.available	2023-04-28T00:41:18Z	-
dc.date.issued	2020	-
dc.identifier.issn	2169-3536	-
dc.identifier.uri	https://scholarworks.dongguk.edu/handle/sw.dongguk/7167	-
dc.description.abstract	A pre-trained 2D CNN (Convolutional Neural Network) can be used for the spatial stream in the two-stream CNN structure for videos, treating the representative frame selected from the video as an input. However, the CNN for the temporal stream in the two-stream CNN needs training from scratch using the optical flow frames, which demands expensive computations. In this paper, we propose to adopt a pre-trained 2D CNN for the temporal stream to avoid the optical flow computations. Specifically, three RGB frames selected at three different times in the video sequence are converted into grayscale images and are assigned to three R(red), G(green), and B(blue) channels, respectively, to form a Stacked Grayscale 3-channel Image (SG3I). Then, the pre-trained 2D CNN is fine-tuned by SG3Is for the temporal stream CNN. Therefore, only pre-trained 2D CNNs are used for both spatial and temporal streams. To learn long-range temporal motions in videos, we can use multiple SG3Is by partitioning the video shot into sub-shots and a single SG3I is generated for each sub-shot. Experimental results show that our two-stream CNN with the proposed SG3Is is about 14.6 times faster than the first version of the two-stream CNN with the optical flow, and yet achieves a similar recognition accuracy for UCF-101 and a 5.7% better result for HMDB-51.	-
dc.format.extent	10	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC	-
dc.title	Action Recognition in Videos Using Pre-Trained 2D Convolutional Neural Networks	-
dc.type	Article	-
dc.publisher.location	미국	-
dc.identifier.doi	10.1109/ACCESS.2020.2983427	-
dc.identifier.scopusid	2-s2.0-85083249776	-
dc.identifier.wosid	000527413100044	-
dc.identifier.bibliographicCitation	IEEE ACCESS, v.8, pp 60179 - 60188	-
dc.citation.title	IEEE ACCESS	-
dc.citation.volume	8	-
dc.citation.startPage	60179	-
dc.citation.endPage	60188	-
dc.type.docType	Article	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalResearchArea	Telecommunications	-
dc.relation.journalWebOfScienceCategory	Computer Science, Information Systems	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.relation.journalWebOfScienceCategory	Telecommunications	-
dc.subject.keywordAuthor	Convolutional neural network (CNN)	-
dc.subject.keywordAuthor	action recognition	-
dc.subject.keywordAuthor	video analysis	-
dc.subject.keywordAuthor	two-stream convolutional neural networks	-

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Engineering > Department of Electronics and Electrical Engineering > 1. Journal Articles

Show simple item record

qrcode

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

30, Pildong-ro 1-gil, Jung-gu, Seoul, 04620, Republic of Korea+82-2-2260-3114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE