Detailed Information

Cited 19 time in webofscience Cited 26 time in scopus
Metadata Downloads

Object Detection-Based Video Retargeting With Spatial-Temporal Consistency

Authors
Lee, Seung JoonLee, SiyeongCho, Sung InKang, Suk-Ju
Issue Date
Dec-2020
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Keywords
Object detection; Object tracking; Distortion; Indexes; Computational complexity; Image sequences; Optimization; Object detection; object tracking; video retargeting; convolutional neural network
Citation
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, v.30, no.12, pp 4434 - 4439
Pages
6
Indexed
SCIE
SCOPUS
Journal Title
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
Volume
30
Number
12
Start Page
4434
End Page
4439
URI
https://scholarworks.dongguk.edu/handle/sw.dongguk/5850
DOI
10.1109/TCSVT.2020.2981652
ISSN
1051-8215
1558-2205
Abstract
This study proposes a video retargeting method using deep neural network-based object detection. First, the meaningful regions of the input video denoted by bounding boxes of the object detection are extracted. In this case, the area is defined considering the size and number of bounding boxes for objects detected. The bounding boxes of each frame image are considered as regions of interest (RoIs). Second, the Siamese object tracking network is used to address high computational complexity of the object detection network. By dividing the video into scenes, object detection is performed for the first frame image of each scene to obtain the first bounding box. Object tracking is performed for the next sequential frame image until a scene change is detected. Third, the image is resized in the horizontal direction to alter the aspect ratio of the image and obtain the 1D RoIs of the image by projecting bounding boxes in the vertical direction. Then, the proposed method computes the grid map from the 1D RoIs to calculate new coordinates of each column data of the image. Finally, the retargeted video is obtained by rearranging all retargeted frame images. Comparative experiments conducted with various benchmark methods show an average bidirectional similarity score of 1.92, which is higher than other conventional methods. The proposed method was stable and satisfied viewers without causing cognitive discomfort as conventional methods.
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Advanced Convergence Engineering > Department of Computer Science and Artificial Intelligence > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE