Cited 0 time in
Indoor Scene Reconstruction From Monocular Video Combining Contextual and Geometric Priors
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Wen, Mingyun | - |
| dc.contributor.author | Sheng, Xuanyu | - |
| dc.contributor.author | Cho, Kyungeun | - |
| dc.date.accessioned | 2024-11-04T04:30:14Z | - |
| dc.date.available | 2024-11-04T04:30:14Z | - |
| dc.date.issued | 2024 | - |
| dc.identifier.issn | 2169-3536 | - |
| dc.identifier.issn | 2169-3536 | - |
| dc.identifier.uri | https://scholarworks.dongguk.edu/handle/sw.dongguk/56143 | - |
| dc.description.abstract | Recent advancements in three-dimensional (3D) indoor scene reconstruction from monocular videos using deep learning have gained considerable attention. However, existing methods remain insufficient compared to reconstructions using data obtained from 3D sensors. This is primarily because video data lacks explicit depth information. Depth inference from monocular videos is reliant on visual cues, such as texture, which can become ambiguous owing to lighting, reflections, and material properties. Most existing methods utilize convolutional neural networks (CNN) for feature extraction and integrate features from multiple viewpoints to generate 3D features. However, CNNs cannot capture effective features in areas with unclear visual cues owing to their limited perceptual fields in shallow layers. Thus, to overcome these issues, this study proposes a keyframe feature-generation module employing a pretrained vision transformer (ViT) that capitalize on their global perception to infer and synthesize features from areas with ambiguous visual cues. In addition, we employ a pretrained multi-view stereo network to generate the cost volume as a geometric feature. Moreover, the geometric features are further enhanced via the features extracted from a ViT. The effectiveness of the proposed approach is demonstrated on real-world datasets compared to existing methods. | - |
| dc.format.extent | 10 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | IEEE | - |
| dc.title | Indoor Scene Reconstruction From Monocular Video Combining Contextual and Geometric Priors | - |
| dc.type | Article | - |
| dc.publisher.location | 미국 | - |
| dc.identifier.doi | 10.1109/ACCESS.2024.3481250 | - |
| dc.identifier.scopusid | 2-s2.0-85206978915 | - |
| dc.identifier.wosid | 001340664500001 | - |
| dc.identifier.bibliographicCitation | IEEE Access, v.12, pp 153360 - 153369 | - |
| dc.citation.title | IEEE Access | - |
| dc.citation.volume | 12 | - |
| dc.citation.startPage | 153360 | - |
| dc.citation.endPage | 153369 | - |
| dc.type.docType | Article | - |
| dc.description.isOpenAccess | Y | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Computer Science | - |
| dc.relation.journalResearchArea | Engineering | - |
| dc.relation.journalResearchArea | Telecommunications | - |
| dc.relation.journalWebOfScienceCategory | Computer Science, Information Systems | - |
| dc.relation.journalWebOfScienceCategory | Engineering, Electrical & Electronic | - |
| dc.relation.journalWebOfScienceCategory | Telecommunications | - |
| dc.subject.keywordAuthor | Feature extraction | - |
| dc.subject.keywordAuthor | Three-dimensional displays | - |
| dc.subject.keywordAuthor | Image reconstruction | - |
| dc.subject.keywordAuthor | Costs | - |
| dc.subject.keywordAuthor | Estimation | - |
| dc.subject.keywordAuthor | Geometry | - |
| dc.subject.keywordAuthor | Computational modeling | - |
| dc.subject.keywordAuthor | Cameras | - |
| dc.subject.keywordAuthor | Transformers | - |
| dc.subject.keywordAuthor | Deep learning | - |
| dc.subject.keywordAuthor | Mesh generation | - |
| dc.subject.keywordAuthor | feature extraction | - |
| dc.subject.keywordAuthor | 3D scene reconstruction | - |
| dc.subject.keywordAuthor | mesh reconstruction | - |
| dc.subject.keywordAuthor | vision transformer | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
30, Pildong-ro 1-gil, Jung-gu, Seoul, 04620, Republic of Korea+82-2-2260-3114
Copyright(c) 2023 DONGGUK UNIVERSITY. ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
