Heterogeneous Feature Fusion Module Based on CNN and Transformer for Multiview Stereo Reconstruction

Rui Gao, Jiajia Xu, Yipeng Chen, Kyungeun Cho

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

For decades, a vital area of computer vision research has been multiview stereo (MVS), which creates 3D models of a scene using photographs. This study presents an effective MVS network for 3D reconstruction utilizing multiview pictures. Alternative learning-based reconstruction techniques work well, because CNNs (convolutional neural network) can extract only the image’s local features; however, they contain many artifacts. Herein, a transformer and CNN are used to extract the global and local features of the image, respectively. Additionally, hierarchical aggregation and heterogeneous interaction modules were used to improve these features. They are based on the transformer and can extract dense features with 3D consistency and global context that are necessary to provide accurate matching for MVS.

Original languageEnglish
Article number112
JournalMathematics
Volume11
Issue number1
DOIs
StatePublished - Jan 2023

Keywords

  • 3D reconstruction
  • deep learning
  • multi-view stereo
  • transformer

Fingerprint

Dive into the research topics of 'Heterogeneous Feature Fusion Module Based on CNN and Transformer for Multiview Stereo Reconstruction'. Together they form a unique fingerprint.

Cite this