一、Introdution

Why BEV

BEV(Bird’s-eye-view)三部曲之二:方法详解

BEV的难点

分类法

任务拓展

一些较新的数据集,例如(Lyft, Nuscenes, Argoverse),提供了

视角变换的主要方法

二、仅摄像头X语义分割

VPN (IROS 2020)

Cam2BEV (ITSC 2020)

A Sim2Real Deep Learning Approach for the Transformation of Images from Multiple Vehicle-Mounted Cameras to a Semantically Segmented Image in Bird’s Eye View

MonoLayout(WACV 2020)

MonoLayout: Amodal scene layout from a single image

PyrOccNet (CVPR 2020)

Predicting Semantic Map Representations from Images using Pyramid Occupancy Networks

Lift, Splat, Shoot (ECCV 2020,NVIDIA)

Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D

BEV Feature Stitching (RAL/ICRA 2022)

Understanding Bird’s-Eye View Semantic HD-Maps Using an Onboard Monocular Camera

GKT(arvix 22.06)

Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer

BEV(Bird’s-eye-view)三部曲之二:方法详解

三、仅摄像头X语义分割(Tranformers)

BEV(Bird’s-eye-view)三部曲之二:方法详解

PYVA: Projecting Your View Attentively (CVPR 2021)

Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation

BEVFormer (CVPR 2022 Workshop)

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers

ViT-BEVSeg (arxiv 2205)

ViT-BEVSeg: A Hierarchical Transformer Network for Monocular Birds-Eye-View Segmentation

BEV(Bird’s-eye-view)三部曲之二:方法详解
BEV(Bird’s-eye-view)三部曲之二:方法详解

四、3D目标检测等其它任务

FIERY (ICCV 2021)

Predicting the Future from Monocular Cameras in Bird’s-Eye View

BEV(Bird’s-eye-view)三部曲之二:方法详解

DETR3D (CoRL 2021)

DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries

BEV(Bird’s-eye-view)三部曲之二:方法详解
BEV(Bird’s-eye-view)三部曲之二:方法详解

PersFormer (ECCV 2022 Oral)

PersFormer: a New Baseline for 3D Laneline Detection

BEV(Bird’s-eye-view)三部曲之二:方法详解
BEV(Bird’s-eye-view)三部曲之二:方法详解

核心Proposed Perspective Transformer:

五、多模态融合

HDMapNET( ICRA 2022)

HDMapNet: An Online HD Map Construction and Evaluation Framework

主要解决两个问题:道路预测向量化和从相机前视图到鸟瞰图的视角转换。

FUTR3D (arxiv 2022.03)

BEV(Bird’s-eye-view)三部曲之二:方法详解
BEV(Bird’s-eye-view)三部曲之二:方法详解

BEV(Bird’s-eye-view)三部曲之二:方法详解

BEVFusion (arxiv 2022.05)

BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird’s-Eye View Representation
BEVFusion ranks first on nuScenes among all solutions.

BEV(Bird’s-eye-view)三部曲之二:方法详解
BEV(Bird’s-eye-view)三部曲之二:方法详解

核心:对BEV pooling的操作做了加速,从500ms 缩减到 12ms
BEV(Bird’s-eye-view)三部曲之二:方法详解

六、业界

Tesla’s Approach

参考:blog1 blog2 video

BEV(Bird’s-eye-view)三部曲之二:方法详解

  1. 首先raw image进来;
  2. 通过rectify layer, 把图片转到virtual camera下
  3. 通过一个RegNet, 其实是一个ResNet的形式,然后给出不同尺度下的features.
  4. 通过BiFPN, 把不同尺度下的features, 从上到下,又从下到上,来来回回对不同尺度下的features 做一个融合。双向FPN
  5. 通过transformer的形式投到BEV视角下,得到一个俯视的feature
  6. 给到feature queue里面,加入时序信号,video module 实际是对时序信号的一次融合。之后得到一个多camera 融合并加入了时序信号的features
  7. 最后给到不同的detection head 里面去做检测;

Reference

发表回复