问题描述
我正在研究分类问题(自动驾驶汽车的对象分类).我使用KITTI提供的数据集,该数据集提供激光雷达和相机数据,并希望使用这两个数据来执行任务.
I am working on a classification problem (object classification for autonomous vehicle). I use a dataset from KITTI which provide Lidar and camera Data and want to use both of this data to perform the task.
3D LIDAR数据投影到RGB图像的坐标系上,从而生成稀疏的LIDAR图像:
3D LIDAR data is projected onto the coordinatesystem of the RGB image resulting in a sparse LIDAR image :
每个像素都使用深度(到点的距离:sqrt(X²+Y²),在0到255之间缩放)进行编码.
Each pixel is encoding using depth (distance to the point : sqrt(X² + Y²), scaling between 0 and 255).
为了获得CNN更好的结果,我需要一个密集的激光雷达图像,有人知道如何使用python做到这一点吗?
In order to obtain better results for my CNN, I need a dense lidar image, anyone know how to do it using python?
我想获得这样的东西
提前谢谢
推荐答案
我以前从未使用过点云数据/LIDAR,但由于没有人回答,我会尽力而为.我不确定每种方法的修复方法,尽管我认为它们可能无法很好地工作(除了可变方法之外,我认为这会很慢).但是,如果您的目标是将3D LIDAR读数(伴随有环号和激光强度读数)投影到密集的2D矩阵(用于CNN中),则以下参考资料可能会很有用.此外,在本文中,他们引用了以前的工作(快速的衣领线段Velodyne点云的里程表估计),其中更详细地介绍了极地合并技术,并提供了C ++代码.查阅论文,但我将尝试在此处总结该技术:
I've never worked with point-cloud data/LIDAR before, but as nobody has answered yet, I'll give it my best shot. I'm not sure about inpainting approaches per-say, though I imagine they might not work very well (except for maybe a variational method, which I presume would be quite slow). But if your goal is to project the 3D LIDAR readings (when accompanied by ring ids and laser intensity readings) into a dense 2D matrix (for use in a CNN), the following reference might prove useful. Additionally, in this paper they reference a previous work (Collar Line Segments for Fast Odometry Estimation from Velodyne Point Clouds) which covers the technique of polar binning in more detail, and has C++ code available. Check out the papers, but I'll try and summarize the technique here:
用于Velodyne LiDAR数据中非常快速地面分割的CNN -在第III.A节(将稀疏3D数据编码为密集2D矩阵中)描述了其预处理技术.
CNN for Very Fast Ground Segmentation in Velodyne LiDAR Data- Describes its preprocessing technique in section III.A (Encoding Sparse 3D Data Into a Dense 2D Matrix).
- 1)令P代表原始点云,M代表您希望输出的多通道密集矩阵. M的大小取决于扫描中使用的激光束数量和扫描仪的水平角分辨率.
- 2)将点云数据聚合到极坐标箱b(r,c)中,其中r代表环ID,c = floor((R * atan(x/z)+ 180)/360).
- 3)使用以下映射将bin b(r,c)映射到矩阵M,m(r,c)中的对应值,其中p ^ i是激光强度读数:
- 4)如果是空箱,则从其邻域线性内插m(r,c)的值.
最后,看看下面的论文,他们介绍了一些在CNN中使用稀疏Velodyne读数的技术.也许看看其中有没有改善您的表现?
Finally, looking at the following paper, they introduce some techniques for using the sparse Velodyne readings in a CNN. Maybe see if any of these improve your performance?
使用完全卷积网络从3D激光雷达进行车辆检测-在第III.A节(数据准备)中描述了其预处理技术.
Vehicle Detection from 3D Lidar Using Fully Convolutional Network- Describes its preprocessing technique in section III.A (Data Preparation).
将范围数据编码为2通道图像
- 1)初始化2通道矩阵I;用零填充
- 2)在给定坐标(x,y,z)的情况下,让theta = atan2(y,x)和让phi = arcsin(z/sqrt(x ^ 2 + y ^ 2 + z ^ 2))
- 3)让delta_theta和delta_phi分别等于连续光束发射器之间的平均水平和垂直分辨率.
- 4)令r = floor(theta/delta_theta);令c = floor(phi/delta_phi)
- 5)令d = sqrt(x ^ 2 + y ^ 2)
- 6)令I(r,c)=(d,z);如果两点投影到同一位置(稀有),则使那一点更靠近观察者
不均等(上/下)采样
- 在第一卷积层中,作者对水平方向4向下采样,对垂直方向2向下采样;这是因为对于Velodyne点图,点在水平层中更密集.他们在最终的反卷积层(同时预测车辆的物体"及其边界框)中以相同的因子进行上采样.
所有技术都是针对KITTI数据集/Velodyne LIDAR实施的,因此我认为它们可以针对您的特定用例工作(也许进行一些修改).
All techniques are implemented with respect to the KITTI dataset/Velodyne LIDAR, so I imagine they could work (perhaps with some modification) for your particular use-case.
这篇关于将稀疏2D LIDAR图像修复为密集深度图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!