本文介绍了为什么可以通过形状/地理解析此“无效"的众所周知的二进制文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解析知名二进制文件的二进制编码地理信息系统(GIS)中使用的几何对象.我正在使用此ESRI规范(相同结果此处来自esri ).我从渗透解析OpenStreetMap数据的工具(特别是 pgsimp-dump格式,该格式为十六进制二进制的表示形式.

I am trying to parse Well Known Binary a binary encoding of geometry objects used in Geographic Information Systems (GIS). I am using this spec from ESRI (same results here from esri). I have input data from Osmosis a tool to parse OpenStreetMap data, specifically the pgsimp-dump format which gives the hex represenation of the binary.

ESRI文档说,Point只能有21个字节,字节顺序应只有1个字节,typeid的uint32应该有4个字节,双精度x时应为8个,双精度y时应为8个.

The ESRI docs say that there should only be 21 bytes for a Point, 1 byte for byte order, 4 for uint32 for typeid, and 8 for double x and 8 for double y.

这个渗透(十六进制)示例就是一个例子:0101000020E6100000DB81DF2B5F7822C0DFBB7262B4744A40,它长25个字节.

An example from osmosis is this (hex) example: 0101000020E6100000DB81DF2B5F7822C0DFBB7262B4744A40, which is 25 bytes long.

Shapely 一个基于W语言库C的python程序,用于解析WKB(等), GEOS 能够解析该字符串:

Shapely a python programme to parse WKB (etc), which is based on the popular C library GEOS is able to parse this string:

>>> import shapely.wkb
>>> shapely.wkb.loads("0101000020E6100000DB81DF2B5F7822C0DFBB7262B4744A40", hex=True)
<shapely.geometry.point.Point object at 0x7f221f2581d0>

当我要求Shapely从中解析然后转换为WKB时,我得到21个字节.

When I ask Shapely to parse from then convert to WKB I get a 21 bytes.

>>> shapely.wkb.loads("0101000020E6100000DB81DF2B5F7822C0DFBB7262B4744A40", hex=True).wkb.encode("hex").upper()
'0101000000DB81DF2B5F7822C0DFBB7262B4744A40'

区别是中间的4个字节,对于typeif = d,在uint32中出现了3个字节

The difference is the 4 bytes in the middle, which appear 3 bytes into the uint32 for the typeif=d

01010000**20E61000**00DB81DF2B5F7822C0DFBB7262B4744A40

当无效的WKB时,为什么可以通过形状/地理解析此WKB?这些字节是什么意思?

Why can shapely/geos parse this WKB when it's invalid WKB? What do these bytes mean?

推荐答案

GEOS/精心使用WKT/WKB的扩展版本,称为EWKT/EWKB,其为 .如果您有权访问PostGIS,则可以在此处查看发生的情况:

GEOS / Shapely use an Extended variant of WKT/WKB called EWKT / EWKB, which is documented by PostGIS. If you have access to PostGIS, you can see what's going on here:

SELECT ST_AsEWKT('0101000020E6100000DB81DF2B5F7822C0DFBB7262B4744A40'::geometry);

返回EWKT SRID=4326;POINT(-9.2351011 52.9117549).因此,额外的数据是空间参考标识符或SRID.专门针对WGS 84 EPSG:4326 .

Returns the EWKT SRID=4326;POINT(-9.2351011 52.9117549). So the extra data was the spatial reference identifier, or SRID. Specifically EPSG:4326 for WGS 84.

Shapely 不支持SRID ,但是有一些小技巧,例如:

Shapely does not support SRIDs, however there are a few hacks, e.g.:

from shapely import geos
geos.WKBWriter.defaults['include_srid'] = True

现在应该使wkbwkb_hex输出EWKB,其中包括SRID.默认值为False,它将为2D几何体(而不是3D)输出ISO WKB.

should now make wkb or wkb_hex output the EWKB, which includes the SRID. The default is False, which would output ISO WKB for 2D geometries (but not for 3D).

因此,看来您的目标是将EWKB转换为ISO WKB,您只能将GEOS/Shapely用于2D几何.如果您具有3D(Z或M)或4D(ZM)几何,则只有PostGIS能够执行此转换.

So it seems your objective is to convert EWKB to ISO WKB, which you can do with GEOS / Shapely for 2D geometries only. If you have 3D (Z or M) or 4D (ZM) geometries, then only PostGIS is able to do this conversion.

这篇关于为什么可以通过形状/地理解析此“无效"的众所周知的二进制文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-18 21:55