问题描述
我有一个这样的DataFrame(df):
I have a DataFrame (df) like this:
PointID Time geojson
---- ---- ----
36F 2016-04-01T03:52:30 {'type': 'Point', 'coordinates': [3.961389, 43.123]}
36G 2016-04-01T03:52:50 {'type': 'Point', 'coordinates': [3.543234, 43.789]}
geojson列包含geoJSON格式的数据(本质上是Python字典).
The geojson column contains data in geoJSON format (esentially, a Python dict).
我想以geoJSON格式创建一个新列,其中包括时间坐标.换句话说,我想将时间信息注入geoJSON信息中.
I want to create a new column in geoJSON format, which includes the time coordinate. In other words, I want to inject the time information into the geoJSON info.
对于单个值,我可以成功完成:
For a single value, I can successfully do:
oldjson = df.iloc[0]['geojson']
newjson = [df['coordinates'][0], df['coordinates'][1], df.iloc[0]['time'] ]
对于单个参数,我成功地将dataFrame.apply与lambda结合使用(感谢SO:
For a single parameter, I successfully used dataFrame.apply in combination with lambda (thanks to SO: related question
但是现在,我有两个参数,我想在整个DataFrame上使用它.由于我对.apply语法和lambda不确定,因此我什至不知道这样做是否可行.我想做这样的事情:
But now, I have two parameters, and I want to use it on the whole DataFrame. As I am not confident with the .apply syntax and lambda, I do not know if this is even possible. I would like to do something like this:
def inject_time(geojson, time):
"""
Injects Time dimension into geoJSON coordinates. Expects a dict in geojson POINT format.
"""
geojson['coordinates'] = [geojson['coordinates'][0], geojson['coordinates'][1], time]
return geojson
df["newcolumn"] = df["geojson"].apply(lambda x: inject_time(x, df['time'])))
...但是那不起作用,因为该函数将注入整个序列.
...but that does not work, because the function would inject the whole series.
我认为带有时间戳的geoJSON的格式应如下所示:
I figured that the format of the timestamped geoJSON should be something like this:
TimestampedGeoJson({
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": {
"type": "LineString",
"coordinates": [[-70,-25],[-70,35],[70,35]],
},
"properties": {
"times": [1435708800000, 1435795200000, 1435881600000]
}
}
]
})
因此time元素位于properties元素中,但这并没有太大改变问题.
So the time element is in the properties element, but this does not change the problem much.
推荐答案
您需要带有axis=1
的DataFrame.apply
用于按行处理:
You need DataFrame.apply
with axis=1
for processing by rows:
df['new'] = df.apply(lambda x: inject_time(x['geojson'], x['Time']), axis=1)
#temporary display long string in column
with pd.option_context('display.max_colwidth', 100):
print (df['new'])
0 {'type': 'Point', 'coordinates': [3.961389, 43.123, '2016-04-01T03:52:30']}
1 {'type': 'Point', 'coordinates': [3.543234, 43.789, '2016-04-01T03:52:50']}
Name: new, dtype: object
这篇关于Pandas DataFrame.apply:使用来自两列的数据创建新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!