一个图像值一千个字:
可观察到的行为:从上图可以看出,国家/地区的名称与它们的实际几何形状不匹配。
预期行为:我想正确地将数据框与其几何形状连接起来,并在ggmap中显示结果。
我以前加入了不同的数据框,但是由于ggmap
显然需要“加强”数据框(实际上我并不知道真正的含义)才能显示结果,所以事情变得错误。
这是我到目前为止所做的:
library(rgdal)
library(dplyr)
library(broom)
library(ggmap)
# Load GeoJSON file with countries.
countries = readOGR(dsn = "https://gist.githubusercontent.com/ccamara/fc26d8bb7e777488b446fbaad1e6ea63/raw/a6f69b6c3b4a75b02858e966b9d36c85982cbd32/countries.geojson")
# Load dataframe.
df = read.csv("https://gist.githubusercontent.com/ccamara/fc26d8bb7e777488b446fbaad1e6ea63/raw/a6f69b6c3b4a75b02858e966b9d36c85982cbd32/sample-dataframe.csv")
# Join geometry with dataframe.
countries$iso_a2 = as.factor(countries$iso_a2)
countries@data = left_join(countries@data, df, by = c('iso_a2' = 'country_code'))
# Convert to dataframe so it can be used by ggmap.
countries.t = tidy(countries)
# Here's where the problem starts, as by doing so, data has been lost!
# Recover attributes' table that was destroyed after using broom::tidy.
countries@data$id = rownames(countries@data) # Adding a new id variable.
countries.t = left_join(countries.t, countries@data, by = "id")
ggplot(data = countries.t,
aes(long, lat, fill = country_name, group = group)) +
geom_polygon() +
geom_path(colour="black", lwd=0.05) + # polygon borders
coord_equal() +
ggtitle("Data and geometry have been messed!") +
theme(axis.text = element_blank(), # change the theme options
axis.title = element_blank(), # remove axis titles
axis.ticks = element_blank()) # remove axis ticks
最佳答案
行为困惑是有原因的。countries
开始时是一个大型SpatialPolygonsDataFrame,其中包含 177个元素(以及countries@data
中的177行)。当您对left_join
和countries@data
执行df
时,countries
中的元素数不受影响,但是countries@data
中的行数增加到 210 。
使用countries
强化broom::tidy
会将countries
及其177个元素转换为id
从0到176的数据帧。(我不确定为什么它的索引从零开始,但是我通常还是更倾向于明确地指定区域)。
另一方面,根据id
将countries@data
添加到rownames(countries@data)
中,会导致id
值从1到210,因为这是countries@data
较早加入后df
中的行数。因此,一切都不同步。
请尝试以下操作:
# (we start out right after loading countries & df)
# no need to join geometry with df first
# convert countries to data frame, specifying the regions explicitly
# (note I'm using the name column rather than the iso_a2 column from countries@data;
# this is because there are some repeat -99 values in iso_a2, and we want
# one-to-one matching.)
countries.t = tidy(countries, region = "name")
# join with the original file's data
countries.t = left_join(countries.t, countries@data, by = c("id" = "name"))
# join with df
countries.t = left_join(countries.t, df, by = c("iso_a2" = "country_code"))
# no change to the plot's code, except for ggtitle
ggplot(data = countries.t,
aes(long, lat, fill = country_name, group = group)) +
geom_polygon() +
geom_path(colour="black", lwd = 0.05) +
coord_equal() +
ggtitle("Data and geometry are fine") +
theme(axis.text = element_blank(),
axis.title = element_blank(),
axis.ticks = element_blank())
p.s.您实际上不需要ggmap包。只是它加载的ggplot2软件包。
关于r - 如何使用ggmap正确连接数据和几何,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/46277574/