本文介绍了R ggplot2与shapefile和csv数据合并以填充多边形的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 我们每天制作地图,显示我们地区30个不同地区的温度计算水平,每个地区都根据水平填充不同的颜色。这张地图看起来像 现在我想将地图生成切换为R.我已经下载了省和市级边界(可以找到 但它给出错误 lockquote Error fix.by(by.x,x):'by'必须指定一个唯一有效的列 我对shapefile不熟悉,也许我需要了解更多有关shp数据属性的信息以找到正确的选项合并两个数据集。我怎样才能合并数据,以便我可以绘制线条(市政边界),然后用水平填充它? 解决方案 [注意:这个问题在一个月前被问过,所以OP可能找到了解决问题的另一种方法。在处理这个相关问题时,我偶然发现了它。这个答案是包含在希望它会使其他人受益的。] 这似乎是OP要求的...... ...和用以下代码生成: require(rgdal) require(maptools)要求(ggplot2)要求(plyr) #读取温度数据 setwd(<位置如果您的数据文件>) temp.data< - read.csv(file =levels.dat,header = TRUE,sep =,na.string =NA,dec =。,strip.white = TRUE) temp.data $ CODINE< - str_pad(temp.data $ CODINE,width = 5,side ='left',pad ='0') #读取自治区多边形 setwd (< shapefile的位置) esp< - readOGR(dsn =。,layer =poligonos_municipio_etrs89) muni< - subset(esp,esp $ PROVINCIA == 46| esp $ PROVINCIA ==12| esp $ PROVINCIA ==3)#强化和合并:muni.df用于ggplot mu ni @ data $ id< - rownames(muni @ data) muni.df< - fortify(muni) muni.df< - join(muni.df,muni @ data,by = id) muni.df< - merge(muni.df,temp.data,by.x =CODIGOINE,by.y =CODINE,all.x = T,a..ly = F)#创建地图图层 ggp< - ggplot(data = muni.df,aes(x = long,y = lat,group = group)) ggp< - ggp + geom_polygon(aes(fill = LEVEL))#绘制多边形 ggp< - ggp + geom_path(color =gray,linestyle = 2)#绘制边界 ggp< - ggp + coord_equal() ggp< - ggp + scale_fill_gradient(low =#ffffcc,high =#ff4444, space =Lab,na.value =grey50, b guide =colourbar) ggp< - ggp + labs(title =温度级别:Comunitat Valenciana)#渲染地图 print(ggp) 说明: 导入的shapefile与 readOGR(...)的类型为 SpacialDataF rame ,并且有两个主要部分:包含每个多边形上所有点的坐标的 ploygon 部分和包含信息的 data 部分。关于每个多边形(所以每个多边形一行)。例如,可以使用 muni @ polygons 和 muni @ data 来引用它们。效用函数 fortify(...)将多边形部分转换为一个数据框,用 ggplot 绘制。所以基本的工作流程是: $ p $ [1]导入温度数据文件(temp.data) [2]导入(muni) [3]将muni多边形转换为绘图数据框(muni.df< - fortify(...)) [4]从muni @ data中加入列至muni.df [5]将temp.data中的列加入muni.df [6]使情节 连接必须在公共字段完成,这就是大部分问题出现的地方。原始shapefile中的每个多边形都具有唯一的ID属性。在shapefile上运行 fortify(...)会创建一个基于此的 id 列。但数据部分没有ID列。相反,多边形ID存储为行名称。因此,首先我们必须添加一个 id 列到 muni @ data ,如下所示: muni @ data $ id 现在我们在 muni @ data 中有一个 id 字段和一个对应的中的$ c> id 字段,因此我们可以执行连接: muni.df< - join(muni.df,muni @ data,by =id) 要创建地图,我们需要根据温度级别设置填充颜色。为此,我们需要从 temp.data 到 muni.df中加入 LEVEL 列。在 temp.data 中有一个标识市政府的字段 CODINE 。现在,在 muni.df 中还有相应的字段 CODIGOINE 。但是有一个问题: CODIGOINE 是 char(5),前导零,而 CODINE 是整数,这意味着前导零缺失(从Excel导入,也许?)。所以只需加入这两个字段就不会产生任何匹配。我们必须先将 CODINE 转换为 char(5)前导零: temp.data $ CODINE< - str_pad(temp.data $ CODINE,width = 5,side ='left',pad ='0') 现在我们可以将 temp.dat 加入 muni.df 根据相应的字段。 muni.df pre> 我们使用 merge(...)而不是 join(... )因为连接字段有不同的名称, join(...)要求它们具有相同的名称。 (但是请注意, join(...)更快,应该尽可能使用)。因此,最后,我们有一个数据框,其中包含绘制多边形的所有信息以及可用于为每个多边形建立填充颜色的温度 LEVEL 。 关于OP原始代码的一些说明: OP的第一张地图绿色的顶部)确定了我们地区30个不同的区域......。我无法找到识别这些区域的shapefile。市政档案确定了543个城市,我看不出将这些城市分成30个区域。此外,温度水平文件有542行,每个城市一个(或多或少)。 OP为直辖市导入线形文件以绘制边界。您不需要这样做,因为 geom_polygon(...)会绘制(并填充)多边形,而 geom_path(...)将绘制边界。 We daily produce maps that show a calculated level for temperature in 30 distinct areas of our region, each area is filled with a different colour depending on the level. This maps look likeNow I want to switch map generation to R. I've downloaded provincial and municipal boundaries (you can find boundaries for whole Spain or here the subset for my region) and managed to plot them with ggplot2 following Hadley's example.I can also produce an ascii file that contains two columns: identifier (CODINE) and daily level. You can download here.This is my first script attempting to plot shapefiles with R and ggplot2 so there may be mistakes and for sure it can be improved, suggestions welcome. The following code (based on Hadley's previously mentioned) works for me:> require("rgdal")> require("maptools")> require("ggplot2")> require("plyr")# Reading municipal boundariesesp = readOGR(dsn=".", layer="lineas_limite_municipales_etrs89")muni=subset(esp, esp$PROV1 == "46" | esp$PROV1 == "12" | esp$PROV1 == "3")muni@data$id = rownames(muni@data)muni.points = fortify(muni, region="id")muni.df = join(muni.points, muni@data, by="id")# Reading province boundariesprov = readOGR(dsn=".", layer="poligonos_provincia_etrs89")pr=subset(prov, prov$CODINE == "46" | prov$CODINE == "12" | prov$CODINE == "03" )pr@data$id = rownames(pr@data)pr.points = fortify(pr, region="id")pr.df = join(pr.points, pr@data, by="id")ggplot(muni.df) + aes(long,lat,group=group) + geom_path(color="blue") ++ coord_equal()+ geom_path(data=pr.df, + aes(x=long, y=lat, group=group),color="red", size=0.5) This code plots a nice map with all the boundaries For polygon filling by level I tried to read and then merge as suggested in http://tormodboe.wordpress.com/2011/02/22/g%C3%B8y-med-kart-2/ but it gives an error I am not familiar with shapefiles, maybe I need to learn more on shp data attributes to find the right choice to merge both data sets. How can I merge data so I can plot the lines (municipal boundaries) and then fill it with levels? 解决方案 [NB: This question was asked over a month ago so OP has probably found a different way to solve their problem. I stumbled upon it while working on this related question. This answer is included in hopes it will benefit someone else.]This appears to be what OP is asking for...... and was produced with the following code:require("rgdal")require("maptools")require("ggplot2")require("plyr")# read temperature datasetwd("<location if your data file>")temp.data <- read.csv(file = "levels.dat", header=TRUE, sep=" ", na.string="NA", dec=".", strip.white=TRUE)temp.data$CODINE <- str_pad(temp.data$CODINE, width = 5, side = 'left', pad = '0')# read municipality polygonssetwd("<location of your shapefile")esp <- readOGR(dsn=".", layer="poligonos_municipio_etrs89")muni <- subset(esp, esp$PROVINCIA == "46" | esp$PROVINCIA == "12" | esp$PROVINCIA == "3")# fortify and merge: muni.df is used in ggplotmuni@data$id <- rownames(muni@data)muni.df <- fortify(muni)muni.df <- join(muni.df, muni@data, by="id")muni.df <- merge(muni.df, temp.data, by.x="CODIGOINE", by.y="CODINE", all.x=T, a..ly=F)# create the map layersggp <- ggplot(data=muni.df, aes(x=long, y=lat, group=group)) ggp <- ggp + geom_polygon(aes(fill=LEVEL)) # draw polygonsggp <- ggp + geom_path(color="grey", linestyle=2) # draw boundariesggp <- ggp + coord_equal() ggp <- ggp + scale_fill_gradient(low = "#ffffcc", high = "#ff4444", space = "Lab", na.value = "grey50", guide = "colourbar")ggp <- ggp + labs(title="Temperature Levels: Comunitat Valenciana")# render the mapprint(ggp)Explanation:Shapefiles imported into R with readOGR(...) are of type SpacialDataFrame and have two main sections: a ploygon section which contains the coordinates of all the points on each polygon, and a data section which contains information about each polygon (so, one row per polygon). These can be referenced, e.g., using muni@polygons and muni@data. The utility function fortify(...) converts the polygon section to a data frame organized for plotting with ggplot. So the basic workflow is:[1] Import temperature data file (temp.data)[2] Import polygon shapefile of municipalities (muni)[3] Convert muni polygons to a data frame for plotting (muni.df <- fortify(...))[4] Join columns from muni@data to muni.df[5] Join columns from temp.data to muni.df[6] Make the plotThe joins must be done on common fields, and this is where most of the problems come in. Each polygon in the original shapefile has a unique ID attribute. Running fortify(...) on the shapefile creates a column, id, which is based on this. But there is no ID column in the data section. Instead, the polygon IDs are stored as row names. So first we must add an id column to muni@data as follows:muni@data$id <- rownames(muni@data)Now we have an id field in muni@data and a corresponding id field in muni.df, so we can do the join:muni.df <- join(muni.df, muni@data, by="id")To create the map we will need to set fill colors based on temperature level. To do that we need to join the LEVEL column from temp.data to muni.df. In temp.data there is a field CODINE which identifies the municipality. There is also, now, a corresponding field CODIGOINE in muni.df. But there's a problem: CODIGOINE is char(5), with leading zeros, whereas CODINE is integer which means leading zeros are missing (imported from Excel, perhaps?). So just joining on these two fields produces no matches. We must first convert CODINE into char(5) with leading zeros:temp.data$CODINE <- str_pad(temp.data$CODINE, width = 5, side = 'left', pad = '0')Now we can join temp.dat to muni.df based on the corresponding fields.muni.df <- merge(muni.df, temp.data, by.x="CODIGOINE", by.y="CODINE", all.x=T, a..ly=F)We use merge(...) instead of join(...) because the join fields have different names and join(...) requires them to have the same name. (Note, however that join(...) is faster and should be used if possible). So, finally, we have a data frame which contains all the information for plotting the polygons and the temperature LEVEL which can be used to establish the fill color for each polygon.Some notes on OP's original code:OP's first map (the green one at the top) identifies "30 distinct areas for our region...". I could find no shapefile identifying those areas. The municipality file identifies 543 municipalities, and I could see no way to group these into 30 areas. In addition, the temperature level file has 542 rows, one for each municipality (more or less).OP was importing line shapefiles for municipality to draw the boundaries. You don't need that because geom_polygon(...) will draw (and fill) the polygons and geom_path(...) will draw the boundaries. 这篇关于R ggplot2与shapefile和csv数据合并以填充多边形的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
10-15 09:43