本文介绍了订单数据在ggplot2中绘制barplot的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 29岁程序员,3月因学历无情被辞! 我需要建立我的数据的barplot,显示不同样本中的细菌相对丰度(每个列在总数据集中总和应为1)。 子集我的数据: > mydata Taxon CD6 CD1 CD12 Actinomycetaceae; g__Actinomyces 0.031960309 0.066683743 0.045638509 Coriobacteriaceae; g__Atopobium 0.018691589 0.003244536 0.00447774 棒状杆菌科; g__棒状杆菌0.001846083 0.006403689 0.000516662 Micrococcaceae; g__Rothia 0.001730703 0.000426913 0.001894429 Porphyromonadaceae; g__Porphyromonas 0.073497173 0.065915301 0.175406872 (CD6,CD1,CD12),其中y值是细菌种类的相对丰度(Taxon列)。 I认为(但我不确定)我的数据格式不适合做这个情节,因为我没有一个变量来组合,就像我发现的例子一样...... 有没有一种方法可以将数据排序为正确的inp ut到这个代码? 或者我该如何修改它? Thanks!解决方案你想要这样的东西吗? #sample data df Taxon CD6 CD1 CD12 Actinomycetaceae; g__Actinomyces 0.031960309 0.066683743 0.045638509 Coriobacteriaceae; g__Atopobium 0.018691589 0.003244536 0.00447774 Corynebacteriaceae; g__Corynebacterium 0.001846083 0.006403689 0.000516662 Micrococcaceae; g__Rothia 0.001730703 0.000426913 0.001894429 Porphyromonadaceae; g__Porphyromonas 0.073497173 0.065915301 0.175406872) #将宽数据格式转换为长格式 require(reshape2) df.long measure.vars = grep(CD \\d +,names(df),val = T), variable.name =sample, value.name =value) #计算比例 require(plyr) df.long< - ddply(df.long,。(sample), transform,value = value / sum(value)) #以id 的顺序排列样本df.long $ sample< - reorder(df.long $ sample,as.numer ic(sub(CD,,df.long $ sample))) #plot using ggplot require(ggplot2) ggplot(df.long,aes (x = sample,y = value,fill = Taxon))+ geom_bar(stat =identity)+ scale_fill_manual(values = scales :: hue_pal(h = c(0,360)+ 15,#添加手动颜色c = 100,l = 65, h.start = 0, direction = 1)(length(levels(df $ Taxon))))) I need to build a barplot of my data, showing bacterial relative abundance in different samples (each column should sum to 1 in the complete dataset).A subset of my data:> mydataTaxon CD6 CD1 CD12Actinomycetaceae;g__Actinomyces 0.031960309 0.066683743 0.045638509Coriobacteriaceae;g__Atopobium 0.018691589 0.003244536 0.00447774Corynebacteriaceae;g__Corynebacterium 0.001846083 0.006403689 0.000516662Micrococcaceae;g__Rothia 0.001730703 0.000426913 0.001894429Porphyromonadaceae;g__Porphyromonas 0.073497173 0.065915301 0.175406872What I'd like to have is a bar for each sample (CD6, CD1, CD12), where the y values are the relative abundance of bacterial species (the Taxon column).I think (but I'm not sure) my data format is not right to do the plot, since I don't have a variable to group by like in the examples I found...Is there a way to order my data making them right as input to this code?Or how can I modify it?Thanks! 解决方案 Do you want something like this? # sample datadf <- read.table(header=T, sep=" ", text="Taxon CD6 CD1 CD12Actinomycetaceae;g__Actinomyces 0.031960309 0.066683743 0.045638509Coriobacteriaceae;g__Atopobium 0.018691589 0.003244536 0.00447774Corynebacteriaceae;g__Corynebacterium 0.001846083 0.006403689 0.000516662Micrococcaceae;g__Rothia 0.001730703 0.000426913 0.001894429Porphyromonadaceae;g__Porphyromonas 0.073497173 0.065915301 0.175406872")# convert wide data format to long formatrequire(reshape2)df.long <- melt(df, id.vars="Taxon", measure.vars=grep("CD\\d+", names(df), val=T), variable.name="sample", value.name="value")# calculate proportionsrequire(plyr)df.long <- ddply(df.long, .(sample), transform, value=value/sum(value))# order samples by iddf.long$sample <- reorder(df.long$sample, as.numeric(sub("CD", "", df.long$sample)))# plot using ggplotrequire(ggplot2)ggplot(df.long, aes(x=sample, y=value, fill=Taxon)) + geom_bar(stat="identity") + scale_fill_manual(values=scales::hue_pal(h = c(0, 360) + 15, # add manual colors c = 100, l = 65, h.start = 0, direction = 1)(length(levels(df$Taxon)))) 这篇关于订单数据在ggplot2中绘制barplot的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 上岸,阿里云! 08-15 22:34