本文介绍了循环在特定列上匹配模式(在数据框中)上跨行执行计算?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 我有一个数据框与一些布尔值(1/0)如下(抱歉,我不知道如何把它做成一个聪明的表) Flag1.Sam Flag2.Sam Flag3.Sam Flag1.Ted Flag2.Ted Flag3.Ted probe1 0 1 0 1 0 0 probe2 0 0 0 0 0 0 probe3 1 0 0 0 0 0 probe4 0 0 0 0 0 0 probe5 1 1 0 1 0 0 我有64个样本(Sam / Ted ....等),它们在一个名为files的列表中; files 我想创建一个列,将每个样本的标志值相加来创建以下内容: Sam Ted probe1.flagsum 1 1 probe2.flagsum 0 0 probe3.flagsum 1 0 probe4.flagsum 0 0 probe5.flagsum 2 1 我对R相当陌生,试图学习需要知道的基础,但我已经尝试了以下内容: $ { $ FLAGS $ i #greping列过滤一个样本 filter1< ; - grep(names(filters),pattern = y)#列出这些列的总和值 FLAGS $ y } } 上面的代码不起作用,前面。 任何人都可以帮我解决这个问题,或者指点我使用的命令/工具的正确方向。 谢谢。解决方案 这很容易在R reshape ,尽管使用重塑或 reshape2 packa ges可能更直观。 这是一个基于R的解决方案: #这里是你的数据目前的形式 dat = read.table(header = TRUE,text =Flag1.Sam Flag2.Sam Flag3.Sam Flag1.Ted Flag2.Ted Flag3.Ted probe1 0 1 0 1 0 0 probe2 0 0 0 0 0 0 probe3 1 0 0 0 0 0 probe4 0 0 0 0 0 0 probe5 1 1 0 1 0 0 )#生成一个ID行 dat $ id = row.names(dat)#重新整形为长 r.dat = reshape(dat,direction =long , timevar =probe, vary = 1:6,sep =。)#计算行数 r.dat $ sum = rowSums(r。 dat [3:5])#重塑成宽格式,放弃你不感兴趣的 reshape(r.dat,direction =wide, idvar =id ,timevar =probe, drop = 3:5) ## id sum.Sam sum.Ted ## probe1.Sam probe1 1 1 ## probe2 .Sam probe2 0 0 ## probe3.Sam probe3 1 0 ## probe4.Sam probe4 0 0 ## probe5.Sam probe5 2 1 不止一种方法可以让猫变成皮肤 你也可以调用这样一个函数: myFun = function(data,varnames){ temp = vector(list,length(varnames)) for(i in 1:length(varnames)){ temp [[i]] = colSums(t(dat [grep(varnames [i],names(data))])) names(temp)[[i]] = varnames [i] } data.frame(temp)} $ b $ p然后,利用你有名字的向量: files = c(Sam,Ted) myFun(dat,files) ## Sam ted ## probe1 1 1 ## probe2 0 0 ## probe3 1 0 ## probe4 0 0 ## probe5 2 1 享受! I have a dataframe with some boolean values (1/0) as follows (sorry I couldn't work out how to make this into a smart table) Flag1.Sam Flag2.Sam Flag3.Sam Flag1.Ted Flag2.Ted Flag3.Tedprobe1 0 1 0 1 0 0probe2 0 0 0 0 0 0probe3 1 0 0 0 0 0probe4 0 0 0 0 0 0probe5 1 1 0 1 0 0I have 64 samples (Sam/Ted....etc) which are in a list called files i.e;files <- c("Sam", "Ted", "Ann", ....) And I would like to create a a column summing the flag values for each sample to create the following: Sam Ted probe1.flagsum 1 1probe2.flagsum 0 0 probe3.flagsum 1 0 probe4.flagsum 0 0probe5.flagsum 2 1I am fairly new to R, trying to learn on a need to know basis but I have tried the following: for(i in files) { FLAGS$i <- cbind(sapply(i, function(y) { #greping columns to filter for one sample filter1 <- grep(names(filters), pattern=y) #print out the summed values for those columns FLAGS$y <-rowSums(filters[,(filter1)]) }}The above code does not work and I am bit lost as how to move forward.Can anyone help me untangle this problem or point me in the right direction of the commands/tools to use.Thank you. 解决方案 This is easily doable in base R reshape, though using the reshape or reshape2 packages might be more intuitive.Here's a solution in base R:# Here's your data in its current formdat = read.table(header=TRUE, text="Flag1.Sam Flag2.Sam Flag3.Sam Flag1.Ted Flag2.Ted Flag3.Tedprobe1 0 1 0 1 0 0probe2 0 0 0 0 0 0probe3 1 0 0 0 0 0probe4 0 0 0 0 0 0probe5 1 1 0 1 0 0")# Generate an ID rowdat$id = row.names(dat)# Reshape wide to longr.dat = reshape(dat, direction="long", timevar="probe", varying=1:6, sep=".")# Calculate row sumsr.dat$sum = rowSums(r.dat[3:5])# Reshape back to wide format, dropping what you're not interested inreshape(r.dat, direction="wide", idvar="id", timevar="probe", drop=3:5)## id sum.Sam sum.Ted## probe1.Sam probe1 1 1## probe2.Sam probe2 0 0## probe3.Sam probe3 1 0## probe4.Sam probe4 0 0## probe5.Sam probe5 2 1More than one way to skin a catYou can also whip up a function like this one:myFun = function(data, varnames) { temp = vector("list", length(varnames)) for (i in 1:length(varnames)) { temp[[i]] = colSums(t(dat[grep(varnames[i], names(data))])) names(temp)[[i]] = varnames[i] } data.frame(temp)}Then, making use of the vector that you have of names:files = c("Sam", "Ted")myFun(dat, files)## Sam Ted## probe1 1 1## probe2 0 0## probe3 1 0## probe4 0 0## probe5 2 1Enjoy! 这篇关于循环在特定列上匹配模式(在数据框中)上跨行执行计算?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
10-20 16:21