本文介绍了for循环:R中的大小不同的数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 我正在处理一个数据框,看起来像这 以下是它的外观: 小时周id步行类区域名称 22496 22/3/14 3 12 634工作区CBD区1 22670 22/3/14 3 12 220购物集群乌节路1 23287 22 / 3/14 3 12 723机场樟宜机场2 16430 22/3/14 4 12 947工作区CBD区2 4697 22/3/14 3 12 220住宅区Ang Mo Kio 2 4911 22/3/14 3 12 1001购物区Orchard Rd 3 11126 22/3/14 3 12 220住宅区Ang Mo Kio 2 等等...直到635行返回。 与我要比较的其他数据集可以找到此处 以下是它的样子: 类别外国人本地人工作群1600000 3623900 购物群1800000 3646666.667 机场15095152 8902705 住宅区527700 280000 以及这我要与他们的 previousHour 比较的最后一个数据集 第一个和第二个共享相同的属性,即 category &第一个和第三个数据集共享相同的属性 hour 。 对于 previousHour 基于类别。例如,工作集 这里 previousHour 应该如下所示: 小时 0 3 4 4 4 5 ,直到144行为每个类别返回...。 点击这里 购物类别 previousHour 。 购物应如下所示: hour 0 3 3 4 4 5 $ b b 直到144行返回... 点击 c 点击此处住宅 全部144行返回... SumHour 数据集: 类别sumHour 1机场2208 2住宅区1656 3购物群集1656 4工作群集1656 这里是我最想在R : #for n in 1:number of rows { #calculate sumHours(in SumHours dataset ) - previousHour = newHourSum并将其存储为newHourSum #计算小时/(newHourSum-previousHour)*外部人员并将其存储为footfallHour #添加到空数据框} 我不知道该怎么做,这是我试过的: mergetbl { newtbl = data.frame(hour = numeric(),forgHour = numeric ntbl1rows< -nrow(tbl1)#获取行数 for(n in 1:ntbl1rows) { #for n 1:行数{#检查从IDA数据集的前一小时!!!! #calculate sumDate - previousHour = newHourSum并将其存储为newHourSum #计算小时/(newHourSum-previousHour)*外部人员并将其存储为footfallHour #添加到空数据框} newHourSum footfallHour< - (tbl1 $ hour /(newHourSum-previousHour))* tbl2 $ Foreigners newtbl< - rbind(newtbl,footfallHour) } } newtbl ... $ c>: 小时forgHour 0 1337.79(函数应计算此) 3 .. 。 3 ... 3 ... 4 ... 3 ... 等等... 解决方案 : 尝试以下操作: ###你的外国人/当地人的大小与tbl1相同 外国人= ifelse(tbl1 $ category ==工作集群,tbl2 $ Foreigners [1],ifelse(tbl1 $ category ==购物集群,tbl2 $ Foreigners [2],ifelse(tbl1 $ category ==Airport,tbl2 $ Foreigners [3],tbl2 $ Foreigners [4])) Locals = ifelse $ category ==Workgroup,tbl2 $ Locals [1],ifelse(tbl1 $ category ==购物集群,tbl2 $ Locals [2],ifelse(tbl1 $ category ==Airport,tbl2 $ Locals [ 3],tbl2 $ Locals [4]))) resultHour = function(tbl1,tbl2,ForeOrLoca) { previousHour = rep(0,nrow(tbl1)) for(i in 2:nrow(tbl1)) { previousHour [i] = tbl1 $ hour [i-1] } ###从tbl1 NewHourSum = ifelse(tbl1 $ category ==Work cluster,sum(with(tbl1,hour * I(category ==Work cluster))) ifelse(tbl1 $ category ==购物集群,sum(与(tbl1,小时* I(类别==购物集群))),ifelse(tbl1 $ category ==小时* I(类别==机场))),sum(with(tbl1,hour * I(category ==Residential area)))))) ##最后,这个 小时= as.vector(tbl1 $小时) footfallHour newtbl< cbind(hour,footfallHour) return(newtbl)} 输出我得到: > head(newtbl)小时footfallHour [1,] 3 1337.7926 [2,] 3 1506.2762 [3,] 3 12631.9264 [4,] 4 1785.2162 [5,] 3 441.7132 [6,] 3 1506.2762 函数: TheResultIWant = resultHour(tbl1,tbl2) I'm working on a data frame which looks like thisHere's how it looks like:shape id day hour week id footfall category area name22496 22/3/14 3 12 634 Work cluster CBD area 1 22670 22/3/14 3 12 220 Shopping cluster Orchard Road 1 23287 22/3/14 3 12 723 Airport Changi Airport 2 16430 22/3/14 4 12 947 Work cluster CBD area 2 4697 22/3/14 3 12 220 Residential area Ang Mo Kio 2 4911 22/3/14 3 12 1001 Shopping cluster Orchard Rd 3 11126 22/3/14 3 12 220 Residential area Ang Mo Kio 2 and so on... until 635 rows return. with the other dataset that I want to compare with can be found hereHere's how it looks like:category Foreigners LocalsWork cluster 1600000 3623900Shopping cluster 1800000 3646666.667Airport 15095152 8902705Residential area 527700 280000and also this last dataset that i want to compare with their previousHourThe first and second share the same attribute, i.e. category & first and third dataset share the same attribute hour.As for previousHour based on category. Eg, for workcluster hereThe previousHour should look like this: hour034445until 144 rows return... for each category. Click here for shopping categorypreviousHour eg. for shopping should look like this:hour033445until 144 rows return...Click here for airport categoryClick here for residential categoryall 144 rows return...SumHour dataset:category sumHour1 Airport 22082 Residential area 16563 Shopping cluster 16564 Work cluster 1656Here's, what I ideally want to find in R: #for n in 1: number of rows{ # calculate sumHours(in SumHours dataset) - previousHour = newHourSum and store it as newHourSum # calculate hour/(newHourSum-previousHour) * Foreigners and store it as footfallHour # add to the empty dataframe }I'm not sure how to do that and here's what i tried: mergetbl <- function(tbl1, tbl2){ newtbl = data.frame(hour=numeric(),forgHour=numeric()) ntbl1rows<-nrow(tbl1) # get the number of rows for(n in 1:ntbl1rows) { #for n in 1: number of rows{ # check the previous hour from IDA dataset !!!! # calculate sumDate - previousHour = newHourSum and store it as newHourSum # calculate hour/(newHourSum-previousHour) * Foreigners and store it as footfallHour # add to the empty dataframe } newHourSum <- 3588 - tbl1 footfallHour <- (tbl1$hour/(newHourSum-previousHour)) * tbl2$Foreigners newtbl <- rbind(newtbl, footfallHour) }}But nothing happened to newtbl...Here's what ideally looks like for newtbl:hour forgHour0 1337.79 (the function should calculate this)3 ...3 ...3 ...4 ...3 ...and so on... 解决方案 Thinking in terms of vectors gives this :Try this:### this is to get your Foreigners/Locals to be at the same size as tbl1Foreigners=ifelse(tbl1$category=="Work cluster",tbl2$Foreigners[1], ifelse (tbl1$category=="Shopping cluster", tbl2$Foreigners[2], ifelse(tbl1$category=="Airport", tbl2$Foreigners[3], tbl2$Foreigners[4])))Locals=ifelse(tbl1$category=="Work cluster",tbl2$Locals[1], ifelse (tbl1$category=="Shopping cluster", tbl2$Locals[2], ifelse(tbl1$category=="Airport", tbl2$Locals[3], tbl2$Locals[4])))And now, the functionresultHour = function(tbl1, tbl2, ForeOrLoca){previousHour = rep (0, nrow(tbl1))for (i in 2:nrow(tbl1)){ previousHour[i] = tbl1$hour[i-1]}### The conditional sum matching the category from tbl1NewHourSum = ifelse(tbl1$category=="Work cluster",sum(with(tbl1, hour*I(category == "Work cluster"))), ifelse (tbl1$category=="Shopping cluster", sum(with(tbl1, hour*I(category == "Shopping cluster"))), ifelse(tbl1$category=="Airport", sum(with(tbl1, hour*I(category == "Airport"))), sum(with(tbl1, hour*I(category == "Residential area"))))))##and finally, thishour = as.vector(tbl1$hour)footfallHour <- (hour/(newHourSum - previousHour)) * ForeOrLocanewtbl <- cbind(hour, footfallHour)return (newtbl)}this is the output I get :> head(newtbl) hour footfallHour[1,] 3 1337.7926[2,] 3 1506.2762[3,] 3 12631.9264[4,] 4 1785.2162[5,] 3 441.7132[6,] 3 1506.2762Using the function:TheResultIWant = resultHour (tbl1,tbl2) 这篇关于for循环:R中的大小不同的数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
10-27 02:46