本文介绍了rxDatastep中的transforms和transformFunc产生不同的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在xdf文件中添加一个新列.我在rxDatastep中测试了transformstransformFunc.

I would like to add a new column in an xdf file. I tested both transforms and transformFunc in rxDatastep.

这行代码对我来说很好:

This line of code works fine for me:

rxDataStep(nyc_jan_xdf,transforms = list(newCol5=ifelse(payment_type==1,10,20)))

但是如果我使用transformFunc:

but If I use transformFunc:

CashVsCard<-function(x)
{
  if(x$payment_type==1){
    x$newCol13=10
  } else {
    x$newCol13=20
  }
  return(x)
}
rxDataStep(nyc_jan_xdf,transformFunc = CashVsCard)

它不起作用并返回此错误:

it doesnt work and returns this error:

Error in doTryCatch(return(expr), name, parentenv, handler) : 
  The variable 'newCol13' has a different number of rows than other columns in the data: 1 vs. 10
In addition: Warning message:
In if (x$payment_type == 1) { :
  the condition has length > 1 and only the first element will be used

为什么transformFunc不起作用?

我的数据示例:

structure(list(VendorID = c(2L, 2L, 2L, 1L, 1L, 1L), tpep_pickup_datetime = c("2016-01-01 00:00:00", 
"2016-01-01 00:00:00", "2016-01-01 00:00:03", "2016-01-01 00:00:04", 
"2016-01-01 00:00:05", "2016-01-01 00:00:06"), tpep_dropoff_datetime = c("2016-01-01 00:00:00", 
"2016-01-01 00:00:00", "2016-01-01 00:15:49", "2016-01-01 00:14:32", 
"2016-01-01 00:14:27", "2016-01-01 00:04:44"), passenger_count = c(5L, 
1L, 6L, 1L, 2L, 1L), trip_distance = c(4.90000009536743, 10.539999961853, 
2.4300000667572, 3.70000004768372, 2.20000004768372, 1.70000004768372
), pickup_longitude = c(-73.9807815551758, -73.9845504760742, 
-73.9693298339844, -74.0043029785156, -73.9919967651367, -73.9821014404297
), pickup_latitude = c(40.7299118041992, 40.6795654296875, 40.7635383605957, 
40.7422409057617, 40.718578338623, 40.7746963500977), RatecodeID = c(1L, 
1L, 1L, 1L, 1L, 1L), store_and_fwd_flag = c("N", "N", "N", "N", 
"N", "Y"), dropoff_longitude = c(-73.9444732666016, -73.9502716064453, 
-73.9956893920898, -74.0073623657227, -74.0051345825195, -73.9709396362305
), dropoff_latitude = c(40.7166786193848, 40.7889251708984, 40.7442512512207, 
40.7069358825684, 40.7399444580078, 40.7967071533203), payment_type = c(1L, 
1L, 1L, 1L, 1L, 1L), fare_amount = c(18, 33, 12, 14, 11, 7), 
    extra = c(0.5, 0.5, 0.5, 0.5, 0.5, 0.5), mta_tax = c(0.5, 
    0.5, 0.5, 0.5, 0.5, 0.5), tip_amount = c(0, 0, 3.99000000953674, 
    3.04999995231628, 1.5, 1.64999997615814), tolls_amount = c(0, 
    0, 0, 0, 0, 0), improvement_surcharge = c(0.300000011920929, 
    0.300000011920929, 0.300000011920929, 0.300000011920929, 
    0.300000011920929, 0.300000011920929), total_amount = c(19.2999992370605, 
    34.2999992370605, 17.2900009155273, 18.3500003814697, 13.8000001907349, 
    9.94999980926514)), .Names = c("VendorID", "tpep_pickup_datetime", 
"tpep_dropoff_datetime", "passenger_count", "trip_distance", 
"pickup_longitude", "pickup_latitude", "RatecodeID", "store_and_fwd_flag", 
"dropoff_longitude", "dropoff_latitude", "payment_type", "fare_amount", 
"extra", "mta_tax", "tip_amount", "tolls_amount", "improvement_surcharge", 
"total_amount"), row.names = c(NA, 6L), class = "data.frame")

推荐答案

我找到了它.这不是最好的解决方案,但它可以工作.我应该只更改如下功能:

I have found it. It is not the best solution but it works. I should only change the function like this:

CashVsCard<-function(x)
{

  p<-length(x$payment_type)   
  for(i in 1: p)
  {

    if(x$payment_type[i]==1)
    {
      x$cash_vs_Card4[i]="Card"
    }   else    {
      x$cash_vs_Card4[i]="Others"
    }
  }
  return(x)
}

这篇关于rxDatastep中的transforms和transformFunc产生不同的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-18 05:36