本文介绍了使用R从CSV文件创建和合并动物园对象时间序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在一个目录中有大量的csv文件.这些文件包含两列,DatePrice. filename.csvfilename包含数据系列的唯一标识符.我知道,当这些时间序列数据是动物园对象时,可以处理合并数据序列的缺失值.我还了解到,使用na.locf(merge() function时,我可以使用最新的观测值来填充缺失的值.

I have a large set of csv files in a single directory. These files contain two columns, Date and Price. The filename of filename.csv contains the unique identifier of the data series. I understand that missing values for merged data series can be handled when these times series data are zoo objects. I also understand that, in using the na.locf(merge() function, I can fill in the missing values with the most recent observations.

我想自动化该过程.

  1. *.csv文件的日期和价格列数据加载到R数据框中.
  2. 在合并的动物园时间序列组合"对象中建立每个标识相同的时间序列.
  3. 使用MergedData <- na.locf(merge( ))合并这些动物园对象的时间序列.
  1. loading the *.csv file columnar Date and Price data into R dataframes.
  2. establishing each distinct time series within the Merged zoo "portfolio of time series" objects with an identity that is equal to each of their s.
  3. merging these zoo objects time series using MergedData <- na.locf(merge( )).

当然,最终目标是使用fPortfolio软件包.

The ultimate goal, of course, is to use the fPortfolio package.

我已经使用以下语句创建了Date,Price对的数据帧.这种方法的问题是我从文件中丢失了时间序列数据的<filename>标识符.

I've used the following statement to create a data frame of Date,Price pairs. The problem with this approach is that I lose the <filename> identifier of the time series data from the files.

  result <- lapply(files, function(x) x <- read.csv(x) )

我了解我可以编写代码来生成实例逐个执行所有这些步骤所需的R语句.我想知道是否有某种方法不需要我这样做.我很难相信别人不想执行同样的任务.

I understand that I can write code to generate the R statements required to do all these steps instance by instance. I'm wondering if there is some approach that wouldn't require me to do that. It's hard for me to believe that others haven't wanted to perform this same task.

推荐答案

使用sapply(保留文件名)可以得到更好的格式.在这里,我将保留lapply.

You can have better formatting using sapply( keep the files names). Here I will keep lapply.

  1. 假设所有文件都在同一目录中,则可以使用list.files. 这样的工作流程非常方便.
  2. 我会使用read.zoo直接获取动物园对象(避免以后强制使用)
  1. Assuming that all your files are in the same directory you can use list.files. it is very handy for such workflow.
  2. I would use read.zoo to get directly zoo objects(avoid later coercing)

例如:

zoo.objs <- lapply(list.files(path=MY_FILES_DIRECTORY,
                              pattern='^zoo_*.csv',    ## I look for csv files, 
                                                       ##   which names start with zoo_
                              full.names=T),           ## to get full names path+filename
                   read.zoo)

我现在再次使用list.files重命名结果

I use now list.files again to rename my result

 names(zoo.objs) <- list.files(path=MY_FILES_DIRECTORY,
                          pattern='^zoo_*.csv')

这篇关于使用R从CSV文件创建和合并动物园对象时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-16 13:18