从R中的数据帧或时间序列对象中删除NA值

从R中的数据帧或时间序列对象中删除NA值

本文介绍了从R中的数据帧或时间序列对象中删除NA值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我通过以下方式读取了一些数据:

I read in some data via:

it.data <- read.csv("inputData/rstar.data.it.csv", header = T, sep = ","),则第二和第四列分别是通货膨胀率.兴趣:

it.data <- read.csv("inputData/rstar.data.it.csv", header = T, sep = ",") then the second and fourth columns are inflation resp. interest:

inflation.it <- it.data[2]

interest.it <- it.data[4].

但是,当我尝试将数据重新格式化为时间序列对象时,麻烦就开始了,因为列中存在前导和尾随的NA值.我尝试了na.omit()it.data[complete.cases(it.data),]na.contiguous,但是没有运气.现在发生的是,当我尝试将数据转换为TS对象时,

However, the trouble starts when I am trying to reform the data into a time-series object, because there are leading and trailing NA values in the columns. I have tried na.omit(), it.data[complete.cases(it.data),], na.contiguous, without luck. What happens now is that when I try to transform the data into a TS object,

inflation.ts.it <- ts(inflation.it, frequency = 4, interest.start)

我得到了非常奇怪的值,这些值与原始数据不匹配.

I get very strange values which do not match with the original data.

谢谢.

PS.数据(我并没有发布所有内容,只是想知道一个主意):

PS. The data (I did not post everything, but just to get an idea):

     gdp.log       inflation  inflation.expectations     interest
1          .    2.4361259655                       .            .
2          .    2.9997029997                       .            .
3          .    1.5169194865                       .            .
4          .    1.5059368664        2.11467132957868            .
5          .    2.0591647331        2.02043102148892            .
6          .    1.9896193771        1.76791011585382            .
7          .    2.6436781609        2.04959978443843            .
8          .    3.3951497860        2.52190301432020            .
9          .    4.5467462347        3.14379838970698            .
10         .    5.0890585241        3.91865817645959            .
11         .    5.7110862262        4.68551019278066            .
12         .    7.7262693156        5.76829007519398            .
13         .    7.5292198967        6.51390849069030            .
14         .    6.9679849340        6.98364009316870            .
15         .    7.6006355932        7.45602743492283            .
16         .    5.6352459016        6.93327158141434            .
17         .    5.4853387259        6.42230128873304            .
18         .    6.6649899396        6.34655254012084            .
19         .    5.8577405857        5.91082878825926            .
20         .    5.5528612997        5.89023263777669            .
21         .    4.9125329499        5.74703119375926            .
22         .    4.2442820089        5.14185421108985            .

推荐答案

假设点在原始CSV中,则可以通过在读入时将"."指定为na.string来进行修复.

Assuming the dots are in the original CSV, you can fix it by specifying "." as na.string upon read-in.

read.csv(text=
"gdp.log,inflation,inflation.expectations,interest
.,2.4361259655377,.,.
.,2.99970299970301,.,.
.,1.5169194865811,.,.
.,1.50593686649291,2.11467132957868,.
.,2.05916473317866,2.02043102148892,.
.,1.9896193771626,1.76791011585382,.
.,2.64367816091953,2.04959978443843,.",
header=TRUE, na.string=c(".", "NA"))

na.string可以是字符串的向量,以防丢失值使用多个代码.

na.string can be a vector of character strings, in case several codes are used for missing values.

这篇关于从R中的数据帧或时间序列对象中删除NA值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 02:35