本文介绍了在R中执行read.csv时,所有行都未被读取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 这是输入文件: http://www.yourfilelink.com/get。 php?fid = 841283 。我执行了 选项(stringsAsFactors = FALSE)x = read.csv(test1.csv,header = FALSE ,sep =')。 结果是: http://www.yourfilelink.com/get.php?fid=841284 给135行,我只得到7行!列数是正确的,是13. x [6,10]也有它后面的行的内容,只是在字符串中的\\\分隔。 请帮我。我在这个问题困住了! :/ 解决方案具有多个\\\的超长项目的描述症状表明您可能需要处理与不匹配的报价。如果在名称或地址条目中有引号,则解析器将在考虑hte条目完成之前等待下一个引号。尝试 x = read.csv(test1.csv,header = FALSE,sep =',quote = ) 这实际上并不影响我下载的文件sep参数将在 read.csv 中被忽略。)我需要首先使用count.fields和分隔符,然后使用 read.table 与 fill = TRUE 。结果仍然有点搞砸与几个列填充逗号,但至少有一些工作: table(count.fields(〜/ Downloads / test1.txt,sep =',quote =)) 10 13 5 130 x< - read.table(〜/ Downloads / test1.txt,header = FALSE,sep =',quote = ,stringsAsFactors = FALSE,skip = 5)#scan中的错误(file,what,nmax,sep,dec,quote,skip,nlines,na.strings,:#line 6没有13元素x quote =,stringsAsFactors = FALSE,fill = TRUE) str(x) ####################################### ################## 'data.frame':135 obs。 of 13 variables: $ V1:chrINSERT INTO message VALUES(52,INSERT INTO message VALUES(53,INSERT INTO message VALUES(54,INSERT INTO message VALUES(55, 。 $ V2:[email protected]@[email protected]@enron.com... $ V3:chr,,,,... $ V4:chr2000-01-21 04:51:002000-01-24 01:37 :002000-01-24 02:06:002000-02-02 10:21:00... $ V5:chr,,, ... $ V6:chr< 12435833.1075863606729.JavaMail.evans@thyme> $ V7:chr,,,,... $ V8:chrENRON HOSTS年度分析会议业务概述和2000年目标超过$ 50 - 你做到了!超过$ 50 - 你做到了!ROAD-SHOW.COM Q4i.COM选择ENRON提供财务网站内容... $ HOUSTON - 安然公司今天在== 20休斯顿举办年度股东分析师大会。安永董事长兼首席执行官Ken Lay| __truncated__在华尔街,人们都在谈论安然。在安然,我们谈论= 20个人...我们的人。你是驾驶forc| __truncated__在华尔街,人们都在谈论安然。在安然,我们谈论= 20个人...我们的人。您是驾驶forc| __truncated__HOUSTON = 01)Enron宽带服务(EBS),E = nron = 20Corp的全资子公司。和一个领导者交付的高b| __truncated__ ... $ V11:chr,,,... $ V12:chrRobert_Badeer_Aug2000Notes FoldersPress版本Robert_Badeer_Aug2000Notes FoldersPress版本Robert_Badeer_Aug2000Notes FoldersPress版本... ... $ V13:chr);););... 我有更好的结果以逗号作为分隔符,只是单引号,而不是默认的单引号或双引号, read。*。read-table(〜/ read)。 $ Downloads / test1.txt,header = FALSE,sep =,, quote =',stringsAsFactors = FALSE,fill = TRUE) str(x2) This is the input file: http://www.yourfilelink.com/get.php?fid=841283 . I executed options(stringsAsFactors=FALSE)x=read.csv("test1.csv", header = FALSE, sep="'"). The result is this: http://www.yourfilelink.com/get.php?fid=841284 Instead of giving 135 rows, I am getting only 7 rows! Number of columns is correct, and is 13. x[6,10] has the content of the rows following it as well, just separated by \n in the string.Please help me in this. I am stuck up in this problem! :/ 解决方案 The described symptom of an extremely long item with multiple "\n"'s suggests you probably need to deal with unmatched quotes. If there is a quote mark in a name or address entry then the parser will wait for the next one before considering hte entry complete. Try"x=read.csv("test1.csv", header = FALSE, sep="'", quote="")That didn't actually work on the file I downloaded. (And do note that the sep argument will be ignored in read.csv.) I needed to first use count.fields with that separator and then using read.table with fill =TRUE. The results were still a bit messed up with several columns being populated with commas but at least there is something to work with:table( count.fields("~/Downloads/test1.txt", sep="'", quote="")) 10 13 5 130 x <- read.table("~/Downloads/test1.txt", header = FALSE, sep="'", quote="", stringsAsFactors=FALSE, skip=5)#Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : # line 6 did not have 13 elements x <- read.table("~/Downloads/test1.txt", header = FALSE, sep="'", quote="", stringsAsFactors=FALSE, fill=TRUE) str(x) #########################################################'data.frame': 135 obs. of 13 variables: $ V1 : chr "INSERT INTO message VALUES (52," "INSERT INTO message VALUES (53," "INSERT INTO message VALUES (54," "INSERT INTO message VALUES (55," ... $ V2 : chr "[email protected]" "[email protected]" "[email protected]" "[email protected]" ... $ V3 : chr "," "," "," "," ... $ V4 : chr "2000-01-21 04:51:00" "2000-01-24 01:37:00" "2000-01-24 02:06:00" "2000-02-02 10:21:00" ... $ V5 : chr "," "," "," "," ... $ V6 : chr "<12435833.1075863606729.JavaMail.evans@thyme>" "<29664079.1075863606676.JavaMail.evans@thyme>" "<15300605.1075863606629.JavaMail.evans@thyme>" "<10522232.1075863606538.JavaMail.evans@thyme>" ... $ V7 : chr "," "," "," "," ... $ V8 : chr "ENRON HOSTS ANNUAL ANALYST CONFERENCE PROVIDES BUSINESS OVERVIEW AND GOALS FOR 2000" "Over $50 -- You made it happen!" "Over $50 -- You made it happen!" "ROAD-SHOW.COM Q4i.COM CHOOSE ENRON TO DELIVER FINANCIAL WEB CONTENT" ... $ V9 : chr "," "," "," "," ... $ V10: chr "HOUSTON - Enron Corp. hosted its annual equity analyst conference today in==20Houston. Ken Lay, Enron chairman and chief execu"| __truncated__ "On Wall Street, people are talking about Enron. At Enron, we re talking=20about people...our people. You are the driving forc"| __truncated__ "On Wall Street, people are talking about Enron. At Enron, we re talking=20about people...our people. You are the driving forc"| __truncated__ "HOUSTON =01) Enron Broadband Services (EBS), a wholly owned subsidiary of E=nron=20Corp. and a leader in the delivery of high-b"| __truncated__ ... $ V11: chr "" "," "," "," ... $ V12: chr "" "Robert_Badeer_Aug2000Notes FoldersPress releases" "Robert_Badeer_Aug2000Notes FoldersPress releases" "Robert_Badeer_Aug2000Notes FoldersPress releases" ... $ V13: chr "" ");" ");" ");" ...I got better results with a comma as separator and just single quote rather than the default single- or double-quote that the read.*-functions use:x2 <- read.table("~/Downloads/test1.txt", header = FALSE, sep=",", quote="'", stringsAsFactors=FALSE, fill=TRUE) str(x2) 这篇关于在R中执行read.csv时,所有行都未被读取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
10-27 01:18