Pandas read_csv预期列数错误，带有粗糙的csv文件

本文介绍了Pandas read_csv预期列数错误，带有粗糙的csv文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个csv文件，有几百行和26列，但最后几列只有几个行中的值，他们是朝向文件的中间或末端。当我尝试使用read_csv（）读取它，我得到以下错误。
ValueError：期望23列，在第64行中获得26

I have a csv file that has a few hundred rows and 26 columns, but the last few columns only have a value in a few rows and they are towards the middle or end of the file. When I try to read it in using read_csv() I get the following error."ValueError: Expecting 23 columns, got 26 in row 64"

我不能看到在哪里明确地陈述文件中的列数，如何确定它认为文件应该有多少列。
转储位于

I can't see where to explicitly state the number of columns in the file, or how it determines how many columns it thinks the file should have.The dump is below

In [3]:

infile =open(easygui.fileopenbox(),"r")
pledge = read_csv(infile,parse_dates='true')


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-b35e7a16b389> in <module>()
      1 infile =open(easygui.fileopenbox(),"r")
      2
----> 3 pledge = read_csv(infile,parse_dates='true')


C:\Python27\lib\site-packages\pandas-0.8.1-py2.7-win32.egg\pandas\io\parsers.pyc in read_csv(filepath_or_buffer, sep, dialect, header, index_col, names, skiprows, na_values, thousands, comment, parse_dates, keep_date_col, dayfirst, date_parser, nrows, iterator, chunksize, skip_footer, converters, verbose, delimiter, encoding, squeeze)
    234         kwds['delimiter'] = sep
    235
--> 236     return _read(TextParser, filepath_or_buffer, kwds)
    237
    238 @Appender(_read_table_doc)

C:\Python27\lib\site-packages\pandas-0.8.1-py2.7-win32.egg\pandas\io\parsers.pyc in _read(cls, filepath_or_buffer, kwds)
    189         return parser
    190
--> 191     return parser.get_chunk()
    192
    193 @Appender(_read_csv_doc)

C:\Python27\lib\site-packages\pandas-0.8.1-py2.7-win32.egg\pandas\io\parsers.pyc in get_chunk(self, rows)
    779             msg = ('Expecting %d columns, got %d in row %d' %
    780                    (col_len, zip_len, row_num))
--> 781             raise ValueError(msg)
    782
    783         data = dict((k, v) for k, v in izip(self.columns, zipped_content))

ValueError: Expecting 23 columns, got 26 in row 64

推荐答案

names 参数。例如，如果你有这样的csv文件：

You can use names parameter. For example, if you have csv file like this:

1,2,1
2,3,4,2,3
1,2,3,3
1,2,3,4,5,6

尝试阅读它，会收到错误信息。

And try to read it, you'll receive and error

>>> pd.read_csv(r'D:/Temp/tt.csv')
Traceback (most recent call last):
...
Expected 5 fields in line 4, saw 6

但如果你传递 names 参数，获取结果：

But if you pass names parameters, you'll get result:

>>> pd.read_csv(r'D:/Temp/tt.csv', names=list('abcdef'))
   a  b  c   d   e   f
0  1  2  1 NaN NaN NaN
1  2  3  4   2   3 NaN
2  1  2  3   3 NaN NaN
3  1  2  3   4   5   6

希望它有帮助。

这篇关于Pandas read_csv预期列数错误，带有粗糙的csv文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！