csv无法导入路径中带有重音符号的文件

csv无法导入路径中带有重音符号的文件

本文介绍了pandas.read_csv无法导入路径中带有重音符号的文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Python和QT GUI开发应用程序.我需要将文件导入到DataFrame.我用QFileDialog.getOpenFileName获取路径和文件名,用pandas.read_csv方法打开它.一切正常,直到我得到带有特殊字符(如ó")的路径. pandas.read_csv无法正常工作并使应用程序崩溃.

I am developing an application with Python and a QT GUI.I need to import a file to a DataFrame.I use a QFileDialog.getOpenFileName to get the path and filename to open it with pandas.read_csv method.Everything works well until I get a path with special characters like "ó". The pandas.read_csv doesn't work and crash the app.

我尝试在控制台中重现该错误,并得到以下结果:

I try to reproduce the error in console and have the following results:

In[2]: import pandas as pd
Backend Qt5Agg is interactive backend. Turning interactive mode on.

In[3]: path1 = 'F:/Software_Proyects/Python/Proyectos/test_read_csv/FlowData.txt'
In[4]: df1 = pd.read_csv(path1, delim_whitespace=True, dtype=object)

In[5]: path2 = 'F:/Software_Proyects/Python/Proyectos/test_read_csv_with_ó/FlowData.txt'
In[6]: df2 = pd.read_csv(path2, delim_whitespace=True, dtype=object)
Traceback (most recent call last):
  File "C:\Program Files (x86)\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-6-feba8e024d43>", line 1, in <module>
    df2 = pd.read_csv(path2, delim_whitespace=True, dtype=object)
  File "C:\Program Files (x86)\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 646, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "C:\Program Files (x86)\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 389, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "C:\Program Files (x86)\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 730, in __init__
    self._make_engine(self.engine)
  File "C:\Program Files (x86)\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 923, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "C:\Program Files (x86)\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 1390, in __init__
    self._reader = _parser.TextReader(src, **kwds)
  File "pandas\parser.pyx", line 373, in pandas.parser.TextReader.__cinit__ (pandas\parser.c:4184)
  File "pandas\parser.pyx", line 669, in pandas.parser.TextReader._setup_parser_source (pandas\parser.c:8471)
OSError: Initializing from file failed

show_versions()的输出是:

In[7]: pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.6.0.final.0
python-bits: 32
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.11.3
scipy: 0.18.1
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.2.2
numexpr: 2.6.1
matplotlib: 2.0.0
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.2
bs4: 4.5.3
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.1.5
pymysql: None
psycopg2: None
jinja2: 2.9.4
boto: 2.45.0
pandas_datareader: None

正如我在这篇文章中所读的用熊猫编码.read_csv,当文件名带有重音符号时,此问题已在pandas 0.14.0中修复.

As I read in this post Encoding with pandas.read_csv when file name has accents the problem was fixed in pandas 0.14.0.

是否有解决此问题的建议?

Any recommendation to solve this problem?

推荐答案

深入了解,此行为仅在Windows系统中结合使用Python 3.6和pandas.read_csv.

Looking in deep, this behavior comes in a combination of Python 3.6 and pandas.read_csv only in Windows systems.

Python 3.6将Windows文件系统编码从"mbcs"更改为"UTF-8".参见 Python PEP 529 .使用sys.getfilesystemencoding()获取当前文件系统编码

Python 3.6 change Windows filesystem encoding from "mbcs" to "UTF-8". See Python PEP 529. Use sys.getfilesystemencoding() to get the current file system encoding

我对此有一些解决方案:

I get some solutions around this:

1.--使用此代码将所有应用更改为可与以前的Python< = 3.5编码("mbcs")一起使用

1.- Use this code to change all the app to works with the prior Python <= 3.5 encoding ("mbcs")

import sys
sys._enablelegacywindowsfsencoding()

2.-将文件指针传递给pandas.read_csv

2.- Pass a file pointer to the pandas.read_csv

with open(path2, 'r') as fp:
    df2 = pd.read_csv(fp, delim_whitespace=True, dtype=object)

这篇关于pandas.read_csv无法导入路径中带有重音符号的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-23 23:13