UnicodeDecodeError Sentiment140 Kaggle

本文介绍了UnicodeDecodeError Sentiment140 Kaggle的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试阅读Kaggle上提供的Sentiment140.csv：

I am trying to read the Sentiment140.csv available on Kaggle: https://www.kaggle.com/kazanova/sentiment140

我的代码是这个：

import pandas as pd
import os

cols = ['sentiment','id','date','query_string','user','text']
BASE_DIR = ''
df = pd.read_csv(os.path.join(BASE_DIR, 'Sentiment140.csv'),header=None, names=cols)

它给了我这个错误：

想了解的是：

1）我该如何解决这个问题？

1) How do I solve this issue?

2）在哪里可以找到？看到基于错误，我应该使用哪种编码类型而不是 utf-8？

2) Where can I see which type of encoding should I use instead of "utf-8", based on the error?

3）使用其他编码方法会导致其他问题

3) Using other encoding methods will cause me other issues later on?

预先感谢

P.s。我在Mac上使用python3

P.s. I am using python3 on a mac

Sentiment140

UnicodeDecodeError Sentiment140 Kaggle

问题描述

推荐答案