问题描述
我已经使用熊猫导入了一个csv文件.
I have imported a csv file using pandas.
我的数据框有多个标题为农场",总苹果"和好苹果"的列.
My dataframe has multiple columns titled "Farm", "Total Apples" and "Good Apples".
为总苹果"和好苹果"导入的数值数据包含逗号,表示数千个. 1200等我要删除逗号,以便数据看起来像1200等.
The numerical data imported for "Total Apples" and "Good Apples" contains commas to indicate thousands e.g. 1,200 etc.I want to remove the comma so the data looks like 1200 etc.
总苹果"和好苹果"列的变量类型作为对象出现.
The variable type for the "Total Apples" and "Good Apples" columns comes up as object.
我尝试使用df.str.replace
和df.strip
,但没有成功.
I tried using df.str.replace
and df.strip
but have not been successful.
还尝试将变量类型从对象更改为字符串,将对象类型更改为整数,但无法使其正常工作.
Also tried to change the variable type from object to string and object to integer but couldn't make it work.
任何帮助将不胜感激.
****编辑****
****EDIT****
使用pd.read_csv导入的csv文件中的数据摘录:
Excerpt of data from csv file imported using pd.read_csv:
Farm_Name Total Apples Good Apples
EM 18,327 14,176
EE 18,785 14,146
IW 635 486
L 33,929 24,586
NE 12,497 9,609
NW 30,756 23,765
SC 8,515 6,438
SE 22,896 17,914
SW 11,972 9,114
WM 27,251 20,931
Y 21,495 16,662
推荐答案
我认为您可以将参数thousands
添加到 read_csv
,然后将Total Apples
和Good Apples
列中的值转换为integers
:
I think you can add parameter thousands
to read_csv
, then values in columns Total Apples
and Good Apples
are converted to integers
:
也许您的separator
是不同的,别忘了更改它.如果分隔符为空白,请将其更改为sep='\s+'
.
Maybe your separator
is different, dont forget change it. If separator is whitespace, change it to sep='\s+'
.
import pandas as pd
import io
temp=u"""Farm_Name;Total Apples;Good Apples
EM;18,327;14,176
EE;18,785;14,146
IW;635;486
L;33,929;24,586
NE;12,497;9,609
NW;30,756;23,765
SC;8,515;6,438
SE;22,896;17,914
SW;11,972;9,114
WM;27,251;20,931
Y;21,495;16,662"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep=";",thousands=',')
print df
Farm_Name Total Apples Good Apples
0 EM 18327 14176
1 EE 18785 14146
2 IW 635 486
3 L 33929 24586
4 NE 12497 9609
5 NW 30756 23765
6 SC 8515 6438
7 SE 22896 17914
8 SW 11972 9114
9 WM 27251 20931
10 Y 21495 16662
print df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11 entries, 0 to 10
Data columns (total 3 columns):
Farm_Name 11 non-null object
Total Apples 11 non-null int64
Good Apples 11 non-null int64
dtypes: int64(2), object(1)
memory usage: 336.0+ bytes
None
这篇关于从pandas数据框列中的对象中删除逗号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!