在数据框中添加列后,没有得到正确的输出文件。这是我的输入文件
Security Wise Delivery Position - Compulsory Rolling Settlement
10,MTO,01022018,592287763,0001583
Trade Date <01-FEB-2018>,Settlement Type <N>,Settlement No <2018023>,Settlement Date <05-FEB-2018>
Record Type,Sr No,Name of Security,Quantity Traded,Deliverable Quantity(gross across client level),% of Deliverable Quantity to Traded Quantity
20,1,20MICRONS,EQ,53466,27284,51.03
20,2,3IINFOTECH,EQ,7116046,3351489,47.10
20,3,3MINDIA,EQ,2613,1826,69.88
20,4,5PAISA,EQ,8463,5230,61.80
20,5,63MOONS,EQ,324922,131478,40.46
预期产量
20,1,20MICRONS,EQ,53466,27284,51.03,01022018
20,2,3IINFOTECH,EQ,7116046,3351489,47.10,01022018
20,3,3MINDIA,EQ,2613,1826,69.88,01022018
20,4,5PAISA,EQ,8463,5230,61.80,01022018
20,5,63MOONS,EQ,324922,131478,40.46,01022018
我的密码
import pandas as pd
df = pd.read_csv('C:/Working/dalal/MTO_11052018.DAT', sep='\t',skiprows=1)
df=df.iloc[1]
l1=list(str(df).split(","))
l2=l1[2]
df2=pd.read_csv('C:/Working/dalal/MTO_11052018.DAT',sep='\t',skiprows=3)
df2['Trans_dt']=df2.apply(lambda row:[l2],axis=1)
df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT',sep=',')
没被期待。请帮忙
最佳答案
我认为第二列需要header=1
,无行需要nrows=0
,第三列仅需要usecols=[2]
:
import pandas as pd
temp=u"""Security Wise Delivery Position - Compulsory Rolling Settlement
10,MTO,01022018,592287763,0001583
Trade Date <01-FEB-2018>,Settlement Type <N>,Settlement No <2018023>,Settlement Date <05-FEB-2018>
Record Type,Sr No,Name of Security,Quantity Traded,Deliverable Quantity(gross across client level),% of Deliverable Quantity to Traded Quantity
20,1,20MICRONS,EQ,53466,27284,51.03
20,2,3IINFOTECH,EQ,7116046,3351489,47.10
20,3,3MINDIA,EQ,2613,1826,69.88
20,4,5PAISA,EQ,8463,5230,61.80
20,5,63MOONS,EQ,324922,131478,40.46"""
#after testing replace 'pd.compat.StringIO(temp)' to 'C:/Working/dalal/MTO_11052018.DAT'
a = pd.read_csv(pd.compat.StringIO(temp), nrows=0, header=1, usecols=[2]).columns
print (a)
Index(['01022018'], dtype='object')
然后读取所有必需的数据并
assign
新列:#after testing replace 'pd.compat.StringIO(temp)' to 'C:/Working/dalal/MTO_11052018.DAT'
df = pd.read_csv(pd.compat.StringIO(temp), skiprows=3).assign(Trans_dt=a[0])
print (df)
Record Type ... Trans_dt
20 1 ... 1022018
20 2 ... 1022018
20 3 ... 1022018
20 4 ... 1022018
20 5 ... 1022018
[5 rows x 7 columns]
df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT')
#if columns names is necessary remove
df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT', header=None)
或类似,如果需要默认的rangeindex:
#after testing replace 'pd.compat.StringIO(temp)' to 'C:/Working/dalal/MTO_11052018.DAT'
df = pd.read_csv(pd.compat.StringIO(temp), skiprows=3).rename_axis('val').reset_index().assign(Trans_dt=a[0])
print (df)
val ... Trans_dt
0 20 ... 1022018
1 20 ... 1022018
2 20 ... 1022018
3 20 ... 1022018
4 20 ... 1022018
[5 rows x 8 columns]
如果列名不重要:
#after testing replace 'pd.compat.StringIO(temp)' to 'C:/Working/dalal/MTO_11052018.DAT'
df = pd.read_csv(pd.compat.StringIO(temp), skiprows=4, header=None).assign(Trans_dt=a[0])
print (df)
0 1 2 3 4 5 6 Trans_dt
0 20 1 20MICRONS EQ 53466 27284 51.03 1022018
1 20 2 3IINFOTECH EQ 7116046 3351489 47.10 1022018
2 20 3 3MINDIA EQ 2613 1826 69.88 1022018
3 20 4 5PAISA EQ 8463 5230 61.80 1022018
4 20 5 63MOONS EQ 324922 131478 40.46 1022018
最后:
df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT', index=False)
#if columns names is necessary remove
df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT', index=False, header=None)