问题描述
我运行下面的代码并得到:
I run below code and get:
raise ValueError('Length of values does not match length of ' 'index')
ValueError: 值的长度与索引的长度不匹配
ValueError: Length of values does not match length of index
用于:
print(df3)
zz = df1.WE=np.where(df1.EW.isin(df3.AL),df1.WE,np.nan)
df3['O2'] = zz #Error Line
我已经将它用于具有相同列结构的 csv 中的数据,因为这是抓取的数据,而另一个 csv 是手动创建的.
I have got this to work for data in a csv that has identical column structure save for this is scraped data where as the other csv was created manually.
df1
EW WE \
0 can v can 1.900
1 Lanus U20 v Argentinos Jrs U20 2.100
DA
0 https://www.bet365.com.au/#/AC/B1/C1/D13/E40/F...
1 https://www.bet365.com.au/#/AC/B1/C1/D13/E40/F...
df2
AA AB AC AD \
0 Today 19:00 Indonesia U23 Kyrgyzstan U23 650.00
1 Today 23:00 Brunei Darussalam U23 Mongolia U23 3.30
2 Today 19:30 can can 110.00
AE
0 https://www.betfair.com.au/exchange/plus/footb...
1 https://www.betfair.com.au/exchange/plus/footb...
2 https://www.betfair.com.au/exchange/plus/footb...
df3
df3 = pd.DataFrame(columns=['DAT', 'G', 'TN', 'O1', 'L1', 'TN2', 'O2', 'L2', 'D', 'AJ', 'AK', 'AL'])
期望:
df3:
DAT G \
0 Today 19:00 can v can
TN O1 \
0 can v can 1.900
L1 \
0 https://www.bet365.com.au/#/AC/B1/C1/D13/E40/F...
TN2 O2 \
0 can v can 110.00
L2 D
0 https://www.betfair.com.au/exchange/plus/footb... 108.1
代码:
import pandas as pd
import time
import numpy as np
import pandas as pd
print('####CURRIES###')
df1 = pd.read_csv('C:\\O\\df1.csv',
index_col=False,
usecols=[0,1,2],
names=["EW", "WE", "DA"],
header=None)
print(df1)
print('####Mangoes###')
df2 = pd.read_csv('C:\\O\\df2.csv',
index_col=False,
usecols=[0,1,2, 3, 4],
names=["AA", "AB", "AC", "AD", "AE"],
header=None)
print(df2)
df3 = pd.DataFrame(columns=['DAT', 'G', 'TN', 'O1', 'L1', 'TN2', 'O2', 'L2', 'D', 'AJ', 'AK', 'AL'])
df3['DAT'] = df2['AA']
df3['G'] = df2['AB'] + ' v ' + df2['AC']
df3['AK'] = df2['AB'] + ' v ' + df2['AC']
df3['AL'] = df2['AB'] + ' v ' + df2['AC']
print(df3)
#TN
print('Vlookup')
aa = df3.AK=np.where(df3.AK.isin(df1.EW),df3.AK,np.nan)
print(aa)
df3['TN'] = aa
print('Vlookup output')
print(df3)
## Not working
#TN
print('Vlookup')
aa = df3.AK=np.where(df3.AK.isin(df1.EW),df3.AK,np.nan)
print(aa)
df3['TN2'] = aa
## Not working
#L
df3['L1'] = df2['AE']
df3['L2'] = df1['DA']
print('dataframe complete')
print(df3)
##O <-- Get value beside T
print('done')
#O1 Mangoes
aa = df2.AD=np.where(df3.AL.isin(df1.EW),df2.AD,np.nan)
df3['O1'] = aa
#O2 Curries
#zz = df1.WE=np.where(df3.AL.isin(df1.EW),df1.WE,np.nan)
#zz = df1.WE=np.where(df1.EW.isin(df3.AL),df1.WE,np.nan)
###############ERROR B# ###############ERROR B#
print(df3)
zz = df1.WE=np.where(df1.EW.isin(df3.AL),df1.WE,np.nan)
df3['O2'] = zz
###############ERROR ABOVE#
print('output')
df = df3[pd.notnull(df3['TN'])]
#df3['D'] = df3['O2'] - df3['O1']
print('tt')
#df3.dropna(how='all')
#df3.columns = df3.columns.droplevel(3)
print(df3)
#df3['AA'] = df1[zz]
print('zza')
df3.pop('AK')
df3.pop('AL')
df3['D'] = df3['O2'].sub(df3['O1'], fill_value=0)
df3.pop('AJ')
print(df3.dtypes)
NaNs = df3 = df3.dropna(subset=['O1','O2'])
print(NaNs)
import pandas as pd
print('test')
print(df3)
writer = pd.ExcelWriter('C:\\O\\dx.xlsx')
df3.to_excel(writer, startcol=0, index = False)
writer.save()
print(df3)
#df3.pop('AK')
#df3['T1'] = df1[zz]
#Delete Columns
#put O in a column value
#
抓取代码以创建数据帧此处.
Scraping code to create dataframes here .
然后我写can v can in df1 column A row 1.
I then write can v can in df1 column A row 1.
然后我在df2列B第3行写can
I then write can in df2 Column B row 3
然后我在df2列C第3行写can
I then write can in df2 column C row 3
或者只是这里是我正在使用的确切文件(已删除).
Or simply here are exact files (scraped) I am using.
上面代码下面的这些文件工作得很好.我不明白为什么它不适用于抓取的数据.
These files below the code above works perfectly. I don't understand why it is not working for scraped data.
工作 代码
推荐答案
您需要 merge
与 inner join
:
print('####CURRIES###')
df1 = pd.read_csv('C:\\O\\df1.csv',
index_col=False,
usecols=[0,1,2],
names=["EW", "WE", "DA"],
header=None)
print(df1.head())
####CURRIES###
EW WE \
0 can v can 1.90
1 Lanus U20 v Argentinos Jrs U20 2.10
2 Botafogo RJ U20 v Toluca U20 1.83
3 Atletico Mineiro U20 v Bahia U20 2.10
4 FC Porto v Monaco 1.36
DA
0 https://www.bet365.com.au/#/AC/B1/C1/D13/E40/F...
1 https://www.bet365.com.au/#/AC/B1/C1/D13/E40/F...
2 https://www.bet365.com.au/#/AC/B1/C1/D13/E40/F...
3 https://www.bet365.com.au/#/AC/B1/C1/D13/E40/F...
4 https://www.bet365.com.au/#/AC/B1/C1/D13/E40/F...
print('####Mangoes###')
df2 = pd.read_csv('C:\\O\\df2.csv',
index_col=False,
usecols=[0,1,2, 3, 4],
names=["AA", "AB", "AC", "AD", "AE"],
header=None)
print(df2.head())
####Mangoes###
AA AB AC AD \
0 Today 19:00 Indonesia U23 Kyrgyzstan U23 650.00
1 Today 23:00 Brunei Darussalam U23 Mongolia U23 3.30
2 Today 19:30 can can 110.00
3 Today 20:00 FC Zbrojovka Brno U19 Sparta Prague U19 1.97
4 Today 20:30 Rahmatgonj MFS Sheikh Jamal Dhanmondi 1.51
AE
0 https://www.betfair.com.au/exchange/plus/footb...
1 https://www.betfair.com.au/exchange/plus/footb...
2 https://www.betfair.com.au/exchange/plus/footb...
3 https://www.betfair.com.au/exchange/plus/footb...
4 https://www.betfair.com.au/exchange/plus/footb...
#crewte new column called EW in df2
df2['EW'] = df2['AB'] + ' v ' + df2['AC']
#merge on columns EW
df3 = pd.merge(df2, df1, on='EW')
#change columns names
df3 = df3.rename(columns={'WE':'O1','AD':'O2'})
#subtract O columns
df3['D'] = df3['O2'].sub(df3['O1'], fill_value=0)
print (df3)
AA AB AC O2 \
0 Today 19:00 Indonesia U23 Kyrgyzstan U23 650.0
1 Today 19:30 can can 110.0
AE \
0 https://www.betfair.com.au/exchange/plus/footb...
1 https://www.betfair.com.au/exchange/plus/footb...
EW O1 \
0 Indonesia U23 v Kyrgyzstan U23 2.0
1 can v can 1.9
DA D
0 https://www.bet365.com.au/#/AC/B1/C1/D13/E40/F... 648.0
1 https://www.bet365.com.au/#/AC/B1/C1/D13/E40/F... 108.1
这篇关于我得到的抓取的 CSV pandas 数据框:ValueError('值的长度与''索引'的长度不匹配)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!