我的DataFrame正在看这个:

indeed.fr
11.41%
career2.successfactors.eu
8.53%
37.16%
pracuj.pl
7.40%
80.42%
corporate.danone.com.br
6.64%
indeed.com.br
4.68%
61.73%


因此,我只想保留第一个%,如下所示:

indeed.fr
11.41%
career2.successfactors.eu
8.53%
pracuj.pl
7.40%
corporate.danone.com.br
6.64%
indeed.com.br
4.68%


所有行都是字符串,依此类推,我不知道是否可以在诸如上一行包含%的条件下删除行。

有任何想法吗 ?

谢谢你的时间 !

mydata =['indeed.fr','11.41%','career2.successfactors.eu','8.53%','37.16%','pracuj.pl','7.40%','80.42%','corporate.danone.com.br','6.64%','indeed.com.br','4.68%','61.73%']
df=pd.DataFrame(mydata)


最后,我想要这样:

最佳答案

mydata =['indeed.fr','11.41%','career2.successfactors.eu','8.53%','37.16%','pracuj.pl','7.40%','80.42%','corporate.danone.com.br','6.64%','indeed.com.br','4.68%','61.73%']

df = pd.DataFrame(mydata)


是您创建的样本。

解决方案如下

rowList = []
row = []

#Variable to keep track of the number of times I see the percentage value
percentVal = 0

for i in df.index:


    if(df.at[i, 0][0] not in set('0123456789')):


        row.append(df.at[i, 0])

        percentVal = 0

    else:


        percentVal += 1

        if(percentVal != 2):
            row.append(df.at[i, 0])
            rowList.append(row)
            row = []

        else:
            #If percentVal == 2, that means, I have seen my second percentage value and I'm going to skip it.
            print("Skipping {}".format(df.at[i, 0]))
            row = []



yourSol = pd.DataFrame(rowList)
yourSol.columns = ['Incoming Referal Sources', 'Value (%)']

print(yourSol)

10-07 12:34