我有以下格式的csv数据
ab aback abandon abate Class
ab NaN abandon NaN A
NaN aback NaN NaN A
NaN aback abandon NaN B
ab NaN NaN abate C
NaN NaN abandon abate C
我要删除NaN单元格,并将数据重新排列为
ab abandon A
aback A
aback abandon B
ab abate C
abandon abate C
处理过的表单中不需要标题。我尝试了很多线程,如Remove NaN from pandas series、Missing Data In Pandas Dataframes、How can I remove Nan from list Python/NumPy等,但它们都提供了按列的解决方案。
here is the sample file。
它有空单元格,当我使用dataframe显示它时,所有空单元格都显示为NaN
这是密码
import pandas as pd
df = pd.read_csv('C:/Users/ABRAR/Google Drive/Tourism Project/Small_sample.csv', low_memory=False)
print(df)
最佳答案
df = df.apply(lambda x: sorted(x.values.astype(str)), axis=1)\
.replace('nan','')
df = df.drop(df.index[df.eq('').all(axis=1)]) #drop all null rows
df = df.drop(df.columns[df.eq('').all()],axis=1) #drop all null columns
print(df.head())
输出:
ab aback
14 access
18 accept
23 access
24 able accept
47 accepted