Closed. This question needs to be more focused。它当前不接受答案。
                            
                        
                    
                
                            
                                
                
                        
                            
                        
                    
                        
                            想改善这个问题吗?更新问题,使其仅通过editing this post专注于一个问题。
                        
                        2年前关闭。
                                                                                            
                
        
我需要创建三个pandas.Series(x,y,z)。用于此目的的数据以各种方式格式化。有些用\n;分隔,有些仅用空格分隔。我想要一种将这些数据提取到列表中的通用方法。数据如下所示:

x is "\n -10.03 -7.02 -0.05 9.96 20 40"
y is "\n 0.70;\n 0.79;\n 0.90;\n 1.00"
z is "\n 100.00 100.00 100.00 100.00 100.00 100.00;\.." (24 times)

最佳答案

这可以使用正则表达式和列表理解来完成:

码:

import re
split_pattern = re.compile(r'[\n \t;]+')

x = '\n -10.03 -7.02 -0.05 9.96 20 40'
y = '\n 0.70;\n 0.79;\n 0.90;\n 1.00'
z = '\n 100.00 100.00 100.00 100.00 100.00 100.00;'

for data in (x, y, z):
    data_list = [float(d) for d in split_pattern.split(data) if d != ""]
    print(data_list)


结果:

[-10.03, -7.02, -0.05, 9.96, 20.0, 40.0]
[0.7, 0.79, 0.9, 1.0]
[100.0, 100.0, 100.0, 100.0, 100.0, 100.0]

10-04 10:24
查看更多