python - 如何从字符串正确获取价格

设定

我正在使用Selenium和Python 3.x进行网络抓取产品价格。

我有一个包含每个产品价格的字符串列表。

对于低于1000欧元的价格，字符串看起来像'€ 505.93 net'（即505.93）。
对于价格从1000欧元起的字符串，其外观类似于'€ 1 505.93 net'（即1505.93）。

问题

我不确定如何整齐地处理千元价格中的空白和点。

那么，让product_price = '€ 1 505.93 net'

[int(s) for s in product_price if s.isdigit()]

给，

[1, 5, 0, 5, 9, 3]

product_price = '€ 505.93 net'上的类似过程给出[5, 0, 5, 9, 3]。

题

如何调整我的代码，使我得到1505.93和505.93？

最佳答案

这是一种方法。我们可以匹配以下正则表达式模式，该模式使用空格作为千位分隔符：

€\s*(\d{1,3}(?: \d{3})*(?:\.\d+)?)

然后，第一个捕获组应包含匹配的欧元金额。

input = '€ 1 505.93 net and here is another price € 505.93'
result = re.findall(r'€\s*(\d{1,3}(?: \d{3})*\.\d+)', input)
print list(result)

['1 505.93', '505.93']

正则表达式的解释：

€                  a Euro sign
\s*                followed by optional whitespace
(                  (capture what follows)
    \d{1,3}        one to three digits
    (?: \d{3})*    followed by zero or more thousands groups
    (?:\.\d+)?     an optional decimal component
)                  (close capture group)

关于python - 如何从字符串正确获取价格，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/55690393/