使用Python正则表达式提取数据

本文介绍了使用Python正则表达式提取数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在用Python正则表达式包装头部时遇到了一些麻烦，想出一个正则表达式来提取特定值.

I am having some trouble wrapping my head around Python regular expressions to come up with a regular expression to extract specific values.

我要解析的页面上有许多productId，它们以以下格式显示

The page I am trying to parse has a number of productIds which appear in the following format

\"productId\":\"111111\"

在这种情况下，我需要提取所有值，111111.

I need to extract all the values, 111111 in this case.

推荐答案

t = "\"productId\":\"111111\""
m = re.match("\W*productId[^:]*:\D*(\d+)", t)
if m:
    print m.group(1)

表示匹配非单词字符(\W*)，然后匹配productId，后跟非列字符([^:]*)和:.然后匹配非数字(\D*)并匹配并捕获以下数字((\d+)).

meaning match non-word characters (\W*), then productId followed by non-column characters ([^:]*) and a :. Then match non-digits (\D*) and match and capture following digits ((\d+)).

输出

这篇关于使用Python正则表达式提取数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！