本文介绍了如何将多个正则表达式合并为一行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的脚本可以很好地执行此操作:

My script works fine doing this:

images = re.findall("src.\"(\S*?media.tumblr\S*?tumblr_\S*?jpg)", doc)
videos = re.findall("\S*?(http\S*?video_file\S*?tumblr_[a-zA-Z0-9]*)", doc)

但是,我认为在整个文档中搜索两次是低效的.

However, I believe it is inefficient to search through the whole document twice.

如果有帮助,这里有一个示例文档:http://pastebin.com/5kRZXjij

Here's a sample document if it helps: http://pastebin.com/5kRZXjij

我希望上面的输出如下:

I would expect the following output from the above:

images = http://37.media.tumblr.com/tumblr_lnmh4tD3sM1qi02clo1_500.jpg
videos = http://bassrx.tumblr.com/video_file/86319903607/tumblr_lo8i76CWSP1qi02cl

相反,最好执行以下操作:

Instead it would be better to do something like:

image_and_video_links = re.findall(" <match-image-links-or-video links> ", doc)

如何将两行 re.findall 合二为一?

How can I combine the two re.findall lines into one?

我曾尝试使用 | 字符,但总是无法匹配任何内容.所以我确定我完全不知道如何正确使用它.

I have tried using the | character but I always fail to match anything. So I'm sure I'm completely confused as to how to use it properly.

推荐答案

如评论中所述,管道 (|) 应该可以解决问题.

As mentioned in the comments, a pipe (|) should do the trick.

正则表达式

(src.\"(\S*?media.tumblr\S*?tumblr_\S*?jpg))|(\S*?(http\S*?video_file\S*?tumblr_[a-zA-Z0-9]*))

捕获两种模式中的任何一种.

catches either of the two patterns.

关于 正则表达式测试器

这篇关于如何将多个正则表达式合并为一行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-21 05:58