本文介绍了正则表达式挑选歌手姓名和歌曲名称,并出现延迟匹配问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试构建一个灵活的正则表达式,以选择媒体文件的歌手姓名和歌曲标题.我希望它具有灵活性并支持以下所有功能:

I'm trying to build a flexible regular expression to pick out the artist name and song title of a media file. I'd like it to be flexible and support all of the following:

01表演艺术家的例子-Song.mp3的例子

01 Example Artist - Example Song.mp3

01示例Song.mp3(在此示例中,没有艺术家,因此组应该为空)

01 Example Song.mp3(In this example, there's no artist so that group should be null)

示例艺术家-示例Song.mp3

Example Artist - Example Song.mp3

示例Song.mp3(再次,没有艺术家)

Example Song.mp3(Again, no artist)

我想出了以下内容(使用.NET语法,特别是对于命名捕获组):

I've come up with the following (in .NET syntax, particularly for named capture groups):

\d{0,2}\s*(?<artist>[^-]*)?[\s-]*(?<songname>.*)(\.mp3|\.m4a)

这很好用,但是对于此输入失败:01示例Song.mp3

This works well, but fails for this input:01 Example Song.mp3

我相信由于贪婪的匹配,它吞下了歌手的名字.因此,我尝试修改表达式,以便艺术家部分可以进行延迟匹配:

It swallows the song name as the artist, I believe because of greedy matching. So, I tried modifying the expression so that the artist part would be lazy matching:

\d{0,2}\s*(?<artist>[^-]*)*?[\s-]*(?<songname>.*)(\.mp3|\.m4a)

更改为:

(?<artist>[^-]*)?

成为

(?<artist>[^-]*)*?

这确实可以解决上述问题.但是现在,此输入失败:

This does indeed fix the above problem. But now, it fails for this input:

01表演艺术家的例子-Song.mp3的例子

01 Example Artist - Example Song.mp3

现在,它太懒了,因为它捕获了"Example Artist-Example Song"作为歌曲名,却没有捕获任何艺术家名称.

Now, it's too lazy in that it captures "Example Artist - Example Song" as the songname and captures nothing for the artist name.

有人对此有建议吗?

推荐答案

您不能仅凭贪婪来完成此任务,您需要使用组(无论是否可选)进行更具描述性的描述.一个例子:

You can't achieve this task only with greediness, you need to be more descriptive using groups (optional or not). An example:

(?x) # switch on comment mode
^    # start of the string
(?: (?<track>\d{1,3}) \s*[\s-]\s* )? # the track is optional ( including separators) 
(?: (?<artist>.+?) \s*-\s* )? # the same with the artist name
(?<title> .+ )
(?<ext> \.m(?:p3|4a) )

演示

顺便说一句,即使采用世界上最好的模式,音频文件名也可能很奇怪,我怀疑您是否可以处理所有情况.

As an aside, audio filenames can be very weird, even with the best pattern of the world, I doubt you can handle all cases.

如果将.+ 替换为更明确的内容,则可以变得更加灵活和高效:

You can be a little more flexible and more efficient if you replace .+ with something more explicit:

^(?x)
(?: (?<track>\d{1,3}) \s*[\s-]\s* )?
(?: (?<artist> \S+ (?>[ .-][^\s.-]*)*? ) \s*-\s*)?
(?<title> [^.\n]+ (?>\.[^.\n]*)*? )
(?<ext> \.m(?:p3|4a) )

( \ n 仅在此处用于测试目的,您可以在一次应用模式一个文件名时将其删除)

( \n are only here for test purpose, you can remove them when you apply the pattern one filename at a time)

这篇关于正则表达式挑选歌手姓名和歌曲名称,并出现延迟匹配问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-15 01:27