问题描述
使用 str.format()
是在Python 2.6和Python 3中格式化字符串的新标准。使用<$ c $时遇到了一个问题c> str.format()与正则表达式。 我写了一个正则表达式来返回下面所有单个级别的域一个指定的域名或指定的域名下的两个级别的域名,如果下面的第二级别是www ...
假设指定的域名是delivery.com,正则表达式应该返回a.delivery.com,b.delivery.com,www.c.delivery.com ...但它不应该返回xadelivery.com。
import re
str1 =www.pizza.delivery.com
str2 =w.pizza.delivery.com
str3 =pizza.delivery.com
if(re.match('^(w {3} \。)?([0-9A-Za-z-] + \。) {1} delivery.com $',str1):print'String 1 matches!'
if(re.match('^(w {3} \。)?([0-9A-Za-z - )+ \。){1} delivery.com $',str2):print'String 2 matches!'
if(re.ma tch('^(w {3} \。)?([0-9A-Za-z-] + \。){1} delivery.com $',str3):print'String 3 matches!'
运行这个应该给出结果:
字符串1匹配!
字符串3匹配!
现在,问题是当我尝试使用str.format动态替换delivery.com ... (b)(b)(b)(b)(b) '){1} {domainName} $'。format(domainName ='delivery.com'),str1):print'String 1 matches!'
这似乎失败了,因为 str.format()
期望 {3}
和 {1}
是函数的参数(我假设)
我可以使用+运算符连接字符串
$ $ $ $ $ $ $ $ $(^ {w}} { [0-9A-Za-z-] + \。){1}'+ domainName +'$'
问题归结为,当字符串(通常是正则表达式)具有 {n}时,是否可以使用 str.format()在这里面呢?
你首先需要格式化字符串,然后使用正则表达式,这真的不值得把所有东西都放进去
>>>单行,转义是通过加倍大括号完成的。 pat ='^(w {{3}} \。)?([0-9A-Za-z-] + \。){{1}} {domainName} $'.format(domainName ='delivery。 com')
>>> pat
'^(w {3} \\。)?([0-9A-Za-z - ] + \\。){1} delivery.com $'
> ;>> re.match(pat,str1)
另外, 请注意,正则表达式中的 Using I've written a regular expression to return all domains that are a single level below a specified domain or any domains that are 2 levels below the domain specified, if the 2nd level below is www... Assuming the specified domain is delivery.com, my regex should return a.delivery.com, b.delivery.com, www.c.delivery.com ... but it should not return x.a.delivery.com. Running this should give the result: Now, the problem is when I try to replace delivery.com dynamically using str.format... This seems to fail, because the I could concatenate the string using + operator The question comes down to, is it possible to use you first would need to format string and then use regex. It really doesn't worth it to put everything into a single line. Escaping is done by doubling the curly braces: Also, Please note, that 这篇关于Python 2.6 + str.format()和正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! re.match
匹配在字符串的开头,如果使用 re.match $,则不必将
^
c $ c>,但如果您使用 re.search
,则需要 ^
。 b
$ b {1}
是相当多的。str.format()
is the new standard for formatting strings in Python 2.6, and Python 3. I've run into an issue when using str.format()
with regular expressions.import re
str1 = "www.pizza.delivery.com"
str2 = "w.pizza.delivery.com"
str3 = "pizza.delivery.com"
if (re.match('^(w{3}\.)?([0-9A-Za-z-]+\.){1}delivery.com$', str1): print 'String 1 matches!'
if (re.match('^(w{3}\.)?([0-9A-Za-z-]+\.){1}delivery.com$', str2): print 'String 2 matches!'
if (re.match('^(w{3}\.)?([0-9A-Za-z-]+\.){1}delivery.com$', str3): print 'String 3 matches!'
String 1 matches!
String 3 matches!
if (re.match('^(w{3}\.)?([0-9A-Za-z-]+\.){1}{domainName}$'.format(domainName = 'delivery.com'), str1): print 'String 1 matches!'
str.format()
expects the {3}
and {1}
to be parameters to the function. (I'm assuming)'^(w{3}\.)?([0-9A-Za-z-]+\.){1}' + domainName + '$'
str.format()
when the string (usually regex) has "{n}" within it?>>> pat= '^(w{{3}}\.)?([0-9A-Za-z-]+\.){{1}}{domainName}$'.format(domainName = 'delivery.com')
>>> pat
'^(w{3}\\.)?([0-9A-Za-z-]+\\.){1}delivery.com$'
>>> re.match(pat, str1)
re.match
is matching at the beginning of the string, you don't have to put ^
if you use re.match
, you need ^
if you're using re.search
, however.{1}
in regex is rather redundant.