本文介绍了使用正则表达式(preg_replace:php)和受限词向字符串添加文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个布尔搜索字符串用于第三方索引搜索服务:Germany or (Indian, Tech*)

I have a boolean search string for third party index search service: Germany or (Indian, Tech*)

我希望我的结果在处理后:Germany[45] or (Indian[45], Tech*[45]).这里是搜索服务所需的权重45.

I want my result to be after processing: Germany[45] or (Indian[45], Tech*[45]). Here 45 is the weight needed by the search service.

搜索了很长时间后,我得到了结果:Germany[45] or (Indian[45], Tech[45]*).在这里,您可以看到*出现在[45]之后,这不是必需的.

After googling around for long I was able to get the result: Germany[45] or (Indian[45], Tech[45]*). Here you can see * has came after [45] which is not required.

输出应为:Germany[45] or (Indian[45], Tech*[45]),在[45]之前查找*.

The output should be: Germany[45] or (Indian[45], Tech*[45]), look for * before [45].

代码:

preg_replace('/([a-z0-9\*\.])+(\b(?<!or|and|not))/i', '$0'."[45]", $term);

因此,其背后的简单概念是对单词施加权重,而不是对or/and/not等进行布尔搜索.请帮助我微调正则表达式或提供新的正则表达式以获取所需的结果.

So the simple concept behind it is to apply weight to words, but not to or/and/not etc. boolean search sensitive words. Please help me to fine tune the regexp or give a new regex to get required result.

推荐答案

问题是您只获得包含\b-单词边界的匹配项.由于星号是非单词字符,因此将其从匹配项中删除,因此解决方案是允许单词边界或星号(\*|\b):

The problem was that you were only getting matches that include a \b - a word boundary. Since an asterisk is a non-word character, it was eliminating it from the match, so the solution was to allow for either a word boundary or an asterisk (\*|\b):

preg_replace('/([a-z0-9.]+)((\*|\b)(?<!or|and|not))/i', '$0'."[45]", $term);

但是,使用负前瞻进行操作会更简单:

However, it's simpler to do it with a negative lookahead:

preg_replace('/\b(?!or|and|not)([a-z0-9*.]+)/i', '$0'."[45]", $term);

注意:在字符类中,星号和点号不是元字符,因此不需要像原始表达式中那样转义它们:[a-z0-9\*\.]+.

Note: Within character classes asterisks and periods are not metacharacters, so they don't need to be escaped as you had in your original expression: [a-z0-9\*\.]+.

这篇关于使用正则表达式(preg_replace:php)和受限词向字符串添加文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-15 01:22