文本如下:
<td><a href="/AS714/17.100.0.0/16" title="Apple Inc.">17.100.0.0/16</a> </td>
<td><a href="/AS714/17.102.0.0/15" title="Apple Inc.">17.102.0.0/15</a> </td>
<td><a href="/AS714/17.104.0.0/15" title="Apple Inc.">17.104.0.0/15</a> </td>
<td><a href="/AS714/17.106.0.0/15" title="Apple Inc.">17.106.0.0/15</a> </td>
<td><a href="/AS714/17.108.0.0/15" title="Apple Inc.">17.108.0.0/15</a> </td>
<td><a href="/AS714/17.110.0.0/15" title="Apple Inc.">17.110.0.0/15</a> </td>
<td><a href="/AS714/17.110.64.0/18" title="Apple Inc.">17.110.64.0/18</a> </td>
然后我的正则表达式是:(?<=.">)(.+)(?=</a)
在notepad++中可以正确提取出17.100.0.0./16等等这些中间的字符
这个数据在一个apple.txt中,shell脚本怎么才能输出这些字符到一个另一个txt中?谢谢!
试过grep -E 、awk -F都提示脚本错误。
wget xx.com/apple.txt
cat apple.txt
grep -E ‘(?<=.">)(.+)(?=</a)’
??
台湾网友:
- grep -o -E ‘[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+\/[[:digit:]]+’ test.txt > out.text
复制代码
四川网友:
大佬,我知道你想用IP地址去匹配,但是后面有IPv6地址呢。能不能还用我上面的前后字符匹配法来写!谢谢!
<td><a href="/AS714/2620:149:5e1::/48" title="Apple Inc.">2620:149:5e1::/48</a> </td>
<td><a href="/AS714/2620:149:a01::/48" title="Apple Inc.">2620:149:a01::/48</a> </td>
<td><a href="/AS714/2620:149:a07::/48" title="Apple Inc.">2620:149:a07::/48</a> </td>
<td><a href="/AS714/2620:149:a0b::/48" title="Apple Inc.">2620:149:a0b::/48</a> </td>
<td><a href="/AS714/2620:149:a11::/48" title="Apple Inc.">2620:149:a11::/48</a> </td>
<td><a href="/AS714/2620:149:a13::/48" title="Apple Inc.">2620:149:a13::/48</a> </td>
<td><a href="/AS714/2620:149:a15::/48" title="Apple Inc.">2620:149:a15::/48</a> </td>
<td><a href="/AS714/2620:149:a17::/48" title="Apple Inc.">2620:149:a17::/48</a> </td>
<td><a href="/AS714/2620:149:a19::/48" title="Apple Inc.">2620:149:a19::/48</a> </td>
<td><a href="/AS714/2620:149:a1b::/48" title="Apple Inc.">2620:149:a1b::/48</a> </td>
这样就IPv4地址正则匹配不到了
陕西网友:
- grep -o -E ‘([0-9]{1,3}\.)+[0-9]+/[0-9]+’ test.txt | uniq
复制代码
山东网友:尝试sed -e ‘s/.">\(.*\)</a>/\1/’
提示bad option in substitution expression
是我的.">和</a>正则元字符冲突了吗
陕西网友:着急去吃饭,大概瞎写了一个
青海网友:正则我是不会的,换个思路嘛,把 Apple Inc."> 跟</a>中间的东西取出来~~
读一行,awk 1下,接着awk第2下,echo结束,继续下一行。
新疆网友:扒bgp.he.net
辽宁网友:s后面的‘/’ 和表达式中的 ‘/’ 重复,换个不一样的就好了,sed ‘s|.">\(.*\)</a>|\1|’
山西网友:能发一下吗!!!!谢谢老哥!!!!!
贵州网友:能发一下吗!!!!谢谢老哥!!!!!