本文介绍了Jsoup:获取所有标题标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Jsoup解析html文档以获取所有标题标签.另外,我需要将标题标签分组为[h1] [h2]等...

I'm trying to parse an html document with Jsoup to get all heading tags. In addition I need to group the heading tags as [h1] [h2] etc...

     hh = doc.select("h[0-6]");

但这给了我一个空数组.

but this give me an empty array.

推荐答案

您的选择器在这里表示具有属性"0-6"的 h-Tag -不是正则表达式.但是您可以组合使用多个选择器:hh = doc.select("h0, h1, h2, h3, h4, h5, h6");.

Your selector means h-Tag with attribute "0-6" here - not a regex. But you can combine multiple selectors instead: hh = doc.select("h0, h1, h2, h3, h4, h5, h6");.

分组:您需要一个包含所有h标签的组+每个h1,h2,...标签的组还是每个h1,h2,...标签的组?

Grouping: do you need a group with all h-Tags + a group for each h1, h2, ... tag or only a group for each h1, h2, ... tag?

以下是您如何执行此操作的示例:

Here's an example how you can do this:

// Group of all h-Tags
Elements hTags = doc.select("h1, h2, h3, h4, h5, h6");

// Group of all h1-Tags
Elements h1Tags = hTags.select("h1");
// Group of all h2-Tags
Elements h2Tags = hTags.select("h2");
// ... etc.

如果要为每个h1,h2,...标签分组,则可以删除第一个选择器,然后在其他选择器中将hTags替换为doc.

If you want a group for each h1, h2, ... tag you can drop first selector and replace hTags with doc in the others.

这篇关于Jsoup:获取所有标题标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-31 09:00