问题描述
我正在使用 BeautifulSoup 并解析一些 HTML.
I am using BeautifulSoup and parsing some HTMLs.
我从每个 HTML 中获取特定数据 (使用 for 循环) 并将该数据添加到特定列表中.
I'm getting a certain data from each HTML (using for loop) and adding that data to a certain list.
问题是,一些 HTML 具有不同的格式(并且它们没有我想要的数据).
The problem is, some of the HTMLs have different format (and they don't have the data that I want in them).
所以,我试图使用异常处理并将值 null
添加到列表 (我应该这样做,因为数据的顺序很重要.)
So, I was trying to use exception handling and add value null
to the list (I should do this since the sequence of data is important.)
例如,我有一个类似的代码:
For instance, I have a code like:
soup = BeautifulSoup(links)
dlist = soup.findAll('dd', 'title')
# I'm trying to find content between <dd class='title'> and </dd>
gotdata = dlist[1]
# and what i want is the 2nd content of those
newlist.append(gotdata)
# and I add that to a newlist
而且有些链接没有任何<dd class='title'>
,所以我想做的是将字符串 null
添加到改为列出.
and some of the links don't have any <dd class='title'>
, so what I want to do is add string null
to the list instead.
出现错误:
list index out of range.
我所做的尝试是添加一些这样的行:
What I have done tried is to add some lines like this:
if not dlist[1]:
newlist.append('null')
continue
但是没有用.它仍然显示错误:
But it doesn't work out. It still shows error:
list index out of range.
我该怎么办?我应该使用异常处理吗?或者有没有更简单的方法?
What should I do about this? Should I use exception handling? or is there any easier way?
有什么建议吗?任何帮助都会很棒!
Any suggestions? Any help would be really great!
推荐答案
处理异常才是王道:
try:
gotdata = dlist[1]
except IndexError:
gotdata = 'null'
当然你也可以查看dlist
的len()
;但处理异常更直观.
Of course you could also check the len()
of dlist
; but handling the exception is more intuitive.
这篇关于我想异常处理“列表索引超出范围".的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!