问题描述
我想从网页上获取所有 iframe
.
I want to get all the iframe
from a webpage.
代码:
site = "http://" + url
f = urllib2.urlopen(site)
web_content = f.read()
soup = BeautifulSoup(web_content)
info = {}
content = []
for iframe in soup.find_all('iframe'):
info['src'] = iframe.get('src')
info['height'] = iframe.get('height')
info['width'] = iframe.get('width')
content.append(info)
print(info)
pprint(content)
print(info)
的结果:
{'src': u'abc.com', 'width': u'0', 'height': u'0'}
{'src': u'xyz.com', 'width': u'0', 'height': u'0'}
{'src': u'http://www.detik.com', 'width': u'1000', 'height': u'600'}
pcode(内容)的结果:
result of pprint(content)
:
[{'height': u'600', 'src': u'http://www.detik.com', 'width': u'1000'},
{'height': u'600', 'src': u'http://www.detik.com', 'width': u'1000'},
{'height': u'600', 'src': u'http://www.detik.com', 'width': u'1000'}]
为什么内容的值不正确?假定与我 print(info)
时的值相同.
Why is the value of the content not right? It's suppose to be the same as the value when I print(info)
.
推荐答案
您没有为每个iframe创建单独的词典,只是不断地修改同一本词典,并在自己的词典中添加对该词典的其他引用列表.
You are not creating a separate dictionary for each iframe, you just keep modifying the same dictionary over and over, and you keep adding additional references to that dictionary in your list.
请记住,当您执行 content.append(info)
之类的操作时,您并没有在复制数据,只是在数据上附加了引用.
Remember, when you do something like content.append(info)
, you aren't making a copy of the data, you are simply appending a reference to the data.
您需要为每个iframe创建一个新的字典.
You need to create a new dictionary for each iframe.
for iframe in soup.find_all('iframe'):
info = {}
...
更好的是,您不需要先创建一个空字典.只需一次创建所有内容:
Even better, you don't need to create an empty dictionary first. Just create it all at once:
for iframe in soup.find_all('iframe'):
info = {
"src": iframe.get('src'),
"height": iframe.get('height'),
"width": iframe.get('width'),
}
content.append(info)
还有其他方法可以完成此操作,例如遍历属性列表或使用列表或字典理解,但是很难提高上述代码的清晰度.
There are other ways to accomplish this, such as iterating over a list of attributes, or using list or dictionary comprehensions, but it's hard to improve upon the clarity of the above code.
这篇关于创建词典列表会产生同一词典的副本列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!