问题描述
我正在尝试使用PhantomJS/CasperJS抓取网页.最近几天,我一直在阅读文档并在线搜索,但我陷入了困境.
I'm trying to use PhantomJS/CasperJS to scrape a webpage. I've spent the last few days reading the docs, and searching online, but I'm stuck.
我要抓取的页面显示了三个级别的链接-年,月和日.当您选择年,月和日时,计数将显示在#count div中.此外,月份实际上是更改#imageLoad div中的图像的输入(我不需要).
The page I'm scraping shows three levels of links - years, months, and days. When you select a Year, Month, and day, a count appears in the #count div. Also, the months are actually inputs that change an image in the #imageLoad div (which I don't need).
<div id="years">
<span class="year">2010</span>
<span class="year">2011</span>
<span class="year">2012</span>
etc...
</div>
<div id="months">
<input type="image" class="month" src="jan_image.png" onclick="changepic('jan')" />
<input type="image" class="month" src="feb_image.png" onclick="changepic('feb')" />
<input type="image" class="month" src="mar_image.png" onclick="changepic('mar')" />
etc...
</div>
<div id="days">
<span class="day">1</span>
<span class="day">2</span>
<span class="day">3</span>
etc...
</div>
<div id="imageLoad">
</div>
<div id="count">
</div>
我试图遍历链接,并记录出现在年,月和日的每种组合中的计数.如您所见,月份是改变情况的输入.
I'm trying to loop through the links and record the count that appears for each combination of years, months, and days. As you can see, the months are inputs that change the picture.
我尝试了很多事情.我要做的主要事情是一个嵌套循环,循环遍历每组链接,并在单击时单击它们.这是代码(我正在使用jQuery):
I tried a number of things. The main thing I want to do is a nested loop that loops through each set of links, clicking them as I go. Here is the code (I'm using jQuery):
casper.start(link);
casper.then(function() {
pageInfo = this.evaluate(function(){
values = [];
for(var y = 0; y < $('#years').length; y++){
year= $('#years span').get(y);
$(year).click();
for(var m = 0; m < $('#months').length; m++){
month= $('#months input').get(m);
$(month).click();
for(var d = 0; d < $('#days').length; d++){
day= $('#days span').get(d);
$(day).click();
values.push($('#count').text());
}
}
}
return values;
});
});
我认为这将按顺序遍历每组链接,并且我将从年,月和日的每个变化中获取所有值.
This I thought would loop through each set of links in order, and I would get all the values from every variation of year, month and day.
但是,当我单击脚本中的月份输入时,脚本会中断并转到下一个casper.then语句.我有更好的方法吗?
However, when I click on the month inputs in my script, the script breaks and goes to the next casper.then statement. Is there a better way for me to do this?
我有种错误的感觉,但是我尝试过的其他任何方法都没有取得成果.总是看起来好像一旦它跳到下一个"then",就再也没有回到我的循环了.
I have the feeling that I'm going about this the wrong way, but nothing else I've tried has been fruitful either. It always seems like once it breaks to the next "then" there's no coming back to my loop.
我已经尝试过使用Casper.each
循环,但是我不知道之前会有多少个元素.
I've tried looping with Casper.each
, but I don't know how many elements there will be before hand.
先谢谢了.
推荐答案
仅作记录,此示例显示了使用casperjs在嵌套循环中执行操作的正确方法:
Just for the record, this example shows the right way to perform operations in a nested loop using casperjs:
https://github.com/n1k0/casperjs/blob/master/samples/dynamic .js
它不会花太多时间使其适应您的需求.
It won't take you too much to adapt it to what you need.
这篇关于CasperJs单击嵌套循环中的链接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!