我想使用CasperJS清除一些网络数据。数据在表中,在每一行中有一个链接,导致一个页面有更多的细节。在脚本中有一个循环遍历所有表行。我想Casper单击链接,收集子页面上的数据,并返回一个历史记录步骤,以处理下一个表行。问题是,click()不工作,我不知道为什么。有什么办法解决这个问题吗? (注意:href调用了javascript函数viewContact)
I want to scrape some web data using CasperJS. The data is in a table, in each row there is a link leading to a page with more detail. In the script there is a loop iterating through all table rows. I want Casper to click the link, collect the data on a sub-page and come one history step back to process next table row. The problem is that the click() doesn't work and I don't know why. Is there any way to fix this ? (note: a javascript function viewContact is invoked by href)
var employee = {
last_name: "",
first_name: "",
position: "",
department: "",
location: "",
email: "",
phone: "",
twitter: ""
var employees = [];
var result_number = 50;
var start_url = 'https://www.jigsaw.com/SearchContact.xhtml?companyId=489781&orderby=0&order=0&opCode=paging&mode=0&estimatedCount=126&dead=false&rpage=1&rowsPerPage=200';
var casper = require('casper').create({
javascriptEnabled: true
casper.start(start_url, function() {
var js = this.evaluate(function() {
return document;
for (var i = 1; i <= result_number; i++)
// j stands for three neighbour td columns containing:
// position, name+link, location
employee.position = this.getHTML('#sortableTable tr:nth-child(' + i + ') td:nth-child(3) span');
// click link and get other data
this.click('#sortableTable tr:nth-child(' + i + ') td:nth-child(4) span a');
employee.first_name = this.getHTML('#sortableTable tr:nth-child(' + i + ') td:nth-child(4) span a');
//collect data
this.waitForSelector('#firstname', function() {
employee.first_name = this.getHTML('#firstname');
this.waitForSelector('#lastname', function() {
employee.last_name = this.getHTML('#lastname');
this.waitForSelector('#state', function() {
employee.department = this.getHTML('#state');
this.waitForSelector('#email', function() {
employee.email = this.getHTML('#email');
this.waitForSelector('#phone', function() {
employee.phone = this.getHTML('#phone');
//get back to previous page
employee.location = this.getHTML('#sortableTable tr:nth-child(' + i + ') td:nth-child(5) span');
this.echo('\n\n Employee number: ' + i + " :\n");
this.echo('first name : ' + employee.first_name);
this.echo('last name : ' + employee.last_name);
this.echo('position : ' + employee.position);
this.echo('department : ' + employee.department);
this.echo('location : ' + employee.location);
this.echo('email : ' + employee.email);
this.echo('phone : ' + employee.phone);
I see two things here that need to be corrected. First, The for loop in your code doesn't appear to be in the scope of any casperjs methods.
for (var i = 1; i <= result_number; i++)
它应该在 casper.then
其次,最重要的是,可以通过复制粘贴的方式, tr:nth-child('+ i +')
你想与之交互将不会以这种方式工作。我不知道为什么,但它似乎不工作这直向前。我试图做同样的事情。我的解决方案是首先将 i
Secondly and most importantly, the tr:nth-child(' + i + ')
you'd like to interact with won't work in this way. I don't know why but it doesn't seem to work this straight forwardly. I've tried to do the same thing. My solution was to first of all convert the i
to a string instead of a number like so:
pageturn = pageturn + 1;
// Collect <td> contents on each page.
var pageturnString = pageturn.toString();
var linknum = 'a.SomeLinkClass:nth-child('+pageturnString+')';
在我的例子中我使用这个来点击更改页面,无论如何你必须封装你的与第一个方法内的 this.then()
in my case I'm using this to click to change the page, either way you must encapsulate your interaction with the said css selector inside a this.then()
method inside the first method, and then a second child method does the rest of the for loop.
casper.each(pagecount, function() {
this.then(function() {
pageturn = pageturn + 1;
// Collect <td> contents on each page.
var pageturnString = pageturn.toString();
var linknum = 'a.SomeLinkClass:nth-child('+pageturnString+')';
this.then(function() {
//Now run for loop here.
如果您不将css选择器构造封装在 this.then ()
方法在下一个方法中使用之前,它将不工作。我不知道为什么,但这是交易。在我的代码中, pagecount
If you don't encapsulate the css selector construction within the this.then()
method before it's used in the next method, it won't work. I don't know why but that's the deal. In my code, pagecount
could possibly be used instead of your for loop but I'll leave that up to you.
这篇关于casperJS如何在从web / .click()收集数据的同时单击表中的多个链接不工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!