本文介绍了Node.js的 - 使用具有Cheerio一个回调函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我建设节点刮板,它采用和的在页面加载和解析。

I'm building a scraper in Node, which uses request and cheerio to load in pages and parse them.

重要的是,我把一个回调后,才请求和Cheerio已完成加载页面。我试图使用,但我不能完全肯定放在哪里回调。

It's important that I put a callback only AFTER Request and Cheerio has finished loading the page. I'm trying to use the async extension, but I'm not entirely sure where to put the callback.

request(url, function (err, resp, body) {
    var $;
    if (err) {
        console.log("Error!: " + err + " using " + url);
    } else {
        async.series([
            function (callback) {
                $ = cheerio.load(body);
                callback();
            },
            function (callback) {
               // do stuff with the `$` content here
            }
        ]);
    }
});

我一直在阅读 cheerio文档 并不能找到当内容已经被加载回​​调的例子。

I've been reading the cheerio documentation and can't find any examples of callbacks for when the content has been loaded in.

什么是做到这一点的最好方法是什么?当我在剧本扔50个URL就开始移动太早之前cheerio已正确加载的内容,我想遏制与异步加载任何错误。

What's the best way to do it? When I throw 50 URLs at the script it starts moving on too early before cheerio has properly loaded in content, and I'm trying to curb any errors with async loading.

我完全新异步编程和一般的回调,所以如果我想的东西简单在这里请让我知道。

I'm totally new to asynchronous programming and callbacks in general so if I'm missing something simple here please let me know.

推荐答案

cheerio.load 是同步的,你不需要任何回调它。

Yes, cheerio.load is synchronous and you don't need any callbacks for it.

request(url, function (err, resp, body) {
  if (err) {
    return console.log("Error!: " + err + " using " + url);
  }
  var $ = cheerio.load(body);
  // do stuff with the `$` content here
});

这篇关于Node.js的 - 使用具有Cheerio一个回调函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-10 05:59