问题描述
我使用 rust_bert
来总结文本.我需要用 rust_bert::pipelines::summarization::SummarizationModel::new
设置一个模型,它从互联网上获取模型.它使用 tokio
以及(我认为)我的问题异步执行此操作我遇到的是我正在另一个 Tokio 运行时中运行 Tokio 运行时,如错误消息所示:
I'm using rust_bert
for summarising text. I need to set a model with rust_bert::pipelines::summarization::SummarizationModel::new
, which fetches the model from the internet. It does this asynchronously using tokio
and the issue that (I think) I'm running into is that I am running the Tokio runtime within another Tokio runtime, as indicated by the error message:
Downloading https://cdn.huggingface.co/facebook/bart-large-cnn/config.json to "/home/(censored)/.cache/.rustbert/bart-cnn/config.json"
thread 'main' panicked at 'Cannot start a runtime from within a runtime. This happens because a function (like `block_on`) attempted to block the current thread while the thread is being used to drive asynchronous tasks.', /home/(censored)/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.21/src/runtime/enter.rs:38:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
我试过同步运行模型获取tokio::task::spawn_blocking
和 tokio::task::block_in_place
但他们都不为我工作.block_in_place
给出了与不存在相同的错误,而且 spawn_blocking
对我来说真的似乎没有用.我也试过使 summarize_text
异步,但这并没有多大帮助.Github 问题tokio-rs/tokio#2194和 Reddit 帖子"'无法从运行时内启动运行时.'使用 Actix-Web 和 Postgresql"看起来很相似(相同的错误消息),但它们对找到解决方案没有多大帮助.
I've tried running the model fetching synchronously withtokio::task::spawn_blocking
and tokio::task::block_in_place
but neither of them are working for me. block_in_place
gives the same error as if weren't there, and spawn_blocking
doesn't really seem to be of use to me.I've also tried making summarize_text
async, but that didn't help much. Github Issuetokio-rs/tokio#2194and Reddit post"'Cannot start a runtime from within a runtime.' with Actix-Web And Postgresql"seem similar (same-ish error message), but they weren't of much help in finding a solution.
我遇到问题的代码如下:
The code I've got issues with is as follows:
use egg_mode::tweet;
use rust_bert::pipelines::summarization::SummarizationModel;
fn summarize_text(model: SummarizationModel, text: &str) -> String {
let output = model.summarize(&[text]);
// @TODO: output summarization
match output.is_empty() {
false => "FALSE".to_string(),
true => "TRUE".to_string(),
}
}
#[tokio::main]
async fn main() {
let model = SummarizationModel::new(Default::default()).unwrap();
let token = egg_mode::auth::Token::Bearer("obviously not my token".to_string());
let tweet_id = 1221552460768202756; // example tweet
println!("Loading tweet [{id}]", id = tweet_id);
let status = tweet::show(tweet_id, &token).await;
match status {
Err(err) => println!("Failed to fetch tweet: {}", err),
Ok(tweet) => {
println!(
"Original tweet:\n{orig}\n\nSummarized tweet:\n{sum}",
orig = tweet.text,
sum = summarize_text(model, &tweet.text)
);
}
}
}
推荐答案
解决问题
这是一个简化的例子:
Solving the problem
This is a reduced example:
use tokio; // 1.0.2
#[tokio::main]
async fn inner_example() {}
#[tokio::main]
async fn main() {
inner_example();
}
thread 'main' panicked at 'Cannot start a runtime from within a runtime. This happens because a function (like `block_on`) attempted to block the current thread while the thread is being used to drive asynchronous tasks.', /playground/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.2/src/runtime/enter.rs:39:9
为避免这种情况,您需要在完全独立的线程上运行创建第二个 Tokio 运行时的代码.最简单的方法是使用 std::thread::生成
:
To avoid this, you need to run the code that creates the second Tokio runtime on a completely independent thread. The easiest way to do this is to use std::thread::spawn
:
use std::thread;
#[tokio::main]
async fn inner_example() {}
#[tokio::main]
async fn main() {
thread::spawn(|| {
inner_example();
}).join().expect("Thread panicked")
}
为了提高性能,您可能希望使用线程池而不是每次都创建一个新线程.方便的是,Tokio 本身通过 spawn_blocking
:
For improved performance, you may wish to use a threadpool instead of creating a new thread each time. Conveniently, Tokio itself provides such a threadpool via spawn_blocking
:
#[tokio::main]
async fn inner_example() {}
#[tokio::main]
async fn main() {
tokio::task::spawn_blocking(|| {
inner_example();
}).await.expect("Task panicked")
}
在某些情况下,您实际上不需要创建第二个 Tokio 运行时,而是可以重用父运行时.为此,您需要传入 Handle
到外部运行时.您可以选择使用轻量级执行器,例如 futures::executor
阻塞结果,如果你需要等待工作完成:
In some cases you don't need to actually create a second Tokio runtime and can instead reuse the parent runtime. To do so, you pass in a Handle
to the outer runtime. You can optionally use a lightweight executor like futures::executor
to block on the result, if you need to wait for the work to finish:
use tokio::runtime::Handle; // 1.0.2
fn inner_example(handle: Handle) {
futures::executor::block_on(async {
handle
.spawn(async {
// Do work here
})
.await
.expect("Task spawned in Tokio executor panicked")
})
}
#[tokio::main]
async fn main() {
let handle = Handle::current();
tokio::task::spawn_blocking(|| {
inner_example(handle);
})
.await
.expect("Blocking task panicked")
}
另见:
更好的方法是首先避免创建嵌套的 Tokio 运行时.理想情况下,如果一个库使用异步执行器,它也会提供直接异步功能,以便您可以使用自己的执行器.
A better path is to avoid creating nested Tokio runtimes in the first place. Ideally, if a library uses an asynchronous executor, it would also offer the direct asynchronous function so you could use your own executor.
值得查看 API 以查看是否有非阻塞替代方案,如果没有,则在项目存储库中提出问题.
It's worth looking at the API to see if there is a non-blocking alternative, and if not, raising an issue on the project's repository.
您还可以重新组织您的代码,以便 Tokio 运行时不是嵌套的而是顺序的:
You may also be able to reorganize your code so that the Tokio runtimes are not nested but are instead sequential:
struct Data;
#[tokio::main]
async fn inner_example() -> Data {
Data
}
#[tokio::main]
async fn core(_: Data) {}
fn main() {
let data = inner_example();
core(data);
}
这篇关于如何在另一个 Tokio 运行时内创建 Tokio 运行时而不会出现错误“无法从运行时内启动运行时"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!