问题描述
我们有一个(长期运行的)Windows服务,它使用FtpWebRequest定期与嵌入在第三方设备上的FTP服务器进行通信。这在大多数情况下都很有效,但有时我们的服务会停止与设备进行通信,但只要您重新启动我们的服务,所有事情都会重新开始。
我花了有一段时间用一个MCVE(包含在下面)进行调试,并通过Wireshark发现,一旦通信开始失败,没有网络流量进入外部FTP服务器(在Wireshark中根本没有数据包出现在IP上)。如果我尝试从Windows资源管理器等同一台机器上的其他应用程序连接到相同的FTP,则一切正常。
在即将停止工作之前查看数据包,我会看到数据包与重置(RST)标志设置来自设备,所以我怀疑这可能是问题。一旦计算机上的网络堆栈的某些部分运行,我们的服务就会收到重置数据包,它会执行并阻止从我们的流程到设备的所有进一步的沟通。
据我所知,我们的方式没有任何问题与设备进行通信,大部分时间,完全相同的代码工作得很好。重现此问题的最简单方法(请参阅下面的MCVE)似乎是同时与FTP建立了很多单独的连接,所以我怀疑在与FTP建立大量连接时可能会发生此问题(如果我们重新启动我们的流程,一切正常,我们确实需要重新建立与我们的沟通。设备。是否有一种方法可以在不必重新启动整个过程的情况下重新建立通信(经过适当的时间后)?
不幸的是,FTP服务器运行在一个相当古老的第三方设备上,它不太可能被更新来解决这个问题,即使它仍然是我们仍然如果可能的话,我们希望/需要与所有已经在场的人沟通,而不需要我们的客户更新它们。
我们知道的选项:
-
使用命令行FTP客户端,例如Windows内置的FTP客户端。
- 其中一个缺点是我们需要列出目录中的所有文件,然后只下载其中的一部分文件,必须编写逻辑来解析对此的响应。
- 我们还必须将这些文件下载到临时文件中,而不是像我们现在这样下载到流中。
创建另一个应用程序来处理每个请求完成后我们拆下的FTP通信部分。
- 这里主要的缺点是进程间通信有点痛苦。
MCVE
这在LINQPad中运行并且相当可靠地再现问题。通常情况下,前几个任务成功,然后发生问题,之后所有任务开始超时。在Wireshark中,我可以看到我的电脑和设备之间没有通讯。 如果我再次运行脚本,那么所有任务都会失败,直到我重新启动LINQPad或执行取消所有线程并重置,从而重新启动进程LINQPad用于运行查询。
async任务Main(){
var tasks = new List< Task>();
var numberOfBatches = 3;
var numberOfTasksPerBatch = 10;
foreach(Enumerable.Range(1,numberOfBatches)中的var batchNumber){
$批处理{batchNumber}中的任务启动。
tasks.AddRange(Enumerable.Range(1,numberOfTasksPerBatch).Select(taskNumber => Connect(batchNumber,taskNumber)));
等待Task.Delay(TimeSpan.FromSeconds(5));
}
等待Task.WhenAll(任务);
异步任务连接(int batchNumber,int taskNumber){
尝试{
var client = new FtpClient();
var result = await client.GetFileAsync(new Uri(ftp://192.168.0.191/logging/20140620.csv),TimeSpan.FromSeconds(10));
result.Count.Dump($ batch {batchNumber}中的Task {taskNumber}}成功);
} catch(Exception e){
e.Dump($ batch {batchNumber}中的Task {taskNumber}失败);
}
}
公共类FtpClient {
公共虚拟异步任务< ImmutableList<字节>> GetFileAsync(Uri fileUri,TimeSpan超时){
if(fileUri == null){
throw new ArgumentNullException(nameof(fileUri));
}
FtpWebRequest ftpWebRequest =(FtpWebRequest)WebRequest.Create(fileUri);
ftpWebRequest.Method = WebRequestMethods.Ftp.DownloadFile;
ftpWebRequest.UseBinary = true;
ftpWebRequest.KeepAlive = false;
使用(var source = new CancellationTokenSource(timeout)){
尝试{
使用(var response =(FtpWebResponse)await ftpWebRequest.GetResponseAsync()
.WithWaitCancellation (source.Token)){
using(Stream ftpStream = response.GetResponseStream()){
if(ftpStream == null){
throw new InvalidOperationException(No response stream);
使用(var dataStream = new MemoryStream()){
await ftpStream.CopyToAsync(dataStream,4096,source.Token)
.WithWaitCancellation(source。令牌);
return dataStream.ToArray()。ToImmutableList();
$ b} catch(OperationCanceledException){
抛出新的WebException(
String.Format(在{0}秒后操作超时。,timeout.TotalSeconds),
WebExceptionStatus.Timeout);
} finally {
ftpWebRequest.Abort();
public static class TaskCancellationExtensions {
/// http://stackoverflow.com/a/ 14524565/1512
public static async Task< T> WithWaitCancellation< T>(
这个任务< T>任务,
CancellationToken cancellationToken){
//任务完成源。
var tcs = new TaskCompletionSource< Boolean>();
//注册取消令牌。
using(cancellationToken.Register(
s =>((TaskCompletionSource< Boolean>)s).TrySetResult(true),
tcs)){
//如果任务等待on是取消令牌...
if(task!= await Task.WhenAny(task,tcs.Task)){
throw new OperationCanceledException(cancellationToken);
}
}
//等待其中一个完成。
返回等待任务;
}
/// http://stackoverflow.com/a/14524565/1512
公共静态异步任务WithWaitCancellation(
此任务任务,
CancellationToken cancellationToken){
//任务完成源。
var tcs = new TaskCompletionSource< Boolean>();
//注册取消令牌。
using(cancellationToken.Register(
s =>((TaskCompletionSource< Boolean>)s).TrySetResult(true),
tcs)){
//如果任务等待on是取消令牌...
if(task!= await Task.WhenAny(task,tcs.Task)){
throw new OperationCanceledException(cancellationToken);
}
}
//等待其中一个完成。
等待任务;
$ / code>
即使网络在N次尝试失败后回来,也会提醒我旧的(?)IE不会重新加载页面的行为。
您应该尝试设置 FtpWebRequest
的缓存策略为 BypassCache
。
HttpRequestCachePolicy bypassPolicy = new HttpRequestCachePolicy(
HttpRequestCacheLevel.BypassCache
);
ftpWebRequest.CachePolicy = bypassPolicy;
设定 KeepAlive
后。
We have a (long-running) Windows service that among other things periodically communicates with an FTP server embedded on a third-party device using FtpWebRequest. This works great most of the time, but sometimes our service stops communicating with the device, but as soon as you restart our service everything starts working again.
I've spent some time debugging this with an MCVE (included below) and discovered via Wireshark that once communication starts failing there is no network traffic going to the external FTP server (no packets at all show up going to this IP in Wireshark). If I try to connect to the same FTP from another application on the same machine like Windows explorer everything works fine.
Looking at the packets just before everything stops working I see packets with the reset (RST) flag set coming from the device, so I suspect this may be the issue. Once some part of the network stack on the computer our service in running on receives the reset packet it does what's described in the TCP resets section of this article and blocks all further communication from our process to the device.
As far as I can tell there's nothing wrong with the way we're communicating with the device, and most of the time the exact same code works just fine. The easiest way to reproduce the issue (see MCVE below) seems to be to make a lot of separate connections to the FTP at the same time, so I suspect the issue may occur when there are a lot of connections being made to the FTP (not all by us) at the same time.
The thing is that if we do restart our process everything works fine, and we do need to re-establish communication with the device. Is there a way to re-establish communication (after a suitable amount of time has passed) without having to restart the entire process?
Unfortunately the FTP server is running embedded on a fairly old third-party device that's not likely to be updated to address this issue, and even if it were we'd still want/need to communicate with all the ones already out in the field without requiring our customers to update them if possible.
Options we are aware of:
Using a command line FTP client such as the one built into Windows.
- One downside to this is that we need to list all the files in a directory and then download only some of them, so we'd have to write logic to parse the response to this.
- We'd also have to download the files to a temp file instead of to a stream like we do now.
Creating another application that handles the FTP communication part that we tear down after each request completes.
- The main downside here is that inter-process communication is a bit of a pain.
MCVE
This runs in LINQPad and reproduces the issue fairly reliably. Typically the first several tasks succeed and then the issue occurs, and after that all tasks start timing out. In Wireshark I can see that no communication between my computer and the device is happening.
If I run the script again then all tasks fail until I restart LINQPad or do "Cancel All Threads and Reset" which restarts the process LINQPad uses to run the query. If I do either of those things then we're back to the first several tasks succeeding.
async Task Main() {
var tasks = new List<Task>();
var numberOfBatches = 3;
var numberOfTasksPerBatch = 10;
foreach (var batchNumber in Enumerable.Range(1, numberOfBatches)) {
$"Starting tasks in batch {batchNumber}".Dump();
tasks.AddRange(Enumerable.Range(1, numberOfTasksPerBatch).Select(taskNumber => Connect(batchNumber, taskNumber)));
await Task.Delay(TimeSpan.FromSeconds(5));
}
await Task.WhenAll(tasks);
}
async Task Connect(int batchNumber, int taskNumber) {
try {
var client = new FtpClient();
var result = await client.GetFileAsync(new Uri("ftp://192.168.0.191/logging/20140620.csv"), TimeSpan.FromSeconds(10));
result.Count.Dump($"Task {taskNumber} in batch {batchNumber} succeeded");
} catch (Exception e) {
e.Dump($"Task {taskNumber} in batch {batchNumber} failed");
}
}
public class FtpClient {
public virtual async Task<ImmutableList<Byte>> GetFileAsync(Uri fileUri, TimeSpan timeout) {
if (fileUri == null) {
throw new ArgumentNullException(nameof(fileUri));
}
FtpWebRequest ftpWebRequest = (FtpWebRequest)WebRequest.Create(fileUri);
ftpWebRequest.Method = WebRequestMethods.Ftp.DownloadFile;
ftpWebRequest.UseBinary = true;
ftpWebRequest.KeepAlive = false;
using (var source = new CancellationTokenSource(timeout)) {
try {
using (var response = (FtpWebResponse)await ftpWebRequest.GetResponseAsync()
.WithWaitCancellation(source.Token)) {
using (Stream ftpStream = response.GetResponseStream()) {
if (ftpStream == null) {
throw new InvalidOperationException("No response stream");
}
using (var dataStream = new MemoryStream()) {
await ftpStream.CopyToAsync(dataStream, 4096, source.Token)
.WithWaitCancellation(source.Token);
return dataStream.ToArray().ToImmutableList();
}
}
}
} catch (OperationCanceledException) {
throw new WebException(
String.Format("Operation timed out after {0} seconds.", timeout.TotalSeconds),
WebExceptionStatus.Timeout);
} finally {
ftpWebRequest.Abort();
}
}
}
}
public static class TaskCancellationExtensions {
/// http://stackoverflow.com/a/14524565/1512
public static async Task<T> WithWaitCancellation<T>(
this Task<T> task,
CancellationToken cancellationToken) {
// The task completion source.
var tcs = new TaskCompletionSource<Boolean>();
// Register with the cancellation token.
using (cancellationToken.Register(
s => ((TaskCompletionSource<Boolean>)s).TrySetResult(true),
tcs)) {
// If the task waited on is the cancellation token...
if (task != await Task.WhenAny(task, tcs.Task)) {
throw new OperationCanceledException(cancellationToken);
}
}
// Wait for one or the other to complete.
return await task;
}
/// http://stackoverflow.com/a/14524565/1512
public static async Task WithWaitCancellation(
this Task task,
CancellationToken cancellationToken) {
// The task completion source.
var tcs = new TaskCompletionSource<Boolean>();
// Register with the cancellation token.
using (cancellationToken.Register(
s => ((TaskCompletionSource<Boolean>)s).TrySetResult(true),
tcs)) {
// If the task waited on is the cancellation token...
if (task != await Task.WhenAny(task, tcs.Task)) {
throw new OperationCanceledException(cancellationToken);
}
}
// Wait for one or the other to complete.
await task;
}
}
This reminds me of old(?) IE behaviour of no reload of pages even when the network came back after N unsuccessful tries.
You should try setting the FtpWebRequest
's cache policy to BypassCache
.
HttpRequestCachePolicy bypassPolicy = new HttpRequestCachePolicy(
HttpRequestCacheLevel.BypassCache
);
ftpWebRequest.CachePolicy = bypassPolicy;
after setting KeepAlive
.
这篇关于如何在不重新启动进程的情况下重新启动与发送重置数据包的FTP服务器的通信?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!