问题描述
我有node.js服务和角度客户端使用socket.io在长时间的http请求中传输一些消息。
I have node.js service and angular client using socket.io to transport some message during long time http request.
服务:
export const socketArray: SocketIO.Socket[] = [];
export let socketMapping: {[socketId: string]: number} = {};
const socketRegister: hapi.Plugin<any> = {
register: (server) => {
const io: SocketIO.Server = socket(server.listener);
// Whenever a session connected to socket, create a socket object and add it to socket array
io.on("connection", (socket) => {
console.log(`socket ${socket.id} connected`);
logger.info(`socket ${socket.id} connected`);
// Only put socket object into array if init message received
socket.on("init", msg => {
logger.info(`socket ${socket.id} initialized`);
socketArray.push(socket);
socketMapping[socket.id] = msg;
});
// Remove socket object from socket array when disconnected
socket.on("disconnect", (reason) => {
console.log(`socket ${socket.id} disconnected because: ${reason}`)
logger.info(`socket ${socket.id} disconnected because: ${reason}`);
for(let i = 0; i < socketArray.length; i ++) {
if(socketArray[i] === socket) {
socketArray.splice(i, 1);
return;
}
}
});
});
},
name: "socketRegister",
version: "1.0"
}
export const socketSender = async (socketId: string, channel: string, content: SocketMessage) => {
try {
// Add message to db here
// await storeMessage(socketMapping[socketId], content);
// Find corresponding socket and send message
logger.info(`trying sending message to ${socketId}`);
for (let i = 0; i < socketArray.length; i ++) {
if (socketArray[i].id === socketId) {
socketArray[i].emit(channel, JSON.stringify(content));
logger.info(`socket ${socketId} send message to ${channel}`);
if (content.isFinal == true) {
// TODO: delete all messages of the process if isFinal is true
await deleteProcess(content.processId);
}
return;
}
}
} catch (err) {
logger.error("Socket sender error: ", err.message);
}
};
客户:
connectSocket() {
if (!this.socket) {
try {
this.socket = io(socketUrl);
this.socket.emit('init', 'some-data');
} catch (err) {
console.log(err);
}
} else if (this.socket.disconnected) {
this.socket.connect();
this.socket.emit('init', 'some-data');
}
this.socket.on('some-channel', (data) => {
// Do something
});
this.socket.on('disconnect', (data) => {
console.log(data);
});
}
它们通常正常工作但随机产生断线错误。从我的日志文件中,我们可以看到:
They usually work fine but produce disconnection error randomly. From my log file, we can see this:
2018-07-21T00:20:28.209Z[x]INFO: socket 8jBh7YC4A1btDTo_AAAN connected
2018-07-21T00:20:28.324Z[x]INFO: socket 8jBh7YC4A1btDTo_AAAN initialized
2018-07-21T00:21:48.314Z[x]INFO: socket 8jBh7YC4A1btDTo_AAAN disconnected because: ping timeout
2018-07-21T00:21:50.849Z[x]INFO: socket C6O7Vq38ygNiwGHcAAAO connected
2018-07-21T00:23:09.345Z[x]INFO: trying sending message to C6O7Vq38ygNiwGHcAAAO
并且在断开连接消息的同时,前端还注意到一个断开事件,说传输关闭
。
And at the same time of disconnect message, front-end also noticed a disconnect event which saying transport close
.
从日志中,我们可以得到工作流是这样的:
From the log, we can get the work flow is this:
- 前端启动套接字连接并向后端发送init消息。它还保存了套接字。
- 后端检测到连接并收到初始化消息
- 后端将套接字放入阵列中以便它任何时候都可以随时使用
- 第一个套接字意外断开,另一个连接发布时没有前端的意识,因此前端从不发送消息来初始化它。
- 由于前端的已保存套接字未更改,因此在发出http请求时使用了旧的套接字ID。结果,后端发送了带有旧套接字的消息,该套接字已从套接字数组中删除。
- Front-end started a socket connection and sent an init message to back-end. It also save the socket.
- Back-end detected the connection and received init message
- Back-end put the socket to the array so that it can be used anytime anywhere
- The first socket was disconnected unexpectedly and another connection is published without front-end's awareness so front-end never send a message to initialize it.
- Since front-end's saved socket is not changed, it used the old socket id when made http request. As a result, back-end sent message with the old socket which was already removed from socket array.
情况不会发生经常。有谁知道什么可能导致断开连接和未知的连接问题?
The situation doesn't happen frequently. Does anyone know what could cause the disconnect and unknown connect issue?
推荐答案
这真的取决于长时间的http请求在做什么。 node.js将您的Javascript作为单个线程运行。这意味着它一次只能做一件事。但是,由于服务器执行的很多事情都与I / O相关(从数据库读取,从文件中获取数据,从另一台服务器获取数据等),而node.js使用事件驱动的异步I / O,它通常可以同时在空中播放许多球,因此它似乎同时处理大量请求。
It really depends what "long time http request" is doing. node.js runs your Javascript as a single thread. That means it can literally only do one thing at a time. But, since many things that servers do are I/O related (read from a database, get data from a file, get data from another server, etc...) and node.js uses event-driven asynchronous I/O, it can often have many balls in the air at the same time so it appears to be working on lots of requests at once.
但是,如果您的复杂http请求是CPU-密集,使用大量的CPU,然后它占用了单个Javascript线程,并且在占用CPU时没有其他任何东西可以完成。这意味着所有传入的HTTP或socket.io请求都必须在队列中等待,直到一个node.js Javascript线程空闲,因此它可以从事件队列中获取下一个事件并开始处理该传入请求。
But, if your complex http request is CPU-intensive, using lots of CPU, then it's hogging the single Javascript thread and nothing else can get done while it is hogging the CPU. That means that all incoming HTTP or socket.io requests have to wait in a queue until the one node.js Javascript thread is free so it can grab the next event from the event queue and start to process that incoming request.
如果我们能看到这个非常复杂的http请求的代码,我们只能更具体地帮助你。
We could only really help you more specifically if we could see the code for this "very complex http request".
在node.js中占用CPU的常用方法是将CPU密集型的东西卸载到其他进程。如果它主要只是导致问题的这一段代码,你可以启动几个子进程(可能与服务器中的CPU数量一样多),然后向它们提供CPU密集型工作并离开主节点.js可以自由处理具有极低延迟的传入(非CPU密集型)请求。
The usual way around CPU-hogging things in node.js is to offload CPU-intensive stuff to other processes. If it's mostly just this one piece of code that causes the problem, you can spin up several child processes (perhaps as many as the number of CPUs you have in your server) and then feed them the CPU-intensive work and leave your main node.js process free to handle incoming (non-CPU-intensive) requests with very low latency.
如果您有多个可能占用CPU的操作,那么您要么拥有将它们全部存储到子进程(可能通过某种工作队列),或者您可以部署群集。集群的挑战是给定的socket.io连接将是集群中的一个特定服务器,如果它恰好是执行CPU占用操作的那个进程,那么分配给该服务器的所有socket.io连接都将有很长的延迟。因此,对于此类问题,定期群集可能不太好。处理CPU密集型工作的工作队列和多个专用子进程可能更好,因为这些进程不会有任何负责的外部socket.io连接。
If you have multiple operations that might hog the CPU, then you either have to farm them all out to child processes (probably via some sort of work queue) or you can deploy clustering. The challenge with clustering is that a given socket.io connection will be to one particular server in your cluster and if it's that process that just happens to be executing a CPU-hogging operation, then all the socket.io connections assigned to that server would have bad latency. So, regular clustering is probably not so good for this type of issue. The work-queue and multiple specialized child processes to handle CPU-intensive work are probably better because those processes won't have any outside socket.io connections that they are responsible for.
此外,您应该知道,如果您正在使用同步文件I / O,那将阻止整个node.js Javascript线程。 node.js在同步文件I / O操作期间无法运行任何其他Javascript。 node.js获得了它的可伸缩性,并且能够从异步I / O模型中同时运行许多操作。如果您使用同步I / O,则会完全破坏它并破坏可伸缩性和响应性。
Also, you should know that if you're using synchronous file I/O, that blocks the entire node.js Javascript thread. node.js can not run any other Javascript during a synchronous file I/O operation. node.js gets its scalability and its ability to have many operations in flight at the same from its asynchronous I/O model. If you use synchronous I/O, you completely break that and ruin scalability and responsiveness.
同步文件I / O仅属于服务器启动代码或单用途脚本(不是服务器)。在服务器中处理请求时不应该使用它。
Synchronous file I/O belongs only in server startup code or in a single purpose script (not a server). It should never be used while processing a request in a server.
使用流或使用<$>使异步文件I / O更容易忍受的两种方法c $ c> async / await 使用promisified fs
方法。
Two ways to make asynchronous file I/O a little more tolerable are by using streams or by using async/await
with promisified fs
methods.
这篇关于Socket.io意外断开连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!