继续看看path&assoc的断开和恢复管理。
二. Manage transport andassociation
偶联的多归属管理主要针对transport,但多个transport/path的断开必然会倒致association也断开。所以追踪path的更新、断开和恢复,也离不开assoc的断开和恢复管理。
每个path的传送失败(即收不到SACK),除了本path出错计数外,assoc的出错计数器也要递增。除了primary transport(主通路)在传送DATA期间外,在primarytransport闲时和alternatetransport(备用通路)上,一般是通过发送HeartBeat来检测链路状态。
path和assoc的出错计数器分别如下:
transport->error_count和asoc->overall_error_count
path和assoc的几种状态分别如下:
pathstate: 0-Unactive,1-Active,2-Unconfirm。
assocstate:0~max,4是已建立。/proc/net/sctp/assoc中“ST”项表示偶联状态。
1、何时更新path
操作函数:sctp_assoc_control_transport #net/sctp/associola.c
操作对象:asoc->primary_path
asoc->active_path
asoc->retran_path
操作类型:up / down
(1). SCTP_TRANSPORT_UP点:sctp_check_transmitted,sct_cmd_transport_on
(2). SCTP_TRANSPORT_DOWN点:sctp_do_8_2_transport_strike #可能会更新active_path!
2、DOWN:何时断开path
path重传次数超过最大值(可通过/proc/sys/net/sctp/path_max_retrans设置),path通路断开。
操作函数:sctp_do_8_2_transport_strike,实现源码如下所示:
点击(此处)折叠或打开
- /* The check for association's overall error counter exceeding the
- * threshold is done in the state function.
- */
- /* We are here due to a timer expiration. If the timer was
- * not a HEARTBEAT, then normal error tracking is done.
- * If the timer was a heartbeat, we only increment error counts
- * when we already have an outstanding HEARTBEAT that has not
- * been acknowledged.
- * Additionally, some tranport states inhibit error increments.
- */
- if (!is_hb) {
- asoc->overall_error_count++;
- if (transport->state != SCTP_INACTIVE)
- transport->error_count++; //传送失败次数统计,下同
- } else if (transport->hb_sent) {
- if (transport->state != SCTP_UNCONFIRMED)
- asoc->overall_error_count++;
- if (transport->state != SCTP_INACTIVE)
- transport->error_count++;
- }
- //。。。(略),SCTP_PF状态处理
- if (transport->state != SCTP_INACTIVE &&
- (transport->error_count > transport->pathmaxrxt)) { //通路失败次数比较
- SCTP_DEBUG_PRINTK_IPADDR("transport_strike:association %p",
- " transport IP: port:%d failed.\n",
- asoc,
- (&transport->ipaddr),
- ntohs(transport->ipaddr.v4.sin_port));
- sctp_assoc_control_transport(asoc, transport,
- SCTP_TRANSPORT_DOWN, //通路断开
- SCTP_FAILED_THRESHOLD);
- }
sctp_do_8_2_transport_strike这个函数何时被调用:(都在sctp_cmd_interpreter中)
(1). SCTP_CMD_STRIKE -> sctp_do_8_2_transport_strike
触发点:sctp_sf_do_6_3_3_rtx,sctp_sf_t2_timer_expire, sctp_sf_t4_timer_expire
(2). SCTP_CMD_TRANSPORT_RESET -> sctp_cmd_transport_reset -> sctp_do_8_2_transport_strike
触发点:sctp_sf_sendbeat_8_3,sctp_sf_do_prm_requestheartbeat
3、UP:何时清掉transport->error_count,表明path恢复正常
(1). sctp_cmd_interpreter(SCTP_CMD_UPDATE_ASSOC) -> sctp_assoc_update -> sctp_transport_reset
(2). sctp_cmd_interpreter(SCTP_CMD_PROCESS_SACK) -> sctp_cmd_process_sack -> sctp_outq_sack -> sctp_check_transmitted //收到SACK
(3). sctp_cmd_interpreter(SCTP_CMD_TRANSPORT_ON) -> sctp_cmd_transport_on
4、何时断开偶联
assoc重传次数超过最大值(可通过/proc/sys/net/sctp/association_max_retrans设置),偶联断开。
操作函数:sctp_sf_do_6_3_3_rtx, sctp_sf_sendbeat_8_3, sctp_sf_t4_timer_expire等。
以sctp_sf_do_6_3_3_rtx为例:
点击(此处)折叠或打开
- if (asoc->overall_error_count >= asoc->max_retrans) { //偶联失败次数判断
- if (asoc->state == SCTP_STATE_SHUTDOWN_PENDING) {
- /*
- * We are here likely because the receiver had its rwnd
- * closed for a while and we have not been able to
- * transmit the locally queued data within the maximum
- * retransmission attempts limit. Start the T5
- * shutdown guard timer to give the receiver one last
- * chance and some additional time to recover before
- * aborting.
- */
- sctp_add_cmd_sf(commands, SCTP_CMD_TIMER_START_ONCE,
- SCTP_TO(SCTP_EVENT_TIMEOUT_T5_SHUTDOWN_GUARD));
- } else {
- sctp_add_cmd_sf(commands, SCTP_CMD_SET_SK_ERR,
- SCTP_ERROR(ETIMEDOUT));
- /* CMD_ASSOC_FAILED calls CMD_DELETE_TCB. */
- sctp_add_cmd_sf(commands, SCTP_CMD_ASSOC_FAILED, //偶联断开
- SCTP_PERR(SCTP_ERROR_NO_ERROR));
- SCTP_INC_STATS(net, SCTP_MIB_ABORTEDS);
- SCTP_DEC_STATS(net, SCTP_MIB_CURRESTAB);
- return SCTP_DISPOSITION_DELETE_TCB;
- }
- }
5、何时清掉asoc->overall_error_count,表明偶联恢复正常
(1). sctp_cmd_interpreter(SCTP_CMD_UPDATE_ASSOC) -> sctp_assoc_update
(2). sctp_cmd_interpreter(SCTP_CMD_PROCESS_SACK) -> sctp_cmd_process_sack -> sctp_outq_sack -> sctp_check_transmitted //收到SACK
(3). sctp_cmd_interpreter(SCTP_CMD_TRANSPORT_ON) -> sctp_cmd_transport_on
(4). sctp_cmd_interpreter(SCTP_CMD_GEN_SHUTDOWN)
PS:鉴于SCTP代码的相对稳定,如果不是特别说明,所分析源码的内核版本是2.6.21。
另外吐槽一下,要是CU博客编辑支持直接从WORD中拷贝过来就好了,省得每次重新设置格式,难道非得截图。。。