WO2010022635A1 - 多线程通讯程序中防止线程吊死的方法 - Google Patents

多线程通讯程序中防止线程吊死的方法 Download PDF

Info

Publication number
WO2010022635A1
WO2010022635A1 PCT/CN2009/073424 CN2009073424W WO2010022635A1 WO 2010022635 A1 WO2010022635 A1 WO 2010022635A1 CN 2009073424 W CN2009073424 W CN 2009073424W WO 2010022635 A1 WO2010022635 A1 WO 2010022635A1
Authority
WO
WIPO (PCT)
Prior art keywords
thread
tcp connection
sub
link information
child
Prior art date
Application number
PCT/CN2009/073424
Other languages
English (en)
French (fr)
Inventor
李冰
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Priority to EP09809220.8A priority Critical patent/EP2323344A4/en
Priority to BRPI0917076A priority patent/BRPI0917076A2/pt
Publication of WO2010022635A1 publication Critical patent/WO2010022635A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/485Task life-cycle, e.g. stopping, restarting, resuming execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/165Combined use of TCP and UDP protocols; selection criteria therefor

Definitions

  • the present invention relates to the field of multi-thread control technologies, and in particular, to a method for preventing thread hang in a multi-thread communication program.
  • BACKGROUND With the development of intelligent networks, the number of nodes that need to be linked in the communication program of the intelligent network is increasing, the amount of communication data is gradually increasing, and the efficiency requirement for communication programs is correspondingly increased.
  • the master thread creates a pair of child threads for each node on the TCP connection, that is, the receiving child thread and the generating child thread, respectively, for receiving and transmitting messages on the TCP connection.
  • the master thread also records information such as the node module number, the office number, the IP address, the sending sub-thread, and the thread ID (identification) number of the receiving sub-thread into the global link information table.
  • the communication program needs to newly create a send or receive sub-thread or close the transmit and receive sub-threads, and update the corresponding information in the link information table accordingly.
  • the link master table may be updated by any one of the communication master thread, the receiving child thread, or the sending child thread
  • the mutex lock is generally used to ensure that only one thread operates the link information table at the same time. Since the communication master thread, the receiving child thread, and the sending child thread are simultaneously running, when the receiving child thread or the sending child thread finds that the link is abnormal, the link information table is updated after acquiring the mutex lock, and the master thread is simultaneously controlled. The link is abnormal because the heartbeat message is not received. In this case, the master thread closes the link and reclaims the thread resources of the corresponding link. One: kills the thread.
  • the technical problem to be solved by the present invention is to provide a method for preventing a thread from hanging in a multi-thread communication program, preventing a thread from hanging in a communication program, and ensuring normal operation of the communication program.
  • a method for preventing thread hang in a multi-thread communication program includes: a master thread for initiating a link establishment request and/or processing a link establishment request; at least one pair of sub-threads, wherein each pair The thread corresponds to a TCP connection, including a receiving sub-thread that receives a message on the TCP connection and a sending sub-thread that sends a message; and a link information table for storing link information of each TCP connection; the method includes: When the first sub-thread detects that the first TCP connection corresponding to the sub-thread is abnormal, the exit flag is set in the link information of the first TCP connection and exits the sub-thread; the main control thread periodically polls the link information.
  • the first sub-thread further detects whether an abnormality occurs in the first TCP connection according to whether a transmission error or a reception error occurs on the first TCP connection.
  • the user when the first sub-thread fails to exit the sub-thread, the user enters a dormant state and sets state information that identifies that the first sub-thread is in a dormant state in the link information of the first TCP connection; After detecting the status information, the master thread kills the first child thread that is in a sleep state.
  • the master thread further closes the first TCP connection by closing a socket descriptor of the first TCP connection.
  • the master thread further kills all remaining child threads corresponding to the first TCP connection according to the child thread ID number saved in the link information of the first TCP connection.
  • the method for preventing a thread hang in a multi-thread communication program includes: a main control thread for initiating a link establishment request and/or processing a link establishment request; at least one pair of sub-threads, wherein each pair of sub-threads corresponds to a TCP connection, comprising: a receiving sub-thread that receives a message on the TCP connection and a sending sub-thread that sends a message; and a link information table for storing link information of each TCP connection; the method includes: When the thread detects that the first TCP connection is abnormal, the first TCP connection is closed; if the child thread corresponding to the first TCP connection detects that the first TCP connection is abnormal, the exit flag is set in the link information of the first TCP connection.
  • the master thread periodically polls the link information table, and when detecting the exit flag bit in the link information of the first TCP connection, kills all remaining children corresponding to the first TCP connection Thread, and clear the link information of the first TCP connection.
  • the sub-thread corresponding to the first TCP connection further detects whether an abnormality occurs in the first TCP connection according to whether a transmission error or a reception error occurs on the first TCP connection.
  • the sub-thread corresponding to the first TCP connection further enters a dormant state when the sub-thread fails to be exited, and sets state information indicating that the sub-thread is in a dormant state in the link information of the first TCP connection;
  • the master thread kills the child thread that is in a sleep state.
  • the master thread further closes the first TCP connection by closing a socket descriptor of the first TCP connection.
  • the master thread further performs the link according to the first TCP connection.
  • the child thread ID number saved in the message kills all remaining child threads corresponding to the first TCP connection.
  • the communication sub-thread detects a connection abnormality
  • the abnormality is indicated by setting the exit flag bit
  • the receiving sub-thread and the sending sub-thread can be separately set.
  • the exit flag is set, and the master thread kills the corresponding child thread according to the exit flag, thereby ensuring that the master thread and the receiving/transmitting child thread do not simultaneously operate the same content in the link information table, that is, this embodiment
  • the use of the mutex is avoided, thereby fundamentally avoiding the hang of the thread due to the unavailability of the mutex, and ensuring the normal operation of the communication program.
  • the master thread after detecting that the TCP connection is abnormal, the master thread does not directly kill the receiving/transmitting sub-thread corresponding to the TCP connection, but first closes the TCP connection and waits for the receiving of the TCP connection/ The sending sub-thread actively exits, and after receiving/sending the sub-thread exiting, the receiving/sending sub-thread is killed, thereby avoiding directly killing the sub-threads, and the data reception or sending failure may cause the communication program to be undetermined. Impact.
  • the present invention can be implemented in a standard C language with good portability.
  • FIG. 1 is a structural diagram of an intelligent network multiple SCP system according to an embodiment of the present invention
  • FIG. 2 is a system architecture diagram of an embodiment of the present invention
  • FIG. 3 is a method for preventing thread hanging according to an embodiment of the present invention
  • FIG. 4 is a flowchart of a method for preventing a thread from hanging due to another embodiment of the present invention.
  • the method according to the embodiment of the present invention provides a method for processing a specific abnormality when an abnormality occurs in a link.
  • the order and processing mode so that the main control thread and the receiving/transmitting sub-thread do not update the same information in the link information table at the same time, so that each thread can access the link information table without acquiring the mutex lock. Updates are made to avoid thread hangs in the communication program, which ensures the normal operation of the business.
  • the system has multiple Service Control Points (SCPs), each of which has a service running and has its own services. Multiple SCPs are managed by a Service Management Point (SMP). TCP long links are established between the SCPs, between the SCP and the SMP.
  • SCPs Service Control Points
  • SMP Service Management Point
  • TCP long links are established between the SCPs, between the SCP and the SMP.
  • 2 is a system architecture diagram of an embodiment of the present invention.
  • the communication program on the SCP and SMP will create a master thread for initiating the link establishment request and/or processing the link establishment request at startup, and then establish a TCP long connection with other SCPs and SMPs, and Each established connection creates a pair of sub-threads (ie, a sending sub-thread and a receiving sub-thread), and respectively transmits and receives a message.
  • sub-threads ie, a sending sub-thread and a receiving sub-thread
  • the communication sub-thread including the receiving/transmitting sub-thread
  • the main control thread may detect that the TCP connection is abnormal. The following two cases are described separately.
  • the communication sub-thread detects a TCP connection abnormality.
  • the method for preventing thread hang in the multi-thread communication program in this embodiment includes the following steps: Step 31: The communication program starts the main control thread, and the other The SCP and the SMP establish a TCP long connection, and for each established connection, create a pair of sub-threads corresponding thereto, and record the link information of each TCP connection in the link information table.
  • the child thread sets an exit flag bit (EXITFLAG) in the link information of the first TCP connection, and is used to indicate whether the receiving/transmitting sub-thread corresponding to the first TCP connection needs to be quit due to an abnormality; and the communication sub-thread tries to take the initiative. Exit, if the exit fails, the communication sub-thread goes to sleep.
  • the communication sub-thread here may be a receiving sub-thread or a sending sub-thread corresponding to the first TCP connection.
  • Step 33 After the main control thread starts, it starts to periodically poll the link information table to determine whether the EXITFLAG (exit flag) flag bit is set in the link information of each TCP connection in the link information table. .
  • the master thread polls the first TCP connection, it is detected that the EXITFLAG flag bit is set.
  • the master thread closes the socket (the Socket) descriptor of the first TCP connection, thereby closing the first TCP connection, and according to the communication sub-thread corresponding to the first TCP connection saved in the link information of the first TCP connection.
  • the thread identifier (ID) number kills the communication sub-thread corresponding to the first TCP connection, and updates the link information table to clear the link information corresponding to the first TCP connection.
  • the main control thread since a certain communication sub-thread corresponding to the first TCP connection may have successfully exited actively, at this time, the main control thread only needs to kill all remaining sub-threads corresponding to the first TCP connection. Thereafter, the master thread continues to poll the next TCP connection and performs a similar decision.
  • the communication sub-thread when the communication sub-thread fails to enter the sleep state, the communication sub-thread may further set status information indicating that the sub-thread is in the sleep state in the link information, so in step 33, the main control thread detects the status. After the information, the communication sub-thread that is in a dormant state can be killed accordingly.
  • the communication sub-thread when the communication sub-thread detects the connection abnormality, the abnormality is indicated by setting the exit flag bit, and the receiving sub-thread and the sending sub-thread can respectively set the exit flag, and the main control thread according to The exit flag kills the corresponding child thread, thereby ensuring that the master thread and the receiving/transmitting child thread do not simultaneously operate the same content in the link information table, that is, the specific flag and process are adopted in this embodiment.
  • the use of the mutex is avoided, so that the thread hangs due to the unavailability of the mutex is avoided, and the communication program is guaranteed to operate normally.
  • the main control thread detects a TCP connection abnormality.
  • the method for preventing thread hang in the multi-thread communication program in this embodiment includes the following steps: Step 41: The communication program starts the main control thread, and the other The SCP and the SMP establish a TCP long connection, and for each established connection, create a pair of sub-threads corresponding thereto, and record the link information of each TCP connection in the link information table.
  • the master thread closes the TCP connection by closing the socket descriptor of the TCP connection (here, the TCP connection is assumed to be the first).
  • a TCP connection a TCP connection
  • Step 43 Since the first TCP connection is closed, causing a reception or transmission error of the communication sub-thread of the first TCP connection, after detecting the error, the communication sub-thread is set to correspond to the first TCP connection.
  • Step 44 After the main control thread starts, it starts to periodically poll the link information table to determine whether the EXITFLAG flag bit is set in the link information of each TCP connection in the link information table.
  • the master thread polls the first TCP connection, it detects that the EXITFLAG flag is set. Killing, by the master thread, the communication sub-thread corresponding to the first TCP connection according to the thread identifier (ID) number of the communication sub-thread corresponding to the first TCP connection saved in the link information of the first TCP connection, and After the thread resources (such as the stack) are collected, the link information table is updated, and the link information corresponding to the first TCP connection is cleared.
  • ID thread identifier
  • the main control thread since a certain communication sub-thread corresponding to the first TCP connection may have successfully exited actively, at this time, the main control thread only needs to kill all remaining sub-threads corresponding to the first TCP connection. Thereafter, the master thread continues to poll the next TCP connection and performs a similar decision.
  • the master thread after detecting that the first TCP connection is abnormal, the master thread does not directly kill the communication sub-thread corresponding to the first TCP connection, but first closes the first TCP connection and waits for the first TCP connection.
  • the communication sub-thread actively exits, and after the communication sub-thread exits, the communication sub-thread is killed. The reason for this is: When the first TCP connection is abnormal, its reception
  • the sending sub-thread may still receive or send data. If these sub-threads are directly killed, the data reception or transmission may fail, which may have an undetermined effect on the communication program. Therefore, this embodiment allows the sub-thread to actively exit, Avoid the above situation.
  • this embodiment allows the sub-thread to actively exit, Avoid the above situation.
  • the embodiment of the present invention by setting the exit flag and the specific process, the thread in the communication program is effectively prevented from hanging, and the long-term communication interruption between the SCP and the SMP is avoided.
  • the communication program between the SCP and the SMP is specifically implemented, but the present invention is not limited thereto. It will be apparent to those skilled in the art that the present invention can be modified in form and detail without departing from the spirit and scope of the invention. The above described embodiments are illustrative and not restrictive, without departing from the spirit and scope of the invention. In the present case, all changes and modifications are within the scope of the invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Communication Control (AREA)

Description

多线程通讯程序中防止线程吊死的方法 技术领域 本发明涉及多线程控制技术领域,具体涉及一种多线程通讯程序中防止 线程吊死的方法。 背景技术 随着智能网的发展, 智能网的通讯程序需要链接的节点日益增多 , 通讯 数据量也逐渐加大, 对通讯程序的效率要求也相应地提高。 目前使用的智能 网通讯程序, 大都采用多线程架构, 即程序中存在一个主控线程, 用于处理 建链(建立 TCP连接) 请求或发起建链请求。 主控线程针对每一个 TCP连 接上的节点, 创建一对子线程, 即接收子线程和发生子线程, 分别用于在该 TCP连接上接收和发送消息。 主控线程还将节点模块号、 局号、 IP地址、 发 送子线程和接收子线程的线程 ID (标识)号等信息记录到全局的链路信息表 中。 当新建链路或出现断链情况时, 通信程序需要新建发送、接收子线程或 关闭发送、 接收子线程, 并相应地更新链路信息表中的对应信息。 由于通讯 主控线程、接收子线程或发送子线程中的任何一个都可能会更新链路信息表, 一般通过互斥锁来保证同时只有一个线程操作链路信息表。 由于通讯主控线 程、 接收子线程、 发送子线程是同时运行的, 当接收子线程或发送子线程发 现链路异常, 会在获取到互斥锁后, 更新链路信息表, 同时主控线程由于没 有收到心跳消息的原因, 也检测到链路异常, 此时主控线程会关闭链路, 并 回收对应链路的线程资源, 一^:为杀死线程。 如果接收或发送子线程在获取 到互斥锁的情况下被主控线程杀死, 会导致互斥锁不可用, 主控线程下次获 取互斥锁时, 会陷入无限等待状态, 从而导致线程吊死。 可以看出, 现有技术在智能网中使用多线程模式进行机器间通讯时 , 主 控线程在关闭链路对应的线程时, 可能导致通讯线程吊死情况的发生。 发明内容 本发明所要解决的技术问题是提供一种多线程通讯程序中防止线程吊 死的方法, 防止通信程序中发生线程吊死 , 保证通讯程序的正常运行。 为解决上述技术问题, 本发明是通过以下方案实现的: 才艮据本发明的一个方面 ,提供了一种多线程通讯程序中防止线程吊死的 方法。 才艮据本发明的多线程通讯程序中防止线程吊死的方法, 包括: 用于发起建链请求和 /或处理建链请求的主控线程; 至少一对子线程, 其中, 每一对子线程对应于一条 TCP连接, 包括在 该 TCP连接上接收消息的接收子线程和发送消息的发送子线程; 以及用于保存每条 TCP连接的链路信息的链路信息表; 所述方法包括: 第一子线程检测到本子线程对应的第一 TCP连接发生异常时, 在第一 TCP连接的链路信息中设置退出标志位并退出本子线程; 主控线程周期性地轮询所述链路信息表, 并在检测到第一 TCP连接的 链路信息中的退出标志位时, 关闭第一 TCP连接, 杀死第一 TCP连接对应 的所有剩余子线程, 并清空第一 TCP连接的链路信息。 优选地, 上述方法中, 第一子线程进一步才艮据所述第一 TCP连接上是 否发生发送错误或接收错误, 检测所述第一 TCP连接是否发生异常。 优选地, 上述方法中, 所述第一子线程在退出本子线程失败时, 进一步 进入休眠状态并在第一 TCP 连接的链路信息中设置标识第一子线程处于休 眠状态的状态信息; 所述主控线程在检测到所述状态信息后,杀死处于休眠状态的第一子线 程。 优选地, 上述方法中, 所述主控线程进一步通过关闭第一 TCP连接的 套接字描述符, 关闭所述第一 TCP连接。 优选地, 上述方法中, 所述主控线程进一步才艮据第一 TCP连接的链路 信息中保存的子线程 ID号, 杀死第一 TCP连接对应的所有剩余子线程。 根据本发明的另一个方面,还提供了一种多线程通讯程序中防止线程吊 死的方法。 根据本发明的多线程通讯程序中防止线程吊死的方法, 包括: 用于发起建链请求和 /或处理建链请求的主控线程; 至少一对子线程, 其中, 每一对子线程对应于一条 TCP连接, 包括在 该 TCP连接上接收消息的接收子线程和发送消息的发送子线程; 以及用于保存每条 TCP连接的链路信息的链路信息表; 所述方法包括: 主控线程检测到第一 TCP连接发生异常时, 关闭第一 TCP连接; 第一 TCP连接对应的子线程检测到所述第一 TCP连接发生异常, 则在 第一 TCP连接的链路信息中设置退出标志位并退出本子线程; 主控线程周期性地轮询所述链路信息表, 当检测到第一 TCP连接的链 路信息中的退出标志位时, 杀死第一 TCP连接对应的所有剩余子线程, 并清 空第一 TCP连接的链路信息。 优选地, 上述方法中, 第一 TCP连接对应的子线程进一步才艮据所述第 一 TCP连接上是否发生发送错误或接收错误, 检测所述第一 TCP连接是否 发生异常。 优选地, 上述方法中, 第一 TCP连接对应的子线程在退出本子线程失 败时,进一步进入休眠状态并在第一 TCP连接的链路信息中设置标识本子线 程处于休眠状态的状态信息; 所述主控线程在检测到所述状态信息后, 杀死处于休眠状态的子线程。 优选地, 上述方法中, 所述主控线程进一步通过关闭第一 TCP连接的 套接字描述符, 关闭所述第一 TCP连接。 优选地, 上述方法中, 所述主控线程进一步才艮据第一 TCP连接的链路 信息中保存的子线程 ID号, 杀死第一 TCP连接对应的所有剩余子线程。 如上所述, 通过本发明提供的多线程通讯程序中防止线程吊死的方法, 通讯子线程在检测到连接异常时, 通过设置退出标志位来指示异常, 接收子 线程和发送子线程可以分别设置退出标志位, 主控线程根据退出标志位杀死 对应的子线程, 从而保证了主控线程、 接收 /发送子线程不会对链路信息表中 的同一内容同时进行操作, 即, 本实施例通过特定的标志位和流程, 避免了 使用互斥锁 , 从而从根本上避免了因为互斥锁不可用而导致的线程吊死, 保 证了通讯程序的正常运行。 并且, 在本发明中, 主控线程检测到 TCP连接发 生异常后, 并没有直接杀死该 TCP连接对应的接收 /发送子线程, 而是首先 关闭该 TCP连接, 并等待该 TCP连接的接收 /发送子线程主动退出, 在接收 / 发送子线程退出失败后 , 再杀死接收 /发送子线程, 从而避免了直接杀死这些 子线程, 可能导致的数据接收或发送失败对通信程序带来无法确定的影响。 最后, 本发明可以采用标准 C语言实现, 具有艮好的可移植性。 本发明的其它特征和优点将在随后的说明书中阐述, 并且, 部分地从说 明书中变得显而易见, 或者通过实施本发明而了解。 本发明的目的和其他优 点可通过在所写的说明书、 权利要求书、 以及附图中所特别指出的结构来实 现和获得。 附图说明 附图用来提供对本发明的进一步理解, 并且构成说明书的一部分, 与本 发明的实施例一起用于解释本发明, 并不构成对本发明的限制。 在附图中: 图 1为本发明实施例适用的智能网多 SCP系统的结构图; 图 2为本发明实施例的系统架构图; 图 3为本发明实施例所述防止线程吊死的方法流程图; 图 4为本发明另一实施例所述防止线程吊死的方法流程图。 具体实施方式 功能相克述 根据本发明实施例提供的方法, 在链路发生异常时, 通过特定的处理顺 序和处理方式, 使得主控线程、 接收 /发送子线程不会对链路信息表中的同一 信息同时进行更新, 从而各个线程在不需要获取互斥锁的情况下即可对链路 信息表进行更新, 避免通信程序中发生线程吊死, 保障了业务的正常运行。 以下结合附图 , 通过具体实施例对本发明作详细说明。 以下实施例以智能网多 SCP系统为例进行说明。 如图 1所示, 这种系 统有多个业务控制点( SCP, Service Control Point ), 每个 SCP上都有业务运 行,并有自己的业务。多个 SCP由一个业务管理点( SMP, Service Management Point ) 管理。 在 SCP之间、 SCP和 SMP之间都建立 TCP长链接。 图 2为本发明实施例的系统架构图。 如图 2所示, SCP、 SMP上的通讯 程序在启动时都会创建一个用于发起建链请求和 /或处理建链请求的主控线 程, 然后与其它 SCP、 SMP建立 TCP长连接, 并对于每条建立上的连接, 创建与之对应的一对子线程 (即发送子线程和接收子线程), 分别进行发送、 接收消息工作。 并且, 创建一个全局的链路信息表, 将每条 TCP连接的链路 信息记录在该链路信息表。 在 TCP连接发生异常时, 通讯子线程 (包括接收 /发送子线程) 和主控 线程都可能检测到该 TCP连接发生异常 , 下面分别对两种情况下进行说明。
1. 通讯子线程检测到 TCP连接异常 如图 3所示, 本实施例所述多线程通讯程序中防止线程吊死的方法, 包 括以下步骤: 步骤 31 , 通讯程序启动主控线程, 并与其它 SCP、 SMP建立 TCP长连 接, 并对于每条建立上的连接, 创建与之对应的一对子线程, 以及在链路信 息表中记录各个 TCP连接的链路信息。 步骤 32 , 当通讯子线程在其对应的某条 TCP连接 (为描述方便, 以下 称为第一 TCP 连接) 上的接收消息或发送消息发生错误, 此时判断该第一 TCP连接发生异常 ,通讯子线程在第一 TCP连接的链路信息中设置退出标志 位( EXITFLAG ), 用于指示第一 TCP连接对应的接收 /发送子线程是否因为 发生异常而需要退出; 并且, 该通讯子线程尝试主动退出, 如果退出失败, 则该通讯子线程进入休眠状态。这里的通讯子线程可以是第一 TCP连接所对 应的接收子线程或发送子线程。 步骤 33 , 主控线程在启动后, 便开始周期性地轮询链路信息表, 判断 链路信息表中各个 TCP连接的链路信息中是否存在 EXITFLAG (退出标志) 标志位已置位的情形。由于链路信息表中的第一 TCP连接的链路信息中已将 EXITFLAG标志位置位, 因此, 当主控线程轮询到该第一 TCP连接时, 检 测到 EXITFLAG标志位已置位, 此时, 主控线程关闭套接字该第一 TCP连 接的( Socket )描述符,从而关闭第一 TCP连接, 并根据第一 TCP连接的链 路信息中所保存的第一 TCP连接对应的通讯子线程的线程标识( ID )号, 杀 死( kill )第一 TCP连接对应的通讯子线程, 并更新链路信息表, 将第一 TCP 连接对应的链路信息清空。 这里, 由于第一 TCP连接对应的某个通信子线程 可能已经主动成功退出, 此时, 主控线程只需要杀死第一 TCP连接对应的所 有剩余子线程。 此后, 主控线程继续轮询下一个 TCP连接, 执行类似的判断 操作。 上述步骤 32中, 通讯子线程在退出失败进入睡眠状态时, 还可以进一 步在链路信息中设置标识该子线程处于休眠状态的状态信息,从而在步骤 33 中, 主控线程在检测到该状态信息后, 可以据此杀死处于休眠状态的该通讯 子线程。 从以上所述可以看出 , 本实施例中 , 通讯子线程在检测到连接异常时, 通过设置退出标志位来指示异常 , 接收子线程和发送子线程可以分别设置退 出标志位 , 主控线程根据退出标志位杀死对应的子线程 , 从而保证了主控线 程、 接收 /发送子线程不会对链路信息表中的同一内容同时进行操作, 即, 本 实施例通过特定的标志位和流程, 避免了使用互斥锁, 从而从才艮本上避免了 因为互斥锁不可用而导致的线程吊死, 保证了通讯程序的正常运行。
2. 主控线程检测到 TCP连接异常 如图 4所示, 本实施例所述多线程通讯程序中防止线程吊死的方法, 包 括以下步骤: 步骤 41 , 通讯程序启动主控线程, 并与其它 SCP、 SMP建立 TCP长连 接, 并对于每条建立上的连接, 创建与之对应的一对子线程, 以及在链路信 息表中记录各个 TCP连接的链路信息。 步骤 42 , 主控线程在启动后, 定时检测每条 TCP连接的心跳消息是否 正常, 即是否收到该 TCP连接对端的心跳消息: 如果在规定时间 (如 20秒) 内没有收到对端的心跳消息, 则认为该 TCP连接异常, 此时, 主控线程通过 关闭该 TCP连接的套接字 ( socket )描述符来关闭该 TCP连接(这里, 假设 该 TCP连接为第一 TCP连接); 步骤 43 , 由于第一 TCP连接被关闭, 从而导致该第一 TCP连接的通讯 子线程发生接收或发送错误, 在检测到上述错误之后, 通讯子线程设置在第 一 TCP连接对应的链路信息中设置 EXITFLAG标志位, 并尝试主动退出: 如果退出失败, 则进入休眠状态。 步骤 44, 主控线程在启动后, 便开始周期性地轮询链路信息表, 判断 链路信息表中各个 TCP连接的链路信息中是否存在 EXITFLAG标志位已置 位的情形。由于链路信息表中的第一 TCP连接的链路信息中已将 EXITFLAG 标志位置位,因此,当主控线程轮询到该第一 TCP连接时,检测到 EXITFLAG 标志位已置位 , 此时, 主控线程根据第一 TCP连接的链路信息中所保存的第 一 TCP连接对应的通讯子线程的线程标识 (ID ) 号, 杀死 (kill ) 第一 TCP 连接对应的通讯子线程, 并在线程资源 (如堆栈等) 回收完毕后, 更新链路 信息表, 将第一 TCP连接对应的链路信息清空。 这里, 由于第一 TCP连接 对应的某个通信子线程可能已经主动成功退出, 此时, 主控线程只需要杀死 第一 TCP连接对应的所有剩余子线程。 此后, 主控线程继续轮询下一个 TCP 连接 , 执行类似的判断操作。 在本实施例中, 主控线程检测到第一 TCP连接发生异常后, 并没有直 接杀死第一 TCP连接对应的通信子线程, 而是首先关闭第一 TCP连接, 并 等待第一 TCP连接的通信子线程主动退出 , 在通信子线程退出失败后, 再杀 死通信子线程。 这种做法的原因在于: 在第一 TCP连接发生异常时, 其接收
/发送子线程可能仍然在接收或发送数据, 如果直接杀死这些子线程, 可能导 致数据接收或发送失败, 对通信程序带来无法确定的影响, 因此, 本实施例 让子线程主动退出, 以避免上述情况的发生。 综上所述, 本发明实施例中通过设置退出标志位以及特定的流程, 有效 的防止了通讯程序中的线程吊死, 避免 SCP、 SMP之间长时间通讯中断。 本发明实施例中以 SCP、 SMP之间通讯程序作为具体实施 , 但本发明 并不局限于此。 本发明所属领域的技术人员应能理解, 在不偏离本发明的宗 旨和^^申的情况下, 可以对它进行形式和细节的显而易见的 4爹改。 因 jib, 以 上描述的实施例是说明性而不是限制性的, 在不脱离本发明的精神和范围的 情况下, 所有的变换和修改都在本发明的保护范围之内。

Claims

权 利 要 求 书
1. 一种多线程通讯程序中防止线程吊死的方法 , 所述通讯程序包括: 用于发起建链请求和 /或处理建链请求的主控线程; 至少一对子线程, 其中, 每一对子线程对应于一条 TCP 连接, 包括在该 TCP 连接上接收消息的接收子线程和发送消息的发送子线 程;
以及用于保存每条 TCP连接的链路信息的链路信息表; 其特征在于, 所述方法包括: 第一子线程检测到本子线程对应的第一 TCP 连接发生异常时, 在第一 TCP连接的链路信息中设置退出标志位并退出本子线程;
主控线程周期性地轮询所述链路信息表, 并在检测到第一 TCP 连接的链路信息中的退出标志位时, 关闭第一 TCP 连接, 杀死第一 TCP连接对应的所有剩余子线程, 并清空第一 TCP连接的链路信息。
2. 如权利要求 1 所述的方法, 其特征在于, 第一子线程进一步才艮据所述 第一 TCP 连接上是否发生发送错误或接收错误, 检测所述第一 TCP 连接是否发生异常。
3. 如权利要求 1所述的方法 , 其特征在于 ,
所述第一子线程在退出本子线程失败时 , 进一步进入休眠状态并 在第一 TCP连接的链路信息中设置标识第一子线程处于休眠状态的状 态信息;
所述主控线程在检测到所述状态信息后, 杀死处于休眠状态的第 一子线程。
4. 如权利要求 1 所述的方法, 其特征在于, 所述主控线程进一步通过关 闭第一 TCP连接的套接字描述符, 关闭所述第一 TCP连接。
5. 如权利要求 1 所述的方法, 其特征在于, 所述主控线程进一步才艮据第 一 TCP连接的链路信息中保存的子线程 ID号, 杀死第一 TCP连接对 应的所有剩余子线程。
6. 一种多线程通讯程序中防止线程吊死的方法 , 所述通讯程序包括: 用于发起建链请求和 /或处理建链请求的主控线程; 至少一对子线程, 其中, 每一对子线程对应于一条 TCP 连接, 包括在该 TCP 连接上接收消息的接收子线程和发送消息的发送子线 程;
以及用于保存每条 TCP连接的链路信息的链路信息表; 其特征在于, 所述方法包括:
主控线程检测到第一 TCP连接发生异常时,关闭第一 TCP连接; 第一 TCP连接对应的子线程检测到所述第一 TCP连接发生异常, 则在第一 TCP连接的链路信息中设置退出标志位并退出本子线程; 主控线程周期性地轮询所述链路信息表, 当检测到第一 TCP 连 接的链路信息中的退出标志位时, 杀死第一 TCP连接对应的所有剩余 子线程, 并清空第一 TCP连接的链路信息。
7. 如权利要求 6所述的方法, 其特征在于, 第一 TCP连接对应的子线程 进一步根据所述第一 TCP连接上是否发生发送错误或接收错误, 检测 所述第一 TCP连接是否发生异常。
8. 如权利要求 6所述的方法, 其特征在于,
第一 TCP 连接对应的子线程在退出本子线程失败时, 进一步进 入休眠状态并在第一 TCP连接的链路信息中设置标识本子线程处于休 眠状态的状态信息;
所述主控线程在检测到所述状态信息后, 杀死处于休眠状态的子 线程。
9. 如权利要求 6所述的方法, 其特征在于, 所述主控线程进一步通过关 闭第一 TCP连接的套接字描述符, 关闭所述第一 TCP连接。
10. 如权利要求 6所述的方法, 其特征在于, 所述主控线程进一步才艮据第 一 TCP连接的链路信息中保存的子线程 ID号, 杀死第一 TCP连接对 应的所有剩余子线程。
PCT/CN2009/073424 2008-09-01 2009-08-21 多线程通讯程序中防止线程吊死的方法 WO2010022635A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP09809220.8A EP2323344A4 (en) 2008-09-01 2009-08-21 Method for preventing thread hanging in a multi-thread communication program
BRPI0917076A BRPI0917076A2 (pt) 2008-09-01 2009-08-21 método para evitar a finalização de tarefa em um progama de comunicação multitarefa

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200810119263.9 2008-09-01
CN2008101192639A CN101355577B (zh) 2008-09-01 2008-09-01 一种多线程通讯程序中防止线程吊死的方法

Publications (1)

Publication Number Publication Date
WO2010022635A1 true WO2010022635A1 (zh) 2010-03-04

Family

ID=40308165

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2009/073424 WO2010022635A1 (zh) 2008-09-01 2009-08-21 多线程通讯程序中防止线程吊死的方法

Country Status (4)

Country Link
EP (1) EP2323344A4 (zh)
CN (1) CN101355577B (zh)
BR (1) BRPI0917076A2 (zh)
WO (1) WO2010022635A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112612581A (zh) * 2020-12-02 2021-04-06 北京和利时系统工程有限公司 线程主动退出方法和装置
CN113485839A (zh) * 2021-07-27 2021-10-08 中国银行股份有限公司 一种数据处理方法及系统

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101355577B (zh) * 2008-09-01 2011-04-20 中兴通讯股份有限公司 一种多线程通讯程序中防止线程吊死的方法
CN102426540B (zh) * 2011-11-14 2013-06-05 苏州阔地网络科技有限公司 一种分布式即时通信软件中全局会话备份切换方法及装置
CN102609308A (zh) * 2011-12-22 2012-07-25 深圳市万兴软件有限公司 一种非主线程失效方法及装置
CN104836683B (zh) * 2015-04-01 2018-06-05 上海大唐移动通信设备有限公司 一种线程重连的方法及装置
CN106375353B (zh) * 2015-07-20 2020-05-19 中兴通讯股份有限公司 建链处理方法及装置
CN107196817B (zh) * 2016-03-15 2020-07-14 中国移动通信集团河南有限公司 Ping线程监测方法、网络服务状态监测方法及客户端
CN105975325A (zh) * 2016-04-22 2016-09-28 浙江工业大学 一种自动跳出阻塞式代码段的控制方法
CN107666474B (zh) * 2016-07-30 2021-04-20 华为技术有限公司 一种网络报文处理方法、装置及网络服务器
CN107256180B (zh) * 2017-05-19 2019-04-26 腾讯科技(深圳)有限公司 数据处理方法、装置及终端
CN107967176A (zh) * 2017-11-22 2018-04-27 郑州云海信息技术有限公司 一种Samba多线程架构异常处理方法及相关装置
CN108037988A (zh) * 2017-12-11 2018-05-15 郑州云海信息技术有限公司 一种samba多线程性能打点方法及装置
CN115113931B (zh) * 2022-07-22 2023-02-14 瀚博半导体(上海)有限公司 数据处理系统、方法、人工智能芯片、电子设备和介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6529962B1 (en) * 1999-02-05 2003-03-04 International Business Machines Corporation Preserving thread identity during remote calls
CN1585438A (zh) * 2004-06-08 2005-02-23 中兴通讯股份有限公司 语音业务交换设备的防吊死存活性测试方法
US20080209422A1 (en) * 2007-02-28 2008-08-28 Coha Joseph A Deadlock avoidance mechanism in multi-threaded applications
CN101355577A (zh) * 2008-09-01 2009-01-28 中兴通讯股份有限公司 一种多线程通讯程序中防止线程吊死的方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7386610B1 (en) * 2000-09-18 2008-06-10 Hewlett-Packard Development Company, L.P. Internet protocol data mirroring
US20030065741A1 (en) * 2001-09-29 2003-04-03 Hahn Vo Concurrent bidirectional network communication utilizing send and receive threads
CN1801101A (zh) * 2006-01-17 2006-07-12 浙江大学 Java操作系统中线程的实现和线程状态切换的方法
US20080040494A1 (en) * 2006-07-28 2008-02-14 Alok Kumar Partitioning a Transmission Control Protocol (TCP) Control Block (TCB)

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6529962B1 (en) * 1999-02-05 2003-03-04 International Business Machines Corporation Preserving thread identity during remote calls
CN1585438A (zh) * 2004-06-08 2005-02-23 中兴通讯股份有限公司 语音业务交换设备的防吊死存活性测试方法
US20080209422A1 (en) * 2007-02-28 2008-08-28 Coha Joseph A Deadlock avoidance mechanism in multi-threaded applications
CN101355577A (zh) * 2008-09-01 2009-01-28 中兴通讯股份有限公司 一种多线程通讯程序中防止线程吊死的方法

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112612581A (zh) * 2020-12-02 2021-04-06 北京和利时系统工程有限公司 线程主动退出方法和装置
CN112612581B (zh) * 2020-12-02 2024-02-13 北京和利时系统工程有限公司 线程主动退出方法和装置
CN113485839A (zh) * 2021-07-27 2021-10-08 中国银行股份有限公司 一种数据处理方法及系统

Also Published As

Publication number Publication date
EP2323344A4 (en) 2018-01-10
BRPI0917076A2 (pt) 2016-07-05
CN101355577B (zh) 2011-04-20
CN101355577A (zh) 2009-01-28
EP2323344A1 (en) 2011-05-18

Similar Documents

Publication Publication Date Title
WO2010022635A1 (zh) 多线程通讯程序中防止线程吊死的方法
KR100812374B1 (ko) 클러스터 시스템에서 프로토콜 네트워크 장애 관리 시스템및 방법
US9270524B2 (en) Method and device for LACP link switching and data transmission
US7505403B2 (en) Stack manager protocol with automatic set up mechanism
US9083565B2 (en) Network apparatus and method for communication between different components
CN1921369B (zh) 一种网络连接的接管方法
US20060092853A1 (en) Stack manager protocol with automatic set up mechanism
US8098571B2 (en) Stack manager protocol with automatic set up mechanism
CN103200109A (zh) 一种ospf邻居关系管理方法和设备
CN103036729A (zh) 一种开放网络能力的系统、方法和相关网元
WO2014201903A1 (zh) 分布式弹性网络互连系统中协作方法和系统
WO2019000953A1 (zh) 一种实现Mux机的方法、设备及系统
CN105516658B (zh) 一种监控设备控制方法及装置
JP2004280717A (ja) 遠隔制御監視システム及びそれに用いる情報通信方法
US7860090B2 (en) Method for processing LMP packets, LMP packet processing unit and LMP packet processing node
CN115334066B (zh) 一种分布式集群系统及其处理同步请求响应的方法
CN101478775B (zh) 一种多邻居连接状态的检测方法、系统和设备
CN114143904B (zh) 一种基于5g融合网络分流器的cpe管理方法
WO2021073367A1 (zh) 一种数据处理方法、设备及系统
WO2010060312A1 (zh) Web应用与外部设备网络互连的实现方法和系统
CN102546233A (zh) 一种高可用集群中串口心跳的实现方法
US9590893B2 (en) System and method for management of network links by traffic type
CN101877660B (zh) 一种多家乡主机的链路状态监测及故障排除方法
CN115150453B (zh) 一种避免lacp协议超时的方法、系统、设备和存储介质
WO2009143757A1 (zh) 查询器选举方法、路由器和网络系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09809220

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2009809220

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

REG Reference to national code

Ref country code: BR

Ref legal event code: B01E

Ref document number: PI0917076

Country of ref document: BR

Free format text: SOLICITA-SE A REGULARIZACAO DA PAGINA DE RESUMO APRESENTADA NA PETICAO 015110000426 , POR ESTA NAO APRESENTAR TITULO.

ENP Entry into the national phase

Ref document number: PI0917076

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20110228