CN113672398A - Memory optimization method and device of full-flow backtracking analysis system - Google Patents

Memory optimization method and device of full-flow backtracking analysis system Download PDF

Info

Publication number
CN113672398A
CN113672398A CN202111242869.3A CN202111242869A CN113672398A CN 113672398 A CN113672398 A CN 113672398A CN 202111242869 A CN202111242869 A CN 202111242869A CN 113672398 A CN113672398 A CN 113672398A
Authority
CN
China
Prior art keywords
queue
message storage
storage unit
thread
message
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111242869.3A
Other languages
Chinese (zh)
Other versions
CN113672398B (en
Inventor
曲武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinjing Yunhua Shenyang Technology Co ltd
Beijing Jinjingyunhua Technology Co ltd
Original Assignee
Jinjing Yunhua Shenyang Technology Co ltd
Beijing Jinjingyunhua Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinjing Yunhua Shenyang Technology Co ltd, Beijing Jinjingyunhua Technology Co ltd filed Critical Jinjing Yunhua Shenyang Technology Co ltd
Priority to CN202111242869.3A priority Critical patent/CN113672398B/en
Publication of CN113672398A publication Critical patent/CN113672398A/en
Application granted granted Critical
Publication of CN113672398B publication Critical patent/CN113672398B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/90Buffering arrangements
    • H04L49/9063Intermediate storage in different physical parts of a node or terminal
    • H04L49/9068Intermediate storage in different physical parts of a node or terminal in the network interface card
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5011Pool
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5022Workload threshold
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/508Monitor

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The embodiment of the invention provides a memory optimization method and device of a full-flow backtracking analysis system. Responding to a system starting instruction by a memory management thread, applying for a message storage unit for each CPU, and initializing and distributing the message storage unit to a memory pool; when the capturing thread receives a message and needs to apply for a message storage unit, according to the source of the message, applying for the address of the message storage unit from a memory pool corresponding to the capturing thread; when a capturing thread needs to release a message storage unit, if the message storage unit belongs to a memory pool of the capturing thread, the message storage unit is distributed to the memory pool; and when the number of the message storage units to be released reaches the number threshold, sending the message storage units to be released to a memory management thread, and releasing the message storage units to be released to a memory pool corresponding to the high-load CPU by the memory management thread. In this way, the cache message can be reasonably utilized, and high robustness and high concurrency are ensured.

Description

Memory optimization method and device of full-flow backtracking analysis system
Technical Field
The present invention relates generally to the field of network monitoring, and more particularly, to a memory optimization method and device for a full-traffic backtracking analysis system.
Background
The IT technology is windy and cloudy, the technology revolution is accelerating continuously, the flow of the network equipment interface message reaches 10Gbps, 40Gbps or even higher per second, the current full-flow backtracking analysis system not only needs to capture the message, but also needs to store the message, and needs to perform flow analysis and threat detection, which creates huge challenges to the traditional transmission and safety, and the performance of the message full-flow backtracking analysis system becomes the important factor of each manufacturer.
The full-flow backtracking analysis system has the functions of capturing, analyzing and storing messages, the most basic element in the whole system is a message storage unit mbuf (memory buffer), and due to the fact that analysis and storage are involved, if the message resources are not released timely, resource exhaustion is easily caused. If the application mode of the message is set as follow-up dynamic allocation, the message is released after being used up, and the overall processing is very slow due to the consumption of performance by allocation/release. If the message is applied by using the memory pool mechanism, a fixed number of messages need to be applied in advance. Because all the full-flow storage backtracking systems have a basic flow analysis threat detection function, a flow caching process is involved in a message analysis process, especially when an attack mainly based on resource consumption is encountered, if the part of messages are released in time, a memory pool is exhausted easily, if the part of messages are released in time, the memory application is wasted if the part of messages are initiated too much, and the robustness and the high performance of the whole system are directly influenced by the quality of the part of design.
The message capturing and storing system collects messages in a mirror image or light splitting mode, because the messages of the same session are not necessarily distributed to the same thread when a network card receives a packet, the general design method of the industry is that N threads are usually distributed to capture in a multi-core system, M threads are used for analysis, and finally X threads are used for storing the messages, and the design mode causes a large amount of cache loss in the message processing process, thereby affecting the performance. The best performance scenario in the full-traffic backtracking analysis system is a worker mode, that is, the performance overhead caused by cache loss and thread switching is reduced by capturing and analyzing the data in the same thread. And because the analysis thread usually takes the flow as a unit, the left and right directions of the same flow can be ensured to be received by the same capturing thread while the best scene, namely the message, in the multi-core system is captured, so that the competition resource of high concurrency and no flow can be realized during multi-core processing, and the performance is improved. In such a design, the hardware is usually configured to enable two direction messages of the same flow to be received by the same thread, which has the disadvantage that a load imbalance scenario often occurs, that is, the message traffic processed by different capturing threads is not balanced.
For the worker mode, the memory pool of the message can be set as resources of each core, and the current general design scheme is to set the memory pool resources of each core as the same amount, but when the phenomenon of traffic load imbalance occurs, the design mode of the memory pool easily causes message waste.
Disclosure of Invention
According to the embodiment of the invention, a memory optimization scheme of a full-flow backtracking analysis system is provided. The scheme can reasonably utilize the cache message, and ensures high robustness and high concurrency.
In a first aspect of the present invention, a memory optimization method for a full-traffic backtracking analysis system is provided. The method comprises the following steps:
responding to a system starting instruction by the memory management thread, applying for a message storage unit for each CPU, and initializing and distributing the message storage unit to a corresponding memory pool through a capturing thread corresponding to the CPU;
when a capturing thread receives a message and needs to apply for a message storage unit, marking the affiliation of the message storage unit; according to the source of the message, applying for the address use of a message storage unit from a memory pool corresponding to the capturing thread;
when a capturing thread needs to release a message storage unit, if the message storage unit belongs to a memory pool of the capturing thread, the message storage unit is allocated to the memory pool; otherwise, when the number of the message storage units to be released reaches a preset first number threshold, sending the message storage units to be released to the memory management thread, and releasing the message storage units to be released to a memory pool corresponding to the high-load CPU by the memory management thread.
Furthermore, each CPU corresponds to one capturing thread, each CPU is provided with a memory pool, each memory pool is composed of a first queue, a second queue and a third queue, and the maximum length of the first queue, the second queue and the third queue is set.
Further, the memory management thread applies for a packet storage unit for each CPU, including:
the memory management thread applies for a first queue in a memory pool of each CPU for a message storage unit corresponding to the maximum length of the first queue; and
and applying for a second queue in a memory pool of each CPU a message storage unit corresponding to half of the maximum length of the second queue.
Further, the applying for the address usage of the message storage unit from the memory pool corresponding to the capturing thread according to the source of the message includes:
when the source of the message is a packet received from a network card driver, judging whether the capturing thread performs load balancing;
if the capturing thread performs load balancing, judging whether the CPU load corresponding to the capturing thread reaches a preset load threshold value, and if so, sending the message to the capturing thread with the load not reaching the preset load threshold value; if not, requesting to acquire message storage units of the corresponding number of the messages from a memory pool corresponding to the capturing thread according to the sequence of the first queue, the second queue and the third queue;
if the capturing thread does not carry out load balancing, requesting to acquire message storage units of the corresponding number of the messages from a memory pool corresponding to the capturing thread according to the sequence of a first queue, a second queue and a third queue, and marking a cache mark of the capturing thread;
and when the source of the message is not the network card drive packet receiving, requesting to acquire the message storage units of the corresponding number of the message from the memory pool corresponding to the capturing thread according to the sequence of the second queue and the third queue.
Further, the requesting obtains the message storage units corresponding to the number of the messages from the memory pool corresponding to the capturing thread according to the sequence of the first queue, the second queue, and the third queue, including:
applying for the address use of the message storage units with the corresponding number of the messages from the first queue of the corresponding memory pool;
if no message storage unit can be applied in the first queue, applying for the use of the addresses of the remaining message storage units to be applied from a second queue of the memory pool;
and if no message storage unit can be applied in the second queue, applying for the use of the addresses of the remaining message storage units to be applied from a third queue of the memory pool.
Further, the requesting obtains the message storage units of the corresponding number of the messages from the memory pool corresponding to the capturing thread according to the sequence of the second queue and the third queue, including:
applying for the address use of the message storage units with the corresponding number of the messages from the second queue of the corresponding memory pool;
and if no message storage unit can be applied in the second queue, applying for the use of the addresses of the remaining message storage units to be applied from a third queue of the memory pool.
Further, the releasing the message storage unit to the memory pool if the message storage unit belongs to the memory pool of the capturing thread includes:
when the type of the message storage unit is network card packet receiving, if the processing time length is not greater than a preset time threshold, allocating the address of the message storage unit to a first queue of a memory pool of the capturing thread; if the first queue is full and an unallocated message storage unit exists, allocating the address of the unallocated message storage unit to a third queue; if the processing time length is longer than a preset time threshold value, the address of the message storage unit is distributed to a second queue; if the second queue is full and an unallocated message storage unit exists, allocating the address of the unallocated message storage unit to a third queue;
when the type of the message storage unit is self-use, the address of the message storage unit is allocated to a second queue of a memory pool of the capturing thread; and if the second queue is full and the unallocated message storage units exist, allocating the addresses of the unallocated message storage units to a third queue.
Further, the allocating, by the memory management thread, to the memory pool corresponding to the high-load CPU includes:
sequencing each CPU from high to low according to the load;
and the memory management thread releases the message storage unit to be released to a memory pool of the CPU with the highest load.
Further, the method further comprises:
the memory management thread periodically traverses a first queue in a memory pool of each capturing thread, applies a message storage unit corresponding to the maximum length of a third queue if the number of message storage units in the first queue in the memory pool of the capturing thread is smaller than a preset second number threshold, and sends a half address of the applied message storage unit to the capturing thread; and the capture thread distributes the address of the received message memory unit to a third queue in the corresponding memory pool.
In a second aspect of the invention, an electronic device is provided. The electronic device at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect of the invention.
It should be understood that the statements herein reciting aspects are not intended to limit the critical or essential features of any embodiment of the invention, nor are they intended to limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
The above and other features, advantages and aspects of various embodiments of the present invention will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. In the drawings, like or similar reference characters designate like or similar elements, and wherein:
fig. 1 shows a flowchart of a memory optimization method of a full-traffic backtracking analysis system according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating a flow of a message storage unit address application according to an embodiment of the present invention;
fig. 3 is a schematic diagram illustrating the distribution of the message storage unit when the type of the message storage unit is network card packet receiving according to the embodiment of the present invention;
FIG. 4 is a diagram illustrating a message storage unit allocation when the type of the message storage unit is self-use, according to an embodiment of the invention;
FIG. 5 illustrates a block diagram of an exemplary electronic device capable of implementing embodiments of the present invention;
here, 500 is an electronic device, 501 is a CPU, 502 is a ROM, 503 is a RAM, 504 is a bus, 505 is an I/O interface, 506 is an input unit, 507 is an output unit, 508 is a storage unit, and 509 is a communication unit.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
In the invention, the memory management related design of the full-flow storage backtracking system to the message memory pool is improved, the cache optimization design of the message basic unit is fully considered on the premise of ensuring that the cache message can be reasonably utilized, and high robustness and high concurrency are ensured.
Fig. 1 shows a flowchart of a memory optimization method of a full-traffic backtracking analysis system according to an embodiment of the present invention.
The method comprises the following steps:
s101, responding to a system starting instruction, applying for a message storage unit for each CPU by a memory management thread, and initializing and distributing the message storage unit to a corresponding memory pool through a capturing thread corresponding to the CPU.
The memory management thread is a process except a decode thread, a worker thread and a storage thread and is used for memory allocation.
As an embodiment of the present invention, the memory management thread designs one capture thread for each CPU, for example, if there are N CPUs, the N capture threads need to be designed to perform detect function at the same time. Because each CPU is provided with one memory pool, the initialization of the memory pools is set to be N. The N memory pools are mempool1, mempool2,. N is the number of processing cores and also the number of memory pools.
As an embodiment of the present invention, each memory pool is composed of three queues, which are a first queue, a second queue and a third queue; and setting the maximum length of the first queue, the second queue and the third queue. For example, the maximum length of the first queue is X, the maximum length of the second queue is 2Y, and the maximum length of the third queue is Z.
The memory management thread responds to a system starting instruction and applies for a message storage unit for each CPU, and the method comprises the following steps:
the memory management thread applies for a first queue in a memory pool of each CPU for a message storage unit corresponding to the maximum length of the first queue, namely X message storage units; and
and applying for a second queue in a memory pool of each CPU for a message storage unit corresponding to half of the maximum length of the second queue, namely Y message storage units.
It can be seen that, in the initialization process, the memory management thread needs to apply for the sum of the maximum length of the first queue and half of the maximum length of the second queue, that is, (X + Y) in the above embodiment, instead of applying for the message storage unit for the third queue.
As an embodiment of the present invention, each capturing thread is responsible for initializing enqueuing of a first queue in a memory pool in a current thread, that is, filling addresses of all message storage units of the first queue with the applied X message storage units.
As an embodiment of the present invention, each capturing thread is responsible for initiating enqueuing of a second queue in a memory pool in a current thread, that is, addresses of half of message storage units in the second queue are filled with the applied Y message storage units. Since the second queue has 2Y spaces, the initialization process amounts to half-filling.
As an embodiment of the invention, each capturing thread does not process the third queue at initialization, i.e. does not fill the third queue at initialization. And the third queue receives message messages by the memory management thread only in load balancing.
The number of queues of each memory pool is divided into 3, the first queue and the second queue are respectively used for improving the cache hit rate, the third queue is used for preventing insufficient memory, the maximum length of each queue is initialized and the enqueue dequeue process is carried out according to the maximum length of the queue, so that the robustness and the high performance are guaranteed.
S102, when a message is received by a capturing thread and a message storage unit needs to be applied, marking the affiliated relationship of the message storage unit; and according to the source of the message, applying for the address use of a message storage unit from a memory pool corresponding to the capturing thread.
As an embodiment of the present invention, the relationship of the message storage unit is marked as: marking the original mark as CPUA, and establishing the affiliated relationship between the message storage unit and the CPUA, namely the current message storage unit belongs to the CPUA.
Next, according to the source of the packet, applying for the address usage of a packet storage unit from the memory pool corresponding to the capturing thread, specifically including:
in this embodiment, as shown in fig. 2, when the source of the packet is receiving from the network card driver, it is determined whether the capturing thread a performs load balancing.
Specifically, if the capturing thread a performs load balancing, it is determined whether the load of the capturing thread a reaches a preset load threshold, and if so, the packet is sent to a capturing thread, for example, a capturing thread B, whose load does not reach the preset load threshold; and if not, requesting to acquire the message storage units of the corresponding number of the messages from the memory pool corresponding to the capturing thread A according to the sequence of the first queue, the second queue and the third queue.
The acquiring the message storage units of the corresponding number of the messages according to the sequence of the first queue, the second queue and the third queue comprises the following steps: applying for the address use of the message storage units with the corresponding number of the messages from the first queue of the corresponding memory pool; if no message storage unit can be applied in the first queue, applying for the use of the addresses of the remaining message storage units to be applied from a second queue of the memory pool; and if no message storage unit can be applied in the second queue, applying for the use of the addresses of the remaining message storage units to be applied from a third queue of the memory pool.
Further, if the number of the message storage units in the first queue is smaller than the corresponding number of the messages, taking the difference between the corresponding number of the messages and the number of the message storage units in the first queue as a first difference value, and applying for the address use of the message storage unit with the first difference value from a second queue of the memory pool; and if the number of the message storage units in the second queue is smaller than the first difference value, taking the difference between the first difference value and the number of the message storage units in the second queue as a second difference value, and applying for the address use of the message storage unit with the second difference value from a third queue of the memory pool.
And if the capturing thread does not carry out load balancing, requesting to acquire the message storage units of the corresponding number of the messages from the memory pool corresponding to the capturing thread A according to the sequence of the first queue, the second queue and the third queue, and marking the cache marks of the capturing thread.
The acquiring the message storage units of the corresponding number of the messages according to the sequence of the first queue, the second queue and the third queue comprises the following steps: applying for the address use of the message storage units with the corresponding number of the messages from the first queue of the corresponding memory pool; if no message storage unit can be applied in the first queue, applying for the use of the addresses of the remaining message storage units to be applied from a second queue of the memory pool; and if no message storage unit can be applied in the second queue, applying for the use of the addresses of the remaining message storage units to be applied from a third queue of the memory pool.
Further, if the number of the message storage units in the first queue is smaller than the corresponding number of the messages, taking the difference between the corresponding number of the messages and the number of the message storage units in the first queue as a first difference value, and applying for the address use of the message storage unit with the first difference value from a second queue of the memory pool; and if the number of the message storage units in the second queue is smaller than the first difference value, taking the difference between the first difference value and the number of the message storage units in the second queue as a second difference value, and applying for the address use of the message storage unit with the second difference value from a third queue of the memory pool.
In this embodiment, as shown in fig. 2, when the source of the packet is not the packet received from the network card driver, a request is made to obtain the packet storage units corresponding to the packet from the memory pool corresponding to the capture thread according to the order of the second queue and the third queue.
The request obtains the message storage units of the corresponding number of the messages from the memory pool corresponding to the capturing thread according to the sequence of the second queue and the third queue, and the method comprises the following steps: applying for the address use of the message storage units with the corresponding number of the messages from the second queue of the corresponding memory pool; and if no message storage unit can be applied in the second queue, applying for the use of the addresses of the remaining message storage units to be applied from a third queue of the memory pool.
Further, the obtaining of the message storage units corresponding to the number of the messages according to the sequence of the second queue and the third queue includes: and if the number of the message storage units in the second queue is smaller than the corresponding number of the messages, taking the difference between the corresponding number of the messages and the number of the message storage units in the second queue as a third difference value, and applying for the address use of the message storage unit with the third difference value from the third queue of the memory pool.
It can be seen that, for the messages with balanced load, after entering the load balancing mode, the messages are distributed to each CPU based on the CPU cache of the message storage unit, and it is ensured that the cache for message processing is still handed to the designated CPU, so that the CPU with balanced load is still inside the cache when processing the messages. For example, the current packet is sent from the CPUA to the CPUB for processing, but its cache may already exist in the CPUA; at this time, if the packet returns to the CPUA again, the packet is not loaded from the memory to the cache again, so that the performance is improved. Under the condition of unbalanced load, the memory management thread distributes the message to the corresponding low-load CPU based on the CPU cache, and reasonably utilizes the cache, so that the cache hit rate is improved when the thread of the received message processes the message, and the performance is improved.
Under the condition of unbalanced load, the memory management thread stores the unit address of the queue enqueue message with high load based on the condition of load, so that the forwarding core is prevented from judging a low-load CPU in the packet receiving process, the performance overhead is reduced, and the performance is improved.
S103, when a message storage unit needs to be released by a capturing thread, if the message storage unit belongs to a memory pool of the capturing thread, the message storage unit is distributed to the memory pool; otherwise, when the number of the message storage units to be released reaches a preset first number threshold, sending the message storage units to be released to the memory management thread, and releasing the message storage units to be released to a memory pool corresponding to the high-load CPU by the memory management thread.
As an embodiment of the present invention, the allocating the packet storage unit to the memory pool if the packet storage unit belongs to the memory pool of the capturing thread includes:
as shown in fig. 3, when the type of the message storage unit is network card packet receiving, if the processing duration is not greater than the preset time threshold T, the address of the message storage unit is allocated to the first queue of the memory pool of the capture thread; if the first queue is full and an unallocated message storage unit exists, allocating the address of the unallocated message storage unit to a third queue; if the processing time length is longer than a preset time threshold value, the address of the message storage unit is distributed to a second queue; and if the second queue is full and the unallocated message storage units exist, allocating the addresses of the unallocated message storage units to a third queue. Such as an iPv4 fragment attack message.
As shown in fig. 4, when the type of the message storage unit is a self-use application, the address of the message storage unit is allocated to the second queue of the memory pool of the capture thread; and if the second queue is full and the unallocated message storage units exist, allocating the addresses of the unallocated message storage units to a third queue.
In this embodiment, when the number of the to-be-released message storage units reaches the preset first number threshold MAX _ BURST, the accumulated MAX _ BURST to-be-released message storage units are sent to the memory management thread, and are released to the memory pool corresponding to the high-load CPU by the memory management thread. The self-use type message is not received from the network card. The self-use message is usually the situation that the self needs to send or store the packet, and the message space needs to be applied for storing the message, and the self-use message is usually less. The message received by the network card is usually applied by the network card driver, and belongs to the application of super large batch.
In the process of this embodiment, the transmission is performed only when the number reaches the first number threshold MAX _ BURST, and the transmission is not performed periodically. For each capturing thread, the message storage units of the non-primitive core CPUs less than the first number threshold MAX _ BURST are not released back to the original memory pool in time, and memory leakage cannot be caused. Therefore, performance overhead caused by timing operation can be reduced, and the overall processing performance is improved. Meanwhile, the problem that the message storage unit is not released timely due to too much cache is avoided, and the robustness of the storage system is improved.
In this embodiment, the allocating, by the memory management thread, to the memory pool corresponding to the high-load CPU includes:
sequencing each CPU from high to low according to the load; and the memory management thread releases the message storage unit to be released to a memory pool of the CPU with the highest load.
In the non-load balancing mode, each message storage unit is applied by a capturing thread, and the capturing thread is released, so that the problem of cache loss is avoided.
And receiving and releasing messages needing to be released from each capturing thread in a load balancing mode. And for the messages to be released, enqueuing the corresponding messages cached by the CPU into the CPU with high load according to the load condition of each CPU and the number of the preset first quantity threshold MAX _ BURST. For example, if the capture thread of the CPU with high load is thread E, the packet storage unit enqueues the first queue of thread E, and if enqueuing fails, enqueues the third queue. Under the condition of unbalanced load, the memory management thread queues the address of the queue storage unit for the first queue with high load based on the load condition, so that the forwarding core is prevented from judging a low-load CPU in the packet receiving process, the performance overhead is reduced, and the performance is improved.
In some embodiments, the memory management thread is also used to handle situations where the memory pool of any one CPU may be exhausted.
The memory management thread periodically traverses a first queue in a memory pool of each capturing thread, applies a message storage unit corresponding to the maximum length of a third queue if the number of message storage units in the first queue in the memory pool of the capturing thread is smaller than a preset second number threshold, and sends a half address of the applied message storage unit to the capturing thread; and the capturing thread distributes the address of the received message storage unit to a third queue in the corresponding memory pool, and half of the space of the third queue is filled.
The message storage units of the memory pool are distributed in batches and the addresses of the message storage units are enqueued and separated, so that the processing cores are only responsible for enqueuing and dequeuing, the cost of the processing cores is greatly reduced, and the performance is improved. Meanwhile, the application and release mechanism of the message storage unit ensures that each core is operated without lock completely, the contention-free mechanism ensures the linear increase of the multi-core performance, and the overall performance is greatly improved.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are exemplary embodiments and that the acts and modules illustrated are not necessarily required to practice the invention.
The above is a description of method embodiments, and the embodiments of the present invention are further described below by way of apparatus embodiments.
In the technical scheme of the invention, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations without violating the good customs of the public order.
The invention also provides an electronic device and a readable storage medium according to the embodiment of the invention.
FIG. 5 shows a schematic block diagram of an electronic device 500 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
The device 500 comprises a computing unit 501 which may perform various suitable actions and processes in accordance with a computer program stored in a Read Only Memory (ROM) 502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the device 500 can also be stored. The calculation unit 501, the ROM 502, and the RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
A number of components in the device 500 are connected to the I/O interface 505, including: an input unit 506 such as a keyboard, a mouse, or the like; an output unit 507 such as various types of displays, speakers, and the like; a storage unit 508, such as a magnetic disk, optical disk, or the like; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the device 500 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.
The computing unit 501 may be a variety of general-purpose and/or special-purpose processing components having processing and computing capabilities. Some examples of the computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 501 performs the various methods and processes described above, such as methods S101-S103. For example, in some embodiments, methods S101-S103 may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 500 via the ROM 502 and/or the communication unit 509. When the computer program is loaded into the RAM 503 and executed by the computing unit 501, one or more of the steps of the methods S101-S103 described above may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured to perform the methods S101-S103 by any other suitable means (e.g., by way of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present invention may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A memory optimization method of a full-flow backtracking analysis system is characterized by comprising the following steps:
responding to a system starting instruction by the memory management thread, applying for a message storage unit for each CPU, and initializing and distributing the message storage unit to a corresponding memory pool through a capturing thread corresponding to the CPU;
when a capturing thread receives a message and needs to apply for a message storage unit, marking the affiliation of the message storage unit; according to the source of the message, applying for the address use of a message storage unit from a memory pool corresponding to the capturing thread;
when a capturing thread needs to release a message storage unit, if the message storage unit belongs to a memory pool of the capturing thread, the message storage unit is allocated to the memory pool; otherwise, when the number of the message storage units to be released reaches a preset first number threshold, sending the message storage units to be released to the memory management thread, and releasing the message storage units to be released to a memory pool corresponding to the high-load CPU by the memory management thread.
2. The method of claim 1, wherein each CPU corresponds to one capture thread, and each CPU sets a memory pool, each memory pool is composed of a first queue, a second queue, and a third queue, and sets a maximum length of the first queue, the second queue, and the third queue.
3. The method of claim 2, wherein the memory management thread applies for a message storage unit for each CPU, comprising:
the memory management thread applies for a first queue in a memory pool of each CPU for a message storage unit corresponding to the maximum length of the first queue; and
and applying for a second queue in a memory pool of each CPU a message storage unit corresponding to half of the maximum length of the second queue.
4. The method according to claim 2, wherein the applying for the address usage of the message storage unit from the memory pool corresponding to the capturing thread according to the source of the message comprises:
when the source of the message is a packet received from a network card driver, judging whether the capturing thread performs load balancing;
if the capturing thread performs load balancing, judging whether the CPU load corresponding to the capturing thread reaches a preset load threshold value, and if so, sending the message to the capturing thread with the load not reaching the preset load threshold value; if not, requesting to acquire message storage units of the corresponding number of the messages from a memory pool corresponding to the capturing thread according to the sequence of the first queue, the second queue and the third queue;
if the capturing thread does not carry out load balancing, requesting to acquire message storage units of the corresponding number of the messages from a memory pool corresponding to the capturing thread according to the sequence of a first queue, a second queue and a third queue, and marking a cache mark of the capturing thread;
and when the source of the message is not the network card drive packet receiving, requesting to acquire the message storage units of the corresponding number of the message from the memory pool corresponding to the capturing thread according to the sequence of the second queue and the third queue.
5. The method according to claim 4, wherein the requesting, from the memory pool corresponding to the capturing thread, the message storage units corresponding to the number of the messages according to the sequence of the first queue, the second queue, and the third queue includes:
applying for the address use of the message storage units with the corresponding number of the messages from the first queue of the corresponding memory pool;
if no message storage unit can be applied in the first queue, applying for the use of the addresses of the remaining message storage units to be applied from a second queue of the memory pool;
and if no message storage unit can be applied in the second queue, applying for the use of the addresses of the remaining message storage units to be applied from a third queue of the memory pool.
6. The method according to claim 4, wherein the requesting, from the memory pool corresponding to the capturing thread, the message storage units of the number corresponding to the message are obtained according to the order of the second queue and the third queue, and the method comprises:
applying for the address use of the message storage units with the corresponding number of the messages from the second queue of the corresponding memory pool;
and if no message storage unit can be applied in the second queue, applying for the use of the addresses of the remaining message storage units to be applied from a third queue of the memory pool.
7. The method of claim 2, wherein the assigning the message store unit to the memory pool if the message store unit belongs to the memory pool of the capture thread comprises:
when the type of the message storage unit is network card packet receiving, if the processing time length is not greater than a preset time threshold, allocating the address of the message storage unit to a first queue of a memory pool of the capturing thread; if the first queue is full and an unallocated message storage unit exists, allocating the address of the unallocated message storage unit to a third queue; if the processing time length is longer than a preset time threshold value, the address of the message storage unit is distributed to a second queue; if the second queue is full and an unallocated message storage unit exists, allocating the address of the unallocated message storage unit to a third queue;
when the type of the message storage unit is self-use, the address of the message storage unit is allocated to a second queue of a memory pool of the capturing thread; and if the second queue is full and the unallocated message storage units exist, allocating the addresses of the unallocated message storage units to a third queue.
8. The method of claim 2, wherein the releasing by the memory management thread to the memory pool corresponding to the high load CPU comprises:
sequencing each CPU from high to low according to the load;
and the memory management thread releases the message storage unit to be released to a memory pool of the CPU with the highest load.
9. The method of claim 2, further comprising:
the memory management thread periodically traverses a first queue in a memory pool of each capturing thread, applies a message storage unit corresponding to the maximum length of a third queue if the number of message storage units in the first queue in the memory pool of the capturing thread is smaller than a preset second number threshold, and sends a half address of the applied message storage unit to the capturing thread; and the capture thread distributes the address of the received message memory unit to a third queue in the corresponding memory pool.
10. An electronic device, at least one processor; and
a memory communicatively coupled to the at least one processor; it is characterized in that the preparation method is characterized in that,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-9.
CN202111242869.3A 2021-10-25 2021-10-25 Memory optimization method and device of full-flow backtracking analysis system Active CN113672398B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111242869.3A CN113672398B (en) 2021-10-25 2021-10-25 Memory optimization method and device of full-flow backtracking analysis system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111242869.3A CN113672398B (en) 2021-10-25 2021-10-25 Memory optimization method and device of full-flow backtracking analysis system

Publications (2)

Publication Number Publication Date
CN113672398A true CN113672398A (en) 2021-11-19
CN113672398B CN113672398B (en) 2022-02-18

Family

ID=78551011

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111242869.3A Active CN113672398B (en) 2021-10-25 2021-10-25 Memory optimization method and device of full-flow backtracking analysis system

Country Status (1)

Country Link
CN (1) CN113672398B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105634958A (en) * 2015-12-24 2016-06-01 东软集团股份有限公司 Packet forwarding method and device based on multi-core system
CN106534345A (en) * 2016-12-07 2017-03-22 东软集团股份有限公司 Message forwarding method and device
CN108132889A (en) * 2017-12-20 2018-06-08 东软集团股份有限公司 EMS memory management process, device, computer readable storage medium and electronic equipment
CN109617832A (en) * 2019-01-31 2019-04-12 新华三技术有限公司合肥分公司 Message caching method and device
WO2019212182A1 (en) * 2018-05-04 2019-11-07 Samsung Electronics Co., Ltd. Apparatus and method for managing a shareable resource in a multi-core processor

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105634958A (en) * 2015-12-24 2016-06-01 东软集团股份有限公司 Packet forwarding method and device based on multi-core system
CN106534345A (en) * 2016-12-07 2017-03-22 东软集团股份有限公司 Message forwarding method and device
CN108132889A (en) * 2017-12-20 2018-06-08 东软集团股份有限公司 EMS memory management process, device, computer readable storage medium and electronic equipment
WO2019212182A1 (en) * 2018-05-04 2019-11-07 Samsung Electronics Co., Ltd. Apparatus and method for managing a shareable resource in a multi-core processor
CN109617832A (en) * 2019-01-31 2019-04-12 新华三技术有限公司合肥分公司 Message caching method and device

Also Published As

Publication number Publication date
CN113672398B (en) 2022-02-18

Similar Documents

Publication Publication Date Title
EP3073374B1 (en) Thread creation method, service request processing method and related device
US10754686B2 (en) Method and electronic device for application migration
US10897428B2 (en) Method, server system and computer program product for managing resources
CN106385377B (en) Information processing method and system
US20220236908A1 (en) Method, electronic device and computer program product for processing data
EP4060496A2 (en) Method, apparatus, device and storage medium for running inference service platform
CN110851276A (en) Service request processing method, device, server and storage medium
CN114155026A (en) Resource allocation method, device, server and storage medium
US8860740B2 (en) Method and apparatus for processing a display driver in virture desktop infrastructure
CN103338156B (en) A kind of name pipeline server concurrent communication method based on thread pool
CN111857992B (en) Method and device for allocating linear resources in Radosgw module
CN113672398B (en) Memory optimization method and device of full-flow backtracking analysis system
US9876876B2 (en) Processing a unit of work
US10198262B2 (en) Adaptive core grouping
CN110737530A (en) method for improving packet receiving capability of HANDLE identifier parsing system
CN113821174B (en) Storage processing method, storage processing device, network card equipment and storage medium
US10176144B2 (en) Piggybacking target buffer address for next RDMA operation in current acknowledgement message
CN115567602A (en) CDN node back-to-source method, device and computer readable storage medium
CN114579187A (en) Instruction distribution method and device, electronic equipment and readable storage medium
CN113765819A (en) Resource access method, device, electronic equipment and storage medium
KR20150048028A (en) Managing Data Transfer
CN116188240B (en) GPU virtualization method and device for container and electronic equipment
CN116596091B (en) Model training method, device, equipment and storage medium
CN111865675B (en) Container receiving and dispatching speed limiting method, device and system
CN115526507A (en) Battery replacement station management method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant