CN112714077A - Message duplicate removal method and device, convergence and distribution equipment and storage medium - Google Patents

Message duplicate removal method and device, convergence and distribution equipment and storage medium Download PDF

Info

Publication number
CN112714077A
CN112714077A CN202110329957.0A CN202110329957A CN112714077A CN 112714077 A CN112714077 A CN 112714077A CN 202110329957 A CN202110329957 A CN 202110329957A CN 112714077 A CN112714077 A CN 112714077A
Authority
CN
China
Prior art keywords
message
filtered
preset
value
byte
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110329957.0A
Other languages
Chinese (zh)
Other versions
CN112714077B (en
Inventor
王佳
张卫国
姜代伟
王志鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Jinling Sci&tech Group Co ltd
Original Assignee
Jiangsu Jinling Sci&tech Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Jinling Sci&tech Group Co ltd filed Critical Jiangsu Jinling Sci&tech Group Co ltd
Priority to CN202110329957.0A priority Critical patent/CN112714077B/en
Publication of CN112714077A publication Critical patent/CN112714077A/en
Application granted granted Critical
Publication of CN112714077B publication Critical patent/CN112714077B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/32Flow control; Congestion control by discarding or delaying data units, e.g. packets or frames
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/004Arrangements for detecting or preventing errors in the information received by using forward error control
    • H04L1/0056Systems characterized by the type of code used
    • H04L1/0061Error detection codes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation
    • H04L43/045Processing captured monitoring data, e.g. for logfile generation for graphical visualisation of monitoring data

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to the field of network data processing, in particular to a message duplicate removal method, a message duplicate removal device, a convergence and diversion device and a storage medium, wherein the message duplicate removal method comprises the steps of obtaining a message to be filtered and storing the message to be filtered into a cache; calculating hash values of a plurality of bytes in a message to be filtered, and calculating a cached cyclic redundancy check value; acquiring data corresponding to byte values of preset positions of messages to be filtered; and comparing the check code to be identified, which is formed by the data corresponding to the hash value, the cyclic redundancy check value and the byte value at the preset position, with the check code in the preset duplication elimination table, wherein if the check code to be identified exists in the preset duplication elimination table, the message to be filtered is a repeated message, and the repeated message is discarded. The check code to be identified is compared with a preset duplication elimination table, so that whether the message to be filtered is a repeated message or not is detected, and the accuracy of detecting the repeated message is improved through multiple comparison, so that the message to be filtered which is not the repeated message is prevented from being treated as the repeated message.

Description

Message duplicate removal method and device, convergence and distribution equipment and storage medium
Technical Field
The invention relates to the field of network data transmission, in particular to a message duplicate removal method and device, a convergence and diversion device and a storage medium.
Background
From the date of birth of the network, there is a need for monitoring and maintaining the network. Currently, network traffic visualization analysis is a development trend of network maintenance and guarantee, and at present, such related products can be roughly divided into two types from the application aspect, one is network performance analysis (NPM), and the other is service performance Analysis (APM), wherein the NPM performs analysis statistics and fault location on network performance indexes such as bandwidth, time delay, jitter, packet loss, retransmission, congestion, network attack, and the like, and the APM mainly analyzes service quality according to specific characteristics of a service carried by a data packet.
The bottom layer of the network flow visualization analysis is based on collecting and capturing data messages transmitted in real time in a network line, further performing statistical analysis according to analysis of relevant information in an original data packet, completing processing of message identification, stream management, stream statistics, rule matching, packet sampling, marking and the like on the data packet, and supporting orthogonal connection large-flow exchange output.
In the data transmission process, a large number of repeated messages exist, and the processing and transmission speeds of the network processor are seriously affected by the repeated messages, so that the processing performance of the convergence and shunt equipment is low. The existing message duplication elimination method has low identification accuracy rate on the duplicate messages, and a large number of normal messages can be treated as the duplicate messages.
Disclosure of Invention
Therefore, the technical problem to be solved by the present invention is to overcome the defect of low accuracy rate of identifying the repeated message in the message duplication removal method in the prior art, so as to provide a message duplication removal method, which comprises the following steps:
acquiring a message to be filtered, and storing the message to be filtered into a cache;
calculating hash values of a plurality of bytes in the message to be filtered, and calculating a cyclic redundancy check value of the cache;
acquiring data corresponding to byte values of preset positions of the messages to be filtered;
and comparing a check code to be identified, which is formed by the data corresponding to the hash value, the cyclic redundancy check value and the byte value at the preset position, with a check code in a preset deduplication table, wherein if the check code to be identified exists in the preset deduplication table, the message to be filtered is a repeated message, and the repeated message is discarded or forwarded.
Preferably, the method further comprises the following steps:
if the check code to be identified does not exist in the preset duplication elimination table, the message to be filtered is not a duplicate message;
and adding the check code to be identified to the preset duplication eliminating table.
Preferably, the calculating hash values of a plurality of preceding bytes in the message to be filtered includes:
acquiring a duplicate removal mode of the message to be filtered, wherein the duplicate removal mode is full packet duplicate removal or load duplicate removal;
if the duplication eliminating mode is full packet duplication elimination, calculating the hash values of a plurality of bytes in the message to be filtered from a local area network;
and if the deduplication mode is load deduplication, calculating the hash value of a plurality of bytes in the message to be filtered from the load.
Preferably, the calculating the cached cyclic redundancy check value includes:
judging the number of caches occupied by the message to be filtered;
if the number of occupied caches is one, calculating the cyclic redundancy check value of the caches;
and if the number of occupied caches is multiple, after the cyclic redundancy check value of the first cache is calculated, continuously and sequentially calculating the cyclic redundancy check values of the rest caches.
Preferably, the obtaining of the data corresponding to the byte values of the preset positions of the messages to be filtered includes:
judging the byte length of the last cache for storing the message to be filtered;
if the byte length of the last cache is one, taking the data corresponding to the byte value of the last cache and the data corresponding to the last byte value of the previous cache of the last cache;
and if the byte length of the last cache is not one, taking data corresponding to the last two byte values of the last cache.
Preferably, after the obtaining of the message to be filtered, the method further includes:
carrying out global configuration on the message to be filtered;
carrying out protocol analysis on the message to be filtered so as to identify the message to be filtered;
if the message to be filtered is not identified successfully, discarding the message to be filtered which is not identified successfully, or forwarding the message to be filtered which is not identified successfully to a preset sub-stream group.
Preferably, the method further comprises the following steps:
acquiring quinary information of an outer layer or an innermost layer according to the global configuration, wherein the quinary information comprises a source IP address, a destination IP address, a source port number, a destination port number and a protocol;
if the message to be filtered is successfully identified, searching a source IP address, a target IP address, a quinary, a source port number + protocol, a target port number + protocol, a source IP address + target port number, a target IP address + source port number, a source IP address + protocol and a target IP address + protocol in sequence by utilizing the quinary information; wherein the lookup includes an exact five-tuple lookup and a masked five-tuple lookup.
The invention also provides a message duplication remover, which comprises:
the first acquisition unit is used for acquiring a message to be filtered and storing the message to be filtered into a cache;
the calculation unit is used for calculating hash values of a plurality of bytes in the message to be filtered and calculating a cached cyclic redundancy check value;
the second obtaining unit is used for obtaining data corresponding to the byte value of the preset position of the message to be filtered;
and the judging unit is used for comparing a check code to be identified, which is formed by the data corresponding to the hash value, the cyclic redundancy check value and the byte value at the preset position, with a check code in a preset duplication elimination table, if the check code to be identified exists in the preset duplication elimination table, the message to be filtered is a repeated message, and the repeated message is discarded or forwarded.
The invention also provides a convergence and shunt device, comprising:
the network processor module is used for executing the message duplication eliminating method;
and the optical interface module is connected with the network processor module.
The present invention also provides a computer-readable storage medium storing computer instructions for causing a computer to execute the above-mentioned message deduplication method.
The technical scheme of the invention has the following advantages:
1. in the message duplication eliminating method provided by the invention, the obtained message to be filtered is stored in a cache, and then the hash values of a plurality of bytes in the message to be filtered, the cached cyclic redundancy check value and the data corresponding to the byte value at the preset position are calculated, and the check code to be identified, which is formed by the hash values, the cyclic redundancy check value and the data corresponding to the byte value at the preset position, is compared with the check code in the preset duplication eliminating table, if the check code to be identified exists in the preset duplication eliminating table, the message to be filtered is a duplicate message, and the duplicate message can be discarded or forwarded. The method adopts a check code to be identified which is composed of data corresponding to a hash value, a cyclic redundancy check value and a byte value at a preset position, compares the check code to be identified with a preset duplication elimination table so as to detect whether a message to be filtered is a repeated message, and improves the accuracy of detecting the repeated message through multiple comparison so as to avoid processing the message to be filtered which is not the repeated message as the repeated message.
2. In the message deduplication device provided by the invention, a first acquisition unit stores an acquired message to be filtered into a cache, a calculation unit calculates and obtains a hash value of a plurality of bytes in the message to be filtered and a cached cyclic redundancy check value, a second acquisition unit acquires data corresponding to a byte value at a preset position, a judgment unit compares a check code to be identified, which is composed of the hash value, the cyclic redundancy check value and the data corresponding to the byte value at the preset position, with a check code in a preset deduplication table, if the check code to be identified exists in the preset deduplication table, the message to be filtered is a duplicate message, and the duplicate message can be discarded or forwarded. The device adopts the check code to be identified which is composed of data corresponding to a hash value, a cyclic redundancy check value and a byte value at a preset position, and compares the check code to be identified with a preset duplication elimination table so as to detect whether the message to be filtered is a repeated message or not, and improves the accuracy of the detection of the repeated message through multiple comparison so as to avoid processing the message to be filtered which is not the repeated message as the repeated message.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a message deduplication method in embodiment 1 of the present invention;
fig. 2 is an overall flowchart of a message deduplication method in embodiment 1 of the present invention;
fig. 3 is a block diagram of a message deduplication apparatus according to embodiment 2 of the present invention;
fig. 4 is a block diagram of a convergence and shunt device in embodiment 3 of the present invention;
fig. 5 is a block diagram of a network processor module according to embodiment 3 of the present invention;
fig. 6 is a block diagram of a switching module according to embodiment 3 of the present invention;
fig. 7 is a schematic block diagram of a signal relay module in embodiment 3 of the present invention;
fig. 8 is a block diagram of a signal conversion module of the optical interface module in embodiment 3 of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it should be noted that the terms "first", "second", and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In addition, the technical features involved in the different embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
When the network processor processes the data message, if a large amount of repeated messages exist in the data message, the processing speed of the network processor is seriously affected, and the processing performance of the data processing equipment is greatly reduced.
However, the current detection method for the repeated messages has low accuracy, so that when the network processor processes the data messages, many normal messages are processed as the repeated messages. Therefore, it becomes very important to improve the processing performance of the network processor to improve the detection accuracy of the duplicate packet, and the packet deduplication method provided by the embodiment of the present invention is described below by taking the aggregation and offloading device as an example.
Example 1
Fig. 1 is a flowchart illustrating discarding or forwarding a duplicate packet by calculating a hash value, a cyclic redundancy check value, and obtaining data corresponding to a byte value at a preset position according to some embodiments of the present invention. Although the processes described below include operations that occur in a particular order, it should be clearly understood that the processes may include more or fewer operations that are performed sequentially or in parallel (e.g., using parallel processors or a multi-threaded environment).
The embodiment provides a message deduplication method, which is used for deduplication of a data message in a network processor to improve the detection accuracy of duplicate messages, as shown in fig. 1, and includes the following steps:
s101, obtaining a message to be filtered, and storing the message to be filtered into a cache.
In the above implementation steps, the network processor obtains the messages to be filtered, the number of the obtained messages to be filtered is usually huge, and in order to reduce the processing pressure of the network processor and increase the processing speed of the network processor, repeated messages in the obtained messages to be filtered can be discarded, so that the number of the messages to be processed is reduced.
After the network processor acquires the message to be filtered, the acquired message to be filtered is stored in a cache so as to be convenient for processing the message to be filtered subsequently.
S102, calculating hash values of a plurality of bytes in the message to be filtered, and calculating a cyclic redundancy check value of the cache.
In the above implementation steps, the deduplication mode of the message to be filtered may be full packet deduplication or load deduplication. If full packet deduplication is selected, calculating hash (hash) values of a plurality of bytes in a message to be filtered from a local area network address (mac); if load deduplication is selected, the hash value of the previous bytes in the message to be filtered is calculated starting from the load (payload).
In this embodiment, the hash value of the first sixty-four bytes in the message to be filtered may be calculated. In some embodiments, the hash value of the first thirty-two bytes or the first forty-eight bytes in the message to be filtered may also be calculated, and those skilled in the art may make reasonable selections according to the actual situation, which is not limited herein.
Due to the size of the message data, the number of temporary buffers (buf) may be different when the messages to be filtered are stored in the buffers.
When calculating the cyclic redundancy check (crc 32) value of the cache, firstly, judging the number of the cache occupied by the message to be filtered, if the number of the cache occupied by the message to be filtered is one, calculating the cyclic redundancy check value of the cache, namely calculating the cyclic redundancy check value of the whole cache; if the number of the caches occupied by the message to be filtered is multiple, after the cyclic redundancy check value of the first cache is calculated, the cyclic redundancy check values of the rest caches are continuously and sequentially calculated.
For example, if the number of occupied caches is three when the obtained message to be filtered is stored in the caches, after the cyclic redundancy check value of the first cache is calculated, the cyclic redundancy check value of the second cache is calculated, and finally the cyclic redundancy check value of the third cache is calculated.
S103, acquiring data corresponding to the byte value of the preset position of the message to be filtered.
In the foregoing implementation steps, the preset position of the message to be filtered may be the last two byte values in the message to be filtered, and in some embodiments, the preset position of the message to be filtered may also be the last three byte values or the first two byte values in the message to be filtered.
Because the number of the occupied caches when the message to be filtered is stored in the caches and the byte length in each cache may be different, when data corresponding to the byte values at the preset positions of the message to be filtered are obtained, for example, data corresponding to the last two byte values in the message to be filtered are obtained in the following manner:
judging the byte length of the last cache for storing the message to be filtered, and if the byte length of the last cache is one, taking the data corresponding to the byte value of the last cache and the data corresponding to the last byte value of the previous cache of the last cache; and if the byte length of the last cache is not one, taking the data corresponding to the last two byte values of the last cache.
When the message to be filtered is stored in the cache, only one cache may be occupied, but due to the property of the message to be filtered, the byte length of the cache (the last cache at this time) is not one, so that the data corresponding to the last two byte values in the cache can be taken, and the data corresponding to the last two byte values in the message to be filtered can be obtained.
In some embodiments, when the preset position of the message to be filtered is the last three byte values in the message to be filtered, the byte length of the last cache storing the message to be filtered may be obtained, and if the byte length of the last cache is not less than three, data corresponding to the last three byte values of the last cache is obtained; and if the byte length of the last cache is less than three, acquiring data corresponding to all byte values of the last cache, and acquiring data corresponding to the last byte value of the previous cache of the last cache or data corresponding to the last two byte values to complement the data corresponding to the required byte values. When the length of the byte to be filtered is large, the processing efficiency of the network processor will be reduced, so in this embodiment, the preset position of the message to be filtered preferably selects the last two byte values of the message to be filtered.
In some embodiments, when the preset position of the message to be filtered is the first two byte values in the message to be filtered, the data corresponding to the first two byte values of the first cache occupied by the message to be filtered may be acquired.
S104, comparing a check code to be identified, which is formed by the hash value, the cyclic redundancy check value and the data corresponding to the byte value of the preset position, with a check code in a preset duplication elimination table, wherein if the check code to be identified exists in the preset duplication elimination table, the message to be filtered is a repeated message, and the repeated message is discarded or forwarded.
In the above implementation step, the preset deduplication table stores a check code (key), the check code may be a check code to be recognized that is obtained when the network processor processes the same batch of messages, and the check code to be recognized may be composed of the obtained hash value, the cyclic redundancy check value, and data corresponding to the byte value of the preset position. The preset duplication elimination table can be stored with or without check codes before duplication elimination.
And comparing the check code to be identified, which is formed by the data corresponding to the hash value, the cyclic redundancy check value and the byte value at the preset position, with the check code in the preset duplication elimination table, wherein if the check code to be identified exists in the preset duplication elimination table, the message to be filtered is a repeated message, and the message to be filtered (the repeated message) can be discarded or forwarded.
If the check code to be identified does not exist in the preset duplication elimination table, the message to be filtered is not a duplicate message, and the message to be filtered, which is not a duplicate message, can be output to subsequent equipment for processing after being forwarded to a corresponding output port (e.g., an optical interface). In order to prevent the message which is repeated with the message to be filtered from appearing in the subsequent processing, the acquired check code to be identified can be added into a preset duplication removing table through self-learning so as to facilitate the query of the subsequent message to be filtered.
In some embodiments, the preset deduplication table may have an aging function, that is, by setting an aging time, for example, one second, two seconds, and the like, if the time added by the check code in the preset deduplication table is greater than the set aging time, the check code will be aged and dropped, and a new check code will be learned again.
In the above embodiment, the obtained message to be filtered is stored in the cache, and then the hash values of a plurality of bytes in the message to be filtered, the cached cyclic redundancy check value, and the data corresponding to the byte value at the preset position are obtained through calculation, and the check code to be identified, which is composed of the hash values, the cyclic redundancy check value, and the data corresponding to the byte value at the preset position, is compared with the check code in the preset deduplication table, if the check code to be identified exists in the preset deduplication table, the message to be filtered is a duplicate message, and the duplicate message can be discarded or forwarded. The method adopts a check code to be identified which is composed of data corresponding to a hash value, a cyclic redundancy check value and a byte value at a preset position, compares the check code to be identified with a preset duplication elimination table so as to detect whether a message to be filtered is a repeated message, and improves the accuracy of detecting the repeated message through multiple comparison so as to avoid processing the message to be filtered which is not the repeated message as the repeated message.
As an optional implementation, after obtaining the message to be filtered, global configuration may be performed on the message to be filtered, a Logic ID (logical ID) of an input Port of the message to be filtered is searched and read, a source of the message to be filtered may be determined according to a Logic ID range, and if the message enters from a Port (e.g., an optical interface), a Port ID (Port ID) may be obtained by mapping a Slot ID (Slot ID) and the Logic ID in the global configuration, so that a Port from which the message to be filtered specifically enters may be determined.
As an optional implementation manner, after determining from which port the message to be filtered specifically enters, the method may further include performing protocol analysis on the message to be filtered to identify the message to be filtered, and for the message to be filtered that is not identified successfully, may read an unidentified message action configured globally, discard the message to be filtered that is not identified successfully, or forward the message to be filtered that is not identified successfully to a preset packet group.
Quinary information of an outer layer or an innermost layer, including a source IP address (SIP), a destination IP address (DIP), a source port number (SP), a destination port number (DP), and a protocol, L3/L4 offset, and the like, may be acquired according to a global configuration.
For the successfully identified messages to be filtered, five-element information can be utilized to sequentially search for SIP, DIP, five-element (SIP, DIP, SP, DP and protocol), SP + protocol, DP + protocol, SIP + DP, DIP + SP, SIP + protocol and DIP + protocol, namely nine kinds of accurate five-element searching and mask five-element searching are carried out by utilizing the five-element information carried by the messages to be filtered, so that the messages to be filtered in the flow are filtered, forwarded, discarded and the like. In the nine searches, if one of the searches is found, the subsequent searches can not be carried out any more so as to reduce the processing workload of the network processor; and if the nine searches are carried out in sequence and none is found, discarding the message to be filtered.
For example, the above nine searches are performed on the successfully identified message to be filtered, if the SIP is used as the rule to extract the SIP information from the analyzed and identified message, the SIP rule search is performed, and the message to be filtered can be forwarded or discarded during the search, and the subsequent search is terminated; if the message to be filtered is not searched, carrying out DIP rule search, if the message to be filtered is not searched, carrying out forwarding or discarding operation on the message to be filtered, and terminating subsequent search; the search is continued until all nine searches are finished. If not, the message to be filtered is discarded, and any one of the messages in the check can continue other flow operations, such as forwarding, discarding and the like.
The following describes in detail the steps of the message deduplication method provided in this embodiment with reference to fig. 2 by taking the aggregation and offloading device as an example:
the message to be filtered flows into the convergence and shunt device from the optical interface of the convergence and shunt device, and the convergence and shunt device executes the steps at least including the following steps after receiving the message to be filtered:
s201, message analysis.
After receiving the message to be filtered, the convergence and diversion device analyzes the message to be filtered, namely, identifies the message to be filtered.
S202, judging whether to start message deduplication.
After the message analysis step is executed, the convergence and shunt equipment judges whether to start message deduplication;
if the judgment result in the step S202 is "no", directly executing the step S218 to perform a normal message processing flow; if the judgment result of the step S202 is 'yes', namely, the message is started to be deduplicated, the step S203 is executed, and the deduplication mode is continuously judged; the duplication eliminating mode can be selected by a user, and a default mode can also be used by the convergence and shunt device.
If the judgment result in the step S203 is "full packet deduplication", continuing to execute the step S204, and calculating the first sixty-four byte hash value from the local area network address (mac); if the determination result in step S203 is "load deduplication", then step S205 is continued, and the first sixty-four byte hash value is calculated from the load (payload). The length of the effective message is taken when the hash value is calculated, that is, if the message to be filtered is less than sixty-four bytes, the convergence and diversion device only calculates the hash value of all bytes contained in the message to be filtered.
S207, judging whether the buf (cache) number occupied by the message to be filtered is one.
After the sixty-four byte hash value is calculated in step S206, step S207 is executed to determine whether the number of the buffers occupied by the packet to be filtered is one.
If the judgment result in the step S207 is "no", after the step S208 is executed and the cyclic redundancy check value of the first cache is calculated, the step S210 is executed again, that is, the cyclic redundancy check values of the remaining caches are calculated in sequence again; if the determination result of step S207 is yes, step S209 is performed to calculate the cyclic redundancy check (crc 32) value of the first buffer (i.e., all buffers).
S211, determine whether the byte length of the Last buffer is one (Last-buf-len = = 1).
If the judgment result in the step S211 is "no", then step S213 is executed to directly obtain data corresponding to the last two byte values of the message to be filtered; if the determination result in the step S211 is "yes", after the step S212 is executed, and data corresponding to the last byte value of the last cache of the message to be filtered is fetched, the step S214 is executed, and data corresponding to the last byte value of the previous cache of the last cache is fetched.
S215, searching a preset duplication elimination table of the message.
If the calculated hash value, the cyclic redundancy check value and the data corresponding to the last two bytes are hit in the preset deduplication table, executing step S217, and discarding the message to be filtered (duplicate message); if the message to be filtered is not a duplicate message, the step S216 is executed to add the hash value, the cyclic redundancy check value and the data corresponding to the last two bytes to the preset duplication elimination table of the message, even if the message duplication elimination table is learned, so as to facilitate duplication elimination of the subsequent message to be filtered. And step S218 is performed on the message to be filtered, so as to perform a normal message processing procedure.
Example 2
This embodiment provides a packet deduplication device, configured to deduplicate a packet to be filtered, as shown in fig. 3, including:
the first obtaining unit 301 is configured to obtain a message to be filtered, and store the message to be filtered in a cache. For details, please refer to the related description of step S101 in embodiment 1, which is not repeated herein.
A calculating unit 302, configured to calculate hash values of a plurality of preceding bytes in the message to be filtered, and calculate a cyclic redundancy check value of the cache. For details, please refer to the related description of step S102 in embodiment 1, which is not repeated herein.
A second obtaining unit 302, configured to obtain data corresponding to a byte value of a preset position of the packet to be filtered. For details, please refer to the related description of step S103 in embodiment 1, which is not repeated herein.
A determining unit 304, configured to compare a check code to be identified, which is formed by the data corresponding to the hash value, the cyclic redundancy check value, and the byte value at the preset position, with a check code in a preset deduplication table, if the check code to be identified exists in the preset deduplication table, the packet to be filtered is a duplicate packet, and the duplicate packet is discarded or forwarded. For details, please refer to the related description of step S104 in embodiment 1, which is not repeated herein.
In the above implementation steps, the first obtaining unit 301 stores the obtained message to be filtered in a cache, the calculating unit 302 calculates hash values of a plurality of bytes in the message to be filtered and a cyclic redundancy check value of the cache, the second obtaining unit 303 obtains data corresponding to a byte value at a preset position, the determining unit 304 compares a check code to be identified, which is composed of the hash values, the cyclic redundancy check value and the data corresponding to the byte value at the preset position, with a check code in a preset deduplication table, and if the check code to be identified exists in the preset deduplication table, the message to be filtered is a duplicate message, and the duplicate message can be discarded or forwarded. The device adopts the check code to be identified which is composed of data corresponding to a hash value, a cyclic redundancy check value and a byte value at a preset position, and compares the check code to be identified with a preset duplication elimination table so as to detect whether the message to be filtered is a repeated message or not, and improves the accuracy of the detection of the repeated message through multiple comparison so as to avoid processing the message to be filtered which is not the repeated message as the repeated message.
As an optional implementation manner, the packet deduplication device may further include a packet parsing unit, configured to perform protocol parsing on the packet to be filtered by the protocol parsing module, and include a tunnel. The message analysis unit mainly realizes the functions of message identification, key information extraction and key search extraction, and simultaneously, various parameters related to input are counted.
Example 3
The embodiment provides a convergence and offloading device, which is capable of completing processing such as message identification, stream management, stream statistics, rule matching, and packet sampling on a message to be filtered, as shown in fig. 4 and 5, and includes a network processor module 404 and an optical interface module. The network processor module 404 stores and executes program instructions/modules (such as the first acquiring unit 301, the calculating unit 302, the second acquiring unit 303, and the determining unit 304 shown in fig. 3) corresponding to the message deduplication method of embodiment 1.
The optical interface module is connected to the network processor module 404, the packet to be filtered can flow into the network processor module 404 from the optical interface module, and the network processor module 404 executes the packet deduplication method as described in embodiment 1 on the packet to be filtered, so that the detection accuracy of the packet to be filtered is improved, and the normal packet is prevented from being processed as a duplicate packet. Repeated messages to be filtered are effectively filtered, and the processing performance of the convergence and diversion equipment can be improved.
As shown in fig. 4, the optical interface module may include a first optical interface module 4061 and a second optical interface module 4062, the network processor module may include a first network processor 4041 and a second network processor 4042, the first optical interface module 4061 is directly connected to the network processor module and is referred to as a direct connection port, the second optical interface module 4062 is connected to the switching module 403 and then is connected to the network processor module and is referred to as a switching port, and both the direct connection port and the switching port may be used as input and output ports of the packet to be filtered.
The switching module 403 may be connected to the first network processor 4041 and the second network processor 4042, respectively, to implement line-speed processing of the to-be-filtered packet, the processed to-be-filtered packet may be returned to the switching module 403 by the network processor module 404, and then the switching module 403 implements cross-board forwarding processing through the orthogonal connector 401, or directly implements local board data outflow through the second optical interface module 4062.
As shown in fig. 5, the network processor module 404 is a core data processing unit of the convergence and offloading device, and the network processor module 404 may cooperate with the NL12K chip 412 to implement a data processing flow of the convergence and offloading device; the network processor module 404 may also cooperate with a TCAM (ternary content addressable register) chip to implement data mask rule matching processing, and the board network management module 405 is connected with the network processor module 404 through a PCI-E interface to implement configuration and management. The network processor module 404 is also connected to the memory chip 413.
The first network processor 4041 and the second network processor 4042 in the network processor module 404 may adopt a network processor NPS400 chip, and may implement data input and output at 200Gbps with the optical interface module, and implement data input and output at 800Gbps with the switch module 403. The ports of the Serdes (serial-parallel transceiver) of the NPS400 chip of the network processor are divided into two groups, EAST and WEST, each group has 50 ports, and the ports can be configured according to a 12 lane mode and a 10 lane mode, in this embodiment, the two groups of ports both adopt the 12 lane mode. However, since only the PCIE interface of the WEST group can access the configuration register inside the NPS400 of the network processor, the ports of the WEST group are mainly connected to the switch module 403, and the ports of the EAST group are mainly connected to the optical interface module and the TCAM chip.
As shown in fig. 1 and fig. 6, the switching module 413 undertakes forwarding functions of single-board service data and a control plane, in this embodiment, factors such as function, performance, cost, and technology continuity are considered comprehensively, a model 4031 of the Fabric channel of the convergence and offloading device is BCM56960, which can be used for implementing switching and forwarding of processed data, and the chip is based on an advanced full-hardware forwarding technology, adopts a distributed and scalable design, has a large amount of switching capacity and an ultrahigh forwarding performance, supports a switching capacity of 3.2 Tbps to the maximum, and can fully meet research and development requirements of next-generation devices. The model of the switching chip 4032 of the Base channel of the convergence and shunt device is BCM5389, and the service requirement of a control plane can be met.
In this embodiment, the interface characteristics of the NPS daughter card, the cavum daughter card, the I/O daughter card, and the BCM56960 chip, and the requirements of backplane switching capacity and input/output are comprehensively considered, and the interface utilization maximization can be realized, the convergence and shunt device may be provided with an extended optical interface module (not shown, the extended optical interface module and the optical interface module are both provided with a plurality of optical interfaces), and the extended optical interface module is connected with the convergence and shunt device through the orthogonal connector 401.
Since the routing distance from the switch module 403 to the expansion optical interface module is long, the signal relay module 402 needs to be added to enhance the integrity of the signal. As shown in fig. 6-8, the signal relay module 402 may use a chip with a model of BCM82381 to perform signal relay amplification, where the chip supports three speeds configured as 10G/40G/100G, and can meet application requirements of different interfaces.
In this embodiment, as shown in fig. 6 and fig. 7, the switch module 403 may output 16 × 100G to the optical interface 40611 of the expansion optical interface module, and a total of eight BCM82381 chips are required to complete the signal relay amplifying function. In some embodiments, on the line side, the BCM82381 chip may be interconnected with the extended optical interface module through an 8x25G (which may also be configured as 8x 10G) interface. On the system side, the BCM82381 chip may be interconnected with the switch module 403 via an 8x25G (which may also be configured as 8x 10G) interface.
In this embodiment, as shown in fig. 5, 6, and 8, the optical interface module mainly implements twenty-path input and output of 100G optical signal data, the optical interface may adopt a standard interface of a model QSFP28, and the 100G optical signal may adopt a chip of a model BCM82792 (e.g., the signal conversion chip 411 shown in fig. 5 and 8) to be converted into a CAUI-10 signal and then interface with the network processor module 404. When a message to be filtered is input into the network processor module 404, signal conversion needs to be completed, a QSFP28 optical interface can complete conversion from a 100G optical signal to an electrical signal, an electrical interface can be a 4 × 25G interface, and a management interface can be an I2C interface, where the QSFP28 optical interface can be mainly divided into a management and indication interface, an I2C interface, and a serial data interface.
As shown in fig. 4, the convergence and offloading device may further include an IPMC (intelligent platform management controller) module 407, a logic management module 408, a clock module 409, a power supply module 410, and a board network management module 405.
IPMC module 407 is the bearer of ATCA (advanced Telecommunications computing platform) intelligent platform management functions, located physically primarily above each FRU on the ATCA rack. The IPMC module 407 is configured to record basic information of the board and implement management (e.g., hot plug management, power management, etc.) on the bottom layer of the board. The IPMC module 407 may communicate with the ShMC over two dual redundant IPMB (Intelligent platform management) buses (IPMI interfaces). The ShMC is responsible for completing management of FRUs (such as single boards, power supplies, fans, etc.) in the ATCA system.
The board network management module 405 is a control management unit of the whole board, and is mainly responsible for management, configuration, control and communication functions of the board, and specifically includes the following steps:
1) chip configuration of the single board, which completes parameter configuration of each main device of the single board;
2) managing system ACL rules, and completing management of large-capacity ACL rules by matching with a main control board;
3) and communication and control, namely, realizing communication interaction among the board card and single board network management modules through a backboard BASE channel, and finishing management control on the single board system under the unified management of the master control CPU.
The logic management module 408 mainly implements functions such as board reset control, LPC bus timing sequence conversion, MDC/MDIO bus control, optical interface LED control, and board logic control. The reset control of the single board is mainly triggered by 706 reset and IPMC reset, and the COME (modular computer) can manage the single board in a mode of reading and writing a register by LPC.
The power supply module 410 may adopt a discrete modular design, and in combination with power network requirement analysis and the actual situation of the current mainstream power supply module, a BMR480 power supply module, a BMR467 power supply module, a PVX012A power supply module, and the like may be selected.
The clock module 409 may adopt a passive crystal oscillator or a differential clock, and adopt a passive crystal oscillator and a clock buffer (buffer memory) to meet the requirement of a multi-channel clock, where the passive crystal oscillator may be a passive crystal oscillator with specifications of 50Mhz, 100Mhz, 125Mhz, 156.25Mhz, and the like, and the clock buffer may be a four-channel MC100LVEP14DT and a twelve-channel CDCLVP1212 RHA.
The present embodiment also provides a non-transitory computer-readable storage medium, where the computer-readable storage medium stores computer-executable instructions, where the computer-executable instructions may execute the message deduplication method in any of the above method embodiments. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD), a Solid State Drive (SSD), or the like; the storage medium may also comprise a combination of memories of the kind described above.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications therefrom are within the scope of the invention.

Claims (10)

1. A message duplication eliminating method is characterized by comprising the following steps:
acquiring a message to be filtered, and storing the message to be filtered into a cache;
calculating hash values of a plurality of bytes in the message to be filtered, and calculating a cyclic redundancy check value of the cache;
acquiring data corresponding to byte values of preset positions of the messages to be filtered;
and comparing a check code to be identified, which is formed by the data corresponding to the hash value, the cyclic redundancy check value and the byte value at the preset position, with a check code in a preset deduplication table, wherein if the check code to be identified exists in the preset deduplication table, the message to be filtered is a repeated message, and the repeated message is discarded or forwarded.
2. The message deduplication method of claim 1, wherein the method further comprises:
if the check code to be identified does not exist in the preset duplication elimination table, the message to be filtered is not a duplicate message;
and adding the check code to be identified to the preset duplication eliminating table.
3. The message deduplication method according to claim 1 or 2, wherein the calculating the hash value of a plurality of preceding bytes in the message to be filtered includes:
acquiring a duplicate removal mode of the message to be filtered, wherein the duplicate removal mode is full packet duplicate removal or load duplicate removal;
if the duplication elimination mode is full packet duplication elimination, calculating the hash values of a plurality of bytes in the message to be filtered from the local area network address;
and if the deduplication mode is load deduplication, calculating the hash value of a plurality of bytes in the message to be filtered from the load.
4. The message deduplication method of claim 1 or 2, wherein the calculating the cached cyclic redundancy check value comprises:
judging the number of caches occupied by the message to be filtered;
if the number of occupied caches is one, calculating the cyclic redundancy check value of the caches;
and if the number of occupied caches is multiple, after the cyclic redundancy check value of the first cache is calculated, continuously and sequentially calculating the cyclic redundancy check values of the rest caches.
5. The method according to claim 1 or 2, wherein the preset positions of the messages to be filtered are the last two byte values of the messages to be filtered, and the obtaining of the data corresponding to the byte values of the preset positions of the messages to be filtered includes:
judging the byte length of the last cache for storing the message to be filtered;
if the byte length of the last cache is one, taking the data corresponding to the byte value of the last cache and the data corresponding to the last byte value of the previous cache of the last cache;
and if the byte length of the last cache is not one, taking data corresponding to the last two byte values of the last cache.
6. The method according to claim 1 or 2, wherein after the obtaining the message to be filtered, the method further comprises:
carrying out global configuration on the message to be filtered;
carrying out protocol analysis on the message to be filtered so as to identify the message to be filtered;
if the message to be filtered is not identified successfully, discarding the message to be filtered which is not identified successfully, or forwarding the message to be filtered which is not identified successfully to a preset sub-stream group.
7. The message deduplication method of claim 6, further comprising:
acquiring quinary information of an outer layer or an innermost layer according to the global configuration, wherein the quinary information comprises a source IP address, a destination IP address, a source port number, a destination port number and a protocol;
if the message to be filtered is successfully identified, searching a source IP address, a target IP address, a quinary, a source port number + protocol, a target port number + protocol, a source IP address + target port number, a target IP address + source port number, a source IP address + protocol and a target IP address + protocol in sequence by utilizing the quinary information; wherein the lookup comprises an exact five-tuple lookup and a masked five-tuple lookup.
8. A message deduplication apparatus, comprising:
the first acquisition unit is used for acquiring a message to be filtered and storing the message to be filtered into a cache;
the calculation unit is used for calculating hash values of a plurality of bytes in the message to be filtered and calculating a cached cyclic redundancy check value;
the second obtaining unit is used for obtaining data corresponding to the byte value of the preset position of the message to be filtered;
and the judging unit is used for comparing a check code to be identified, which is formed by the data corresponding to the hash value, the cyclic redundancy check value and the byte value at the preset position, with a check code in a preset duplication elimination table, if the check code to be identified exists in the preset duplication elimination table, the message to be filtered is a repeated message, and the repeated message is discarded or forwarded.
9. A convergence shunting device, comprising:
a network processor module for performing the message deduplication method of any of claims 1-7;
and the optical interface module is connected with the network processor module.
10. A computer-readable storage medium having stored thereon computer instructions for causing a computer to perform the message deduplication method of any one of claims 1-7.
CN202110329957.0A 2021-03-29 2021-03-29 Message duplicate removal method and device, convergence and distribution equipment and storage medium Active CN112714077B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110329957.0A CN112714077B (en) 2021-03-29 2021-03-29 Message duplicate removal method and device, convergence and distribution equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110329957.0A CN112714077B (en) 2021-03-29 2021-03-29 Message duplicate removal method and device, convergence and distribution equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112714077A true CN112714077A (en) 2021-04-27
CN112714077B CN112714077B (en) 2021-07-20

Family

ID=75550397

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110329957.0A Active CN112714077B (en) 2021-03-29 2021-03-29 Message duplicate removal method and device, convergence and distribution equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112714077B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111131479A (en) * 2019-12-27 2020-05-08 迈普通信技术股份有限公司 Flow processing method and device and flow divider
CN117240799A (en) * 2023-11-16 2023-12-15 北京中科网芯科技有限公司 Message de-duplication method and system for convergence and distribution equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108737287A (en) * 2018-05-22 2018-11-02 北京中创腾锐技术有限公司 Repeated packets recognition methods, device and convergence shunting device
CN111770023A (en) * 2020-06-28 2020-10-13 湖南有马信息技术有限公司 Message duplicate removal method and device based on FPGA and FPGA chip
CN112152937A (en) * 2020-09-29 2020-12-29 锐捷网络股份有限公司 Message duplicate removal method and device, electronic equipment and storage medium
US10892990B1 (en) * 2018-11-10 2021-01-12 Veritas Technologies Llc Systems and methods for transmitting data to a remote storage device
CN112491745A (en) * 2020-11-17 2021-03-12 广州西麦科技股份有限公司 Flow duplicate removal method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108737287A (en) * 2018-05-22 2018-11-02 北京中创腾锐技术有限公司 Repeated packets recognition methods, device and convergence shunting device
US10892990B1 (en) * 2018-11-10 2021-01-12 Veritas Technologies Llc Systems and methods for transmitting data to a remote storage device
CN111770023A (en) * 2020-06-28 2020-10-13 湖南有马信息技术有限公司 Message duplicate removal method and device based on FPGA and FPGA chip
CN112152937A (en) * 2020-09-29 2020-12-29 锐捷网络股份有限公司 Message duplicate removal method and device, electronic equipment and storage medium
CN112491745A (en) * 2020-11-17 2021-03-12 广州西麦科技股份有限公司 Flow duplicate removal method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111131479A (en) * 2019-12-27 2020-05-08 迈普通信技术股份有限公司 Flow processing method and device and flow divider
CN111131479B (en) * 2019-12-27 2022-04-05 迈普通信技术股份有限公司 Flow processing method and device and flow divider
CN117240799A (en) * 2023-11-16 2023-12-15 北京中科网芯科技有限公司 Message de-duplication method and system for convergence and distribution equipment
CN117240799B (en) * 2023-11-16 2024-02-02 北京中科网芯科技有限公司 Message de-duplication method and system for convergence and distribution equipment

Also Published As

Publication number Publication date
CN112714077B (en) 2021-07-20

Similar Documents

Publication Publication Date Title
US10516612B2 (en) System and method for identification of large-data flows
US9832122B2 (en) System and method for identification of large-data flows
CN112714077B (en) Message duplicate removal method and device, convergence and distribution equipment and storage medium
EP3057272B1 (en) Technologies for concurrency of cuckoo hashing flow lookup
US10097464B1 (en) Sampling based on large flow detection for network visibility monitoring
EP3057270A1 (en) Technologies for modular forwarding table scalability
US9979624B1 (en) Large flow detection for network visibility monitoring
US10536360B1 (en) Counters for large flow detection
US10778588B1 (en) Load balancing for multipath groups routed flows by re-associating routes to multipath groups
CN112929299B (en) SDN cloud network implementation method, device and equipment based on FPGA accelerator card
US8885480B2 (en) Packet priority in a network processor
US20180375773A1 (en) Technologies for efficient network flow classification with vector bloom filters
US10003515B1 (en) Network visibility monitoring
WO2017112260A1 (en) Technologies for sideband performance tracing of network traffic
KR20120078535A (en) Sas expander connection routing techniques
US20200364080A1 (en) Interrupt processing method and apparatus and server
CN108306835A (en) A kind of the input-buffer structure and data forwarding method of Ethernet switch
US11646976B2 (en) Establishment of fast forwarding table
US9137158B2 (en) Communication apparatus and communication method
US10185783B2 (en) Data processing device, data processing method, and non-transitory computer readable medium
CN110297785A (en) A kind of finance data flow control apparatus and flow control method based on FPGA
CN114710444B (en) Data center flow statistics method and system based on tower type abstract and evictable flow table
CN114244781B (en) Message de-duplication processing method and device based on DPDK
CN109120665B (en) High-speed data packet acquisition method and device
JP2003218907A (en) Processor with reduced memory requirements for high- speed routing and switching of packets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant