CN116436987B

CN116436987B - IO-Link master station data message transmission processing method and system

Info

Publication number: CN116436987B
Application number: CN202310684543.9A
Authority: CN
Inventors: 赵家茂; 程超
Original assignee: Shenzhen Shunchang Automation Control Technology Co ltd
Current assignee: Shenzhen Shunchang Automation Control Technology Co ltd
Priority date: 2023-06-12
Filing date: 2023-06-12
Publication date: 2023-08-22
Anticipated expiration: 2043-06-12
Also published as: CN116436987A

Abstract

The application provides a method and a system for transmitting and processing data messages of an IO-Link master station, and relates to the technical field of data processing. The method comprises the following steps: collecting message data; segmenting the message data according to the data attribute to obtain a plurality of segment data; processing each segment of data according to the frequency to obtain a corresponding character group; calculating the centrality of each character in each character group to obtain group edge characters; calculating the connectivity of each group of edge characters through the annular characters to obtain the connected characters; obtaining an endianness according to the characters in the connection character and the character group, and performing BWT coding on the segment data according to the endianness; the encoded data for each segment data is compressed by LZ77 to obtain and transmit segment compressed data. According to the data characteristics, the application sets the character sequence in a self-defined way, provides a good compression environment for data compression, further greatly improves the compression effect and achieves the purpose of efficient transmission of message data.

Description

IO-Link master station data message transmission processing method and system

Technical Field

The application relates to the technical field of data processing, in particular to a method and a system for transmitting and processing data messages of an IO-Link master station.

Background

The existing compression of the message data mainly enhances the correlation between the data through BWT coding, and then compresses through LZ77, but because of the variety of character types in the message data, the existing BWT is often a fixed character sequence, such as the character sequence in a dictionary is coded, so that the coded result often cannot fully utilize the data characteristics.

Disclosure of Invention

The application aims to provide a method and a system for transmitting and processing IO-Link master station data messages, which are used for relieving the technical problems in the prior art.

In a first aspect, an embodiment of the present application provides a method for transmitting and processing an IO-Link master station data packet, where the method includes:

collecting message data;

segmenting the message data according to the data attribute to obtain a plurality of segment data;

processing each segment of data according to the frequency to obtain a corresponding character group;

calculating the centrality of each character in each character group to obtain group edge characters;

calculating the connectivity of each group of edge characters through the annular characters to obtain the connected characters;

obtaining an endianness according to the connection characters and characters in the character group, and performing BWT coding on segment data according to the endianness;

the encoded data for each segment data is compressed by LZ77 to obtain and transmit segment compressed data.

In an alternative embodiment, the step of segmenting the message data according to the data attribute to obtain a plurality of segment data includes:

searching keywords in the message data;

and putting together data with the same meaning according to the keywords to obtain segment data.

In an alternative embodiment, the step of obtaining a piece of data includes:

classifying the message data according to the data prefixes, and marking the message data with the same prefix as one type of data as one segment of data.

In an alternative embodiment, the step of processing each segment data according to the frequency to obtain a corresponding character set includes:

counting the frequency value of each character in each segment data;

performing ascending order sorting according to the frequency value of each character to obtain an ascending order frequency sequence;

dividing the ascending frequency sequence through otsu multi-threshold segmentation to obtain segmentation points;

the frequency value between adjacent division points is used as a class of frequency value;

obtaining all characters corresponding to each category according to one character corresponding to each frequency value in each category;

and all the characters with similar frequencies corresponding to each category form the character group.

In an alternative embodiment, the step of calculating the centrality of each character in each character group to obtain a group edge character includes:

applying a KM maximum matching principle to each character group, and finding out a character corresponding to the maximum edge value from a left node in KM matching as an initial node;

obtaining a matching character of the initial node on the right side according to the matching relation;

finding out the homonymous node of the matching character on the right side on the left side, and obtaining the matching character of the homonymous node on the left side on the right side according to the matching relation;

iterating for a plurality of times until all characters in the character group are traversed to obtain a matching chain;

and taking the two characters with the greatest centrality in the matching chain as the group of edge characters.

In an alternative embodiment, the step of calculating character centrality comprises:

obtaining the edge value of the corresponding edge of the character, and marking as；

Obtaining the ratio of the number of nodes on two sides of the character node, and marking as；

Obtaining the centrality of the character according to a first preset formula, wherein the first preset formula is as follows:

wherein ,representing the centrality of the character.

In an alternative embodiment, the step of calculating the connectivity of each group of edge characters through the ring characters to obtain the connected characters includes:

according to the sequence of the original character strings, all the characters are connected in a sequence of anticlockwise direction to construct annular characters;

calculating the centrality of the corresponding tail edge character when each edge character in each character group in the annular character is used as the initial letter of the character group, and representing the connectivity by the centrality of the tail edge character;

and taking the initial letter of the character set corresponding to the last edge character with the greatest centrality as an actual character set initial edge character, and taking the actual character set initial edge character and the last edge character corresponding to the actual character set initial edge character as linking characters.

In an alternative embodiment, the step of calculating the centrality of the last edge character in the torus character includes:

obtaining all neighborhood characters corresponding to the edge characters from the annular characters;

taking the centrality of each neighborhood character as a basic value, taking the occurrence frequency of each neighborhood character as a weight value, and obtaining the centrality of the tail edge character corresponding to the edge character by weighted summation.

In an alternative embodiment, the method further comprises:

when the data is decompressed, the compressed data of the segment is decompressed through LZ77 to obtain decompressed data;

each piece of data of the decompressed data is subjected to BWT (binary digital t) inverse coding to obtain original piece of data;

and extracting preset data from the original segment data to obtain original message data.

In a second aspect, the embodiment of the application also provides a system for transmitting and processing the data message of the IO-Link master station. The system comprises:

the data acquisition module is used for acquiring message data;

the sequence calculation module is used for segmenting the message data according to the data attribute to obtain a plurality of segment data; and

processing each segment of data according to the frequency to obtain a corresponding character group; and

calculating the centrality of each character in each character group to obtain group edge characters; and

calculating the connectivity of each group of edge characters through the annular characters to obtain the connected characters; and

and the compression transmission module is used for compressing the coded data of each segment data through the LZ77 to obtain and transmit the segment compressed data.

The application forms the character group through character frequency processing, improves the probability of forming the combined character by the characters, and further is beneficial to the subsequent improvement of compression performance; the centrality of the characters in each character group is obtained through calculation, edge characters are obtained, and then the position of each edge character is obtained through calculation of annular characters, so that the edge characters used as the connection characters of different frequency groups cannot cause great influence on character compression between adjacent frequencies, and the compression effect is improved to the greatest extent; the character sequence obtained through calculation is used as the character sequence in BWT to compress data, so that the data characteristic is fully utilized, the compression effect can be greatly improved, and the efficient transmission of data is realized.

Drawings

For a clearer description of embodiments of the application or of the prior art, the drawings that are needed in the description of the embodiments or of the prior art will be briefly described, it being apparent that the drawings in the description below are some embodiments of the application, from which, without inventive effort, other drawings can be obtained for a person skilled in the art;

fig. 1 is a schematic flow chart of a method for transmitting and processing data messages of an IO-Link master station according to an embodiment of the present application;

fig. 2 is a schematic diagram of an HTTP request message according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a matching chain obtained according to a matching relationship according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a toroidal character according to an embodiment of the present application;

fig. 5 is a schematic diagram of a transmission process of an IO-Link master station data packet according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of an IO-Link master station data packet transmission processing system according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.

Some embodiments of the present application are described in detail below with reference to the accompanying drawings. The following embodiments and features of the embodiments may be combined with each other without conflict.

BWT coding is a permutation-based data rearrangement method that is capable of putting close characters together, similar characters being a sequence of characters that "overlap match", i.e., a large number of repeated characters or strings. These characters or strings will typically appear in adjacent locations after sorting, thereby enabling the subsequent compression algorithm to better exploit the locality and redundancy of the data for more efficient compression. But the compression effect of BWT coding is not controllable.

Based on the above, the application provides a method and a system for transmitting and processing data messages of an IO-Link master station.

As shown in fig. 1, the embodiment of the application provides a method for transmitting and processing data messages of an IO-Link master station, which comprises the following steps:

step 102, collecting message data;

step 104, segmenting the message data according to the data attribute to obtain a plurality of segment data;

step 106, processing each segment of data according to the frequency to obtain a corresponding character group;

step 108, calculating the centrality of each character in each character group to obtain group edge characters;

step 110, calculating the connectivity of each group of edge characters through the annular characters to obtain the connected characters;

step 112, obtaining the character sequence according to the characters in the connection character and the character group, and performing BWT coding on the segment data according to the character sequence;

step 114, compressing the encoded data of each segment data by LZ77 to obtain and transmit segment compressed data.

In the embodiment of the application, the character frequency processing is used for forming the character group, so that the probability of forming the combined characters by the characters is improved, and the subsequent compression performance is improved; the centrality of the characters in each character group is obtained through calculation, edge characters are obtained, and then the position of each edge character is obtained through calculation of annular characters, so that the edge characters used as the connection characters of different frequency groups cannot cause great influence on character compression between adjacent frequencies, and the compression effect is improved to the greatest extent; the character sequence obtained through calculation is used as the character sequence in BWT to compress data, so that the data characteristic is fully utilized, the compression effect can be greatly improved, and the efficient transmission of data is realized.

In an alternative embodiment of the present application, the step of segmenting the message data according to the data attribute to obtain a plurality of segment data includes: searching keywords in the message data; and putting together data with the same meaning according to the keywords to obtain one piece of data.

In this embodiment, segmentation is performed according to the data attribute of the message data, so that data compression is facilitated.

In an alternative embodiment of the present application, the step of obtaining a piece of data includes: the message data is classified according to the data prefixes, and the message data with the same prefix is used as one type of data and marked as one segment of data.

As shown in fig. 2, an example of an HTTP request message, where the message data has multiple attributes, includes: special characters, case english letters, numbers, etc. In this embodiment, the close data is put together by the request header, for example: and putting the characters behind the GET parameters in the message data together, and putting the characters behind the Host parameters together. The method comprises the following steps: firstly, according to keywords in message data, data with the same meaning are put together, namely, the data are classified according to prefixes, and the data with the same prefixes are used as a class. For example: the data following the GET in the message data 1 and the data following the GET in the message data 2 are put together as one category of data, which is called segment data. Characters behind the same parameters are similar, so that compression is facilitated.

In an alternative embodiment of the present application, the step of processing each segment data according to the frequency to obtain a corresponding character group includes: counting the frequency value of each character in each segment data; performing ascending order sorting according to the frequency value of each character to obtain an ascending order frequency sequence; dividing the ascending frequency sequence through otsu multi-threshold segmentation to obtain segmentation points; the frequency value between adjacent division points is used as a class of frequency value; obtaining all characters corresponding to each category according to one character corresponding to each frequency value in each category; all the characters with similar frequencies corresponding to each category form a character group.

In this embodiment, for each segment data, first, the frequency of each character in the segment data is counted to obtain the frequency value of each character. The method comprises the steps of dividing an ascending frequency sequence (the ascending frequency sequence refers to a sequence obtained by ascending sequence of frequency values) through otsu multi-threshold segmentation to obtain segmentation points, wherein frequency values between adjacent segmentation points are used as frequency values of one class, frequency values in the same class are similar, and frequency values of different classes have larger difference. Each frequency value in each category corresponds to one character, and all characters corresponding to each category can be obtained, and the frequencies of the characters are similar and are called character groups. For example: two character sets are obtained through calculation, wherein the first character set is as follows: [ A B C D ], the second character set is: [ EFG ].

Since BWT is to put close characters together, i.e., to be placed in adjacent positions where the frequency of occurrence is originally large, the smaller the frequency of occurrence of each character with the left and right adjacent characters in the original character string, the smaller the influence of the character as an edge character on compression between characters of the same frequency. Therefore, the embodiment enables the character sequences of the characters with similar frequencies to be similar, substantially forms a character combination run, and improves the compression effect.

In an alternative embodiment of the present application, the step of calculating the centrality of each character in each character group to obtain group edge characters includes: applying a KM maximum matching principle to each character group, and finding out a character corresponding to the maximum edge value from a left node in KM matching as an initial node; obtaining a matching character of the initial node on the right side according to the matching relation; finding the homonymous node of the right matching character on the left, and obtaining the matching character of the homonymous node on the left on the right according to the matching relation; iterating for a plurality of times until all characters in the character group are traversed to obtain a matching chain; the two characters with the greatest centrality in the matching chain are taken as group edge characters.

In this embodiment, for each character group, edge characters are obtained through KM matching, the left node is all characters in the character group, the right node is also all characters in the character group, any node on the left side and any node on the right side have edges, the edge value is the occurrence number of the corresponding character pair in the original character string, and a one-to-one matching relationship is obtained through KM maximum matching principle. Firstly, a character corresponding to the maximum edge value is found from the left node, the matching character of the character is obtained according to the matching relation, then the matching character of the matching character on the left side is obtained, a matching chain can be obtained through the same method, and the character with the maximum centrality in the matching chain is used as an edge character. As shown in fig. 3, the value of the edge AB is the largest among all the edges, first find the point a from the left node, the point a corresponds to B, then find the point D from the left point B, find the point C from the left point D, and find the point a from the left point C, and form a matching chain as follows: A-B-D-C-A, centrality is: the ratio of the number of nodes at the left side and the right side of the point is larger as the ratio is closer to 1; the smaller the corresponding edge value, the greater the centrality, and the smaller the impact on the compression effect within the character set when the corresponding node is used as an edge character. The edge characters defined by the embodiment have small influence on the compression effect, and the compression effect is improved.

In an alternative embodiment of the present application, the step of calculating character centrality includes: obtaining the edge value of the corresponding edge of the character, and marking asThe method comprises the steps of carrying out a first treatment on the surface of the Obtaining the ratio of the number of nodes on both sides of the character node, denoted +.>；

Obtaining the centrality of the characters according to a first preset formula, wherein the first preset formula is as follows:

wherein ,representing the centrality of the character.

In this embodiment, the ratio of the number of nodes on both sides of the character node is the ratio of a small value to a large value; the smaller the edge value is, the smaller the influence of the corresponding node as an edge character on the compression effect in the character set is; the greater the ratio, the less the corresponding node is associated with the more frequent node, which acts as an edge character, and the less impact on the compression effect within the character set. The two characters with the greatest centrality are used as edge characters in each character group, so that the influence on the compression effect is reduced.

In an alternative embodiment of the present application, the step of calculating the concatenation of each group of edge characters by using the torus characters to obtain the concatenation characters includes: according to the sequence of the original character strings, all the character strings are connected in a sequence of anticlockwise direction to construct annular characters; calculating the centrality of the corresponding tail edge character when each edge character in each character group in the annular character is used as the initial letter of the character group, and representing the connectivity by the centrality of the tail edge character; and taking the initial letter of the character set corresponding to the last edge character with the greatest centrality as the actual character set initial edge character, and taking the actual character set initial edge character and the last edge character corresponding to the actual character set initial edge character as the linking characters.

In this embodiment, all the characters are sequentially connected on a circle according to the sequence of the original character string, as shown in fig. 4, and are ring characters of the character string babcaea, the gray circle indicates that the corresponding node b is the start point of the character string, and rotates in the counterclockwise direction, so that all the cyclic shift character strings can be obtained, after the ring characters are built, the characters on the ring characters are classified, as shown in fig. 4, wherein the frequencies of e, a and b are similar, the frequencies of c and d are similar, and the frequencies of e, a and b are similar. Since the last character is taken and for each start character, the end character is the adjacent character in the annular character, the influence on the compression effect in the character group corresponding to the edges at both sides is smaller when the character is taken as the edge character by calculating the centrality of the adjacent character corresponding to each edge character in each character group, and the influence on the compression effect is further reduced.

In an alternative embodiment of the present application, the step of calculating the centrality of the last edge character of the torus characters includes: obtaining all neighborhood characters corresponding to the edge characters from the annular characters; taking the centrality of each neighborhood character as a basic value, taking the occurrence frequency of each neighborhood character as a weight value, and obtaining the centrality of the last edge character corresponding to the edge character by weighted summation.

In this embodiment, the centrality of the last character corresponding to each edge character of the character group as its real character is calculated from the edge character of the character group with the largest average probability value, specifically: firstly, a neighborhood character of the edge character is obtained in the annular character, and the neighborhood character of each edge character is calculated and obtained because the character does not necessarily appear in the annular character only once, the weighted summation method is carried out, the centrality of the neighborhood character is taken as a basic value, the appearance frequency of the neighborhood character is taken as a weight value, the centrality corresponding to the edge character is calculated and obtained, and the edge character corresponding to the minimum centrality is taken as the first character of the group of character strings, for example: and calculating to obtain that the initial edge character of the character group [ e, a and b ] is e, and the character sequence in the character group is as follows: e-a-b. The edge characters in each character group can be calculated by the same method, and all character sequences are obtained. For example: the endianness of the character set [ d e ] is: e-d, all the character sequences are obtained as follows: e-a-b-d-e, further improving the compression effect.

Through the above description of the steps, the schematic diagram of the embodiment of the present application is shown in fig. 5, and the present application sets the character sequence in a user-defined manner according to the data characteristics, instead of according to the character sequence in the dictionary, calculates the character sequence of each segment, and performs BWT coding on the characters of the segment according to the character sequence to obtain the coded data of each segment. The obtained linking characters among different frequencies do not destroy the correlation among the characters with similar frequencies, so that the compression effect of the characters in each group is not greatly influenced, and the reduction of compression efficiency is avoided.

In an alternative embodiment of the application, the method further comprises: when the data is decompressed, compressing the data through an LZ77 decompression section to obtain decompressed data; each piece of data of the decompressed data is subjected to BWT (binary digital t) inverse coding to obtain original piece of data; and extracting preset data from the original segment data to obtain the original message data.

In this embodiment, the data of each segment is compressed by LZ77 to obtain compressed data of each segment. When decompressing, firstly, decompressing through LZ77, then, performing BWT (binary weighted transform) inverse coding on each segment to obtain original segment data, and extracting relevant data from each segment data to obtain the original data. Therefore, the compression and decompression of the data message of the IO-Link master station are realized, and the effect of high-speed data transmission is achieved.

Fig. 6 is a schematic structural diagram of an IO-Link master station data packet transmission processing system 60 according to an embodiment of the present application. As shown in fig. 6, the IO-Link master station data packet high-speed transmission processing system 60:

the data acquisition module 602 is configured to acquire message data;

the sequence calculating module 604 is configured to segment the message data according to the data attribute to obtain a plurality of segment data; and processing each segment of data according to the frequency to obtain a corresponding character group; calculating the centrality of each character in each character group to obtain group edge characters; calculating the connectivity of each group of edge characters through the annular characters to obtain the connected characters; obtaining character sequence according to the characters in the connection character and the character group, and performing BWT coding on the segment data according to the character sequence;

the compression transmission module 606 is configured to compress the encoded data of each segment data by LZ77, and obtain and transmit the segment compressed data.

The IO-Link master station data message high-speed transmission processing system 60 provided by the embodiment of the application forms character groups through character frequency processing, so that the probability of forming combined characters by the characters is improved, and further the subsequent compression performance is improved; the centrality of the characters in each character group is obtained through calculation, edge characters are obtained, and then the position of each edge character is obtained through calculation of annular characters, so that the edge characters used as the connection characters of different frequency groups cannot cause great influence on character compression between adjacent frequencies, and the compression effect is improved to the greatest extent; the character sequence obtained through calculation is used as the character sequence in BWT to compress data, so that the data characteristic is fully utilized, the compression effect can be greatly improved, and the efficient transmission of data is realized.

For system embodiments, reference is made to the description of method embodiments for the relevant points, since they essentially correspond to the method embodiments. The system embodiments described above are merely illustrative, in which elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present application without undue burden.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims

1. The IO-Link master station data message high-speed transmission processing method is characterized by comprising the following steps:

collecting message data;

compressing the coded data of each segment data through LZ77 to obtain and transmit segment compressed data;

the step of processing each segment data according to the frequency to obtain a corresponding character group comprises the following steps:

counting the frequency value of each character in each segment data;

all the characters with similar frequencies corresponding to each category form the character group;

the step of calculating character centrality includes:

wherein ,representing centrality of the character;

the step of calculating the centrality of each character in each character group to obtain group edge characters comprises the following steps:

taking the two characters with the greatest centrality in the matching chain as the group of edge characters;

the step of obtaining the connection character by calculating the ring character to obtain the connection character of each group of edge characters comprises the following steps:

2. The method of claim 1, wherein the step of segmenting the message data according to data attributes to obtain a plurality of segments of data comprises:

searching keywords in the message data;

3. The method of claim 2, wherein the step of bringing together data of the same meaning based on the keywords to obtain a piece of data comprises:

4. The method of claim 1, wherein the step of calculating the centrality of the last edge character of the torus characters comprises:

5. The method according to any one of claims 1 to 4, further comprising:

6. The utility model provides a IO-Link master station data message high-speed transmission processing system which characterized in that, the system includes:

the data acquisition module is used for acquiring message data;

the compression transmission module is used for compressing the coded data of each segment of data through LZ77 to obtain and transmit the compressed data of the segment;

counting the frequency value of each character in each segment data;

the step of calculating character centrality includes:

wherein ,representing centrality of characters；