CN115334169A

CN115334169A - Communication protocol coding method for saving network bandwidth

Info

Publication number: CN115334169A
Application number: CN202210461679.9A
Authority: CN
Inventors: 蒋春风; 肖成虎; 何发
Original assignee: Shenzhen Securities Communication Co ltd
Current assignee: Shenzhen Securities Communication Co ltd
Priority date: 2022-04-28
Filing date: 2022-04-28
Publication date: 2022-11-11
Anticipated expiration: 2042-04-28
Also published as: CN115334169B

Abstract

The invention discloses a communication protocol coding method for saving network bandwidth, which comprises a coding process and a decoding process, wherein the coding process comprises the steps of coding data content into a data packet; grouping fields to be encoded; preprocessing each field according to different data types; arranging the grouped field codes into a communication packet; carrying out conventional treatment; the decoding process comprises the steps of analyzing the data packet; analyzing the fields of the packet; conventional treatment is carried out. The invention comprises the following steps: (1) saving network traffic by about 35%. (2) The whole data range is flexibly processed, the change of a communication protocol caused by a small amount of data beyond the range is avoided, and the adaptability of the communication protocol and software is improved. (3) Compared with the compression coding, the coding method has no obvious consumption to the CPU and has stronger practicability.

Description

Communication protocol coding method for saving network bandwidth

Technical Field

The invention relates to a communication protocol coding method for saving network bandwidth, belonging to the technical field of software design and development in the field of network communication.

Background

In network communication, before data is sent, the data needs to be encoded in a certain format and then sent to a network; after the opposite side receives the data, the data is decoded in a method opposite to the coding, so that the real data of the opposite side is obtained.

When data encoding is performed, each field needs to be arranged one by one, and common encoding methods for communication protocols include:

(1) Encoding in a fixed-length text mode;

(2) Variable length text mode encoding with separators;

(3) Binary coding;

(4) XML format, etc.;

(5) And transmitting after compression.

Different coding methods are generally selected for different occasions. Each method has respective advantages and disadvantages, for example, the binary system saves network bandwidth compared with the text mode, but the manual reading is difficult; lengthening saves bandwidth over fixing, but lengthening is more complex to process for computer programming. XML is highly human readable, but is almost the most expensive in terms of network bandwidth consumption, and typically must rely on third party libraries to translate entities.

In order to save network bandwidth, people compress data and then transmit the data, thereby saving bandwidth. Tests have shown that this approach is not feasible and can be invaluable for real-time communication due to the excessive consumption of CPU resources by the compressed data. Therefore, almost no one in real-time communication applications takes a compression approach, except in non-real-time communication.

Disclosure of Invention

The present invention is directed to a communication protocol encoding method for saving network bandwidth, so as to solve the problems in the background art mentioned above.

In order to achieve the purpose, the invention provides the following technical scheme: a communication protocol encoding method for saving network bandwidth comprises an encoding process and a decoding process, wherein the encoding process comprises steps 1 to 5, and the decoding process comprises steps 6 to 8:

step 1, data content is coded into a data packet;

step 2, grouping fields to be coded;

step 3, respectively carrying out the following preprocessing on each field according to different data types;

step 4, arranging the grouped field codes into a communication packet;

step 5, performing conventional treatment;

step 6, analyzing the data packet;

7, analyzing the grouped fields;

and 8, performing conventional treatment.

Preferably, the step 2 groups fields to be coded in groups of four fields, and groups fields with less than four fields at the end.

Preferably, the preprocessing manner of step 3 includes, but is not limited to, preprocessing with a conforming type, an unsigned type, a string type, and a floating point type.

Preferably, the conventional processing of step 5 includes, but is not limited to, adding a check field of a packet tail.

Preferably, the normal processing of step 8 includes, but is not limited to, parsing the check field.

Compared with the prior art, the invention has the beneficial effects that:

(1) Saving network traffic by about 35%.

Suppose that 70% of the memory in a packet is integer and the rest are floating points or strings. Since in most cases integer shapes are small numbers and character strings are short, especially in the financial industry, integer shapes can usually be represented by no more than 1 byte or 2 bytes, while in computers we often use 4 bytes or 8 bytes to represent, which results in more than half of the bytes being wasted. Assuming that the integer wastes 50% of the space (in the present invention, although one byte is supplemented before each group, this one byte is negligible for a large number of integers that are wasted more than half the length), the overall waste rate is 70% × 50% =35%. That is, with the present invention, it is possible to reduce the packet size by 35% in most cases, or to save 35% of the network traffic, or bandwidth consumption.

For example, in the market price, the declared quantity, etc. of the financial industry, after the decimal point is removed, the data size generally does not exceed 65535, but the possibility of exceeding 65535 cannot be completely eliminated. For example, 95% of stock prices do not exceed 655 yuan, but Guizhou Maotai exceeds 1000 yuan, if the stock prices are all represented by 4 bytes in network transmission, the waste is very high, and a large amount of bandwidth can be saved by adopting the coding method of the communication protocol.

(2) Compared with a field coding method with fixed length, the coding method has flexible processing on the integer data range, avoids the change of a communication protocol caused by a small amount of data beyond the range, and improves the adaptability of the communication protocol and software.

The traditional fixed-length field cannot expand the length of data, and in case of exceeding an expected data range in the future, the communication protocol becomes invalid, only the communication protocol can be redesigned, and software needs to be redeveloped by both communication parties, so that a large amount of cost is caused. The coding method is uniform for the integer data expressed by no matter a few bytes, and the problems of redesign of a communication protocol and recoding of software caused by expansion of a data range in use are solved.

(3) Compared with the compression coding, the coding method has no obvious consumption to the CPU and has stronger practicability. The compression coding method is basically not practical in real-time communication.

Drawings

Fig. 1 is a schematic diagram of packet information according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention are clearly and completely described below, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention provides a technical scheme that: the coding is performed in binary. The communication protocol header is not considered at all, is generally short and only needs to be adopted in a conventional mode. Only the coding of the individual fields is considered here.

1. The specific encoding process is as follows:

encoding means that data is packed before being sent to form a data stream, and finally the data stream is sent through a network.

Step 1: the data packet is encoded with contents such as a protocol header.

Step 2: the fields to be encoded are grouped in groups of four fields with less than four fields at the end.

And 3, step 3: and respectively preprocessing each field according to different data types as follows:

(1) Signed integer (whether 1, 2, 4 or 8 byte integer):

(1) if the integer range is [ -128,127], i.e. 1 byte can be represented, a signed integer of one byte is forced;

(2) if the integer range is not in the range and is in the range of-32768, 32767, that is, 2 bytes can be represented, the conversion is forced to be signed integer of two bytes;

(3) if the integer range is not in the above range, but in the range of [ -2147483648,2147483647], i.e. 4 bytes can be represented, then the conversion is forced to be a signed integer of four bytes;

(4) in other cases (referring to int64 not in the above range), no conversion is performed.

(2) Unsigned integer (whether 1, 2, 4, or 8 byte integer):

(1) if the integer range is [0,0xFF ], i.e. 1 byte can be represented, then the conversion is forced into unsigned integer of one byte;

(2) if the integer range is [0x100,0xFFFF ], i.e. 2 bytes can be represented, then the conversion is forced to be unsigned integer of two bytes;

(3) if the integer range is [0x10000,0xFFFFFFFF ], i.e. 4 bytes can be represented, then the conversion is forced to be unsigned integer of four bytes;

(4) otherwise (referring to agent 64 greater than 0 xFFFFFFFF), no conversion is performed.

(3) The type of character string: the type will be arranged in a way that length + string content is used. For the length therein, the same way as the above unsigned integer is used for storage.

(4) Floating point type (including float single precision and double precision): binary data is used directly without conversion.

And 4, step 4: encoding and arranging the mode of every four fields in one group into a communication packet, wherein the specific flow is as follows:

(1) First, a binary one-byte integer is written (for one-byte integer, the size-end endianness is the same, so conversion is not needed), which is called packet information. As shown in fig. 1. Since each byte has 8 bits, these 8 bits are equally divided into four parts, and each 2 bits represents the length of the value in the corresponding field of the subsequent group. That is to say:

bits 1 to 2, indicate the length of the value in the 1 st field in the subsequent group.

Bits 3 to 4, indicate the length of the value in the 2 nd field of the subsequent group.

Bits 5 to 6 indicate the length of the value in the 3 rd field in the subsequent group.

Bits 7 to 8, indicate the length of the value in the 4 th field of the subsequent group.

Wherein, every 2 bits can only represent four possible values of 0, 1, 2 and 3, and the meaning is:

0: indicating that the value length in the corresponding field that follows is 1 byte.

1: indicating that the value length in the corresponding field that follows is 2 bytes.

2: indicating that the value in the corresponding field that follows is 4 bytes in length.

3: indicating that the value length in the corresponding field that follows is 8 bytes.

For the last group, if there are less than 4 fields, the corresponding position may be filled with 0.

(2) The four fields of the group are written next in sequence, each field having the following format:

(1) if the character string is of a character string type, the length (big-endian representation) and the character string content which are preprocessed by the method are written in sequence.

(2) If the type is a floating point type (comprising float single precision and double precision), the binary content of the corresponding big endian is directly written.

(3) If it is an integer type, including signed and unsigned integer, then write the integer that the above method has preprocessed (big-endian method).

And 5, step 5: other conventional processing such as adding a check field at the end of the packet, etc. Because it belongs to the conventional processing and does not belong to the scope of the present invention, the detailed description is omitted.

The specific decoding process is as follows:

decoding refers to the process of unpacking the data received from the network to obtain the original information.

Step 1: the packet header and other contents are analyzed. Since they are not part of the present invention, they will not be described in detail herein.

Step 2: circularly analyzing each four-field packet, wherein the flow is as follows:

(1) Firstly, one byte of packet information is analyzed:

Wherein, each 2 bits can only represent four possible values of 0, 1, 2 and 3, and the meaning is as follows:

2: indicating that the value length in the corresponding field that follows is 4 bytes.

3: indicating that the value in the corresponding field that follows is 8 bytes in length.

For the last group, if there are less than 4 fields, the remainder can be ignored.

(2) The 1 st, 2 nd, 3 th and 4 th fields in the group are sequentially analyzed.

The data type of each field needs to be determined according to the agreement of both parties. Care should be taken to convert to native endianness. The storage length of the local integer can only be larger than or equal to the received length value, otherwise, an error is always existed.

If the last packet has less than four fields, the remainder is ignored.

And 3, step 3: other conventional processing such as parsing the check field of the trailer, etc.

The invention has practical value in real-time communication application, reduces network flow, saves about 35% of network bandwidth compared with a common binary coding method under the condition of no information loss, and simultaneously improves the adaptability of the protocol to various data ranges.

Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that various changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A communication protocol encoding method for saving network bandwidth is characterized in that the method comprises an encoding process and a decoding process, wherein the encoding process comprises steps 1 to 5, and the decoding process comprises steps 6 to 8:

step 1, data content is coded into a data packet;

step 2, grouping fields to be coded;

step 3, respectively carrying out the following pretreatment on each field according to different data types;

step 4, arranging the grouped field codes into a communication packet;

step 5, performing conventional treatment;

step 6, analyzing the data packet;

7, analyzing the grouped fields;

and 8, performing conventional treatment.

2. The network bandwidth saving communication protocol encoding method of claim 1, wherein: and the step 2 is to group every four fields to be coded, wherein the fields with less than four ends are grouped.

3. The network bandwidth saving communication protocol encoding method of claim 1, wherein: the preprocessing mode of the step 3 includes but is not limited to preprocessing a matched type, an unsigned type, a character string type and a floating point type.

4. The network bandwidth saving communication protocol encoding method of claim 1, wherein: the conventional processing of step 5 includes, but is not limited to, adding a check field of the packet tail.

5. The network bandwidth saving communication protocol encoding method of claim 1, wherein: the conventional processing of step 8 includes, but is not limited to, parsing the check field.