CN117060930B

CN117060930B - Data intelligent communication system for docking station

Info

Publication number: CN117060930B
Application number: CN202311315925.0A
Authority: CN
Inventors: 肖杰; 罗勇
Original assignee: Guangdong Zhiying Technology Co ltd
Current assignee: Guangdong Zhiying Technology Co ltd
Priority date: 2023-10-12
Filing date: 2023-10-12
Publication date: 2024-02-06
Anticipated expiration: 2043-10-12
Also published as: CN117060930A

Abstract

The invention relates to the field of data transmission, in particular to a data intelligent communication system for a docking station, which comprises the following components: the data preprocessing module, the data mining module, the data compression module and the data transmission communication module are used for constructing character pairs according to each character appearing in data to be compressed, setting a plurality of intervals, obtaining probabilities of the character pairs at different intervals, constructing a normal form Huffman tree, sequentially encoding each character in the data to be compressed, obtaining a plurality of reference characters each time, predicting the homonym probability of each character according to the probabilities of the character pairs of each reference character at different intervals, and updating the normal form Huffman tree according to the homonym probability. And obtaining a compression result, and transmitting the compression result. The invention has high data compression efficiency and ensures the real-time performance of data communication of the docking station.

Description

Data intelligent communication system for docking station

Technical Field

The invention relates to the field of data transmission, in particular to a data intelligent communication system for a docking station.

Background

The expansion dock is an external device, and the working principle of the expansion dock is to convert an interface of a computer into an interface capable of being connected with more devices, and provide additional ports and functions for users. Devices connected to a docking station need to share limited bandwidth and system resources, which may affect the speed and stability of data transmission when a large number of devices are connected to the docking station.

In order to ensure the data transmission speed, the expansion dock needs to compress and transmit the data, the traditional Huffman coding is limited by the frequency of each character in the data, the compression efficiency is limited, and the data transmission speed of the expansion dock is difficult to ensure.

Disclosure of Invention

In order to solve the above problems, the present invention provides a data intelligent communication system for a docking station, the system comprising:

the data preprocessing module is used for acquiring data to be compressed;

the data mining module constructs character pairs according to each character appearing in the data to be compressed; setting a plurality of intervals, and acquiring the probability of each character pair under different intervals according to the occurrence frequency of the character pair in the data to be compressed;

the data compression module constructs a normal form Huffman tree according to the frequency of each character in the data to be compressed, sequentially codes each character in the data to be compressed according to the normal form Huffman tree, updates the normal form Huffman tree once every code, and comprises the following steps: acquiring a plurality of reference characters according to the current coding character and the coded characters, and predicting the cis-position probability of each character according to the probability of the character pair of each reference character under different intervals; updating a normal form Huffman tree according to the orthographic probability of each character;

obtaining a compression result according to the coding results of all characters in the data to be compressed;

and the data transmission communication module is used for transmitting the compression result.

Preferably, the step of constructing a character pair according to each character appearing in the data to be compressed includes the steps of:

and counting the types of characters in the data to be compressed, forming character pairs by any two characters, and obtaining all the character pairs.

Preferably, the step of setting a plurality of intervals includes:

the interval range is preset, and each integer in the interval range is taken as an interval.

Preferably, the obtaining the probability of the character pair under different intervals according to the occurrence frequency of each character in the data to be compressed includes the steps of:

；

wherein,indicate->The number of character pairs is at->Interval->Probability of down; />Indicate->Frequency of the first character in the pair of characters; />Indicate->The frequency of the first character in the pair of characters; />Is represented in the data to be compressed +.>The first character is separated from the second character in the pair +.>Number of occurrences.

Preferably, the step of obtaining a plurality of reference characters according to the current encoded character and the encoded character includes the steps of:

taking the character coded last time as the current coded character; acquiring a coded sequence according to the current coding character; post-coding in the encoded sequenceThe individual characters are used as reference characters, respectively, wherein +.>Is the upper limit of the spacing range.

Preferably, the step of obtaining the coded sequence according to the current coding character includes the steps of:

all characters preceding the current encoded character and the current encoded character are formed into a coded sequence.

Preferably, the predicting the homonymy probability of each character according to the probabilities of the character pairs to which each reference character belongs at different intervals includes the steps of:

；

wherein,is->The orthographic probability of the seed character; />Is->The frequency of the seed character; />Is->Frequency of occurrence of seed characters in the encoded sequence; />Is the length of the data to be compressed; />Is the upper limit of the interval range; />Expressed in +.>The first character is the first reference character, in +.>The character pair whose seed character is the second character is in +.>Probability at each interval; />Is an exponential normalization function.

Preferably, the updating the normal huffman tree according to the orthographic probability of each character includes the steps of:

and sequentially distributing each character to each leaf node of the current normal form Huffman tree according to the order of the order probability of each character from big to small, so as to realize the updating of the normal form Huffman tree.

Preferably, the obtaining the compression result according to the coding result of all the characters in the data to be compressed includes the steps of:

and splicing the codes of all the characters in the data to be compressed into a binary sequence according to the sequence, and taking the binary sequence as a compression result.

Preferably, the character pairs may comprise two identical characters.

The invention has the following beneficial effects: according to the invention, character pairs are constructed according to each character appearing in data to be compressed, a plurality of intervals are set, probabilities of the character pairs under different intervals are obtained, a normal form Huffman tree is constructed, each character in the data to be compressed is sequentially encoded, each character is encoded, a plurality of reference characters are obtained, the orthotopic probability of each character is predicted according to the probability of the character pair under different intervals to which each reference character belongs, and the normal form Huffman tree is updated according to the orthotopic probability, so that the character with high probability appearing next is not limited by character frequency, shorter code words can be used for encoding as much as possible, the compression efficiency of the data is improved, and the real-time performance of dock data communication is ensured.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

Fig. 1 is a system block diagram of a data intelligent communication system for a docking station according to an embodiment of the present invention.

Detailed Description

In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following is a detailed description of specific implementation, structure, characteristics and effects of the data intelligent communication system for the expansion dock according to the invention with reference to the accompanying drawings and the preferred embodiment. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The following specifically describes a specific scheme of the data intelligent communication system for the docking station provided by the invention with reference to the accompanying drawings.

Referring to fig. 1, an intelligent data communication system for a docking station according to an embodiment of the present invention is shown, and the system includes the following modules:

the data preprocessing module 101 is configured to obtain data to be compressed.

When the computer transmits data to the equipment connected to the docking station through the docking station, the docking station compresses the data, and the data needing to be compressed are recorded as data to be compressed.

So far, the data to be compressed is obtained.

The data mining module 102 is configured to construct character pairs and obtain probabilities of the character pairs at different intervals.

The huffman coding compresses data according to the frequency of the character, codes the character with a large frequency by using a shorter code word, and codes the character with a small frequency by using a longer code word, thereby realizing compression. However, in the huffman coding process, the frequency of each character remains unchanged, the huffman coding has limited compression efficiency on the data to be compressed under the influence of the frequency, and it is difficult to ensure the speed of expanding the data transmission of the dock. The embodiment of the invention aims to predict the probability of each character appearing next according to the coded characters in the coding process, and update the Huffman tree according to the probability so as to ensure that the characters with smaller frequency can be compressed by using shorter codes. In order to predict the probability of each character appearing next based on the encoded characters, it is necessary to mine the rule of occurrence of the characters in the data to be compressed.

In the embodiment of the invention, the interval range is presetIn this embodiment->For example, the interval range may be set by an operator according to actual implementation without limitation. Interval range +.>Each integer is taken as a space, wherein the space 0 represents that no other character exists between two characters, i.e. the two characters are adjacent.

And counting the types of characters in the data to be compressed, and obtaining the occurrence frequency of each character in the data to be compressed. And constructing any two characters into character pairs, and acquiring all the character pairs. Wherein the two characters constituting the character pair may be identical, for example, the character pair corresponding to the character A, B is (a, a), (a, B), (B, a).

For each character pair, obtaining probabilities of the character pairs at different intervals, wherein the probabilities are used for representing probabilities of co-occurrence of two characters in the character pairs at different intervals:

；

wherein,indicate->The number of character pairs is at->Interval->The probability below is used to represent +.>Two character spacing in a character pair->Probability of co-occurrence; />Indicate->Frequency of the first character in the pair of characters; />Indicate->The frequency of the first character in the pair of characters; />Is represented in the data to be compressed +.>The first character is separated from the second character in the pair +.>The number of occurrences; />Expressed in the data to be compressed, at +.>On the premise that the first character appears in the pair of characters, the second character is separated by the first character +.>Frequency of occurrence when->The larger, at->In the case of the occurrence of the first character in the pair of characters, the second character is spaced apart by the first character +.>The higher the probability of occurrence, if at the same time +.>The greater the frequency of the first character in the pair of characters is +.>Two character spacing in a character pair->The greater the probability of co-occurrence.

So far, the probability of each character pair at different intervals is obtained.

The data compression module 103 compresses data to be compressed according to probabilities of each character pair at different intervals.

In the embodiment of the invention, a normal form Huffman tree is constructed according to the frequency of each character in the data to be compressed.

Coding each character in the data to be compressed according to the normal form Huffman tree, and updating the normal form Huffman tree once every coding one character, specifically:

the last encoded character is taken as the current encoded character. All characters preceding the current encoded character and the current encoded character are formed into a coded sequence.

Post-coding in the encoded sequenceEach character is used as a reference character, and each character is predicted according to the probability of the character pair to which each reference character belongs under different intervalsFor representing the probability of each character occurring next to the currently encoded character:

；

wherein,is->The orthographic probability of the seed character; />Is->The frequency of the seed character; />Is->Frequency of occurrence of seed characters in the encoded sequence; />Is the length of the data to be compressed; />Is the upper limit of the interval range; />Expressed in +.>The first character is the first reference character, in +.>The character pair whose seed character is the second character is in +.>Probability at each interval; />Is an exponential normalization function; />Indicate->Frequency of seed character uncoded; when->Seed character is separated by +.>The greater the probability of occurrence, at the same time +.>The greater the frequency of uncoded seed characters, the more likely the next +.>Seed character, at this time->The greater the orthographic probability of the seed character.

And encoding according to the next character in the data to be compressed in the updated normal form Huffman tree. And when the next character is encoded, updating the updated normal form Huffman tree again, and continuously repeating the process to realize the encoding of all the characters in the data to be compressed, and splicing the encoding of all the characters in the data to be compressed into a binary sequence according to the sequence to be used as a compression result.

Thus, the compression of the data to be compressed is realized, and the compression result is obtained.

And the data transmission communication module 104 is used for transmitting the compression result to realize intelligent data communication.

The docking station transmits the compression result to the receiving device. The receiving device decompresses the compression result, and each time a character is decompressed in the decompression process, the normal form huffman tree is updated once by using the method in the data compression module 103 until the compression result is traversed and iteration is stopped. And forming a one-dimensional sequence of all characters obtained by decompression according to the sequence, and taking the one-dimensional sequence as a decompression result. The decompression result is the data transmitted to the receiving equipment by the computer through the docking station.

In summary, the system of the present invention includes a data preprocessing module, a data mining module, a data compression module, and a data transmission communication module, and according to each character appearing in data to be compressed, the embodiment of the present invention constructs a character pair, sets a plurality of intervals, obtains probabilities of the character pair under different intervals, constructs a normal huffman tree, encodes each character in the data to be compressed in turn, obtains a plurality of reference characters, predicts a homonymy probability of each character according to the probabilities of the character pair under different intervals to which each reference character belongs, and updates the normal huffman tree according to the homonymy probability, so that the character having a high probability of appearing next is not limited by the character frequency, and can encode the data with a shorter codeword as much as possible, so that the compression efficiency of the data is improved, and the real-time of expanding data communication is ensured.

The above description is only of the preferred embodiments of the present invention and is not intended to limit the invention, but any modifications, equivalent substitutions, improvements, etc. within the principles of the present invention should be included in the scope of the present invention.

Claims

1. A data intelligent communication system for a docking station, the system comprising:

the data preprocessing module is used for acquiring data to be compressed;

the data transmission communication module is used for transmitting the compression result;

the method for obtaining the probability of the character pairs under different intervals according to the occurrence frequency of each character in the data to be compressed comprises the following steps:

；

2. The intelligent data communication system for a docking station according to claim 1, wherein the constructing a character pair according to each character appearing in the data to be compressed comprises the steps of:

3. The intelligent data communication system for a docking station according to claim 1, wherein the step of setting a plurality of intervals comprises the steps of:

4. The intelligent data communication system for a docking station according to claim 3, wherein the step of acquiring a plurality of reference characters from the current encoded character and the encoded character comprises the steps of:

5. The intelligent data communication system for a docking station according to claim 4, wherein the step of acquiring the encoded sequence according to the current encoded character comprises the steps of:

6. The intelligent data communication system for a docking station according to claim 3, wherein the predicting the homonymy probability of each character according to the probabilities of the character pairs to which each reference character belongs at different intervals comprises the steps of:

；

7. The intelligent data communication system for a docking station according to claim 1, wherein the updating of the canonical huffman tree according to the orthographic probability of each character comprises the steps of:

8. The intelligent data communication system for a docking station according to claim 1, wherein the step of obtaining the compression result according to the encoding result of all the characters in the data to be compressed comprises the steps of:

9. The intelligent data communication system for a docking station of claim 2, wherein the character pairs comprise two identical characters.