CN117459187B - High-speed data transmission method based on optical fiber network - Google Patents

High-speed data transmission method based on optical fiber network Download PDF

Info

Publication number
CN117459187B
CN117459187B CN202311785079.9A CN202311785079A CN117459187B CN 117459187 B CN117459187 B CN 117459187B CN 202311785079 A CN202311785079 A CN 202311785079A CN 117459187 B CN117459187 B CN 117459187B
Authority
CN
China
Prior art keywords
data
current data
dictionary
main frequency
length
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311785079.9A
Other languages
Chinese (zh)
Other versions
CN117459187A (en
Inventor
魏凤
龚任荣
刘晓宇
朱婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Maiwei Digital Tv Equipment Co ltd
Original Assignee
Shenzhen Maiwei Digital Tv Equipment Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Maiwei Digital Tv Equipment Co ltd filed Critical Shenzhen Maiwei Digital Tv Equipment Co ltd
Priority to CN202311785079.9A priority Critical patent/CN117459187B/en
Publication of CN117459187A publication Critical patent/CN117459187A/en
Application granted granted Critical
Publication of CN117459187B publication Critical patent/CN117459187B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0001Systems modifying transmission characteristics according to link quality, e.g. power backoff
    • H04L1/0006Systems modifying transmission characteristics according to link quality, e.g. power backoff by adapting the transmission format
    • H04L1/0007Systems modifying transmission characteristics according to link quality, e.g. power backoff by adapting the transmission format by modifying the frame length
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/04Protocols for data compression, e.g. ROHC
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Computer Security & Cryptography (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention relates to the technical field of data compression, in particular to a high-speed data transmission method based on an optical fiber network, which comprises the following steps: collecting current data and historical data; acquiring the optimal dictionary length of the current data by adopting an permutation entropy algorithm; obtaining the repetition degree of the current data according to the similarity among the data segments of each group of the current data; obtaining a first optimal reference dictionary length according to the difference between the main frequency component curves in the current data frequency domain diagram; clustering the current data and each historical data to obtain each cluster, and constructing a second optimal reference dictionary length according to the difference between the main frequency component of each historical data of the cluster where the current data is located and the main frequency component of the current data; and obtaining corrected dictionary degrees according to the first and second preferred reference dictionary lengths, the preferred dictionary lengths and the repetition degrees, and optimizing the compression process of the LZ77 algorithm. The invention increases the data compression rate, reduces the data volume in the transmission process and improves the data transmission speed.

Description

High-speed data transmission method based on optical fiber network
Technical Field
The application relates to the technical field of data compression, in particular to a high-speed data transmission method based on an optical fiber network.
Background
In the high-speed data transmission process based on the optical fiber network, effective data compression can have a certain influence on the transmission speed and the bandwidth utilization rate. A higher data compression rate means that the volume of data can be reduced, enabling more data to be transmitted with limited bandwidth. Therefore, the purpose of improving the data transmission speed can be achieved by compressing the data with high quality.
In the existing LZ77 algorithm, the corresponding compression efficiency is different for different dictionary sizes used in different scenes, so that the optimal dictionary size needs to be obtained in a self-adaptive mode according to different obtained data.
Disclosure of Invention
In order to solve the technical problems, the invention provides a high-speed data transmission method based on an optical fiber network so as to solve the existing problems.
The high-speed data transmission method based on the optical fiber network adopts the following technical scheme:
one embodiment of the present invention provides a high-speed data transmission method based on an optical fiber network, the method comprising the steps of:
collecting current data to be transmitted and historical data in a transmission system;
acquiring the optimal dictionary length of the current data by adopting an permutation entropy algorithm; constructing a time data curve of the current data according to time sequence ordering, and dividing the time data curve at different starting points according to the optimal dictionary length to obtain each group of data segments at each starting point; obtaining the repetition degree of the current data according to the similarity between each group of data segments divided by the time data curve at each starting point;
obtaining a frequency domain diagram of a time data curve of the current data by adopting Fourier transformation, classifying curves with different amplitudes in the frequency domain diagram by using an Ojin threshold method, and taking the curve with the largest amplitude and value after classification in the class as a main frequency component curve; obtaining a first optimal reference dictionary length of the current data according to the difference between the main frequency value and the period of each main frequency component curve of the current data; clustering the current data and the main frequency component curves of the historical data to obtain clustering clusters, and constructing a second optimal reference dictionary length of the current data according to the difference between the main frequency components of the historical data and the main frequency components of the current data in the clustering clusters where the current data is located;
obtaining a reference factor of the current data to the historical data according to the first optimal reference dictionary length and the optimal dictionary length of each historical data in the cluster where the current data is located; obtaining a third preferred reference dictionary length of the current data according to the reference factors of the current data to the historical data and the first and second preferred reference dictionary lengths; correcting the optimal dictionary length according to the repetition degree of the current data and the third optimal reference dictionary length to obtain a corrected dictionary degree; and compressing the current data by taking the corrected dictionary degree of the current data as the dictionary length of the LZ77 algorithm.
Preferably, the obtaining the preferred dictionary length of the current data by using the permutation entropy algorithm includes:
setting a preset range for the embedded dimension in the permutation entropy, and calculating corresponding permutation entropy for the current data under different embedded dimensions in the preset range; and taking the embedded dimension corresponding to the minimum permutation entropy as the optimal dictionary length.
Preferably, the dividing the time data curve at different starting points according to the preferred dictionary length to obtain each group of data segments at each starting point includes:
and taking the length of the optimized dictionary as the maximum range of the starting point types of the divided data segments on the time data curve, and dividing the starting points of the corresponding positions on the time data curve according to the positions from 1 to the maximum range to obtain the data segments of each group under each starting point.
Preferably, the obtaining the repetition degree of the current data according to the similarity between the groups of data segments divided by the time data curves at each starting point includes:
for each group of data segments divided by the time data curve at each starting point, calculating the DTW distance between the data segment and the adjacent next data segment, and calculating the normalized value of the sum of the DTW distances of all groups of data segments divided by all starting points;
taking the difference result of subtracting the normalized value from 1 as the repetition degree of the current data.
Preferably, the obtaining the first preferred reference dictionary length of the current data according to the difference between the dominant frequency value and the period of each dominant frequency component curve of the current data includes:
for each main frequency component curve of the current data, taking the ratio of the main frequency value of the main frequency component curve to the average value of the main frequency values of all the main frequency component curves as the weight of the main frequency component curve;
and acquiring the period of the main frequency component curve, and taking the sum value of the products of the weights and the period of all the main frequency component curves of the current data as the first optimal reference dictionary length of the current data.
Preferably, the constructing a second preferred reference dictionary length of the current data according to the difference between the dominant frequency component of each history data and the dominant frequency component of the current data in the cluster where the current data is located includes:
for each historical data in the cluster where the current data is located, obtaining similarity according to the difference between the main frequency components of the current data and the historical data;
acquiring a preferred dictionary length of the historical data; calculating the sum value of the similarity of all the historical data in the cluster where the current data is located, and calculating the ratio of the similarity of the historical data to the sum value;
and taking the sum value of the products of the ratio of all the historical data in the cluster where the current data is located and the preferred dictionary length as a second preferred reference dictionary length of the current data.
Preferably, the obtaining the similarity according to the difference between the dominant frequency components of the current data and the historical data includes:
acquiring the minimum minr of the number of main frequency components in the current data and the historical data;
for the front minr main frequency components of the current data and the historical data, respectively taking any one main frequency component to form a main frequency component pair, taking the sum of the normalized Euclidean distances of the amplitude, the frequency and the phase between the main frequency component pair as the difference between the main frequency component pairs, and forming the difference of each main frequency component pair into a distance matrix of the current data and the historical data;
obtaining minr main frequency component matching pairs by adopting a Hungary algorithm on the distance matrix, and calculating the difference average value of the minr main frequency component matching pairs;
and taking the difference result obtained by subtracting the normalized value of the difference mean value from 1 as the similarity of the current data and the historical data.
Preferably, the obtaining the reference factor of the current data to the historical data according to the first preferred reference dictionary length and the preferred dictionary length of each historical data in the cluster where the current data is located includes:
for each historical data in a cluster where the current data is located, calculating the absolute value of the difference value between the first preferential reference dictionary length and the preferential dictionary length of the historical data;
and calculating the normalized value of the sum of the absolute values of the differences of all the historical data, and taking the difference result of subtracting the normalized value from 1 as a reference factor of the current data to the historical data.
Preferably, the obtaining the third preferred reference dictionary length of the current data according to the reference factor of the current data to the historical data and the first and second preferred reference dictionary lengths includes:
calculating the product of the reference factor and the first preferred reference dictionary length as a first product, and calculating the product of the difference result of 1 minus the reference factor and the second preferred reference dictionary length as a second product;
and taking the sum value of the first product and the second product as a third preferred reference dictionary length of the current data.
Preferably, the correcting the preferred dictionary length according to the repetition degree of the current data and the third preferred reference dictionary length to obtain a corrected dictionary degree includes:
calculating the product of the repetition degree and the preferred dictionary length as a third product, and calculating the product of the difference result of subtracting the repetition degree from 1 and the third preferred reference dictionary length as a fourth product;
and taking the sum value of the third product and the fourth product as the correction dictionary length of the current data.
The invention has at least the following beneficial effects:
according to the invention, the optimal dictionary length of the LZ77 algorithm for data compression is preliminarily obtained by using the permutation entropy algorithm, so that the optimization method of the current data and the historical data is facilitated to be judged;
then converting the corresponding current data into a data curve, dividing the data curve to obtain the similarity between the data segments on the basis of the optimal dictionary length, and further obtaining the repetition degree of the current data, wherein the similarity is used for analyzing the basic data structure of the current data so as to influence the data compression optimization method, so that the optimal optimization effect is achieved;
the method comprises the steps of obtaining a first optimal reference dictionary length of current data in a frequency domain layer by analyzing a main frequency component in a frequency domain diagram of the current data, analyzing a corresponding dictionary length of historical data compression, obtaining a second optimal reference dictionary length of the current data in a historical compression experience layer, and evaluating the effect degree of an optimization method of the current data from the frequency domain layer and the historical compression experience layer respectively, so that the selection of the optimal dictionary length of the current data is facilitated;
the method comprises the steps of obtaining the weight of the current data according to the optimization result of the current data, wherein the weight is used for indirectly influencing the optimization result of the current data according to the optimization result of the current data, and realizing the optimization result of the dictionary length according to the third optimization result of the current data;
finally, by combining the obtained compression optimization method, the optimal dictionary length of the current data obtained by the method is analyzed to obtain the corrected dictionary length, so that the optimal dictionary length of the current data is optimized by combining the third optimal reference dictionary length, the data compression in the LZ77 algorithm is realized, the compression rate of the data compression is greatly increased, the data volume in the data transmission process is reduced, and the data transmission speed is greatly improved.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a high-speed data transmission method based on an optical fiber network provided by the invention;
FIG. 2 is a flow chart of the corrected dictionary length acquisition of the current data.
Detailed Description
In order to further describe the technical means and effects adopted by the present invention to achieve the preset purpose, the following detailed description refers to the specific implementation, structure, characteristics and effects of the high-speed data transmission method based on the optical fiber network according to the present invention with reference to the accompanying drawings and the preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The following specifically describes a specific scheme of the high-speed data transmission method based on the optical fiber network provided by the invention with reference to the accompanying drawings.
The embodiment of the invention provides a high-speed data transmission method based on an optical fiber network.
Specifically, the following high-speed data transmission method based on the optical fiber network is provided, referring to fig. 1, the method includes the following steps:
step S001, collecting current data to be transmitted and historical data in a transmission system.
The purpose of this embodiment is to compress data to be transmitted at high speed, so that more data can be transmitted under the same conditions, thereby indirectly improving the network data transmission speed.
In order to facilitate high-speed transmission of current data, the present embodiment performs compression optimization processing on the current data, and first, the current data needs to be acquired, and in order to facilitate discrimination of the compression optimization method of the present embodiment, the present embodiment needs to collect and analyze some historical data with better compression effect. The current data are data to be transmitted by the optical network, and each historical data are obtained from the transmission system.
And step S002, analyzing the current data and the data distribution condition of each historical data, so as to correct the current data dictionary window and optimize the compression process.
The present embodiment uses the existing LZ77 data compression algorithm to compress the obtained data, but since the dictionary length in the existing LZ77 algorithm is fixed, the compression efficiency is different for different data by using the algorithm, so in order to improve the data compression efficiency for different data, the present embodiment proposes a data compression algorithm that adapts the dictionary length of the LZ 77.
Since the algorithm needs to encode the context based on context information, that is, based on the context formation dictionary, the preferred dictionary length of the current data can be obtained by analyzing the context relation of the current data, and the calculation method for specifically obtaining the context relation is as follows:
in this embodiment, the obtained data is analyzed by using permutation entropy, and the context association degree of the current data is obtained by analyzing the magnitude of the permutation entropy corresponding to the obtained data, that is, when the obtained permutation entropy is smaller, the likelihood that the content of the current data sequence is repeated is higher is illustrated. The entropy arrangement method is a known technique, and the description of this embodiment is omitted.
Meanwhile, the repeated condition of representing the repeated length of the sequence is determined due to the size of the embedded dimension in the permutation entropy, so that the data length with more repeated sequences can be obtained and judged by obtaining the permutation entropy of the current data sequence under different embedded dimensions; wherein the time delay t of another parameter in the permutation entropy is set to 1 in this embodiment, and the implementer can set itself.
The obtained data are analyzed by using embedding dimensions with different sizes, wherein the preset value range of the embedding dimensions is [5,15], the minimum arrangement entropy obtained by obtaining different arrangement entropies according to the different embedding dimensions corresponds to the embedding dimension m, and the minimum arrangement entropy corresponds to the embedding dimension m to indicate that the context correlation of the current data is stronger when the data size is m. The preferred dictionary length m of the current data is derived therefrom.
The current data is divided according to the length m by analyzing the current data, and the repetition degree of the adjacent follow-up data and the content thereof after the division is calculated, wherein the calculation method of the repetition degree is as follows:
because the process of data compression by using the LZ77 algorithm is performed sequentially row by row, the current data can be converted into time series data, a time data curve of the current data is constructed, the abscissa of the curve represents the sequence of the data to be compressed, the ordinate represents the numerical value corresponding to the data to be compressed, the embodiment sets each data in the current data to hexadecimal, that is, all data operations in the embodiment are performed on hexadecimal basis, including the above-mentioned division of the embedding dimension, the embedding dimension of 5 represents the calculation of permutation entropy by taking 5 hexadecimal data as a group, so the numerical value corresponding to the compressed data in the ordinate is hexadecimal value, and the points in the curve represent the data points to be compressed.
The current data curve can be analyzed to divide the curve, and each m data are divided into a group according to the time sequence length in sequence, so that n groups of data segments are obtained.
Since the window of the data dictionary is sliding, the dividing method is not fixed, so m dividing methods can be obtained in total for dividing different starting points of the current data, thereby analyzing the repetition degree of the corresponding current data, and the specific calculating method is as follows:
wherein,representing the degree of repetition of the current data, +.>Representing a normalization function->Representing the preferred dictionary length of the current data, i.e. the number of divisions from different starting points, +.>Represents the number of data segments divided from the starting point c, etc.>Represents the DTW distance of the i-th group of data segments from the i+1-th group of data segments after division from the starting point c.
I.e. the smaller the DTW distance between the adjacent data segments is calculated, the more similar the data between the adjacent data segments is, indicating a greater degree of repetition of the adjacent data segments.
Meanwhile, the time data curve of the current data is subjected to Fourier transformation, and the analysis of a frequency domain graph obtained after the Fourier transformation proves that a plurality of sine curves with different magnitudes can exist in the frequency domain image because the data sequence is possibly irregular, the obtained magnitudes are classified by using an Ojin threshold method, and the sine curve with larger magnitudes in the frequency domain image is defined as a main frequency component curve, namely, the larger magnitudes represent larger contribution to the formation of the original data curve. The fourier transform and the oxford thresholding are both known techniques, and the description of this embodiment is omitted.
For each main frequency component curve of the current data, acquiring a corresponding period according to the frequency spectrum of the corresponding main frequency component curve, so as to analyze according to the obtained period, and further acquire a first optimal reference dictionary length of the current data:
wherein,weight representing the jth dominant frequency component curve,/->Representing the principal frequency value corresponding to the jth principal frequency component curve,/for>Representing the main frequency division corresponding to a larger amplitude obtained by division of an Ojin thresholdQuantity of quantity curve->First preferred reference dictionary length representing current data, < >>The period of the jth dominant frequency component curve is represented.
When the main frequency component curve is larger in weight, namely the main frequency value of the main frequency component curve divided according to the amplitude is higher, and the period of the corresponding main frequency component curve is longer, the length of the corresponding dictionary window is longer when the data contained in the main frequency component curve needs to be compressed; all the dominant frequency component curves in the current data are combined,the larger the value, the longer the length of the corresponding dictionary window is required to represent the current data as a whole for compression.
Meanwhile, some historical data with better compression effect are analyzed by the same method, a Fourier frequency domain diagram of each historical data is obtained, all main frequency components in the corresponding Fourier frequency domain diagram are obtained, the obtained main frequency component curves are clustered, the clustering distance in the embodiment is the Euclidean distance between the average frequency value and the average amplitude corresponding to all main frequency component curves of the current data and all the historical data, the similarity condition between the current data and each historical data can be obtained according to the clustering result, and then the compression window correction coefficient corresponding to the current data is obtained according to the obtained similarity. The clustering algorithm uses the existing DBSCAN clustering algorithm, which is a known technology, and this embodiment will not be described in detail.
According to the obtained clustering result, further obtaining a second optimal reference dictionary length of the current data:
wherein,indicating the +.f in the cluster where the current data is located>Similarity of individual history data->Representing a normalization function->Representing the number of dominant frequency components and +.>Minimum value in the number of dominant frequency components of the historical data,/>The difference between the main frequency components of the h pair matching in the current data and the first historical data is represented, wherein the obtaining method of the matching pair is obtained by a hungarian algorithm, and the hungarian algorithm is a known technology, and is not described in detail in this embodiment.
Wherein the difference between the current data and the major frequency component of the h pair of matches in the first historical dataThe calculation method of (1) is as follows: and taking the sum of normalized Euclidean distances of amplitude, frequency and phase corresponding to the two main frequency component data segments as the difference between the two main frequency component data segments. Calculating the difference of all main frequency component data segments in the current data and the first historical data, constructing a distance matrix of the current data and the first historical data by the difference, and obtaining +_ by using a Hungary algorithm>For the matched dominant frequency components, the smaller the difference between the matched dominant frequency components, i.e. the sought/>The smaller the two main frequency component data segments are, the more similar the corresponding periods are, the +.>The larger the closer to 1. The hungarian algorithm is a known technique, and this embodiment is not described in detail.
Wherein,a second preferred reference dictionary length representing the current data, < >>Representing the number of elements in the cluster where the current data is located,/-, for example>Indicating the +.f in the cluster where the current data is located>Similarity of individual history data->Indicating the +.f in the cluster where the current data is located>Preferred dictionary length for each history data.
I.e. the more similar is the dominant frequency component of the current data found to the dominant frequency component of the historical data, i.e. the foundThe larger the closer to 1, the better the preferred dictionary length of the indirect description history data, and the smaller the optimization of the preferred dictionary length for each history data, the second preferred reference dictionary length of the current data may be referred to when the first preferred reference dictionary length of the current data cannot be referred to.
A correction method for acquiring current data according to the above-described acquisition method,acquiring a first preferential reference dictionary length corresponding to each historical data by using the same method, and then acquiring a reference factor of the current data to the historical data according to the difference between the first preferential reference dictionary length and the preferential dictionary length of each historical data
Wherein,reference factor representing current data versus historical data, < ->Representing a normalization function->Representing the number of elements in the cluster where the current data is located,/-, for example>Indicating the +.f in the cluster where the current data is located>First preferred reference dictionary length of the individual history data, < >>Indicating the +.f in the cluster where the current data is located>Preferred dictionary length for each history data.
I.e. when the difference between the first preferred reference dictionary length obtained by the above method and its corresponding preferred dictionary length is smaller for the history data being obtained, i.e. the history data being obtainedThe smaller the first preferred reference dictionary obtained using this method is illustratedThe stronger the referenceability of the length, the greater the influence of the first preferred reference dictionary length derived from the current data on the dictionary length of the current data.
Then according to the analysis of the sinusoidal curve after fourier transform of the current data and each history data, a third preferred reference dictionary length of the current data can be obtained accordingly:
wherein,a third preferred reference dictionary length representing the current data, < >>Reference factor representing current data versus historical data, < ->A second preferred reference dictionary length representing the current data, < >>A first preferred reference dictionary length representing the current data, wherein +.>Is the first product of the current data, +.>Is the second product of the current data.
When the reference factor of the current data to the historical data is larger, the first preferred reference dictionary length is more consistent with the dictionary length of the current data, and the obtained third preferred reference dictionary length of the current data is more approximate to the first preferred reference dictionary length; when the reference factor of the current data to the historical data is smaller, the first preferred reference dictionary length is not consistent with the dictionary length of the current data, the historical data which is similar to the current data is required to be used as a reference, namely the second preferred reference dictionary degree is used as a reference of the third preferred reference dictionary length of the current data.
The third preferred reference dictionary length for the current data is thus obtained.
The third preferred reference dictionary length and the repetition degree of the current data obtained by the method are used for correcting the preferred dictionary length m of the current data:
wherein,modified dictionary length representing current data, +.>Representing the degree of repetition of the current data, +.>Preferred dictionary length representing current data, +.>A third preferred reference dictionary length representing current data. The flowchart for obtaining the length of the correction dictionary of the current data is shown in fig. 2.
I.e. when the degree of repetition of the data segment is better, i.e. theThe larger the length m, the higher the corresponding data compression efficiency, the lower the corresponding correction necessity, namely +.>The smaller the correction of the dictionary length of the current data is for the third preferred reference dictionary length +.>The smaller the reference degree of the current data, the more so far, the repair of the initial dictionary length m of the current data is completedPositive.
Step S003, data transmission.
The correction dictionary length obtained by the method is used for carrying out data compression on the current data, and the compressed data is transmitted, so that the data quantity of the transmitted data is reduced, and the purpose of transmitting the data at high speed is achieved.
According to the embodiment of the invention, the optimal dictionary length of the LZ77 algorithm for data compression is preliminarily obtained by using the permutation entropy algorithm, so that the quality of the current data and the historical data optimization method can be judged;
then converting corresponding current data into a data curve, dividing the data curve to obtain the similarity between data segments on the basis of the optimal dictionary length, and further obtaining the repetition degree of the current data, wherein the similarity is used for analyzing the basic data structure of the current data so as to influence the data compression optimization method in the embodiment of the invention, so that the optimal optimization effect is achieved;
the method comprises the steps of obtaining a first optimal reference dictionary length of current data in a frequency domain layer by analyzing a main frequency component in a frequency domain diagram of the current data, analyzing a corresponding dictionary length of historical data compression, obtaining a second optimal reference dictionary length of the current data in a historical compression experience layer, and evaluating the effect degree of an optimization method of the current data from the frequency domain layer and the historical compression experience layer respectively, so that the selection of the optimal dictionary length of the current data is facilitated;
the embodiment of the invention also obtains the weight of the optimization method of the embodiment of the invention by additionally analyzing the difference condition between the optimal dictionary length of each historical data in the cluster where the current data is located and the first optimal reference dictionary length corresponding to each historical data, and the weight is used for indirectly influencing the compression optimization method of the current data according to the compression optimization effect of the historical data so as to realize the reference degree compression optimization effect of the dictionary length according to the third optimal reference dictionary length of the current data;
finally, by combining the obtained compression optimization method, the optimal dictionary length of the current data obtained by the embodiment of the invention is analyzed to obtain the corrected dictionary length, so that the optimal dictionary length of the current data is optimized by combining the third optimal reference dictionary length, the data compression in the LZ77 algorithm is realized, the compression rate of the data compression is greatly increased, the data quantity in the data transmission process is reduced, and the data transmission speed is greatly improved.
It should be noted that: the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. And the foregoing description has been directed to specific embodiments of this specification. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In this specification, each embodiment is described in a progressive manner, and the same or similar parts of each embodiment are referred to each other, and each embodiment mainly describes differences from other embodiments.
The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; the technical solutions described in the foregoing embodiments are modified or some of the technical features are replaced equivalently, so that the essence of the corresponding technical solutions does not deviate from the scope of the technical solutions of the embodiments of the present application, and all the technical solutions are included in the protection scope of the present application.

Claims (10)

1. The high-speed data transmission method based on the optical fiber network is characterized by comprising the following steps of:
collecting current data to be transmitted and historical data in a transmission system;
acquiring the optimal dictionary length of the current data by adopting an permutation entropy algorithm; constructing a time data curve of the current data according to time sequence ordering, and dividing the time data curve at different starting points according to the optimal dictionary length to obtain each group of data segments at each starting point; obtaining the repetition degree of the current data according to the similarity between each group of data segments divided by the time data curve at each starting point;
obtaining a frequency domain diagram of a time data curve of the current data by adopting Fourier transformation, classifying curves with different amplitudes in the frequency domain diagram by using an Ojin threshold method, and taking the curve with the largest amplitude and value after classification in the class as a main frequency component curve; obtaining a first optimal reference dictionary length of the current data according to the difference between the main frequency value and the period of each main frequency component curve of the current data; clustering the current data and the main frequency component curves of the historical data to obtain clustering clusters, and constructing a second optimal reference dictionary length of the current data according to the difference between the main frequency components of the historical data and the main frequency components of the current data in the clustering clusters where the current data is located;
obtaining a reference factor of the current data to the historical data according to the first optimal reference dictionary length and the optimal dictionary length of each historical data in the cluster where the current data is located; obtaining a third preferred reference dictionary length of the current data according to the reference factors of the current data to the historical data and the first and second preferred reference dictionary lengths; correcting the optimal dictionary length according to the repetition degree of the current data and the third optimal reference dictionary length to obtain a corrected dictionary degree; and compressing the current data by taking the corrected dictionary degree of the current data as the dictionary length of the LZ77 algorithm.
2. The method for high-speed data transmission based on an optical fiber network according to claim 1, wherein the obtaining the preferred dictionary length of the current data using the permutation entropy algorithm comprises:
setting a preset range for the embedded dimension in the permutation entropy, and calculating corresponding permutation entropy for the current data under different embedded dimensions in the preset range; and taking the embedded dimension corresponding to the minimum permutation entropy as the optimal dictionary length.
3. The method for high-speed data transmission based on optical fiber network according to claim 2, wherein the dividing the time data curve at different starting points according to the preferred dictionary length to obtain each group of data segments at each starting point comprises:
and taking the length of the optimized dictionary as the maximum range of the starting point types of the divided data segments on the time data curve, and dividing the starting points of the corresponding positions on the time data curve according to the positions from 1 to the maximum range to obtain the data segments of each group under each starting point.
4. The high-speed data transmission method based on the optical fiber network as claimed in claim 3, wherein the obtaining the repetition degree of the current data according to the similarity between the groups of data segments divided at each start point by the time data curve comprises:
for each group of data segments divided by the time data curve at each starting point, calculating the DTW distance between the data segment and the adjacent next data segment, and calculating the normalized value of the sum of the DTW distances of all groups of data segments divided by all starting points;
taking the difference result of subtracting the normalized value from 1 as the repetition degree of the current data.
5. The method for high-speed data transmission based on an optical fiber network according to claim 1, wherein the obtaining the first preferred reference dictionary length of the current data according to the difference between the main frequency value and the period of each main frequency component curve of the current data comprises:
for each main frequency component curve of the current data, taking the ratio of the main frequency value of the main frequency component curve to the average value of the main frequency values of all the main frequency component curves as the weight of the main frequency component curve;
and acquiring the period of the main frequency component curve, and taking the sum value of the products of the weights and the period of all the main frequency component curves of the current data as the first optimal reference dictionary length of the current data.
6. The method for high-speed data transmission based on a fiber network according to claim 2, wherein the constructing the second preferred reference dictionary length of the current data based on the difference between the dominant frequency component of each history data and the dominant frequency component of the current data in the cluster in which the current data is located comprises:
for each historical data in the cluster where the current data is located, obtaining similarity according to the difference between the main frequency components of the current data and the historical data;
acquiring a preferred dictionary length of the historical data; calculating the sum value of the similarity of all the historical data in the cluster where the current data is located, and calculating the ratio of the similarity of the historical data to the sum value;
and taking the sum value of the products of the ratio of all the historical data in the cluster where the current data is located and the preferred dictionary length as a second preferred reference dictionary length of the current data.
7. The method for high-speed data transmission based on an optical fiber network according to claim 6, wherein the obtaining the similarity according to the difference between the dominant frequency components of the current data and the history data comprises:
acquiring the minimum minr of the number of main frequency components in the current data and the historical data;
for the front minr main frequency components of the current data and the historical data, respectively taking any one main frequency component to form a main frequency component pair, taking the sum of the normalized Euclidean distances of the amplitude, the frequency and the phase between the main frequency component pair as the difference between the main frequency component pairs, and forming the difference of each main frequency component pair into a distance matrix of the current data and the historical data;
obtaining minr main frequency component matching pairs by adopting a Hungary algorithm on the distance matrix, and calculating the difference average value of the minr main frequency component matching pairs;
and taking the difference result obtained by subtracting the normalized value of the difference mean value from 1 as the similarity of the current data and the historical data.
8. The method for high-speed data transmission based on a fiber network according to claim 1, wherein the obtaining the reference factor of the current data to the history data according to the first preferred reference dictionary length and the preferred dictionary length of each history data in the cluster in which the current data is located comprises:
for each historical data in a cluster where the current data is located, calculating the absolute value of the difference value between the first preferential reference dictionary length and the preferential dictionary length of the historical data;
and calculating the normalized value of the sum of the absolute values of the differences of all the historical data, and taking the difference result of subtracting the normalized value from 1 as a reference factor of the current data to the historical data.
9. The method for high-speed data transmission over a fiber optic network according to claim 1, wherein the obtaining a third preferred reference dictionary length of the current data based on the reference factor of the current data to the history data and the first and second preferred reference dictionary lengths comprises:
calculating the product of the reference factor and the first preferred reference dictionary length as a first product, and calculating the product of the difference result of 1 minus the reference factor and the second preferred reference dictionary length as a second product;
and taking the sum value of the first product and the second product as a third preferred reference dictionary length of the current data.
10. The method for high-speed data transmission over a fiber network according to claim 1, wherein correcting the preferred dictionary length according to the repetition level of the current data and the third preferred reference dictionary length to obtain the corrected dictionary level comprises:
calculating the product of the repetition degree and the preferred dictionary length as a third product, and calculating the product of the difference result of subtracting the repetition degree from 1 and the third preferred reference dictionary length as a fourth product;
and taking the sum value of the third product and the fourth product as the correction dictionary length of the current data.
CN202311785079.9A 2023-12-25 2023-12-25 High-speed data transmission method based on optical fiber network Active CN117459187B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311785079.9A CN117459187B (en) 2023-12-25 2023-12-25 High-speed data transmission method based on optical fiber network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311785079.9A CN117459187B (en) 2023-12-25 2023-12-25 High-speed data transmission method based on optical fiber network

Publications (2)

Publication Number Publication Date
CN117459187A CN117459187A (en) 2024-01-26
CN117459187B true CN117459187B (en) 2024-03-12

Family

ID=89591433

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311785079.9A Active CN117459187B (en) 2023-12-25 2023-12-25 High-speed data transmission method based on optical fiber network

Country Status (1)

Country Link
CN (1) CN117459187B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103345920A (en) * 2013-05-29 2013-10-09 河海大学常州校区 Self-adaptation interpolation weighted spectrum model voice conversion and reconstructing method based on Mel-KSVD sparse representation
CN107576943A (en) * 2017-08-07 2018-01-12 西安电子科技大学 Adaptive Time and Frequency Synchronization compression method based on Rayleigh entropy
CN115630271A (en) * 2022-09-28 2023-01-20 中车工业研究院有限公司 Signal frequency estimation method, device, equipment and storage medium
CN116226484A (en) * 2023-05-05 2023-06-06 北京视酷科技有限公司 Ultrafiltration water treatment device monitoring data management system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103345920A (en) * 2013-05-29 2013-10-09 河海大学常州校区 Self-adaptation interpolation weighted spectrum model voice conversion and reconstructing method based on Mel-KSVD sparse representation
CN107576943A (en) * 2017-08-07 2018-01-12 西安电子科技大学 Adaptive Time and Frequency Synchronization compression method based on Rayleigh entropy
CN115630271A (en) * 2022-09-28 2023-01-20 中车工业研究院有限公司 Signal frequency estimation method, device, equipment and storage medium
CN116226484A (en) * 2023-05-05 2023-06-06 北京视酷科技有限公司 Ultrafiltration water treatment device monitoring data management system

Also Published As

Publication number Publication date
CN117459187A (en) 2024-01-26

Similar Documents

Publication Publication Date Title
CN116828070B (en) Intelligent power grid data optimization transmission method
CN109525369B (en) Channel coding type blind identification method based on recurrent neural network
CN115459782A (en) Industrial Internet of things high-frequency data compression method based on time sequence segmentation and clustering
CN111489188B (en) Resident adjustable load potential mining method and system
CN117411947B (en) Cloud edge cooperation-based water service data rapid transmission method
CN115333735A (en) Safe data transmission method
CN116320043B (en) Method and system for improving transmission efficiency of multi-carrier communication system
CN117316301B (en) Intelligent compression processing method for gene detection data
CN116582133B (en) Intelligent management system for data in transformer production process
CN115834642B (en) Intelligent silkworm co-rearing room data transmission method based on Internet of things technology
CN116700630B (en) Organic-inorganic compound fertilizer production data optimized storage method based on Internet of things
CN115514376A (en) High-frequency time sequence data compression method and device based on improved symbol aggregation approximation
CN114416707A (en) Method and device for automated feature engineering of industrial time series data
CN114626487B (en) Linear transformation relation checking method based on random forest classification algorithm
CN116567269A (en) Spectrum monitoring data compression method based on signal-to-noise separation
CN115987294A (en) Multidimensional data processing method of Internet of things
CN115857823A (en) Distributed compression storage method based on data sharing
CN109067678A (en) Based on Higher Order Cumulants WFRFT signal cascade Modulation Identification method, wireless communication system
CN115987296A (en) Traffic energy data compression transmission method based on Huffman coding
CN117459187B (en) High-speed data transmission method based on optical fiber network
CN106656201B (en) Compression method based on amplitude-frequency characteristics of sampled data
CN105469601B (en) A kind of road traffic spatial data compression method based on LZW codings
CN115270895B (en) Fault detection method for diesel engine
CN116155297A (en) Data compression method, device, equipment and storage medium
CN118018033B (en) Intelligent compression transmission method for motor performance data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant