CN115269940A - Data compression method of ERP management system - Google Patents
Data compression method of ERP management system Download PDFInfo
- Publication number
- CN115269940A CN115269940A CN202211206424.4A CN202211206424A CN115269940A CN 115269940 A CN115269940 A CN 115269940A CN 202211206424 A CN202211206424 A CN 202211206424A CN 115269940 A CN115269940 A CN 115269940A
- Authority
- CN
- China
- Prior art keywords
- data
- compression
- interval
- entropy
- reducible
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
- G06F16/90348—Query processing by searching ordered data, e.g. alpha-numerically ordered data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/9035—Filtering based on additional data, e.g. user or group profiles
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention relates to the technical field of digital data processing, in particular to a data compression method of an ERP management system, which collects data to be compressed of the ERP management system; acquiring a plurality of compression intervals of data to be compressed and corresponding average repeatability; partitioning all data again to obtain a plurality of data intervals; screening out reducible entropy data and establishing a corresponding ideal compression model; acquiring corresponding position adjustment parameters and direction adjustment parameters based on the difference between the arrangement position of the reducible entropy data and the ideal compression model; adjusting the reducible entropy data by using corresponding position adjustment parameters and direction adjustment parameters according to different arrangement sequences, and selecting sequence adjustment parameters; and performing distribution adjustment on the reducible entropy data by using the position adjustment parameter, the direction adjustment parameter and the sequence adjustment parameter to obtain an adjusted entropy reduction model, and compressing the entropy reduction model. The invention can improve the compression efficiency and realize the high-efficiency compression of the ERP management system data.
Description
Technical Field
The invention relates to the technical field of digital data processing, in particular to a data compression method of an ERP management system.
Background
When an ERP management system manages information, a large amount of data is required to be supported, the data is compressed by a conventional method for reducing data transmission and large compression amount required by ERP, the most common algorithm for data compression in the prior art is a GZIP compression algorithm, lz77 codes are used for carrying out primary compression on the data, and then Huffman codes are used for carrying out secondary compression on the codes compressed by the lz77 codes, and the compression method is essentially that the data is coded and compressed by using the repeatability of the data to approach the limit information entropy.
Because the comprehensive consideration of the ERP management system results in a small amount of structured data in the data and a small repeatability of the data, that is, the information entropy of the whole data is too large, when the lz77 code is used for compressing the ERP management system data, the repeated data is often not in the same compression dictionary, or is in the same compression window but is relatively far away from the compression window, so that the information entropy of the whole data is too large, the compression efficiency is too low, and a good efficiency effect cannot be achieved.
Disclosure of Invention
In order to solve the problem that the compression effect is not ideal due to the overlarge information entropy of the ERP management system, the invention provides a data compression method of the ERP management system, and the adopted technical scheme is as follows:
an embodiment of the present invention provides a data compression method for an ERP management system, including the following steps:
collecting data to be compressed of an ERP management system;
the method comprises the steps of performing interval division on data to be compressed to obtain a plurality of compression intervals, and obtaining corresponding average repeatability according to the information entropy of repeated data with different lengths in each compression interval; re-partitioning all data based on the average repeatability of all data in a plurality of continuous compression intervals to obtain a plurality of data intervals;
acquiring the distribution characteristics of the repeated data in each data interval, taking the average value of all the distribution characteristics as a screening threshold value, and screening out the repeated data corresponding to the distribution characteristics larger than the screening threshold value as the reducible entropy data; establishing an ideal compression model of the reducible entropy data;
acquiring corresponding position adjustment parameters based on the difference between the arrangement position of the reducible entropy data and the ideal compression model; acquiring corresponding direction adjustment parameters based on the positive and negative of the accumulated value of the difference; adjusting the reducible entropy data by using corresponding position adjustment parameters and direction adjustment parameters according to different arrangement sequences, and selecting the arrangement sequence which is most similar to the ideal compression model after adjustment as a corresponding sequence adjustment parameter;
and performing distribution adjustment on the reducible entropy data by using the position adjustment parameter, the direction adjustment parameter and the sequence adjustment parameter to obtain an adjusted entropy reduction model, and compressing the entropy reduction model.
Preferably, the interval division of the data to be compressed includes:
and dividing all data to be detected into a plurality of compression intervals by taking the length of a compression window in lz77 encoding compression as an interval division unit.
Preferably, the average repeatability obtaining method comprises:
and calculating the information entropy of the repeated data with each length in the compression interval, and acquiring the average repeatability of the corresponding compression interval based on the information entropy corresponding to all different lengths and the length of the compression interval.
Preferably, the repartitioning of all data based on the average repeatability of all data of a plurality of consecutive compression intervals includes:
the average repeatability of the first compression interval is notedObtaining the average repeatability of all data of the first compression interval and the second compression interval, and recording the average repeatability asIf, ifContinuing to calculate the average repeatability of all data in the first, second and third compression intervalsUntil it is calculated toIn the case of a continuous interval of compression,before, beforeA continuous pressAll data of the interval are used as a first data interval;
and then, the average repeatability is calculated from the j +1 th compression interval again until all data intervals of the data to be compressed are obtained.
Preferably, the method for acquiring the distribution characteristics includes:
for any data interval, calculating the distance of any repeated data in the data interval when the repeated data occur each time, calculating the information entropy of the distance, and acquiring the summation result of the corresponding information entropy when all the repeated data occur repeatedly; and calculating the proportion of the data length of the repeated data in the total length of the data interval as the weight of the summation result, taking the obtained product as a characteristic index, taking the characteristic index as the index of a preset value, and taking the obtained index function result as the distribution characteristic.
Preferably, the process of establishing the ideal compression model of the reducible entropy data is as follows:
and performing simulation compression on all data in a data interval in which each piece of reducible entropy data is positioned by taking the length of a compression interval as the length of a sliding compression window, and in the process of simulation compression, when encountering incompressible reducible entropy data, adjusting the position of the next occurrence of the reducible entropy data to ensure that the reducible entropy data is just compressed, traversing the whole data interval to obtain a corresponding ideal compression model.
The embodiment of the invention at least has the following beneficial effects:
firstly, the average repeatability of the data to be compressed is calculated, and all the data are partitioned again by utilizing the average repeatability, so that more repeated data are partitioned in the same data interval as much as possible, and the subsequent compression efficiency is improved; then screening entropy-reducible data in the data intervals, judging whether the data in each data interval has the need of entropy reduction, and screening the data needing entropy reduction; the corresponding position adjustment parameters, direction adjustment parameters and sequence adjustment parameters are obtained by establishing an ideal compression model of the entropy-reducible data, so that the entropy-reducible data is subjected to distribution adjustment to obtain an adjusted entropy-reducible model, repeated data is in the same compression window as much as possible during compression, the entropy-reducible model is compressed, the compression efficiency is improved, the retrieval time is shortened, and the efficient compression of the ERP management system data is realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart illustrating steps of a data compression method of an ERP management system according to an embodiment of the present invention.
Detailed Description
To further illustrate the technical means and effects of the present invention for achieving the predetermined objects, the following detailed description of a data compression method for an ERP management system according to the present invention, its specific implementation, structure, features and effects will be given with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" refers to not necessarily the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The following describes a specific scheme of the data compression method of the ERP management system provided by the present invention in detail with reference to the accompanying drawings.
Referring to fig. 1, a flowchart illustrating steps of a data compression method of an ERP management system according to an embodiment of the present invention is shown, where the method includes the following steps:
and S001, collecting data to be compressed of the ERP management system.
The data to be compressed are screened and collected by utilizing a database of the ERP management system, and the data to be compressed can be collected through manual selection or automatic system selection.
S002, carrying out interval division on data to be compressed to obtain a plurality of compression intervals, and acquiring corresponding average repeatability according to the information entropy of the repeated data with different lengths in each compression interval; and re-partitioning all the data based on the average repeatability of all the data of the continuous multiple compression intervals to obtain multiple data intervals.
The method comprises the following specific steps:
1. and carrying out interval division on the data to be compressed to obtain a plurality of compression intervals.
And dividing all data to be detected into a plurality of compression intervals by taking the length of a compression window in lz77 encoding compression as an interval division unit.
2. The average repeatability of each compression interval is obtained.
And calculating the information entropy of the repeated data with each length in the compression interval, and acquiring the average repeatability of the corresponding compression interval based on the information entropy corresponding to all different lengths and the length of the compression interval.
Counting the length of the longest repeated data in a compression interval, and recording the length of the longest repeated data in the compression interval as。
Then, taking a compression interval as an example, the corresponding average repeatability is calculated:
wherein, the first and the second end of the pipe are connected with each other,represents a repetition length of data of,;The length of the longest repeated data of the current compression interval is obtained;indicating that the length of the repeated data isIn the case of (1)The number of different pieces of data is different,in whichIndicating that the length of the repeated data isThe maximum data amount of all data at that length;expressed as the length of the repeated data beingIn the case of (1)The probability of a single different piece of data,to representThe entropy of the information of (1); a denotes a compression section length.
InformationEntropy is a well-known calculation formula. Each time the repetition degree of the data with the same length is calculated, the whole data needs to be calculated once, and the calculation is carried outThen, so as toThe average reproducibility was calculated as the denominator.
In a compression interval, the calculation of the repeatability of the data is carried out by using the information entropy, the more the data with different length is repeated,the larger the part of calculated values is, then the information entropy of all data with different lengths is accumulated, and the average value is obtained to represent the average repeatability C of the data in the window, wherein the larger the average repeatability C is, the more likely that B is any length belonging to B and the corresponding data is repeated.
3. And re-partitioning all the data based on the average repeatability of all the data of the continuous multiple compression intervals to obtain multiple data intervals.
The average repeatability of the first compression interval is recordedAnd obtaining the average repeatability of all data in the first compression interval and the second compression interval, and recording the average repeatability asIf, ifContinuing to calculate the average repeatability of all data in the first, second and third compression intervalsUntil it is calculated toIn the case of a continuous interval of compression,before, beforeAll data of a continuous compression interval is taken as a first data intervalSubscript ""indicates the first data interval, superscriptIndicates that the data interval is sharedA piece of data; then, the average repeatability is calculated from the j +1 th compression interval again until all data intervals of the data to be compressed are obtained, and the average repeatability is expressed as:. Wherein the subscriptDenotes the firstThe number of the data intervals is one,,is the total number of data intervals.
It should be noted that each data areaMay not be equal, and for convenience of description, are unifiedTo perform the presentation.
S003, acquiring the distribution characteristics of the repeated data in each data interval, and screening out the repeated data corresponding to the distribution characteristics larger than the screening threshold value as reducible entropy data by taking the average value of all the distribution characteristics as the screening threshold value; and establishing an ideal compression model of the reducible entropy data.
The method comprises the following specific steps:
1. and acquiring the distribution characteristics of the repeated data in each data interval.
For any data interval, calculating the distance of any repeated data in the data interval when the repeated data occur each time, calculating the information entropy of the distance, and acquiring the summation result of the corresponding information entropy when all the repeated data occur repeatedly; and calculating the proportion of the data length of the repeated data in the total length of the data interval as the weight of the summation result, wherein the obtained product is a characteristic index, the characteristic index is used as the index of a preset value, and the obtained index function result is the distribution characteristic.
To a first orderIndividual intervalTo (1) aRepeating dataFor the purpose of example only,,is a firstA total number of all duplicate data within the data interval. Firstly, counting the repeated dataIn the first placeAll number of repetitions of the intervalAnd recording the position of each occurrence thereof, e.g. the firstThe secondary occurrence position is,And then calculating the duplicate dataDistance between two adjacent repeatsDistance between the occurrence of a repeat and the occurrence of the next repeatFor the purpose of example only,wherein, in the step (A),representing duplicate dataIn thatTo middleThe position of the secondary occurrence is,representing duplicate dataIn thatTo middleThe position of the next occurrence.
Then obtaining duplicate data based on distanceIn the first placeDistribution characteristics in individual intervals:
Wherein, the first and the second end of the pipe are connected with each other,weights indicating the result of the summation, i.e. duplicate dataThe ratio of the data length of (a) to the total length of the data interval;representing duplicate dataSumming results of all corresponding information entropies when the information entropies repeatedly appear; e is a natural constant, i.e., a preset value in the embodiment of the present invention.
By passingTo the firstIn the data intervalDuplicate dataThe overall difference situation of the positions of two adjacent occurrences is quantified if the data is repeatedIn thatThe more regular the position appears inThe smaller the distribution position adjustment of the subsequent entropy reduction function is, the less the adjustment position is, and the less the calculation amount is.Wherein S represents the number of repetitions of the repeated data,indicating a distance ofProbability of occurrence.
Repeating dataIs in the total length of the data intervalIn whichRepresenting duplicate dataThe length of (a) is greater than (b),indicates the number of repetitions of the repeated data,representing duplicate dataAt the position ofThe total length of each data interval. Utilizing duplicate dataHas a data length ofIn (b) ratioTo calculate the weight, the ratio thereofThe larger the description of duplicate dataIn the first placeWithin a data interval, the length of the statement or the repetition degree has a considerable proportion, and the statement is compressed as much as possible by adjusting the distribution position of the statement and then is compared with the data intervalThe compression rate of the data of each partition contributes more, so the characteristic quantization of the distance is performed by using the occupation ratio as a weight.
By using the firstDuplicate data of individual data intervalsDistribution characteristics ofJudgment ofWhether the data interval has the need of calculating the entropy reduction function to adjust the distribution position of the entropy reduction model or not is judged.In the first placeThe more times of repeated occurrence in the data interval, the longer the data length during the repeated occurrence, and the more regular the distribution position, the more the data interval is compressed in an ideal stateThe greater the compression rate contribution of the individual data intervals, the greater the necessity for the calculation of the entropy-reducing function thereof.
In the above manner toAnd calculating the distribution characteristics of all the repeated data in each interval to obtain the distribution characteristics of all the repeated data.
2. The reducible entropy data is filtered.
And taking the average value of all the distribution characteristics as a screening threshold value, and screening out the repeated data corresponding to the distribution characteristics larger than the screening threshold value as the reducible entropy data.
Comparing the distribution characteristics corresponding to each repeated data with the screening threshold value, reserving the repeated data corresponding to the distribution characteristics larger than the screening threshold value, considering that the subsequent calculated amount is too large when the repeated data corresponding to the distribution characteristics lower than the screening threshold value are processed by using the entropy reduction model, and comparing the calculated amount with that of the repeated data corresponding to the second distribution characteristic lower than the screening threshold valueThe compression ratio contribution of each data interval is not paid, and therefore, it is considered that it is not necessary to perform the entropy reduction model processing.
To this end, screeningAll the repeated data of each data interval necessary to participate in the entropy reduction model processing.
3. And establishing an ideal compression model of the reducible entropy data.
And performing simulated compression on all data in a data interval in which each piece of reducible entropy data is positioned by taking the length of a compression interval as the length of a sliding compression window, and in the process of simulated compression, when encountering incompressible reducible entropy data, adjusting the next occurrence position of the repeated data to ensure that the reducible entropy data is just compressed, and traversing the whole data interval to obtain a corresponding ideal compression model.
Screening outAll repetition numbers of individual data intervals necessary to participate in the entropy-reduction model processingAfter the data is used as the reducible entropy data, the establishment of a reduced entropy model is carried out on each different reducible entropy data by utilizing the partitioned data and the length of a sliding compression windowEstablishing an ideal compression model, so that when the ideal compression model is compressed by using a sliding compression window, all reducible entropy data can be compressed and then combined with the actual second compression windowEstablishing an entropy-reducing function of the entropy-reducing data according to the data distribution condition of each data interval to adjust the distribution position of the entropy-reducing data, so that under the action of the entropy-reducing functionAll data distributions for each data interval are closest to the ideal compression model.
First using the firstThe arrangement mode of the data of each data interval, the length of a sliding compression window and the reducible entropy data establish an ideal compression model, the ideal compression model is that the maximum reducible entropy data is always compressed in each sliding compression process of the sliding dictionary, and the specific establishment mode is as follows:
firstly to the firstThe length of all data utilization compression interval in the data intervalAnd performing analog compression as the length of the sliding compression window, and in the process of the analog compression, when encountering incompressible entropy-reducing data, adjusting the position of the next occurrence of the entropy-reducing data to ensure that the entropy-reducing data is just compressed.
For example, an existing data areaIn a certain section of dataIn which,To be denotable, the entropy data, when compressed,can not be compressed, and then once adjustedThe position of the appearance makes the compression just capable of being compressed, and the adjusted ideal compression model isAt this timeJust as it can be compressed.
In the above manner toPerforming analog compression on all data in the data interval to obtain the first dataIdeal compression model of individual data interval, all data in ideal compression model andidentical, only the arrangement positions are different.
The method comprises the steps of judging whether the repeated data is necessary to participate in an entropy reduction model or not by calculating the distribution characteristics of each repeated data, and adjusting the position of the repeated data with larger distribution characteristics by calculating an entropy reduction model function so that the repeated data are positioned in the same sliding window when the sliding compression window slides as much as possible to improve the compression efficiency.
Step S004, acquiring corresponding position adjustment parameters based on the difference between the arrangement position of the reducible entropy data and the ideal compression model; acquiring corresponding direction adjustment parameters based on the positive and negative of the accumulated value of the difference; and adjusting the reducible entropy data by using the corresponding position adjustment parameters and direction adjustment parameters according to different arrangement sequences, and selecting the arrangement sequence which is most similar to the ideal compression model after adjustment as the corresponding sequence adjustment parameters.
In the process of position adjustment, the position adjustment amount, the adjustment direction and the adjustment sequence need to be determined. Because the screened repeated data are a plurality of data, the overall influence effect is different due to different adjustment sequences, the regulation of the adjustment amount and direction of all screened positions is carried out through the difference value of the positions, and then the self-adaption of the entropy reduction function is carried out according to the similarity of different adjustment sequences.
The method comprises the following specific steps:
1. and acquiring the position adjusting parameters of the reducible entropy data.
For each piece of reducible entropy dataCalculating the average value of the difference value between the arrangement position in the data interval and the arrangement position in the ideal compression model to obtain the second data intervalIndividual sectionTo (1) aRepeating dataFor example, the average position difference valueThe calculation of (c) is as follows:
wherein, the first and the second end of the pipe are connected with each other,is shown asWithin a data intervalMultiple data ofPosition of next occurrenceWhereinIs a firstRepeating dataThe number of all occurrences is such that,is shown asInner second of ideal compression model corresponding to each data intervalMultiple data ofThe position of the next occurrence.
To a first orderAll the screened repeated data and the second data in each data intervalAverage position difference value of same occurrence of corresponding repeated data in ideal compression model corresponding to each data intervalAs a position adjustment parameter in the function of decreasing entropy.
The average difference value represents the large trend of position difference between all screened repeated data and an ideal compression model, namely, the average difference value is divided into a small part of data and the rest dataThe position difference between the screened repeated data in each data interval and the corresponding data in the ideal compression model floats around the average value. Using mean difference valueWhen the position adjustment is carried out as the position adjustment parameter, the calculation amount is less, and the position adjustment is more accurate.
2. And acquiring direction adjustment parameters of the reducible entropy data.
Because of the firstMultiple data ofRepeating data in the second occurrence and ideal compression modelThe position of the secondary occurrence is different, so the calculated difference valueIf there is positive or negative, the difference value is accumulated to obtainThen, it is judgedIf the sign of the repeated data is positive, the corresponding occurrence times of the repeated data in most ideal compression models are proved to be in the second placeThe position ahead of the occurrence in the data interval is adjusted forward when the entropy reduction function is used for adjusting the data interval, so that the data interval can be closer to an ideal compression model; otherwise, if the number is negative, the corresponding occurrence number of the repeated data in most ideal compression models is proved to be in the second placeAfter the occurrence in the data interval, the data interval should be adjusted backward to be closer to the ideal compression model when being adjusted by the entropy reduction function.
3. And acquiring sequential adjustment parameters of the reducible entropy data.
Adjusting all the reducible entropy data according to different arrangement sequences by using the position adjustment parameters and the direction adjustment parameters, and then calculating the adjusted second orderAnd selecting the corresponding adjustment sequence with the highest similarity as a sequence adjustment parameter according to the similarity of the data intervals and the corresponding ideal compression models.
The method specifically comprises the following steps: firstly, randomly determining the adjustment sequence of a group of screened repeated data; then, the first step is carried out according to the position adjustment parameter and the direction adjustment parameter in the sequenceAdjusting the data distribution position of each data interval; calculating the adjusted second in the sequence after the adjustment is completedThe data structure of each data interval, namely the arrangement mode of all the data in the whole is similar to the structure of an ideal compression model. The higher the structural similarity is, the more the adjustment is performed in that orderThe data structure of the data interval is closer to an ideal compression model, i.e. the compression efficiency is higher. And selecting the arrangement sequence with the highest structural similarity as a sequence adjustment parameter.
It should be noted that the structural similarity is the probability of the same bit data in the prior art.
And S005, performing distributed adjustment on the entropy-reducible data by using the position adjustment parameter, the direction adjustment parameter and the sequence adjustment parameter to obtain an adjusted entropy-reduction model, and compressing the entropy-reduction model.
The method comprises the following specific steps:
and performing distribution adjustment on the data of each data interval by using the corresponding entropy reduction function of each data interval, specifically performing sequence adjustment on the screened repeated data by combining the position adjustment parameter and the direction adjustment parameter according to the sequence adjustment parameter of each data interval, changing the distribution characteristics of the whole interval, wherein the adjusted data of each data interval in different arrangement modes is the entropy reduction model corresponding to each data interval. And compressing the entropy reduction function of each data interval as a head file and an entropy reduction model of each data interval by using a GZip compression mode.
In summary, the embodiment of the present invention collects data to be compressed of the ERP management system; the method comprises the steps of performing interval division on data to be compressed to obtain a plurality of compression intervals, and obtaining corresponding average repeatability according to the information entropy of repeated data with different lengths in each compression interval; re-partitioning all data based on the average repeatability of all data in a plurality of continuous compression intervals to obtain a plurality of data intervals; acquiring the distribution characteristics of the repeated data in each data interval, taking the average value of all the distribution characteristics as a screening threshold value, and screening out the repeated data corresponding to the distribution characteristics larger than the screening threshold value as the reducible entropy data; establishing an ideal compression model of the reducible entropy data; acquiring corresponding position adjustment parameters based on the difference between the arrangement position of the reducible entropy data and the ideal compression model; acquiring corresponding direction adjustment parameters based on the positive and negative of the accumulated value of the difference; adjusting the reducible entropy data by using corresponding position adjustment parameters and direction adjustment parameters according to different arrangement sequences, and selecting the arrangement sequence which is most similar to the ideal compression model after adjustment as the corresponding sequence adjustment parameters; and carrying out distribution adjustment on the reducible entropy data by using the position adjustment parameter, the direction adjustment parameter and the sequence adjustment parameter to obtain an adjusted entropy reduction model, and compressing the entropy reduction model. The embodiment of the invention can improve the compression efficiency, reduce the retrieval time and realize the high-efficiency compression of the ERP management system data.
It should be noted that: the sequence of the above embodiments of the present invention is only for description, and does not represent the advantages or disadvantages of the embodiments. And that specific embodiments have been described above. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same or similar parts in the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments.
The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; modifications of the technical solutions described in the foregoing embodiments, or equivalents of some technical features thereof, are not essential to the spirit of the technical solutions of the embodiments of the present application, and are all included in the scope of the present application.
Claims (6)
1. A data compression method of an ERP management system is characterized by comprising the following steps:
collecting data to be compressed of an ERP management system;
the method comprises the steps of performing interval division on data to be compressed to obtain a plurality of compression intervals, and obtaining corresponding average repeatability according to the information entropy of repeated data with different lengths in each compression interval; re-partitioning all data based on the average repeatability of all data of a plurality of continuous compression intervals to obtain a plurality of data intervals;
acquiring the distribution characteristics of the repeated data in each data interval, taking the average value of all the distribution characteristics as a screening threshold value, and screening out the repeated data corresponding to the distribution characteristics larger than the screening threshold value as the reducible entropy data; establishing an ideal compression model of the reducible entropy data;
acquiring corresponding position adjustment parameters based on the difference between the arrangement position of the reducible entropy data and the ideal compression model; acquiring corresponding direction adjustment parameters based on the positive and negative of the accumulated value of the difference; adjusting the reducible entropy data by using corresponding position adjustment parameters and direction adjustment parameters according to different arrangement sequences, and selecting the arrangement sequence which is most similar to the ideal compression model after adjustment as the corresponding sequence adjustment parameters;
and performing distribution adjustment on the reducible entropy data by using the position adjustment parameter, the direction adjustment parameter and the sequence adjustment parameter to obtain an adjusted entropy reduction model, and compressing the entropy reduction model.
2. The data compression method of an ERP management system according to claim 1, wherein the interval division of the data to be compressed includes:
and dividing all data to be detected into a plurality of compression intervals by taking the length of a compression window during lz77 coding compression as an interval division unit.
3. The data compression method for the ERP management system according to claim 1, wherein the average repeatability obtaining method is as follows:
and calculating the information entropy of the repeated data with each length in the compression interval, and acquiring the average repeatability of the corresponding compression interval based on the information entropy corresponding to all different lengths and the length of the compression interval.
4. The data compression method of an ERP management system according to claim 1, wherein the repartitioning all the data based on the average repeatability of all the data of a plurality of consecutive compression intervals comprises:
the average repeatability of the first compression interval is notedObtaining the average repeatability of all data of the first compression interval and the second compression interval, and recording the average repeatability asIf, ifContinuing to calculate the average repeatability of all data in the first, second and third compression intervalsUntil it is calculated toIn the case of a continuous interval of compression,before, beforeAll data of the continuous compression intervals are used as a first data interval;
and then, the average repeatability is calculated from the j +1 th compression interval again until all data intervals of the data to be compressed are obtained.
5. The data compression method for the ERP management system according to claim 1, wherein the method for obtaining the distribution characteristics is:
for any data interval, calculating the distance of any repeated data in the data interval when the repeated data occur each time, calculating the information entropy of the distance, and acquiring the summation result of the corresponding information entropy when all the repeated data occur; and calculating the proportion of the data length of the repeated data in the total length of the data interval as the weight of the summation result, taking the obtained product as a characteristic index, taking the characteristic index as the index of a preset value, and taking the obtained index function result as the distribution characteristic.
6. The data compression method of the ERP management system as claimed in claim 1, wherein the process of establishing the ideal compression model of the reducible entropy data is:
and performing simulation compression on all data in a data interval in which each piece of reducible entropy data is positioned by taking the length of a compression interval as the length of a sliding compression window, and in the process of simulation compression, when encountering incompressible reducible entropy data, adjusting the position of the next occurrence of the reducible entropy data to ensure that the reducible entropy data is just compressed, traversing the whole data interval to obtain a corresponding ideal compression model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211206424.4A CN115269940B (en) | 2022-09-30 | 2022-09-30 | Data compression method of ERP management system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211206424.4A CN115269940B (en) | 2022-09-30 | 2022-09-30 | Data compression method of ERP management system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115269940A true CN115269940A (en) | 2022-11-01 |
CN115269940B CN115269940B (en) | 2022-12-13 |
Family
ID=83757927
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211206424.4A Active CN115269940B (en) | 2022-09-30 | 2022-09-30 | Data compression method of ERP management system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115269940B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116828070A (en) * | 2023-08-28 | 2023-09-29 | 无锡市锡容电力电器有限公司 | Intelligent power grid data optimization transmission method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102609491A (en) * | 2012-01-20 | 2012-07-25 | 东华大学 | Column-storage oriented area-level data compression method |
CN114244373A (en) * | 2022-02-24 | 2022-03-25 | 麒麟软件有限公司 | LZ series compression algorithm coding and decoding speed optimization method |
WO2022126902A1 (en) * | 2020-12-18 | 2022-06-23 | 平安科技(深圳)有限公司 | Model compression method and apparatus, electronic device, and medium |
CN114956290A (en) * | 2022-07-27 | 2022-08-30 | 江苏赛沐思环保科技有限公司 | LZ 77-coding-based intelligent treatment method for industrial wastewater |
-
2022
- 2022-09-30 CN CN202211206424.4A patent/CN115269940B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102609491A (en) * | 2012-01-20 | 2012-07-25 | 东华大学 | Column-storage oriented area-level data compression method |
WO2022126902A1 (en) * | 2020-12-18 | 2022-06-23 | 平安科技(深圳)有限公司 | Model compression method and apparatus, electronic device, and medium |
CN114244373A (en) * | 2022-02-24 | 2022-03-25 | 麒麟软件有限公司 | LZ series compression algorithm coding and decoding speed optimization method |
CN114956290A (en) * | 2022-07-27 | 2022-08-30 | 江苏赛沐思环保科技有限公司 | LZ 77-coding-based intelligent treatment method for industrial wastewater |
Non-Patent Citations (1)
Title |
---|
唐红: "对LZ77压缩数据的不均一纠错编码", 《四川大学学报(工程科学版)》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116828070A (en) * | 2023-08-28 | 2023-09-29 | 无锡市锡容电力电器有限公司 | Intelligent power grid data optimization transmission method |
CN116828070B (en) * | 2023-08-28 | 2023-11-07 | 无锡市锡容电力电器有限公司 | Intelligent power grid data optimization transmission method |
Also Published As
Publication number | Publication date |
---|---|
CN115269940B (en) | 2022-12-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106658003B (en) | A kind of quantization method of the image compression system based on dictionary learning | |
CN115269940B (en) | Data compression method of ERP management system | |
CN115204754B (en) | Heating power supply and demand information management platform based on big data | |
CN115269526B (en) | Method and system for processing semiconductor production data | |
CN116541828B (en) | Intelligent management method for service information data | |
CN110020721B (en) | Target detection deep learning network optimization method based on parameter compression | |
CN115987294A (en) | Multidimensional data processing method of Internet of things | |
CN115801902B (en) | Compression method of network access request data | |
CN117435145B (en) | Digital building information optimized storage method and system | |
CN115987296B (en) | Traffic energy data compression transmission method based on Huffman coding | |
CN111199740A (en) | Unloading method for accelerating automatic voice recognition task based on edge calculation | |
CN116910285B (en) | Intelligent traffic data optimized storage method based on Internet of things | |
CN111163314A (en) | Image compression method and system | |
Chen | Context modeling based on context quantization with application in wavelet image coding | |
CN116934487B (en) | Financial clearing data optimal storage method and system | |
CN116915873B (en) | High-speed elevator operation data rapid transmission method based on Internet of things technology | |
CN111161363A (en) | Image coding model training method and device | |
CN114924868A (en) | Self-adaptive multi-channel distributed deep learning method based on reinforcement learning | |
CN108981990B (en) | Indicator | |
CN112381206A (en) | Deep neural network compression method, system, storage medium and computer equipment | |
CN107612556B (en) | Optimal entropy coding method for L loyd-Max quantizer | |
CN116505952B (en) | Infrared code compression method and device, intelligent equipment and storage medium | |
CN117459187B (en) | High-speed data transmission method based on optical fiber network | |
Martínez-Alajarín et al. | Optimization of the compression parameters of a phonocardiographic telediagnosis system using genetic algorithms | |
CN112329923B (en) | Model compression method and device, electronic equipment and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |