CN117220686B

CN117220686B - Parasitic parameter compression and extraction system and method

Info

Publication number: CN117220686B
Application number: CN202311206468.1A
Authority: CN
Inventors: 孙延辉; 陈瑞; 韦欣; 马胜军; 袁鹏飞; 李世密
Original assignee: Qingdao Zhencheng Technology Co ltd
Current assignee: Qingdao Zhencheng Technology Co ltd
Priority date: 2023-09-18
Filing date: 2023-09-18
Publication date: 2024-02-23
Anticipated expiration: 2043-09-18
Also published as: CN117220686A

Abstract

The application relates to the technical field of data compression and extraction, in particular to a parasitic parameter compression and extraction system and a parasitic parameter compression and extraction method, wherein the system comprises the following components: the acquisition module is used for acquiring the interaction data in the parasitic parameter extraction process and preprocessing the interaction data; the compression module is used for compressing the preprocessed interaction data based on an LZMA algorithm and storing the compressed interaction data in a memory; the extraction module is used for decompressing the interaction data and extracting preset interaction data, and is also used for calculating the use evaluation degree of the second CPU before extracting the preset interaction data, and determining feature extraction data in the preset interaction data according to the use evaluation degree. The invention solves the technical problems that the prior extraction system needs to frequently read in a hard disk and a memory when extracting compressed data, wastes resources and time very and has low efficiency.

Description

Parasitic parameter compression and extraction system and method

Technical Field

The present disclosure relates to the technical field, and in particular, to a system and a method for compressing and extracting parasitic parameters.

Background

During the parasitic parameter providing tool extraction process, there is a large amount of interaction data. However, these data are usually either very large in size or fragmented, and in the process of frequent reading, the parasitic data cannot be stored in the memory completely, only the hard disk can be stored partially, the memory can be stored partially, and frequent reading is performed between the hard disk and the memory through the processing from the hard disk to the memory to the cpu. And after the calculation is finished, the result is fed to the memory through the CPU, and finally stored to the hard disk. The extraction system is very wasteful of resources and time and inefficient.

Disclosure of Invention

In order to solve the technical problems, the application provides a parasitic parameter compression extraction method, which aims to solve the technical problems that the conventional extraction system needs to frequently read in a hard disk and a memory when extracting compressed data, so that resources and time are wasted and the efficiency is low.

In some embodiments of the present application, preprocessing is performed on the interaction data, where the preprocessing includes determining a data partition and a partition amount of the interaction data according to a data bit length of the interaction data, calculating an average data heat value in the data partition, storing the interaction data in the data partition to a corresponding storage space in a memory according to the average data heat value, and dividing the interaction data into a plurality of data partitions, so that execution efficiency of storing the data in the memory is improved, and storage space of the memory is saved.

In some embodiments of the present application, the interactive data is decompressed and the required preset interactive data is extracted, before the preset interactive data is extracted, the usage evaluation degree of the second CPU is calculated according to the usage condition and the extraction condition of the second CPU, the feature extraction data in the preset interactive data is determined according to the usage evaluation degree, and under the condition that the performance of the second CPU and other instruction operations are not affected, important content in the required preset interactive data is extracted, so that the usage efficiency and the extraction efficiency of the CPU are improved.

In some embodiments of the present application, a system and a method for extracting parasitic parameters by compression are provided, the system includes:

the acquisition module is used for acquiring the interaction data in the parasitic parameter extraction process and preprocessing the interaction data;

the compression module is used for compressing the preprocessed interaction data based on an LZMA algorithm and storing the compressed interaction data in a memory;

the extraction module is used for decompressing the interaction data and extracting preset interaction data, and is also used for calculating the use evaluation degree of the second CPU before extracting the preset interaction data, and determining feature extraction data in the preset interaction data according to the use evaluation degree.

In some embodiments of the present application, the extraction module includes:

the first extraction sub-module comprises a first CPU, wherein the first CPU is used for extracting and decompressing the compressed interaction data and setting an index identifier for the decompressed interaction data;

the second extraction sub-module comprises a second CPU, and the second CPU is used for extracting corresponding interaction data in the first extraction sub-module according to the index identifier;

and the computing sub-module is used for acquiring the extraction condition and the use condition when the second CPU extracts the interactive data, and computing the use evaluation degree of the second CPU according to the extraction condition and the use condition.

In some embodiments of the present application, extracting and decompressing the compressed interaction data, setting an index identifier for the decompressed interaction data, including:

acquiring interactive data to be decompressed, extracting the interactive data in the corresponding data partition and storage space from the memory according to the data bit length and the data heat value corresponding to the interactive data to be decompressed, and performing decompression operation on the interactive data to be decompressed and the interactive data extracted from the memory to obtain a decompression result;

converting the interactive data in the decompression result into an interactive text file, wherein the interactive text file comprises a plurality of interactive text segments, and each interactive text segment is provided with an index identifier;

the interactive text file comprises a search index of a plurality of index elements, each index element is associated with an index identifier in one or more interactive text segments, one or more interactive text segments in the interactive text file can be detected by using the index element of one search index, and preset contents of the interactive data can be positioned by using the index identifier in each interactive text segment.

In some embodiments of the present application, calculating the usage evaluation degree of the second CPU according to the extraction condition and the usage condition includes:

the extraction condition of the second CPU comprises an extraction rate K1 and an extraction integrity K2, the use condition of the second CPU comprises a second CPU occupancy rate N1 and a second CPU load value N2, and the use evaluation degree of the second CPU is calculated according to the extraction condition and the use condition;

the second CPU has a usage evaluation degree of:

；

wherein W is the usage evaluation degree of the second CPU, a1 is the weight coefficient in the case of extraction rate to the second CPU extraction, a2 is the weight coefficient in the case of extraction integrity to the second CPU extraction, and a1+a2=1; m1 is a weight coefficient in the use case of the second CPU by the second CPU occupancy, m2 is a weight coefficient in the use case of the second CPU by the second CPU load value, and m1+m2=1, q1 is a weight coefficient of the use evaluation degree by the extraction case, q2 is a weight coefficient of the use evaluation degree by the use case, and q1+q2=1.

In some embodiments of the present application, determining feature extraction data in the preset interaction data according to the usage evaluation degree includes:

extracting preset interaction data, determining feature data contents to be extracted in the preset interaction data, and sequencing the feature data contents according to the extraction importance degree to obtain a feature extraction data set;

each feature data content in the feature extraction data set is provided with a corresponding index identifier, the number of the index identifiers is determined according to the relation between the use evaluation degree and the preset use evaluation degree, and feature extraction data is determined according to the number of the index identifiers and the corresponding feature data content.

In some embodiments of the present application, determining the number of index identifiers according to the relationship between the usage evaluation degree and the preset usage evaluation degree includes:

the extraction module is preset with a first preset use evaluation degree, a second preset use evaluation degree and a third preset evaluation degree, wherein the first preset use evaluation degree is smaller than the second preset use evaluation degree, and the second preset use evaluation degree is smaller than the third preset use evaluation degree; the method comprises the steps of setting a first preset index identification number, a second preset index identification number, a third preset index identification number and a fourth preset index identification number in advance, wherein the first preset index identification number is smaller than the second preset index identification number, the second preset index identification number is smaller than the third preset index identification number, and the third preset index identification number is smaller than the fourth preset index identification number;

when the using evaluation degree is smaller than a first preset using evaluation degree, selecting the number of first preset index marks as the number of current index marks;

when the using evaluation degree is between a first preset using evaluation degree and a second preset using evaluation degree, selecting the number of second preset index marks as the number of current index marks;

when the using evaluation degree is between the first preset using evaluation degree and the second preset using evaluation degree, selecting the number of third preset index marks as the number of current index marks;

and when the use evaluation degree is larger than the third preset use evaluation degree, selecting the number of the fourth preset index marks as the number of the current index marks.

In some embodiments of the present application, preprocessing the interaction data includes:

the preprocessing comprises the steps of obtaining the data bit length of the interactive data, and determining the data partition and partition quantity of the interactive data according to the data bit length of the interactive data;

the method comprises the steps of presetting a first preset data bit length, a second preset data bit length and a third preset data bit length, wherein the first preset data bit length is smaller than the second preset data bit length, and the second preset data bit length is smaller than the third preset data bit length; the method comprises the steps of setting a first preset partition amount, a second preset partition amount, a third preset partition amount and a fourth preset partition amount in advance, wherein the first preset partition amount is smaller than the second preset partition amount, the second preset partition amount is smaller than the third preset partition amount, and the third preset partition amount is smaller than the fourth preset partition amount;

when the data bit length is smaller than a first preset data bit length, selecting the first preset partition amount as a current partition amount;

when the data bit length is between a first preset data bit length and a second preset data bit length, selecting the second preset partition amount as a current partition amount;

when the data bit length is between the second preset data bit length and the third preset data bit length, selecting the third preset partition amount as the current partition amount;

and when the data bit length is longer than the third preset data bit length, selecting the fourth preset partition amount as the current partition amount.

In some embodiments of the present application, calculating an average data heat value in the data partition, sequentially arranging the average data heat value, storing the data partition with the average data heat value greater than a preset data heat value threshold into a first storage space, and storing the data partition with the average data heat value less than the preset data heat value threshold into a second storage space, wherein the first storage space and the second storage space are arranged on a memory;

based on an LZMA algorithm, compressing the interactive data in the first storage space and the second storage space according to a preset compression algorithm, comparing the compressed interactive data with corresponding interactive data in the first storage space and the second storage space, and if the comparison is successful, storing the compressed interactive data in a third storage space.

In some embodiments of the present application, before storing the compressed interaction data in the third storage space, the method includes:

calculating the storage duty ratio of the current interaction data in real time, and determining whether the third storage space meets the storage condition according to the storage duty ratio;

if the storage condition is met, storing the current interaction data into the third storage space;

if the storage condition is not met, determining a supplementary space according to the storage duty ratio, comparing a second empty storage duty ratio of the second storage space with the supplementary space, if the second empty storage duty ratio is larger than the supplementary space, taking the second empty storage duty ratio as a first external storage space of the third storage space, if the second empty storage duty ratio is smaller than the supplementary space, calculating a difference space between the supplementary space and the empty storage duty ratio, and taking the first empty storage duty ratio of the first storage space as a second external storage space of the third storage space based on the difference space;

the storage condition is that the third space storage duty ratio of the third storage space is larger than the storage duty ratio of the current interaction data, and the storage quality of the current interaction data can be ensured.

In some embodiments of the present application, a parasitic parameter compression extraction method is further included:

acquiring interaction data in a parasitic parameter extraction process, and preprocessing the interaction data;

based on an LZMA algorithm, compressing the preprocessed interactive data, and storing the compressed interactive data in a memory;

decompressing the interaction data and extracting preset interaction data, and calculating the use evaluation degree of the second CPU before extracting the preset interaction data, and determining feature extraction data in the preset interaction data according to the use evaluation degree.

Compared with the prior art, the parasitic parameter compression and extraction system and the parasitic parameter compression and extraction method have the beneficial effects that:

preprocessing the interactive data, wherein the preprocessing comprises the steps of determining a data partition and partition quantity of the interactive data according to the data bit length of the interactive data, calculating an average data heat value in the data partition, storing the interactive data in the data partition into corresponding storage spaces in a memory according to the average data heat value, and dividing the interactive data into a plurality of data partitions, so that the execution efficiency of storing the data in the memory is improved, and the storage space of the memory is saved.

Decompressing the interaction data and extracting the required preset interaction data, calculating the use evaluation degree of the second CPU according to the use condition and the extraction condition of the second CPU before extracting the preset interaction data, determining the feature extraction data in the preset interaction data according to the use evaluation degree, and extracting important contents in the required preset interaction data under the condition that the performance and other instruction operation of the second CPU are not affected, thereby improving the use efficiency and the extraction efficiency of the CPU.

Drawings

FIG. 1 is a schematic diagram of a parasitic parameter compression extraction system in accordance with a preferred embodiment of the present application;

fig. 2 is a schematic flow chart of a parasitic parameter compression extraction method in a preferred embodiment of the present application.

Detailed Description

The detailed description of the present application is further described in detail below with reference to the drawings and examples. The following examples are illustrative of the present application, but are not intended to limit the scope of the present application.

In the description of the present application, it should be understood that the terms "center," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like indicate orientations or positional relationships based on the orientation or positional relationships shown in the drawings, merely to facilitate description of the present application and simplify the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and therefore should not be construed as limiting the present application.

The terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.

In the description of the present application, it should be noted that, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the terms in this application will be understood by those of ordinary skill in the art in a specific context.

As shown in fig. 1, a parasitic parameter compression extraction method in a preferred embodiment of the present application includes:

In this embodiment, the preset interaction data is required interaction data, the usage evaluation degree of the second CPU is calculated before the preset interaction data is extracted, the second CPU is a main CPU, and in order to prevent the operation of other instructions in the main CPU from being affected, the feature extraction data is data corresponding to necessary extraction content in the preset interaction data.

In some embodiments of the present application, the extraction module includes:

In this embodiment, the data bit length is the data length corresponding to the interactive data, the data popularity value is set according to the occurrence frequency and the occurrence mode of the interactive data, the decompressed interactive data is converted into the interactive text file through the text function, a plurality of index elements are determined according to the characteristic information of a plurality of interactive text segments in the interactive text file, the interactive text segments are provided with corresponding index identifiers, the preset content of the interactive data can be located by using the index identifiers in each interactive text segment, the main CPU can extract a certain item of data in the parasitic data from the first CPU through the index identifiers to participate in processing, the first CPU can liberate the calculation pressure of the main CPU, and meanwhile, the index identifiers can also reduce the data processing capacity of the main CPU.

the second CPU has a usage evaluation degree of:

；

In this embodiment, k1=v/t, V is the extraction speed, t is the extraction duration, the second CPU occupancy N1 is the remaining occupancy of the second CPU except for extracting the interaction data, the second CPU load value N2 is the difference between the second CPU real-time load value and the normal load value, if the difference is within the normal load difference interval, the second CPU load value N2 is set to 1, and if the difference is not within the normal load difference interval, the second CPU load value N2 is set to-1.

In this embodiment, the feature data contents are sorted according to the extraction importance, the index identifiers corresponding to the feature data contents are also sorted according to the extraction importance, and after the number of the index identifiers is determined, if the number of the index identifiers is 4, the feature data contents corresponding to the first four index identifiers sorted from big to small according to the extraction importance are used as feature extraction data, so that the data processing capacity of the second CPU and the occupation rate of data extraction are greatly reduced.

In the present embodiment, when the usage evaluation degree is higher, it is indicated that the current second CPU operating pressure is smaller to satisfy the data processing amount.

In this embodiment, the interactive data is divided by obtaining the data bit length of the interactive data, if the current partition amount is 10, the interactive data is divided into 10 data partitions, each data partition includes summer-time, and the partition amount of the data partition is set according to the relationship between the data bit length of the interactive data and the preset data bit length, so that the interactive data is split, the execution efficiency of storing the data in the memory is improved, and the storage space of the memory is saved.

In this embodiment, the memory is divided into a first storage space, a second storage space and a third storage space, and the data is stored in the corresponding storage spaces in a partitioned manner according to the sequency of the average data heat value, so as to ensure the data order.

In this embodiment, by allocating the space of the first storage space, the second storage space and the third storage space, the storage space of the interactive data in the memory is ensured to be enough, and when the extraction is performed, only the transfer between the memory and the CPU is performed, so that the extraction time is greatly shortened, and the storage and extraction efficiency is improved.

In some embodiments of the present application, as shown in fig. 2, a parasitic parameter compression extraction method is further included:

step S201: acquiring interaction data in a parasitic parameter extraction process, and preprocessing the interaction data;

step S202: based on an LZMA algorithm, compressing the preprocessed interactive data, and storing the compressed interactive data in a memory;

step S203: decompressing the interaction data and extracting preset interaction data, and calculating the use evaluation degree of the second CPU before extracting the preset interaction data, and determining feature extraction data in the preset interaction data according to the use evaluation degree.

In summary, the invention discloses a system and a method for extracting parasitic parameters by compression, wherein the system comprises: the acquisition module is used for acquiring the interaction data in the parasitic parameter extraction process and preprocessing the interaction data; the compression module is used for compressing the preprocessed interaction data based on an LZMA algorithm and storing the compressed interaction data in a memory; the extraction module is used for decompressing the interaction data and extracting preset interaction data, and is also used for calculating the use evaluation degree of the second CPU before extracting the preset interaction data, and determining feature extraction data in the preset interaction data according to the use evaluation degree.

According to the first conception, the interactive data is preprocessed, the data partition and partition quantity of the interactive data are determined according to the data bit length of the interactive data, the average data heat value in the data partition is calculated, the interactive data in the data partition are stored in corresponding storage spaces in the memory according to the average data heat value, the interactive data are divided into a plurality of data partitions, the execution efficiency of storing the data in the memory is improved, and the storage space of the memory is saved.

According to the second conception, the interactive data are decompressed and the required preset interactive data are extracted, the use evaluation degree of the second CPU is calculated according to the use condition and the extraction condition of the second CPU before the preset interactive data are extracted, the feature extraction data in the preset interactive data are determined according to the use evaluation degree, and important contents in the required preset interactive data are extracted under the condition that the performance of the second CPU and other instruction operation are not affected, so that the use efficiency and the extraction efficiency of the CPU are improved.

The foregoing is merely a preferred embodiment of the present application, and it should be noted that modifications and substitutions can be made by those skilled in the art without departing from the technical principles of the present application, and these modifications and substitutions should also be considered as being within the scope of the present application.

Claims

1. A parasitic parameter compression extraction system, comprising:

the extraction module is used for decompressing the interaction data and extracting preset interaction data, and also used for calculating the use evaluation degree of the second CPU before extracting the preset interaction data, and determining feature extraction data in the preset interaction data according to the use evaluation degree;

the extraction module comprises:

the computing sub-module is used for acquiring the extraction condition and the use condition when the second CPU extracts the interaction data, and computing the use evaluation degree of the second CPU according to the extraction condition and the use condition;

extracting and decompressing the compressed interactive data, setting index identification for the decompressed interactive data, and comprising:

the interactive text file comprises search indexes of a plurality of index elements, each index element is associated with index identifiers in one or more interactive text segments, one or more interactive text segments in the interactive text file can be detected by using the index element of one search index, and preset contents of the interactive data can be positioned by using the index identifier in each interactive text segment;

calculating the use evaluation degree of the second CPU according to the extraction condition and the use condition, wherein the method comprises the following steps:

the second CPU has a usage evaluation degree of:

；

wherein W is the usage evaluation degree of the second CPU, a1 is the weight coefficient in the case of extraction rate to the second CPU extraction, a2 is the weight coefficient in the case of extraction integrity to the second CPU extraction, and a1+a2=1; m1 is a weight coefficient in the use case of the second CPU by the second CPU occupancy, m2 is a weight coefficient in the use case of the second CPU by the second CPU load value, m1+m2=1, q1 is a weight coefficient of the use evaluation degree by the extraction case, q2 is a weight coefficient of the use evaluation degree by the use case, and q1+q2=1;

preprocessing the interaction data, including:

2. The parasitic parameter compression and extraction system of claim 1, wherein determining feature extraction data in the preset interaction data according to the usage evaluation degree comprises:

3. The parasitic parameter compression and extraction system of claim 2, wherein determining the number of index markers according to the relationship between the usage evaluation degree and a preset usage evaluation degree comprises:

4. The parasitic parameter compression extraction system of claim 1,

calculating an average data heat value in the data partition, arranging the average data heat value according to sequence, storing the data partition with the average data heat value larger than a preset data heat value threshold into a first storage space, and storing the data partition with the average data heat value smaller than the preset data heat value threshold into a second storage space, wherein the first storage space and the second storage space are arranged on a memory;

5. The parasitic parameter compression and extraction system of claim 4, comprising, prior to storing the compressed interaction data in the third storage space:

6. A parasitic parameter compression extraction method applied to the parasitic parameter compression extraction system as claimed in any one of claims 1 to 5, comprising:

decompressing the interaction data and extracting preset interaction data, and further calculating a use evaluation degree of a second CPU before extracting the preset interaction data, and determining feature extraction data in the preset interaction data according to the use evaluation degree;

extracting and decompressing the compressed interactive data, and setting an index identifier for the decompressed interactive data;

extracting corresponding interaction data in the first extraction sub-module according to the index identifier;

acquiring the extraction condition and the use condition of the second CPU when the interactive data are extracted, and calculating the use evaluation degree of the second CPU according to the extraction condition and the use condition;

the second CPU has a usage evaluation degree of:

；

preprocessing the interaction data, including: