WO2022170569A1

WO2022170569A1 - Data processing method and apparatus

Info

Publication number: WO2022170569A1
Application number: PCT/CN2021/076545
Authority: WO
Inventors: 黄永兵; 黄韬; 姜伟鹏
Original assignee: 华为技术有限公司
Priority date: 2021-02-10
Filing date: 2021-02-10
Publication date: 2022-08-18
Also published as: CN116368796A

Abstract

A data processing method and apparatus in the field of artificial intelligence. The method comprises: selecting a first reference data block on the basis of the similarity between a data block to be compressed and each pre-stored reference data block; obtaining a first model associated with the first reference data block; and obtaining a compression model according to the first model, compressing, on the basis of the compression model, the data block to be compressed to generate a compressed data block, and generating a compressed data packet comprising the compressed data block and compression model information. The data processing method reduces the data compression duration, and improves the data compression performance.

Description

Data processing method and device

technical field

The present application relates to the technical field of artificial intelligence, and in particular, to a data processing method and apparatus.

Background technique

Artificial Intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. In other words, artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and enable machines to perceive, reason and make decisions.

In recent years, schemes for data compression combined with AI models have emerged. However, since it takes time to train the AI model, the data compression scheme combined with the AI model often has the problem of long data compression time and poor compression performance.

SUMMARY OF THE INVENTION

The embodiment of the present invention provides a data processing method, which can reduce the data compression time and improve the data compression performance.

A first aspect of the embodiments of the present invention provides a data processing method, which is characterized by comprising: selecting a first reference data block based on the similarity between the first data block to be compressed and each pre-stored reference data block; a first model associated with a reference data block; based on the compression model, compress the first data block to generate a first compressed data block, and the compression model is obtained according to the first model; generate a first compressed data packet, the first compressed data The package includes the first compressed data block and compression model information, where the compression model information is obtained according to the compression model.

In a possible implementation manner of the first aspect of the present application, the first model may be directly stored in the form of the first model, or the first model may be a similar model of the first model and the difference between the similar model and the first model The data is stored in the form of data, wherein the first model can be obtained by processing the similar model of the first model and the difference data between the similar model and the first model.

In a possible implementation manner of the first aspect of the present application, the compression model is not necessarily equal to the first model, and in the case of using the first model to compress the first data block, the compression model is equal to the first model, or , in the case of compressing the first data block by using the model after the optimization or training of the first model, the compression model is the model after optimization or training.

In a possible implementation manner of the first aspect of the present application, selecting the first reference data block based on the similarity between the first data block to be compressed and each pre-stored reference data block includes, based on the first data block to be compressed The first reference data block is selected according to the content similarity with each pre-stored reference data block. The similarity between the first data block and the reference data block may be calculated based on various similarity algorithms, for example, the similarity algorithm includes a Jaccard similarity algorithm, a Cosine similarity algorithm, and the like.

By selecting the reference data block with the highest similarity with the data block to be compressed, and compressing the data block to be compressed based on the model corresponding to the reference data block, the training time of the model is reduced, thereby reducing the data compression time.

In a possible implementation manner of the first aspect of the present application, the first data block is compressed based on the compression model to generate the first compressed data block. Specifically, the compression model is used to predict the unit data in the first data block. Occurrence probability: According to the occurrence probability of the unit data in the first data block, the first data block is compressed to generate the first compressed data block. By predicting the occurrence probability of the unit data in the first data block based on the compression model, the training time of the model is reduced, and the data compression performance is improved.

In a possible implementation of the first aspect of the present application, the first data block is compressed based on the compression model to generate the first compressed data block, and the compression model is obtained according to the first model, specifically: using the first model As the compression model, the first data block is compressed to generate the first compressed data block. When the similarity between the first data block to be compressed and the first reference data block is high, for example, when the similarity between the first data block to be compressed and the first reference data block is greater than a threshold, the first model is used as the compression model , compressing the first data block without retraining the model, reducing the training time of the model and improving the data compression performance.

In a possible implementation manner of the first aspect of the present application, the compressed model information includes the first model or an identifier of the first model. By including the identifier of the first model in the compressed model information, the data volume of the compressed data packet is reduced, the storage space is saved, and the compression rate is improved.

In a possible implementation of the first aspect of the present application, the first data block is compressed based on the compression model to generate the first compressed data block, and the compression model is obtained according to the first model, specifically: using the second model As the compression model, the first data block is compressed to generate the first compressed data block, and the second model is obtained by training the first model using part of the data in the first data block.

By using part of the data in the first data block to train the first model to obtain the second model for compressing the first data block, the prediction accuracy of the model is improved, since the first data block and the first reference data block have Therefore, the training process takes less time and improves the data compression performance.

In a possible implementation of the first aspect of the present application, the compressed model information includes any one of the following: the second model; the identifier of the second model; the identifier of the first model, and the difference between the second model and the first model data. Wherein, in the case where the first data block is stored as a reference data block in association with the second model, an identifier of the second model may be included in the compressed model information. By including the identifier of the first model and the difference data between the second model and the first model in the compressed model information, the data volume of the compressed data packet is reduced, the storage space is saved, and the compression rate is improved.

In a possible implementation manner of the first aspect of the present application, the multiple reference data blocks include at least one of the following data blocks: compressed data blocks, data blocks of different data types, and data blocks of different time points in a period. The compressed data block may be a compressed data block within a recent period of time, wherein the length of the time period may be flexibly determined according to the size of the storage space. As a temporally adjacent data block to the to-be-compressed data block, it may have a greater similarity with the to-be-compressed data block. Different types of data include, for example, pictures, text, databases, etc., and data blocks at different time points in the cycle It is a data block at several time points in a cycle in a scenario where the data changes periodically. The number of stored reference data blocks can be flexibly determined according to the size of the storage space.

By selecting reference data blocks from the dimensions of time and space, it is easier to obtain reference data blocks that are more similar to the data blocks to be compressed, thereby reducing model training time.

In a possible implementation manner of the first aspect of the present application, the method provided in the first aspect of the present application further includes: storing the first data block as a reference data block, storing compression model information, and storing the stored first data block with the stored Compressed model information associative storage. Wherein, the compressed model itself can be stored as the compressed model information, or the difference data between the compressed model and other stored similar models and the identifier of the similar model can be stored as the compressed model information. In addition, storing the stored first data block in association with the stored compression model information means that when the stored first data block is found, the stored compression model can be further found, but it is not limited to associating the stored first data block with the stored compression model information stored together.

By updating the reference data block after each compression of data, it is convenient to provide reference data blocks with higher similarity for subsequent data blocks to be compressed.

In a possible implementation manner of the first aspect of the present application, the method provided by the first aspect of the present application is executed by a computing device, and the computing device includes a neural network processor, and the neural network processor is based on a compression model. The data block is compressed to generate a first compressed data block.

By using the neural network processor to process the running of the first model, the model compression process is accelerated, and the duration of data compression is reduced.

In a possible implementation manner of the first aspect of the present application, the method provided by the first aspect of the present application further includes: obtaining a first compressed data packet; obtaining a compression model according to compression model information in the first compressed data packet; The compression model decompresses the first compressed data block in the first compressed data packet to obtain the first data block.

A second aspect of the present application provides a data processing apparatus, including a processing unit and a storage unit, the storage unit stores executable code, and the processing unit is configured to execute the executable code to achieve: based on the first data block to be compressed and the pre-stored the similarity of each reference data block, select the first reference data block; obtain the first model associated with the first reference data block; compress the first data block based on the compression model, generate the first compressed data block, and compress the model To obtain according to the first model; generate a first compressed data packet, the first compressed data packet includes a first compressed data block and compression model information, and the compression model information is obtained according to the compression model.

In a possible implementation manner of the second aspect of the present application, the processing unit is configured to execute executable code to compress the first data block based on the compression model to generate the first compressed data block. Specifically, the processing unit uses in executing the executable code to achieve: predicting the occurrence probability of the unit data in the first data block based on the compression model; compressing the first data block according to the occurrence probability of the unit data in the first data block to generate a first compressed data block.

In a possible implementation manner of the second aspect of the present application, the processing unit is configured to execute the executable code to compress the first data block based on the compression model to generate the first compressed data block, and the compression model is based on the first data block. The obtaining of the model is specifically: the processing unit is configured to execute the executable code to implement using the first model as the compression model, compress the first data block, and generate the first compressed data block.

In a possible implementation manner of the second aspect of the present application, the processing unit is configured to execute the executable code to compress the first data block based on the compression model to generate the first compressed data block, and the compression model is based on the first data block. The model is obtained, specifically: the processing unit is configured to execute the executable code to implement the use of the second model as the compression model, compress the first data block, and generate the first compressed data block, and the second model uses the first data block. Part of the data is obtained by training the first model.

In a possible implementation manner of the second aspect of the present application, the processing unit is further configured to execute executable codes to implement: storing the first data block as a reference data block, storing compression model information, and storing the first data block stored with the storage Compressed model information for associative storage.

In a possible implementation manner of the second aspect of the present application, the processing unit includes a neural network processor; and the processing unit is configured to execute executable codes to implement compression of the first data block based on the compression model to generate the first compressed data piece. Specifically, the neural network processor is configured to execute the executable code to compress the first data block based on the compression model to generate the first compressed data block.

In a possible implementation manner of the second aspect of the present application, the processing unit is further configured to execute executable code to achieve: obtaining the first compressed data packet; obtaining the compression model according to the compression model information in the first compressed data packet; Decompress the first compressed data block in the first compressed data packet by using the compression model to obtain the first data block.

A third aspect of the present application provides a data processing apparatus, comprising: a selection unit for selecting a first reference data block based on the similarity between the first data block to be compressed and each pre-stored reference data block; an acquisition unit for obtaining a first model associated with the first reference data block; a compression unit, configured to compress the first data block based on the compression model to generate a first compressed data block, and the compression model is obtained according to the first model; the generating unit, It is used to generate a first compressed data packet, where the first compressed data packet includes a first compressed data block and compression model information, and the compression model information is obtained according to the compression model.

In a possible implementation manner of the third aspect of the present application, the compression unit is specifically configured to: predict the occurrence probability of the unit data in the first data block based on the compression model; according to the occurrence probability of the unit data in the first data block , compress the first data block to generate the first compressed data block.

In a possible implementation manner of the third aspect of the present application, the compression unit is specifically configured to: use the first model as the compression model, compress the first data block, and generate the first compressed data block.

In a possible implementation manner of the third aspect of the present application, the compression unit is specifically configured to: adopt the second model as the compression model, compress the first data block, and generate the first compressed data block, and the second model uses the first data block as the compression model to compress the first data block. Part of the data in a data block is obtained by training the first model.

In a possible implementation manner of the third aspect of the present application, the apparatus provided in the third aspect of the present application further includes: a storage unit, configured to store the first data block as a reference data block, store compression model information, and store the first data block as a reference data block. Data blocks are stored in association with stored compression model information.

In a possible implementation manner of the third aspect of the present application, the apparatus provided by the third aspect of the present application includes a neural network processor; the compression unit is deployed in the neural network processor.

In a possible implementation manner of the third aspect of the present application, the obtaining unit is further configured to obtain the first compressed data packet, and obtain the compression model according to the compression model information in the first compressed data packet. The third aspect of the present application provides The apparatus further includes a decompression unit, configured to decompress the first compressed data block in the first compressed data packet by using the compression model to obtain the first data block.

A fourth aspect of the present application provides a computer-readable storage medium on which a computer program is stored, wherein when the computer program is executed in a computer or a processor, the computer or processor is caused to execute the above-mentioned first the method described in the aspect.

A fifth aspect of the present application provides a computer program product, characterized in that, when the computer program product runs in a computer or a processor, the computer or processor is caused to execute the method described in the first aspect.

Description of drawings

In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention, and for those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.

Fig. 1 shows a schematic diagram of an artificial intelligence main frame;

2 is a schematic diagram of a system architecture provided by an embodiment of the present invention;

Fig. 3 is the structural schematic diagram of RNN model;

Fig. 4 is the model prediction schematic diagram of RNN model;

5 is a hardware structure diagram of an NPU chip provided by an embodiment of the present invention;

FIG. 6 is an architecture diagram of a storage system provided by an embodiment of the present invention;

7 is a flowchart of a data compression method provided by an embodiment of the present invention;

8 is a schematic diagram of a data block-model pre-stored in a storage node;

9 is a schematic diagram of a process of data compression based on an AI model;

FIG. 10 is a schematic diagram of a process of encoding a data block by a compression module;

Figure 11 is a schematic diagram of the process of data decompression based on the AI model;

FIG. 12 is an architectural diagram of a data processing apparatus according to an embodiment of the present invention;

13 is an architectural diagram of a data processing apparatus provided by an embodiment of the present invention;

14 is a schematic structural diagram of a data processing apparatus provided by an embodiment of the present invention;

FIG. 15 is a terminal-cloud system architecture provided by an embodiment of the present invention.

Detailed ways

The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

Figure 1 shows a schematic diagram of an artificial intelligence main frame, which describes the overall workflow of an artificial intelligence system and is suitable for general artificial intelligence field requirements.

The above-mentioned artificial intelligence theme framework will be elaborated below from the two dimensions of "intelligent information chain" (horizontal axis) and "Internet Technology (IT) value chain" (vertical axis).

The "intelligent information chain" reflects a series of processes from data acquisition to processing. For example, it can be the general process of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision-making, intelligent execution and output. In this process, data has gone through the process of "data-information-knowledge-wisdom".

The "IT value chain" from the underlying infrastructure of artificial intelligence, information (providing and processing technology implementation) to the industrial ecological process of the system reflects the value brought by artificial intelligence to the information technology industry.

The artificial intelligence main frame includes the following main components:

(1) Infrastructure

The infrastructure provides computing power support for artificial intelligence systems, realizes communication with the outside world, and supports through the basic platform. The infrastructure includes: sensors for communicating with the outside; intelligent chips (Central processing unit (CPU), graphics processing unit (Graphics processing unit, GPU), neural-network process unit (Neural-network process unit, NPU), Application Specific Integrated Circuit (ASIC), Field Programmable Gate Array (Field Programmable Gate Array, FPGA) and other hardware acceleration chips), used to provide computing power; basic platform, including distributed computing framework and Network and other related platform guarantees and support can include cloud storage and computing, interconnection networks, etc. For example, the sensor communicates with the outside to obtain data, and provides the data to the intelligent chip in the distributed computing system provided by the basic platform for calculation.

(2) Data

The data on the upper layer of the infrastructure is used to represent the data sources in the field of artificial intelligence. The data involves graphics, images, voice, and text, as well as IoT data from traditional devices, including business data from existing systems and sensory data such as force, displacement, liquid level, temperature, and humidity.

(3) Data processing

Data processing usually includes data training, machine learning, deep learning, search, reasoning, decision-making, etc.

Among them, machine learning and deep learning can perform symbolic and formalized intelligent information modeling, extraction, preprocessing, training, etc. on data.

Reasoning refers to the process of simulating human's intelligent reasoning method in a computer or intelligent system, using formalized information to carry out machine thinking and solving problems according to the reasoning control strategy, and the typical function is search and matching.

Decision-making refers to the process of making decisions after reasoning about intelligent information, usually providing functions such as classification, sorting, and prediction.

(4) General ability

After the data is processed as mentioned above, some general capabilities can be formed based on the results of the data processing, which can be algorithms or general systems, such as translation, text analysis, computer vision processing, speech recognition, and image recognition. and many more.

(5) Smart products and industry applications

Intelligent products and industry applications refer to the products and applications of artificial intelligence systems in various fields. They are the encapsulation of the overall artificial intelligence solution, and the productization of intelligent information decision-making and implementation of applications. Its application areas mainly include: intelligent manufacturing, intelligent transportation, Smart home, smart medical care, smart security, autonomous driving, safe city, smart terminals, etc.

Referring to FIG. 2 , an embodiment of the present invention provides a system architecture 200 . The data collection device 260 is used to collect AI model sample data and store it in the database 230 , and the training device 220 generates the target model/rule 201 based on the sample data maintained in the database 230 .

The AI model includes, for example, a neural network model.

The neural network model is a network structure that imitates the behavioral characteristics of animal neural networks for information processing, and is also referred to as artificial neural networks (ANN). The neural network model includes, for example, at least one of a variety of neural network models such as Convolutional Neural Network (CNN), Deep Neural Networks (DNN), and Recurrent Neural Network (RNN). A sort of. The structure of the neural network model is composed of a large number of nodes (or neurons) connected to each other, and the purpose of processing information is achieved by learning and training the input information based on a specific operation model. A neural network model includes an input layer, a hidden layer and an output layer. The input layer is responsible for receiving input signals, the output layer is responsible for outputting the calculation results of the neural network, and the hidden layer is responsible for learning, training and other computing processes. It is the memory unit of the network. The memory function is represented by a weight matrix, usually each neuron corresponds to a weight coefficient.

The work of each layer in a neural network model can be expressed mathematically

to describe which,

is the input vector of the layer, y is the output value (or output vector) of the layer, and a, W, and b are the model parameters included in the layer. The input vector of the input layer of the model is the input feature vector of the model, each element in the input feature vector is the feature value of the object to be predicted, and the output value output by the output layer of the model is the predicted value of the model, and the predicted value indicates The prediction result of the object to be predicted. From the physical level, the work of each layer in the neural network model can be understood as completing the transformation from the input space to the output space through five operations on the input space (set of input vectors). These five operations include: 1. Dimension/reduction; 2. Zoom in/out; 3. Rotation; 4. Translation; 5. "Bending". Among them, the operations of 1, 2, and 3 are determined by

Complete, the operation of 4 is completed by +b, and the operation of 5 is realized by a(). The reason why the word "space" is used here is because the object to be classified is not a single thing, but a type of thing, and space refers to the collection of all individuals of this type of thing. Among them, W is the weight vector, and each value in the vector represents the weight value of a neuron in the neural network of this layer. The vector W determines the space transformation from the input space to the output space described above, that is, the weight W of each layer controls how the space is transformed. The purpose of training the neural network model is to finally obtain the weight matrix of all layers of the trained neural network (the weight matrix formed by the vectors W of many layers). Therefore, the training process of the neural network is essentially learning the way to control the spatial transformation, and more specifically, learning the weight matrix.

In the process of training the neural network model, the training device 220 can compare the predicted value of the current network with the desired target value, and update the weight vector of each layer of the neural network according to the difference between the two, so as to make the neural network The output of the model is as close as possible to the value that is really expected to be predicted. For example, if the predicted value of the neural network model is larger than the target value, the weight vector of the model is adjusted so that the predicted value of the model decreases, and vice versa, and so on continuously until the desired target value is obtained. (ie ground truth or label value) target model/rule 201. For this purpose, a loss function or objective function can be predefined, which are important equations for measuring the difference between the predicted value and the target value. Among them, taking the loss function as an example, the higher the output value of the loss function (loss), the greater the difference, then the training of the neural network model becomes the process of reducing the loss as much as possible.

The target model/rule 201 obtained by training the device 220 can be applied in different systems or devices. In FIG. 2 , the execution device 210 is configured with an I/O interface 212 for data interaction with external devices, and a “user” can input data to the I/O interface 212 through the client device 240 .

The execution device 210 can call data, codes, etc. in the data storage system 250 , and can also store data, instructions, etc. in the data storage system 250 .

The calculation module 211 uses the target model/rule 201 to process the input data to output the processing result. Finally, the I/O interface 212 returns the processing result to the client device 240, which is provided to the user. The execution device 210 may also include an associated function module (the associated function module 213 and the associated function module 214 are schematically shown in FIG. 2 ), which may perform associated processing based on the processing result of the computing module 211 to output the processing result. associated results.

More deeply, the training device 220 can generate corresponding target models/rules 201 based on different data for different targets, so as to provide users with better results.

In the case shown in FIG. 2 , the user may manually specify input data in execution device 210 , eg, operating in an interface provided by I/O interface 212 . In another case, the client device 240 can automatically input data to the I/O interface 212 and obtain the result. If the client device 240 automatically inputs data and needs to obtain the user's authorization, the user can set the corresponding permission in the client device 240 . The user can view the result output by the execution device 210 on the client device 240, and the specific presentation form can be a specific manner such as display, sound, and action. The client device 240 can also act as a data collection terminal to store the collected sample data in the database 230.

It is worth noting that FIG. 2 is only a schematic diagram of a system architecture provided by an embodiment of the present invention, and the positional relationship between the devices, devices, modules, etc. shown in the figure does not constitute any limitation. For example, in FIG. 2, the data storage The system 250 is an external memory relative to the execution device 210 , and in other cases, the data storage system 250 may also be located in the execution device 210 . In addition, the training device 220 and the execution device 210 may be the same computing device. For example, the training device 220 is a platform server. After the target model 201 is trained, the platform server acts as the execution device 210 to provide services for users. Processing services.

The embodiments of the present invention relate to a solution for lossless data compression combined with an AI model. Data lossless compression refers to a technical method that reduces the amount of original data to reduce storage space and improve the efficiency of transmission, storage and processing without losing useful information. The basic idea of the data compression algorithm is to replace the repeated characters in the data with shorter coding symbols; the higher the probability of characters appearing, the shorter the coding symbols. In the lossless data compression scheme, the occurrence probability of the unit data in the original data block is predicted by the AI model, so that the original data block can be compressed based on the probability output by the AI model through the entropy coding algorithm. The AI model may be, for example, any one of the above-mentioned models such as the CNN model, the DNN model, the RNN model, and the like. Among them, the RNN model is more suitable for processing sequence data. For example, the RNN model can predict the occurrence probability of the unit data arranged at the rear based on the unit data arranged at the front in the data block. Therefore, compared with models such as the CNN model, the RNN model can with higher accuracy. In the embodiments of the present invention, an RNN model will be used as an example for description, wherein the RNN model includes many extended models, such as a long short-term memory (Long Short-Term Memory, LSTM) model, a gate recurrent unit network (Gate Recurrent Unit, GRU) Models, etc. These extended models can also be called RNN models.

Figure 3 is a schematic diagram of the structure of the RNN model. Referring to FIG. 3 , the RNN model is also a neural network model, including an input layer, a multi-layer hidden layer (one layer is schematically shown in FIG. 3 ) and an output layer, and each neural layer includes a plurality of neurons. Different from the CNN model and other models, the neurons in the hidden layer of the RNN model are not only fully connected to the previous layer, but also include arrows pointing to themselves and arrows pointing to other neurons in this layer. The arrow pointing to itself and the arrow pointing to other neurons indicate that the neuron also acquires the memory of the model about the previous input data when performing the current calculation, so as to calculate the output data based on the current input data and the previous input data.

FIG. 4 is a schematic diagram of model prediction of the RNN model. Referring to Figure 4, x _t-1 , x _t , x _t+1 are the model inputs at time t-1, time t and time t+1, respectively, o _t-1 , o _t , o _t+1 are respectively at time t-1 Model output at time t-1, time t and time t+1, in addition, S _t-1 , S _t , and S _t+1 are the model memory at time t-1, time t and time t+1, respectively. Among them, the model memory S _t can be calculated and obtained by the following formula (1):

S _t =f(U*x _t +W*S _t-1 ) (1)

Among them, U and W are model parameters, and the f() function is an activation function in the neural network, such as a tanh function.

The model output o _t is calculated and obtained by the following formula (2):

o _t =softmax(VS _t ) (2)

where V is the model parameter. Combining the above formula (1) and formula (2), it can be seen that the output o _t of the RNN model relative to the input x _t not only depends on the input x _t , but also depends on the model memory S _{t-1 at time t-1} , That is, it is equivalent to also depend on the model input before the input x _t , such as x _t-1 .

For example, in the scenario where the RNN model is used to predict the occurrence probability of characters in a sentence, when each character in a sentence (for example, "I am Chinese") is input to the RNN model in turn, the RNN model will not be used for each character input. After that, the probability of the next word will be predicted based on the previously entered word. After inputting "I am China" to the RNN model, the RNN model will predict the probability of the next word as "person" based on the correlation with the previous word. . Specifically, the occurrence probability of the "person" can be expressed as P(person|me, yes, China, country) in a mathematical formula, indicating that the probability of the word "person" depends on "me", "yes", " The probability of the four words "中" and "country", the RNN model will calculate the probability of "people" based on the previous "I am China" according to the pre-trained parameters U, V, W.

With reference to the above description, the RNN model can be trained based on a loss function, which includes the parameters U, V, and W, so that the values of the parameters U, V, and W can be adjusted based on the loss function to make the output of the RNN model (i.e. the predicted value) is continuously decreasing relative to the label value. For example, in order to use the RNN model to predict the occurrence probability of each word in a piece of text, multiple sentences can be obtained from the piece of text first to generate multiple training samples to train a probability prediction model corresponding to the piece of text. For example, if the sentence "I am Chinese" is included in the text, the true probability of occurrence of the word "人" after "I am Chinese" in the text can be found in advance as the label value, and "I am Chinese" can be used as the label value. The five words "Chinese" are input into the RNN model in turn, and the RNN model outputs the predicted value of the occurrence probability of the word "人", and substitutes the predicted value of the occurrence probability of the word "人" and the label value into the loss function to adjust the parameters U, V, The value of W makes the prediction value of the RNN model for the word "people" in "I am Chinese" closer to the label value. The RNN model is trained multiple times by using multiple samples to make the model converge, so that the probability of occurrence of each word in the text can be predicted by sequentially inputting each word in the text into the trained RNN model.

FIG. 5 is a hardware structure diagram of an NPU chip provided by an embodiment of the present invention. Both the model prediction and model training of the RNN model described above with reference to FIG. 4 can be implemented in the NPU chip shown in FIG. 5 .

As shown in Figure 5, the neural network processor NPU is mounted on the main CPU as a co-processor, and the main CPU assigns tasks. The core part of the NPU is the arithmetic circuit 503, which is controlled by the controller 504 to extract the matrix data in the memory and perform multiplication operations.

In some implementations, the arithmetic circuit 503 includes multiple processing units (Process Engine, PE). In some implementations, arithmetic circuit 503 is a two-dimensional systolic array. The arithmetic circuit 503 may also be a one-dimensional systolic array or other electronic circuitry capable of performing mathematical operations such as multiplication and addition. In some implementations, arithmetic circuit 503 is a general-purpose matrix processor.

The bus interface unit (Bus Interface Unit, BIU) 510 is used for the interaction between the bus and the memory unit access controller (Direct Memory Access Controller, DMAC) 505 and the instruction fetch memory (Instruction Fetch Buffer) 509.

The instruction fetch memory 509 obtains instructions from the external memory through the bus interface unit 510, and the DMAC 505 obtains the original data of the input matrix A or the weight matrix B from the external memory through the bus interface unit 510.

The DMAC 505 is mainly used to transfer the input data in the external memory DDR to the unified memory 506, or to transfer the weight data (for example, the weight matrix B) to the weight memory 502, or to transfer the input data (for example, the input matrix A) to the input memory 501.

For example, the operation circuit 503 can read the corresponding data of the weight matrix B from the weight memory 502 and buffer it on each PE in the operation circuit 503 . The operation circuit 503 reads the data of the input matrix A and the weight matrix B from the input memory 501 to perform matrix operation, and stores the partial result or the final result of the matrix C outputted by the operation in the accumulator 508 . Unified memory 506 may be used to store input data and/or output data.

The vector calculation unit 507 includes a plurality of operation processing units, and further processes the output of the operation circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison, etc., if necessary. The vector calculation unit 507 is mainly used for non-convolutional/full connection layer (Full connecting layer, FCL) network calculation in the neural network, such as pooling (Pooling), batch normalization (Batch Normalization), local response normalization (Local Response normalization) Response Normalization) and so on.

In some implementations, vector computation unit 507 may store the processed output vectors to unified buffer 506 . In some implementations, vector computation unit 507 may apply a nonlinear function to the output of arithmetic circuit 503 to generate activation values. In some implementations, vector computation unit 507 generates normalized values, merged values, or both. In some implementations, the output vector processed by the vector computing unit 507 can be used as an activation input to the arithmetic circuit 503, eg, for use in subsequent layers in a neural network.

An instruction fetch buffer 509 connected to the controller 504 is used to store instructions used by the controller 504 . The unified memory 506, the input memory 501, the weight memory 502 and the instruction fetch memory 509 are all on-chip memories. External memory is independent of the NPU hardware architecture. The operation of each layer in the RNN model shown in FIG. 3 may be performed by the operation circuit 503 or the vector calculation unit 507 .

In the technical solution of using an AI model to compress data losslessly, model training takes a long time, which makes data compression more time-consuming and has poor compression performance.

To this end, an embodiment of the present invention proposes a data compression scheme, which utilizes the correlation of data blocks in time and/or space dimensions, and uses the AI prediction model of similar data blocks as the initial model of the current data block to be compressed, so as to be used for Data compression is performed. The data compression scheme of the embodiment of the present invention saves the model training time of the AI model and shortens the time-consuming of data compression.

The data compression solutions in the embodiments of the present invention can be applied to storage systems and/or storage products, for example, small data block compression can be used in scenarios such as databases and virtualization. It can be understood that the solutions of the embodiments of the present invention are not limited to being used in storage systems, but can be applied to any other scenarios that require data compression. compression. The following description will take the storage system as an example.

FIG. 6 is an architecture diagram of a storage system provided by an embodiment of the present invention. As shown in FIG. 6 , the storage system includes a computing node 61 , a storage node 62 and a storage medium 63 . The computing node 61 and the storage node 62 may be physical servers, or may also be virtual entities, such as virtual machines and containers, which are abstracted based on general hardware resources. The storage medium 63 is, for example, a storage medium such as a solid state disk (Solid State Disk, SSD), a hard disk drive (HDD), a storage class memory (Storage class memory, SCM), and the storage medium 63 may be a storage medium The storage medium local to the node may also be a distributed storage medium connected to the storage node.

The computing node 61 may perform data access to the storage node 62, such as writing data, reading data, and the like. Specifically, the computing node 61 can send a write request to the storage node 62 to write data. The data to be written in the write request can be, for example, various types of data such as images, databases, and texts, and the data to be written is also the data to be compressed. . After receiving the write request, the storage node 62 calculates the similarity between the data to be written in the write request and each pre-stored reference data block through the similarity analysis module 621, and selects a reference data block with the highest similarity, and obtains a reference data block with the highest similarity. The selected reference data block is associated with the AI model (ie, the probability prediction model), and the AI model is provided to the probability prediction module 622 as the initial AI model of the data to be written. After receiving the initial AI model, the probability prediction module 622 predicts the occurrence probability of unit data (eg, characters) in the data to be written based on the initial AI model, and outputs the occurrence probability of each character to the compression module 623 . The compression module 623 compresses the data to be written based on the occurrence probability of each character in the data to be written, obtains a compressed data packet of the data to be written, and stores the compressed data packet in the storage medium, except that in the compressed data packet In addition to the compressed data of the data to be written, it also includes relevant information of the AI model used to compress the data to be written. After storing the compressed data of the data to be written into the storage medium, the storage node records the correspondence between the data to be written and the storage address of the compressed data, so that the computing node can read the data to be written. The storage node 62 may also include a decompression module 624. When the computing node 61 needs to read and write data, the storage node 62 may read the compressed data packet from the storage medium 63, and obtain the data for decompression based on the compressed data packet. Based on the AI model, the probability prediction module 622 and the decompression module 624 decompress the compressed data based on the AI model, so as to obtain the written data and return it to the computing node 61 . The above-mentioned similarity analysis module 621, probability prediction module 622, compression module 623 and decompression module 624 may be in the form of software, hardware or firmware.

In an implementation manner, the similarity analysis module 621 , the compression module 623 and the decompression module 624 may be deployed in the main CPU in FIG. 5 , that is, the similarity analysis module 621 , the compression module 623 and the decompression module are executed by the main CPU. The operation performed by the compression module 624, the probability prediction module 622 may be deployed in the NPU in FIG. 5, that is, the operation performed by the probability prediction module 622 is performed by the NPU. It can be understood that the similarity analysis module 621, the compression module 623 and the decompression module 624 are not limited to be deployed in the main CPU in FIG. 5, and the probability prediction module 622 is not limited to be deployed in the NPU in FIG. The specific device configuration is deployed.

It can be understood that although characters are used as an example of the unit data, the unit data is not limited to characters, for example, the unit data may be an image or part of data in a database. What is common to data such as images, databases, and texts is that they are all represented by binary numbers in the storage system. Therefore, unit data in these different types of data can be uniformly set as binary numbers with a predetermined length. The predetermined length is, for example, one byte, that is, 8 bits, so that the total number of the unit data is 256 in total from 0 to 255. It can be understood that the predetermined length is not limited to one byte, but can be set according to the needs of specific scenarios, for example, it can also be set to two bytes, etc. According to different unit data lengths, the total number of unit data is also Change accordingly. By setting the unit data in this way, different types of data can be processed consistently by the same system.

Each step of the data compression scheme of the embodiment of the present invention will be described in detail below.

FIG. 7 is a flowchart of a data compression method provided by an embodiment of the present invention, and the method includes:

Step S701, selecting a reference data block from a plurality of reference data blocks based on the similarity between the data block to be compressed and each pre-stored reference data block;

Step S702, acquiring the AI model associated with the selected reference data block;

Step S703, compressing the data block based on the compression model to generate a compressed data block, the compression model is obtained according to the AI model associated with the selected reference data block;

Step S704, generating a compressed data package, where the compressed data package includes compressed data blocks and compression model information.

The method shown in FIG. 7 may be performed by the storage node 62 shown in FIG. 6 . After receiving the write request from the computing node 61, the storage node 62 performs data compression on the to-be-written data of the write request by using the method shown in FIG. 7 . The storage node 62 can compress the to-be-written data of the write request as a data block, or the storage node 62 can also divide the to-be-written data into multiple data blocks of a specified size (for example, 8KB or 32KB, etc.) Data blocks are compressed individually.

First, in step S701, a reference data block is selected from the plurality of reference data blocks based on the similarity between the data block to be compressed and each pre-stored reference data block.

Step S701 can be performed by the similarity analysis module 621 in FIG. 6 . The storage node 62 may pre-store multiple reference data blocks and an AI model associated with each reference data block locally or in a storage space such as a storage medium or the cloud, where the AI model is a probability prediction model for the associated reference data block. FIG. 8 is a schematic diagram of a reference data block-model pre-stored in a storage node. As shown in FIG. 8 , multiple reference data block-model groups can be pre-stored. FIG. 8 schematically shows the first group, the second group and the third group, each group including a specific type of reference data block, for example, the first group, the second group and the third group are schematically shown. One group includes the most recently compressed data blocks as reference data blocks, the second group includes data blocks of different types of data as reference data blocks, and the third group includes data blocks at different time points in the cycle as reference data blocks. Among them, the most recently compressed data block, as a temporally adjacent data block to the to-be-compressed data block, may have a greater similarity with the to-be-compressed data block. Therefore, in the first group, the most recently compressed data block can be Stored as a reference data block. The different types of data include, for example, pictures, texts, databases, etc. The network structures, levels, and weights of AI models corresponding to data blocks of different types of data are quite different, and the AI models corresponding to data blocks of the same type of data are similar. Therefore, data blocks of different types of data can be stored as reference data blocks in the second group. The data blocks at different time points in the cycle are data blocks at several time points in a cycle in a scenario where data changes periodically. For example, the computing node 61 may correspond to a food delivery application (Application, APP), and the APP can write transaction data to the storage node 62, and the transaction data is periodically cycled in units of days. The data blocks of transaction data at three time points in the evening are data blocks that are quite different from each other, and the transaction data in the morning of different days may have a high degree of similarity. , The data blocks at three time points in the evening are stored as reference data blocks. In each group in FIG. 8, a rectangular box is used to represent a reference data block, and the numbers in the rectangular box represent the number of the reference data block, such as reference data block 1, reference data block 2, and the square box is used to represent the reference data block 1 and the reference data block 2. The AI model associated with the data block, the number in the box is used to represent the number of the AI model, such as AI model 1, AI model 2, where, for example, the reference data block and AI model with the same number are the associated reference data block-AI A model pair, for example, reference data block 1 is associated with AI model 1. The AI model associated with the reference data block may be directly stored in the form of the AI model, or may be stored in the form of a similar model of the AI model and the difference data between the similar model and the AI model. A model is, for example, a model associated with another reference data block having a higher degree of similarity to the reference data block.

It can be understood that the grouping of reference data blocks in the above is only illustrative, rather than restrictive, and different reference data block groups can be set according to different storage scenarios. Wherein, in the storage node 62, from the dimension of time and/or space, a data block that may have a greater similarity with the data block to be compressed is stored as a reference data block.

After each data block is compressed, the storage node 62 may update the pre-stored reference data block. For example, after each data block is compressed, the reference data blocks and corresponding models in the first group in Figure 8 are updated so that a certain number (eg, 5-10) of the most recently compressed data blocks remain in the group. After each data block is compressed, the reference data blocks of the same type in the second group are updated. After each data block is compressed, the reference data block at the same time point in the cycle in the third group is updated.

For the original data block to be compressed, the similarity analysis module 621 may select a reference data block with higher similarity based on the similarity between the original data block and the reference data block (eg, content similarity, data similarity, etc.). The similarity analysis module 621 can calculate the similarity between the original data block and the reference data block based on various similarity algorithms, such as Jaccard similarity algorithm, Cosine similarity algorithm, etc. Wait. Wherein, the Jaccard similarity J(A, B) can be calculated by the following formula (3):

Among them, the symbol ∩ means to obtain the intersection of set A and set B, and the symbol ∪ means to obtain the union of set A and set B. For example, if the original data block is abc and the reference data block is bcd, the similarity between the original data block and the reference data block is (b,c)/(a,b,c,d)=2/4=0.5.

In one embodiment, the similarity analysis module 621 may calculate the similarity between the original data block and each reference data block, so as to select the reference data block with the highest similarity. In another embodiment, the similarity analysis module 621 may select a corresponding reference data block group based on the characteristics of the original data block, and calculate the similarity between the original data block and each reference data block in the reference data block group, Thus, the reference data block with the highest similarity in the group is selected. For example, if the data block to be compressed is text-type data, the similarity analysis module 621 may select the reference data block with the highest similarity with the original data block from the second group in FIG. 8 .

Step S702, acquiring the AI model associated with the selected reference data block.

After selecting the reference data block, the similarity analysis module 621 can obtain the AI model corresponding to the reference data block based on the association relationship between the reference data block and the AI model, and provide the AI model as the initial AI model of the original data block. to the probability prediction module 622.

In step S703, the data block is compressed based on the compression model obtained according to the AI model associated with the selected reference data block to generate a compressed data block.

First, a compression model can be obtained from the probability prediction module 622 in FIG. 6 based on the initial AI model, and the occurrence probability of unit data in the data block to be compressed can be predicted by the compression model. In the case where the probability prediction module 622 is in the form of software, the probability prediction module 622 is the code for realizing the probability prediction of the unit data in the data block based on the above-mentioned initial AI model. The code can be processed by the NPU in Figure 5 to perform this step.

In one case, when the similarity between the selected reference data block and the original data block to be compressed is greater than or equal to a predetermined threshold, the initial AI model can be directly used as the corresponding original data block. AI models (i.e. compressed models) to make probabilistic predictions on this raw block of data.

In the case where the similarity between the selected reference data block and the data block to be compressed is less than a predetermined threshold, the initial AI model can be further trained by obtaining partial data from the original data block until the model converges, and the training obtained The AI model is used as a compressed model to make probabilistic predictions on this raw data block. For the model training process, reference may be made to the above description of the training process of the RNN model. Specifically, in the storage node 62, part of the data in the original data block can be obtained by the NPU; a predetermined number of unit data in the partial data are input into the initial AI model to output the lower part of the predetermined number of unit data. The probability of occurrence of a unit of data. Afterwards, the storage node 62 can train the initial AI model through the NPU according to the output of the initial AI model and the actual occurrence probability (ie the label value) of the next unit data of the predetermined number of unit data, so as to Get a trained AI model. In this case, since the reference data block corresponding to the initial AI model has a certain degree of similarity with the original data block, the model training will quickly make the model converge, so it takes less model training time.

After determining the compression model for compressing the original data block to be compressed, the storage node 62 may use the compression model through the NPU to perform prediction of the occurrence probability of each unit data in the original data block, so as to be used based on each The occurrence probability of the unit data performs data compression on the original data block.

FIG. 9 is a schematic diagram of a process of data compression based on a compression model.

Assume that the length of the unit data in the data block is 1 byte. As mentioned above, since 1 byte has 8 bits in total, 1 byte can represent 256 unit data from 0 to 255. Assuming that the data block is text, and the 256 units of data are 256 characters. The text data block will be used as an example for description below. It can be understood that if the data block is an image and other types of data blocks, the processing process is the same. of.

As shown in Figure 9, when compressing the original data block, the first byte (ie, the first character) in the data block is first input into the compression module, which compresses the data based on, for example, an arithmetic coding algorithm, so The compression module needs to obtain the occurrence probability of each character in the data block to perform data compression based on the occurrence probability. Since there are no other characters before the first character in the data block, the AI model cannot predict the probability that the first character depends on the previous characters. Therefore, it can be preset that 256 characters have an average probability of occurrence ( That is, 1/256), so the compression module can encode the first character in the data block based on this average probability. It can be understood that this embodiment of the present invention is not limited to the above. In another implementation manner, in a predetermined type of data block, the occurrence probability of the 256 characters may be different. Therefore, the data block may be The first character is input into the compression model, and the occurrence probability of each character of 256 characters is output by the compression model, and the probability is output to the compression module. The compression module may determine the occurrence probability of the first character based on the respective occurrence probability of the 256 characters, and encode the first character based on the occurrence probability.

While the compression module encodes the first character, the first character is also input to the compression model, so that the compression model predicts the respective 256 characters in the second byte of the original data block based on the first character. Depending on the appearance probability of the first character, after performing probability prediction, the compression model outputs the predicted appearance probability of each character corresponding to the second byte depending on the first character to the compression module. After the first character in the data block is input to the compression module, the second character in the data block is input into the compression module, and the compression module determines the second character in the data block based on the predicted probability of each character received from the compression model The probability of occurrence of , and encode the second character based on the probability of occurrence. In the same way, N characters in the data block can be encoded, thereby finally obtaining the compressed data block of the data block. In the above description, for each byte in the original data block, the compression model outputs the respective occurrence probabilities of 256 characters that depend on the previous character. It can be understood that the embodiment of the present invention is not limited to this. For example, it can be preset. For each byte in the original data block, the compression model outputs the respective probabilities of occurrence of 256 characters depending on the first two characters, the occurrence probability depending on the first three characters, and so on. For example, assuming that the compression model is preset to output 256 characters for each byte in the original data block, each of which depends on the occurrence probability of its first two characters, when the first character in the original data block is input to the compression model After that, the compression model outputs the occurrence probability of each of the 256 characters that depends on the first character. When the second character in the original data block is input to the compression model, the compression model outputs the 256 characters that depend on the first character. The probability of occurrence of the first character and the second character, when the third character in the original data block is input to the compression model, the compression model outputs the respective probabilities of occurrence of the second and third characters depending on the 256 characters ,and many more.

After the probability prediction module 622 predicts the occurrence probability of each unit data in the data block to be compressed, the compression module 623 in FIG. The blocks are compressed to generate compressed data blocks.

The compression module 623 can compress the data block based on the occurrence probability of the unit data in the data block through various entropy coding algorithms, for example, the entropy coding algorithm includes arithmetic coding algorithm, asynchronous numerical system algorithm, Huffman coding algorithm, etc. Arithmetic coding algorithms will be described herein as an example.

It is assumed that the data block to be compressed includes three characters a, b, and c arranged in sequence. Referring to FIG. 9, first, the character a is input into the compression module, and the average probability of 256 characters is input into the compression module. The 256 characters are arranged in sequence in advance in the compression module, so that the probability interval corresponding to each character can be easily determined based on the probability of each character. Suppose, in the compression module, a is arranged in the 1st position of 256 characters, b is arranged in the 2nd position, and c is arranged in the 3rd position. Fig. 10 is a schematic diagram of the process of encoding the data block "abc" by the compression module. After the compression module receives the probability of character a and character a (1/256), according to the arrangement position of character a in the 256 characters, the first interval of length 1 in the line segment with length 1/256 is ( 0, 0.004) is determined as the current coding interval, as shown in the left side of Fig. 10, where 1/256 is approximated as 0.004.

When the character a is input into the compression module, the character a is also input into the compression model, and after the character a is input into the compression module, the second character b in the data block is input into the compression module. After the compression model receives the character a, it is assumed that the probability of predicting each character that appears after a is a: 1/4, b: 2/4, c: 1/4, that is, the division characters a, b, and The probability of occurrence of other characters of c is 0. The compression module divides the above-determined coding interval (0, 0.004) into corresponding characters a, b, and c based on the occurrence probability of the second character b in the data block and the arrangement position of character b in the 256 characters. The three intervals, as shown in the middle part of Fig. 10, are respectively (0, 0.004*1/4=0.001), (0.001, 0.004*3/4=0.003), (0.003, 0.004) corresponding to a, and It is determined that the interval (0.001, 0.003) corresponding to the character b is the interval of the current encoding.

While the character b is input into the compression module, the character b is also input into the compression model, and after the character b is input into the compression module, the third character c in the data block is input into the compression module. The compression model predicts the occurrence probability of each character after character b based on the previous character (character b), assuming that the predicted probability of each character appearing after b is a: 1/5, b: 2/5, c: 2/5 , that is, the occurrence probability of other characters other than characters a, b, and c in the 256 characters is 0. Similarly, the compression module divides the above-determined coding interval (0.001, 0.003) into three intervals corresponding to characters a, b, and c, respectively, as shown on the right side of Figure 10, which are (0.001, ( 0.003-0.001)*1/5+0.001=0.0014), (0.0014, (0.003-0.001)*3/5+0.001=0.0022), (0.0022,0.0029), and determine the interval corresponding to character c (0.0022, 0.0029) is the currently encoded interval, after determining the interval, the compression module takes any value from the interval as the compressed data block of the data block abc, for example, 0.0023.

In step S704, a compressed data package of the data block to be compressed is generated, and the compressed data package includes the compressed data block and compression model information.

After obtaining the compressed data block of the data block to be compressed as described above, the storage node 62 may generate a compressed data packet (also referred to as a compressed data block) of the data block to be compressed based on the compressed data block and the relevant information of the compression model used to compress the original data block. compressed package), so that the compressed data block can be decompressed through the compressed package to obtain the original data block.

Specifically, if the initial AI model is directly used as a compression model for compressing the original data block, an identifier (eg, number) of the compressed data block and the initial AI model may be included in the compressed data package. . It can be understood that the compressed data block and the initial AI model may also be included in the compressed data package.

During decompression, the AI model can be obtained through the identifier of the initial AI model in the compressed data package for decompression. By including the identifier of the AI model in the compressed data package, it is not necessary to include the AI model (that is, all parameters of the AI model), thereby reducing the storage space required for storing the compressed data package, and greatly increasing the data compression rate, wherein, The data compression rate is calculated by the following formula (4):

If after the initial AI model is trained, the AI model obtained by training (hereinafter referred to as the post-training AI model) is used as the compression model for compressing the original data block, the post-training AI model and the initial AI model can be calculated first. The difference data of the model, for example, difference data=post-training AI model-initial AI model, and then a compressed data package is generated, and the compressed data package includes the compressed data block, the identifier of the initial AI model, and the above-mentioned difference data. Therefore, when decompressing, the initial AI model can be obtained through the identifier of the initial AI model, and the trained AI model can be obtained based on the difference data and the initial AI model, that is, the AI model after training=initial AI model+difference data. Because the similarity between the reference data block corresponding to the initial AI model and the original data block is high, the difference data between the AI model after training and the initial AI model is small, so the compressed data package is small and the storage of compressed data is reduced. The storage space required by the package, and the data compression rate is increased. It can be understood that the embodiment of the present invention is not limited to this, and in this case, the compressed data block and the post-training AI model may also be included in the compressed data packet.

After generating the compressed data packet, the storage node 62 may store the compressed data packet in the storage medium 63, and record the correspondence between the original data block and the storage address of the compressed data packet.

Returning to FIG. 6 , after the computing node 61 sends the data write request to the storage node 62 , when the write data needs to be read, the computing node 61 may send the write data read request to the storage node 62 . After receiving the read request, the storage node 62 determines the storage address of at least one compressed data packet corresponding to the write data, reads the compressed data packet, and uses the compression model for compressing the data block in the compressed data packet. information, obtain the compression model for compressing the data block, and use the compression model to decompress the compressed data block in the compressed data packet to obtain the original written data and return it to the computing node 61 .

FIG. 11 is a schematic diagram of a process of data decompression based on a compression model.

Referring to FIG. 11 , after obtaining the compressed data packet, the storage node 62 obtains the compressed data block of the original data block from the compressed data packet. In the data compression example shown in FIG. 10 above, the compressed data block is 0.0023, and is based on The compression model information for compressing the original data block in the compressed data package is obtained, and the compression model for compressing the original data block is obtained. Similar to the above, the probability of the first character of the original data block can be preset to be the average probability of 256 characters (that is, 1/256≈0.004), and the storage node 62 inputs 0.0023 and the average probability to the decompression module, and Similarly to the compression module, the decompression module has pre-sorted 256 characters, wherein in the interval (0, 1), the interval (0, 0.004) is the interval corresponding to the character a, and the decompression module determines that the compressed data 0.0023 is in the interval (0, 0.004), so it can be determined that the first character of the original data is character a. After the decompression module predicts the character a, the storage node 62 inputs the character a into the compression model used for compressing the original data block, the same as when compressing the original data above, referring to FIG. 10 , when entering the character a into the compression model After that, the compression model will output the predicted probability of 256 characters, ie, Pa=1/4, Pb=2/4, Pc=1/4, and the occurrence probability of other characters is 0. The compression model inputs the predicted probabilities of characters a, b, and c to the decompression module. The decompression module can determine the sequence of the predetermined characters a, b, and c. For the second byte of the original data block, the character a The corresponding interval is the interval (0, 0.001) in the interval (0, 0.004) determined above, the interval corresponding to the character b is the interval (0.001, 0.003) in the interval (0, 0.004), and the interval corresponding to the character c is the interval The interval (0.003, 0.004) in (0, 0.004). Obviously, the compressed data block 0.0023 falls into the interval (0.001, 0.003) corresponding to the character b. Therefore, the decompression module can determine that the second character in the original data block is the character b. Similarly, after predicting the second character, the predicted character b is input into the compression model, so that the compression model can predict the prediction probability of each character corresponding to the third byte. For example, as shown in Figure 10, Pa= 1/5, Pb=2/5, Pc=2/5, and input the probability of each character to the decompression module. Based on the arrangement order of each character, the decompression module can determine that in the above determined interval (0.001, 0.003), the character a corresponds to the interval (0.001, 0.0014), the character b corresponds to the interval (0.0014, 0.0022), and the character c corresponds to the interval (0.001, 0.0014). The interval (0.0022, 0.003), since the compressed data block 0.0023 falls into the interval (0.0022, 0.003) corresponding to the character c, the decompression module can determine that the third character of the original data block is the character c. After the decompression module sequentially outputs all characters (ie, abc) of the original data block, the storage node 62 completes the process of decompressing the compressed data block, and obtains the original data block. After that, the storage node 62 returns the original data block obtained by decompressing the data to the application node 61 as a response to the above read request.

The data processing scheme of the embodiment of the present invention is described above by taking the process of compressing and decompressing stored data in the storage system as an example. With this scheme, the data compression time can be saved and the data compression rate can be greatly improved. It can be understood that the data processing solutions in the embodiments of the present invention are not limited to being used in storage nodes as described above, but can be used in other scenarios and other devices. For example, the user's mobile terminal can save the original data after compressing the original data through the data compression scheme of the embodiment of the present invention, thereby saving the data storage time of the mobile terminal and improving the user's use experience.

FIG. 12 is an architectural diagram of a data processing apparatus provided by an embodiment of the present invention. The data processing apparatus is configured to execute the data processing method shown in FIG. 7 . Executable code is stored in the cell,

The processing unit 1201 is configured to execute the executable code to implement: selecting the first reference data block based on the similarity between the first data block to be compressed and each pre-stored reference data block; acquiring the first reference data block associated with the first reference data block model; compressing the first data block based on the compression model to generate a first compressed data block, the compression model is obtained according to the first model; generating a first compressed data packet, the first compressed data packet includes the first compressed data block and compression model information, where the compression model information is obtained according to the compression model.

In a possible implementation manner, the processing unit 1201 is configured to execute executable code to implement, based on a compression model, compress the first data block to generate a first compressed data block, and the compression model is based on the first data block. To obtain a model, specifically, the processing unit 1201 is configured to execute executable code to implement: using the first model as a compression model, compressing the first data block, and generating a first compressed data block.

In a possible implementation manner, the processing unit 1201 is configured to execute executable code to implement, based on a compression model, compress the first data block to generate a first compressed data block, and the compression model is based on the first data block. Obtaining a model, specifically: the processing unit 1201 is configured to execute executable code to implement using the second model as a compression model, compress the first data block, and generate a first compressed data block, and the second model is obtained by using Part of the data in the first data block is obtained by training the first model.

In a possible implementation manner, the processing unit 1201 is further configured to execute executable code to implement: storing the first data block as a reference data block, storing compression model information, and associating the stored first data block with the stored compression model information storage.

In a possible implementation manner, the processing unit 1201 includes a neural network processor 1203; the processing unit 1201 is configured to execute executable codes to implement compression of the first data block based on the first model, specifically: a neural network processor 1203 is used for executing the executable code to implement the compression of the first data block based on the first model.

In a possible implementation manner, the processing unit 1201 is further configured to execute executable code to achieve: obtaining the first compressed data packet; obtaining the compression model according to the compression model information in the first compressed data packet; The first compressed data block in a compressed data packet is decompressed to obtain the first data block.

FIG. 13 is an architectural diagram of a data processing apparatus provided by an embodiment of the present application. The data processing apparatus is configured to execute the data processing method shown in FIG. 7 , and the data processing apparatus includes:

a selection unit 131, configured to select the first reference data block based on the similarity between the first data block to be compressed and each pre-stored reference data block;

an obtaining unit 132, configured to obtain the first model associated with the first reference data block;

a compression unit 133, configured to compress the first data block based on a compression model to generate a first compressed data block, and the compression model is obtained according to the first model;

The generating unit 134 is configured to generate a first compressed data packet, where the first compressed data packet includes the first compressed data block and compression model information, where the compression model information is obtained according to the compression model.

In a possible implementation manner, the compressing unit 133 is specifically configured to: predict the occurrence probability of the unit data in the first data block based on the compression model; according to the occurrence probability of the unit data in the first data block probability, and compressing the first data block to generate a first compressed data block.

In a possible implementation manner, the compression unit 133 is specifically configured to: adopt the second model as a compression model, compress the first data block, and generate a first compressed data block, and the second model uses the Part of the data in the first data block is obtained by training the first model.

In a possible implementation manner, the data processing apparatus further includes: a storage unit 135, configured to store the first data block as a reference data block, and store compression model information, where the stored first data block is stored in association with the stored compression model information .

In a possible implementation manner, the data processing apparatus includes a neural network processor; the compression unit 133 is deployed in the neural network processor.

In a possible implementation manner, the obtaining unit 132 is further configured to obtain the first compressed data packet, and obtain the compression model according to the relevant information of the compression model in the first compressed data packet, and the data processing apparatus further includes a decompression unit 136 , which is used to decompress the first compressed data block in the first compressed data packet by using the compression model to obtain the first data block.

Wherein, each unit in the data processing apparatus shown in FIG. 13 may have the form of software, hardware, or firmware, which is not limited in this embodiment of the present invention.

The present application further provides a computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed in a computer or a processor, the computer or processor is caused to execute the description in the above referenced drawings. Methods.

The present application also provides a computer program product, which, when the computer program product is executed in a computer or a processor, causes the computer or the processor to execute the method described above with reference to the figures.

FIG. 14 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention. The data processing apparatus may include a processor 110 , an internal memory 120 , a connection module 130 , a display 140 and an interface module 150 . The data processing apparatus may also include other modules or components, such as audio modules, etc., which are not shown in FIG. 14 . It can be understood that the structures illustrated in the embodiments of the present invention do not constitute a specific limitation on the data processing apparatus. In other embodiments of the present application, the data processing apparatus may include more or less components than those shown, or combine some components, or separate some components, or arrange different components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

The processor 110 may include one or more processing units, for example, the processor 110 may include an application processor (application processor, AP), a modem processor, a GPU, an image signal processor (image signal processor, ISP), a CPU , at least one of video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or NPU, etc. CPU, ISP, NPU and GPU are schematically shown in processor 110 in FIG. 14 , which can be connected via a bus. Wherein, different processing units may be independent devices, or may be integrated in one or more processors. For example, the processor 110 may be a chip or chipset. For example, the application processor may be the CPU.

A memory may also be provided in the processor 110 for storing instructions and data. This memory may hold instructions or data that have just been used or recycled by the processor 110 . If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated memory accesses outside the processor 110 are avoided, reducing the latency of the processor 110, thereby increasing the efficiency of the system.

Internal memory 120, also called main memory, may be used to store computer executable program code, which includes instructions. The processor 110 executes various functional applications and data processing of the data processing apparatus by executing the instructions stored in the internal memory 120 . The internal memory 120 may include a program storage area and a data storage area. Wherein, the storage program area can store the operating system, the code of the application program, etc. For example, as shown in FIG. 14, the program storage area of the internal memory 120 stores the following modules for executing the method shown in FIG. 7: the selection module 121, for selecting the first reference data block based on the similarity between the first data block to be compressed and each pre-stored reference data block; the obtaining module 122 is used to obtain the first model associated with the first reference data block; the compression module 123, for compressing the first data block based on a compression model to generate a first compressed data block, where the compression model is obtained according to the first model; a generating module 124, for generating a first compressed data packet , the first compressed data packet includes the first compressed data block and compression model information, where the compression model information is obtained according to the compression model.

In addition, the internal memory 120 may include random access memory (Random Access Memory, RAM), such as double-rate synchronous dynamic random access memory (Double Data Rate Synchronous Dynamic Random Access Memory, DDR Memory), and may also include non-volatile memory, such as At least one disk storage device, flash memory device, universal flash storage (universal flash storage, UFS), etc.

For example, the connection module 130 may be used for wired connection and wireless local area networks (wireless local area networks, WLAN) and the like. The display screen 140 is used to display text, images, videos, and the like. The display screen 140 includes a display panel. The display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode or an active-matrix organic light-emitting diode (active-matrix organic light). emitting diode, AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diode (quantum dot light emitting diodes, QLED) and so on. The interface module 150 includes, for example, various interfaces such as an external memory interface and a USB interface.

Referring to FIG. 15 , an embodiment of the present invention provides a terminal-cloud system architecture 300 . The execution device 210 is implemented by one or more servers, and optionally, cooperates with other data processing devices, such as data storage, routers, load balancers and other devices; the execution device 210 may be arranged on a physical site, or distributed in multiple locations. on a physical site. The execution device 210 may use the data in the data storage system 250, or invoke the program code in the data storage system 250 to implement the method shown in FIG. 7 .

A user may operate respective user devices (e.g., local device 301 and local device 302) to interact with execution device 210. Each local device may represent any computing device, such as a personal computer, computer workstation, smartphone, tablet, smart camera, smart car or other type of cellular phone, media consumption device, wearable device, set-top box, gaming console, etc.

Each user's local device can interact with the execution device 210 through any communication mechanism/standard communication network, which can be a wide area network, a local area network, a point-to-point connection, etc., or any combination thereof.

In another implementation, one or more aspects of the execution device 210 may be implemented by each local device, for example, the local device 301 may provide the execution device 210 with local data or feedback calculation results.

It should be noted that all the functions of the execution device 210 can also be implemented by the local device. For example, the local device 301 implements the functions of the execution device 210 and provides services for its own users, or provides services for the users of the local device 302 .

It should be understood that the descriptions of "first", "second" and so on herein are only for the simplicity of description to distinguish similar concepts, and have no other limiting effect.

Those skilled in the art can clearly understand that the descriptions of the embodiments provided in the present application may refer to each other, for the convenience and brevity of the description, for example, the functions and execution steps of the devices and devices provided in the embodiments of the present invention can be Referring to the related descriptions of the method embodiments of the present application, mutual reference may also be made between the method embodiments and the device embodiments.

In the above-mentioned embodiments, it may be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented in software, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated. The computer may be a general purpose computer, special purpose computer, computer network, or other programmable device. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server, or data center Transmission to another website site, computer, server, or data center by wire (eg, coaxial cable, optical fiber, digital subscriber line, or wireless (eg, infrared, wireless, microwave, etc.) means. The computer-readable storage medium may be a computer Any available medium that can be accessed or a data storage device such as a server, data center, etc., that contains one or more of the available mediums integrated.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners without exceeding the scope of the present application. For example, the above-described embodiments are only illustrative. For example, the division of the modules or units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined. Either it can be integrated into another system, or some features can be omitted, or not implemented. The unit described as a separate component may or may not be physically separated, and the component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units . Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment. Those of ordinary skill in the art can understand and implement it without creative effort.

In addition, the described apparatus and methods, as well as the schematic diagrams of the various embodiments, may be combined or integrated with other systems, modules, techniques or methods without departing from the scope of this application. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, which may be in electronic, mechanical or other forms.

The above are only specific embodiments of the present application, but the protection scope of the present application is not limited to this. should be covered within the scope of protection of this application. Therefore, the protection scope of the present application should be based on the protection scope of the claims.

Claims

A data processing method, comprising:

Selecting the first reference data block based on the similarity between the first data block to be compressed and each pre-stored reference data block;

obtaining a first model associated with the first reference data block;

Based on a compression model, the first data block is compressed to generate a first compressed data block, and the compression model is obtained according to the first model;

A first compressed data package is generated, where the first compressed data package includes the first compressed data block and compression model information, where the compression model information is obtained according to the compression model.
The method according to claim 1, wherein the first data block is compressed based on a compression model to generate the first compressed data block, specifically:

predicting the occurrence probability of unit data in the first data block based on the compression model;

According to the occurrence probability of unit data in the first data block, the first data block is compressed to generate a first compressed data block.
The method according to claim 1 or 2, wherein the first data block is compressed based on a compression model to generate a first compressed data block, and the compression model is obtained according to the first model ,Specifically:

Using the first model as a compression model, the first data block is compressed to generate a first compressed data block.
The method according to claim 3, wherein the compressed model information includes the first model or an identifier of the first model.
The method according to claim 1 or 2, wherein the first data block is compressed based on a compression model to generate a first compressed data block, and the compression model is obtained according to the first model ,Specifically:

Using the second model as a compression model, the first data block is compressed to generate a first compressed data block, and the second model is obtained by training the first model using part of the data in the first data block.
The method according to claim 5, wherein the compressed model information comprises any one of the following: the second model; the identifier of the second model; the identifier of the first model, and the first model Difference data between the second model and the first model.
The method according to any one of claims 1-6, wherein the multiple reference data blocks include at least one of the following data blocks: compressed data blocks, data blocks of different data types, and different time points of a cycle data block.
The method according to any one of claims 1-7, wherein the method further comprises: storing the first data block as a reference data block, storing the compression model information, and storing the stored first data block and stored in association with the stored compression model information.
The method according to any one of claims 1-8, wherein the method further comprises:

obtaining the first compressed data packet;

obtaining the compression model according to the compression model information in the first compressed data packet;

Decompress the first compressed data block in the first compressed data packet by using the compression model to obtain the first data block.
A data processing apparatus, characterized in that it includes a processing unit and a storage unit, wherein executable codes are stored in the storage unit, and the processing unit is configured to execute the executable codes to achieve:

Selecting the first reference data block based on the similarity between the first data block to be compressed and each pre-stored reference data block;

obtaining a first model associated with the first reference data block;

Based on a compression model, the first data block is compressed to generate a first compressed data block, and the compression model is obtained according to the first model;

A first compressed data package is generated, where the first compressed data package includes the first compressed data block and compression model information, where the compression model information is obtained according to the compression model.
The apparatus according to claim 10, wherein the processing unit is configured to execute the executable code to compress the first data block based on a compression model to generate a first compressed data block, the The compression model is obtained according to the first model, specifically:

The processing unit is configured to execute the executable code to implement using the first model as a compression model, compress the first data block, and generate a first compressed data block.
The apparatus according to claim 11, wherein the compressed model information comprises the first model or an identifier of the first model.
The apparatus according to claim 10, wherein the processing unit is configured to execute the executable code to compress the first data block based on a compression model to generate a first compressed data block, the The compression model is obtained according to the first model, specifically:

The processing unit is configured to execute the executable code to implement using the second model as a compression model, compress the first data block, and generate a first compressed data block, and the second model uses the first Part of the data in the data block is obtained by training the first model.
The device according to claim 13, wherein the compressed model information comprises any one of the following: the second model; the identifier of the second model; the identifier of the first model, and the first model Difference data between the second model and the first model.
The apparatus according to any one of claims 10-14, wherein the plurality of reference data blocks include at least one of the following data blocks: compressed data blocks, data blocks of different data types, and different time points of a cycle data block.
The apparatus according to any one of claims 10-14, wherein the processing unit is further configured to execute the executable code to implement: storing the first data block as a reference data block and storing the compression model information, the stored first data block is stored in association with the stored compression model information.
The apparatus according to any one of claims 10-16, wherein the processing unit is further configured to execute the executable code to achieve:

obtaining the first compressed data packet;

obtaining the compression model according to the compression model information in the first compressed data packet;

Decompress the first compressed data block in the first compressed data packet by using the compression model to obtain the first data block.
A computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed in a computer or a processor, the computer or processor is caused to execute any one of claims 1-9 the described method.
A computer program product, characterized in that, when the computer program product is executed in a computer or a processor, the computer or processor is caused to execute the method according to any one of claims 1-9.