CN118244993A

CN118244993A - Data storage method, data processing method and device, electronic equipment and medium

Info

Publication number: CN118244993A
Application number: CN202410636185.9A
Authority: CN
Inventors: 张伟豪; 刘发强
Original assignee: Beijing Lynxi Technology Co Ltd
Current assignee: Beijing Lynxi Technology Co Ltd
Priority date: 2024-05-22
Filing date: 2024-05-22
Publication date: 2024-06-25
Anticipated expiration: 2044-05-22
Also published as: CN118244993B

Abstract

The disclosure provides a data storage method, a data processing method and device, an electronic device and a medium, wherein the method comprises the following steps: splitting m first data to be stored respectively to obtain m second data and m third data; compression encoding is carried out on the m second data to obtain fourth data corresponding to the m second data; fourth data are stored in the first storage space, and m third data are stored in the second storage space. According to the embodiment of the disclosure, occupation of the storage space can be reduced, and the utilization efficiency of the storage space is improved.

Description

Data storage method, data processing method and device, electronic equipment and medium

Technical Field

The disclosure relates to the technical field of computers, and in particular relates to a data storage method, a data processing method and device, electronic equipment and a medium.

Background

The neural network generally comprises a plurality of network layers, each network layer generally comprises a plurality of neurons, each neuron in each network layer is connected with neurons of a previous layer or input data, in the forward propagation process of the neural network, each network layer generally needs to read a plurality of activation values output by the neurons in the previous network layer stored in a storage space in the process of processing a calculation task, and calculate based on the plurality of activation values and weights of the neurons to obtain output data of the neural network.

Because the neural network, especially the neural network with a complex structure, has more neurons in each network layer, when the neural network executes a calculation task, the activation value output by each network layer needs to be stored in the forward propagation process, and the problem of more occupied storage space exists.

Disclosure of Invention

The disclosure provides a data storage method, a data processing method and device, electronic equipment and medium.

In a first aspect, the present disclosure provides a data storage method, the data storage method comprising:

splitting m first data to be stored respectively to obtain m second data and m third data, wherein the value distribution of the m second data meets a preset condition, the value distribution of the m third data does not meet the preset condition, and the preset condition comprises: the entropy value of the numerical value distribution is smaller than a preset threshold value, and m is more than 1;

compression encoding is carried out on the m second data to obtain fourth data corresponding to the m second data, wherein the data size of the fourth data is smaller than that of the m second data;

and storing the fourth data in the first storage space, and storing the m third data in the second storage space.

In a second aspect, the present disclosure provides a data processing method, the data processing method comprising:

under the condition that a neural network executes a target processing task, fourth data are read from a first storage space and m pieces of third data are read from a second storage space aiming at a first network layer of the neural network, wherein the fourth data and the m pieces of third data are obtained by storing m pieces of first data according to the data storage method of the first aspect, and the m pieces of first data are m activation values output by the first network layer in a forward propagation process;

decompressing the fourth data to obtain m second data;

Generating the m first data according to the m second data and the m third data;

And taking the m pieces of first data as input data of a second network layer of the neural network, and processing the input data based on the second network layer to obtain output data of the second network layer.

In a third aspect, the present disclosure provides a data storage device comprising:

The data splitting module is configured to split m first data to be stored respectively to obtain m second data and m third data, where a value distribution of the m second data meets a preset condition, a value distribution of the m third data does not meet the preset condition, and the preset condition includes: the entropy value of the numerical value distribution is smaller than a preset threshold value, and m is more than 1;

the data coding module is used for carrying out compression coding on the m second data to obtain fourth data corresponding to the m second data, wherein the data volume of the fourth data is smaller than that of the m second data;

And the data storage module is used for storing the fourth data in the first storage space and storing the m third data in the second storage space.

In a fourth aspect, the present disclosure provides a data processing apparatus comprising:

The data reading module is used for reading fourth data from a first storage space and reading m pieces of third data from a second storage space aiming at a first network layer of the neural network under the condition that the neural network executes a target processing task, wherein the fourth data and the m pieces of third data are obtained by storing m pieces of first data according to the data storage method of the first aspect, and the m pieces of first data are m activation values output by the first network layer in a forward propagation process;

the data decompression module is used for decompressing the fourth data to obtain m second data;

The data generation module is used for generating the m first data according to the m second data and the m third data;

And the data processing module is used for taking the m pieces of first data as input data of a second network layer of the neural network, processing the input data based on the second network layer and obtaining output data of the second network layer.

In a fifth aspect, the present disclosure provides an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores one or more computer programs executable by the at least one processor, the one or more computer programs being executable by the at least one processor to enable the at least one processor to perform the data storage method of the first aspect or the data processing method of the second aspect.

In a sixth aspect, the present disclosure provides a computer readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the data storage method of the first aspect or the data processing method of the second aspect.

In the embodiment provided by the present disclosure, considering that for m first data, the overall value distribution does not satisfy a preset condition for performing data compression, the preset condition may be, for example: in the case of splitting data locally, however, the entropy value of the value distribution of the value is less than the preset threshold, so that the split data may have preset conditions satisfying data compression, and therefore, by splitting each first data, data with the value distribution satisfying the preset conditions in the first data is obtained as second data and data without the value distribution satisfying the preset conditions in the first data is obtained as third data, and then, by compression encoding the second data with the value distribution satisfying the preset conditions, a portion of the m first data, in which the value distribution of the value in the m first data can be compressed, for example, a portion of the m first data, in which the value distribution of the value in the entropy value is less than the preset threshold, can be compressed, so that fourth data with a data volume smaller than that of the original m second data is obtained, and thus, compared with a storage space required to be consumed by directly storing the m first data, by storing the compressed fourth data and the m third data respectively, the utilization efficiency of the storage space can be reduced.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, without limitation to the disclosure. The above and other features and advantages will become more readily apparent to those skilled in the art by describing in detail exemplary embodiments with reference to the attached drawings, in which:

FIG. 1 is a flow chart of a data storage method provided in an embodiment of the present disclosure;

FIG. 2 is a flow chart of a data splitting process provided by an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of data bits of floating point data according to an embodiment of the present disclosure;

FIG. 4 is a flow chart of a compression encoding process provided by an embodiment of the present disclosure;

FIG. 5a is a first schematic diagram of a differential process provided by an embodiment of the present disclosure;

FIG. 5b is a second schematic diagram of the differential processing provided by embodiments of the present disclosure;

FIG. 6 is a flow chart of a data processing method provided by an embodiment of the present disclosure;

FIG. 7 is a block diagram of a data storage device provided by an embodiment of the present disclosure;

FIG. 8 is a block diagram of a data processing apparatus provided by an embodiment of the present disclosure;

Fig. 9 is a block diagram of an electronic device according to an embodiment of the present disclosure.

Detailed Description

For a better understanding of the technical solutions of the present disclosure, exemplary embodiments of the present disclosure will be described below with reference to the accompanying drawings, in which various details of the embodiments of the present disclosure are included to facilitate understanding, and they should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Embodiments of the disclosure and features of embodiments may be combined with each other without conflict.

As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

Under the condition that the electronic equipment executes a calculation task, the group of data can be directly stored for the group of data, and when the group of data is needed, the group of data is read from a storage space for use, and under the condition that the data volume of the data is small or the hardware resources are sufficient, the direct storage mode for the data does not bring load to storage; under the condition of large data volume or insufficient hardware resources, the direct data storage mode generally brings large load to storage and influences the utilization efficiency of storage space. For example, in the case of performing a computing task by using a neural network, in the forward propagation process, each network layer may output a plurality of activation values, and if the plurality of activation values for each network layer are stored in a direct storage manner, more storage space is often occupied, and the utilization efficiency of the storage space is affected.

In the related art, in order to reduce excessive occupation of storage space by a set of data, it is generally considered to compress the data using a compression algorithm to reduce the amount of data that needs to be stored.

For example, based on the LZW (Lempel-Ziv-Welch Encoding) algorithm, the algorithm is to build a character conversion table to map an input string into codewords of variable length according to the character conversion table. In the process of encoding data based on the LZW encoding algorithm, if the current character is in a character conversion table, directly using the corresponding encoding in a dictionary; if the current character is not in the character conversion table, it is added to the dictionary and represented using a special code. In this way compression of the data is achieved to reduce the amount of data that needs to be stored.

Another example is a Huffman coding based compression algorithm that uses a variable length coding table to encode source symbols. This coding table is constructed based on the frequency of symbols, with symbols of high frequency of occurrence typically using shorter codes. The construction of Huffman coding generally involves the process of constructing a Huffman tree in which lower frequency symbols are located deeper in the tree and higher frequency symbols are located shallower in the tree, thereby reducing the average length expectancy of the string after coding for data compression purposes.

Of course, other compression algorithms for compressing a set of data exist in the related art, however, as shown in the above description of the LZW coding algorithm and the huffman coding algorithm, the algorithms generally all require that a set of data to be stored meet a certain rule, for example, the LZW coding algorithm needs more strings that can be randomly coded in a set of data, and if the characters in the set of data are all random, the data amount cannot be reduced based on the LZW coding algorithm; when a set of data is encoded and compressed based on the huffman coding algorithm, the probability of occurrence of the characters in the set of data, that is, the frequency is required, and if the occurrence of the characters in the set of data is completely random, the data volume may not be reduced.

In view of this, in the data storage method provided in the embodiment of the present disclosure, considering that the overall value distribution of the m first data does not satisfy the preset condition for performing data compression, the preset condition may be, for example: in the case of splitting data locally, however, the entropy value of the value distribution of the value may have a preset condition satisfying data compression, so, by splitting each first data respectively, data of the first data, the value distribution of which satisfies the preset condition, is obtained as second data, and data of which does not satisfy the preset condition, is obtained as third data, then, by compression encoding the second data of which satisfies the preset condition, a portion of the m first data, in which the value distribution of the m first data can be compressed, for example, a portion of the m first data, in which the value distribution of the value of the m first data is smaller than the preset threshold, may be compressed, so that fourth data of which the data volume is smaller than the data volume of the original m second data, are obtained, and compared with a storage space required for directly storing the m first data, by storing the compressed fourth data and the m third data respectively, the utilization efficiency of the storage space can be reduced.

The data storage method and the data processing method according to the embodiments of the present disclosure may be performed by an electronic device such as a terminal device or a server, where the terminal device may be a vehicle-mounted device, a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal digital assistant (Personal DIGITAL ASSISTANT, PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, or the like, and the method may be implemented by a processor invoking computer readable program instructions stored in a memory. Or the method may be performed by a server.

Fig. 1 is a flowchart of a data storage method according to an embodiment of the present disclosure. Referring to fig. 1, the method includes:

In step S11, splitting m first data to be stored respectively to obtain m second data and m third data;

Wherein, the value distribution of the m second data satisfies a preset condition, and the value distribution of the m third data does not satisfy the preset condition, where m is greater than 1, and the preset condition may include: the entropy value of the numerical value distribution is smaller than a preset threshold value.

In the embodiment of the disclosure, the m first data may be a set of data generated or obtained during the process of executing a computing task by the electronic device, for example, the first data may be an activation value during the forward propagation of the neural network; the neural network may be used to execute a target processing task, where the target processing task may be any one of an image processing task, a voice processing task, a text processing task, and a video processing task. Of course, the first data may be a weight value of a neuron of each network layer of the neural network, or may be set as required, which is not limited herein.

In the disclosed embodiments, the first data may be any type of data that includes at least one data bit. It should be noted that the data bits in the embodiments of the present disclosure may be used to represent a portion of data in a single data.

For example, where the first data is floating point type data (Floating Point Value), it may include exponent bits for representing the value of the exponent portion of the floating point type data, significand bits for representing the sign portion of the floating point type data, significand bits for representing the significand portion of the floating point type data, e.g., for floating point type data "-9.625", with a sign bit of "1", a exponent bit of "10000010", and a significand bit of "0011010000000000000"; of course, the first data may also be data composed of a plurality of data fields, and each data field may be regarded as one data bit of the data, for example, the first data may be data composed of two data fields of "number" and "name", and then for "0010user01", the "number" data bit is "0010" and the "name" data bit is "user01".

It should be noted that, in the embodiment of the present disclosure, the foregoing preset condition is a condition for determining whether the value distribution of the numerical values satisfies the data compression, where the preset condition may be, for example, that the entropy value of the value distribution of the numerical values described in the foregoing description is smaller than a preset threshold, where the preset threshold may be set as required; for another example, the preset condition may be: the occurrence frequency or probability of the character string with the preset length in the numerical value is larger than a preset threshold, and the preset condition can be set according to the requirement in actual implementation, so that the method is not particularly limited.

In step S12, compression encoding is performed on the m second data to obtain fourth data corresponding to the m second data, where the data size of the fourth data is smaller than the data size of the m second data;

in step S13, the fourth data is stored in the first storage space, and m pieces of third data are stored in the second storage space.

In the embodiment of the present disclosure, the first storage space and the second storage space may be any two different storage spaces, which is not particularly limited in the embodiment of the present disclosure.

It can be seen that, according to the embodiments of the present disclosure, by splitting each first data, a data, in which a value distribution in the first data meets a preset condition, for example, a data, in which an entropy value of the value distribution is smaller than a preset threshold, is used as second data and a data, in which the value distribution does not meet the preset condition, is used as third data, and then, by compressing and encoding the second data, a portion, in which the value distribution can be compressed, in the m first data can be compressed, so as to obtain fourth data, in which a data amount is smaller than a data amount of the original m second data, and therefore, compared with a storage space required to directly store the m first data, by storing the compressed fourth data and the m third data, excessive occupation of the storage space can be reduced, and the utilization efficiency of the storage space can be improved.

Fig. 2 is a flowchart of a data splitting process provided by an embodiment of the present disclosure. In some possible implementations, the first data may include a plurality of data bits; referring to fig. 2, in this embodiment, in step S11, the splitting of the m first data to be stored to obtain m second data and m third data may include:

In step S111, for any one of the first data, a data bit satisfying a preset condition among the data bits of the first data is used as the second data corresponding to the first data;

in step S112, data bits other than the second data in the first data are used as third data corresponding to the first data.

In the information theory, entropy may be generally used to describe the magnitude of uncertainty of a source, and the general entropy value may be any one of information entropy, cross entropy, relative entropy, conditional entropy, and the like.

That is, in the embodiment of the present disclosure, for different data bits in m first data, data in different types of data bits may be split, and then, by respectively obtaining entropy values of value distributions of values in data bits of each type and comparing the entropy values with a preset threshold, if the entropy values of value distributions of values in data bits of a certain type are lower, it is indicated that the values are relatively stable, and the value distributions thereof are generally within a certain data range, so that the data in data bits of the type may be considered as second data.

Taking m as 10, taking the first data including data bit 1 and data bit 2 as an example, splitting the value of the data bit 1 from 10 data to obtain 10 sub-data 1 corresponding to the 10 data and splitting the value of the data bit 2 to obtain 10 sub-data 2 corresponding to the 10 data, if the calculated entropy value of the value distribution of the 10 sub-data 1 is e1, the calculated entropy value of the value distribution of the 10 sub-data 2 is e2, and e1 is smaller than a preset threshold, and e2 is larger than the preset threshold, the 10 sub-data 1 can be regarded as second data in the embodiment of the disclosure, and the 10 sub-data 2 can be regarded as third data in the embodiment of the disclosure.

Because the entropy of the 10 sub-data 1 is low, that is, the value distribution of the numerical value is usually in a certain data range, the 10 sub-data 1 can be compression-encoded, for example, the LZW encoding algorithm or the huffman encoding algorithm in the related art is used to encode the partial data, so as to obtain fourth data with the data quantity far smaller than that of the 10 sub-data 1, and then the data storage amount can be reduced by respectively storing the fourth data and the 10 sub-data 2, so as to achieve the effect of effectively using the storage space.

In some possible implementations, the m first data may be floating point type data, and the data bits may include exponent bits; in this embodiment, the data bits satisfying the preset condition among the data bits of the first data described in step S111, as the second data corresponding to the first data, include: taking the exponent bits of the first data as second data; the step S112, which uses the data bits of the first data except the second data as the third data corresponding to the first data, includes: and taking the data bits except the index bits in the first data as third data.

In particular, referring to fig. 3, a schematic diagram of data bits of floating point data according to an embodiment of the disclosure is provided. As shown in fig. 3, for floating point type data "-9.625" represented by 10-ary, which is typically stored in the format of "sign bit", "exponent bit", and "significand bit" when stored in a computer, a "1" may be set in the sign bit to represent the sign bit minus of "-9.625", a "10000010" may be set in the exponent bit to represent its exponent bit 3, and a "0011010000000000000" may be set in the significand bit to represent its significand bit.

In the case of floating point data, the entropy of the numerical value distribution of the digits and the valid data bits will have different distributions, i.e. in a group of floating point data, the entropy of the valid digits will be higher, and the entropy of the digits will be lower. For example, in the case of a neural network performing a computational task, for input data 0, multiple activation values generated by the same network layer during the forward propagation are all often within a certain data range, which results in the exponent bits thereof being often the same data, or data with smaller data differences, while the valid digit bits are often more diverse, for example, for 10 activation values: "-9.625, -10.245, -8.176, -7.121, -9.963, -6.315, -8.425, -10.831, -9.462, -10.429", wherein the exponent of the digit is either 3 or 4, and the significant digit is more random, so in the embodiment of the present disclosure, when the m first data are floating point data, the exponent of each first data may be used as the second data, and the data bits of each first data except for the exponent may be used as the third data.

For example, for the 10 activation values described above: "-9.625, -10.245, -8.176, -7.121, -9.963, -6.315, -8.425, -10.831, -9.462, -10.429", it is possible to consider that the exponent bits of the respective data are taken as the second data thereof, other data bits, such as sign bits and significant digits, are taken as the third data thereof, and in the case that the exponent bits of the 10 activation values are taken as the second data thereof, since the obtained 10 second data are either 3 or 4 in value, the effect of occupying the storage space can be reduced by compressing the 10 second data into 1 fourth data and storing the fourth data and other third data without losing the original data.

Therefore, according to the method provided by the embodiment of the disclosure, the plurality of first data are split based on the data bits of the first data, whether the value distribution of the values in different data bits meets the preset condition or not is judged, and the data bits meeting the preset condition are compressed, so that compared with the method for directly storing the plurality of first data, the method can achieve the effects of reducing excessive occupation of the storage space and improving the utilization efficiency of the storage space.

In some possible implementations, in step S12, the compression encoding the m second data to obtain fourth data corresponding to the m second data includes: acquiring n pieces of reference data, wherein the n pieces of reference data comprise n pieces of data selected from m pieces of second data, and n is more than or equal to 1 and less than m; fourth data is obtained based on the n pieces of reference data and the m pieces of second data.

Fig. 4 is a flowchart of a compression encoding process provided by an embodiment of the present disclosure. Referring to fig. 4, in some possible implementations, the obtaining fourth data based on the n pieces of reference data and the m pieces of second data may include: in step S121, difference processing is performed on the m second data and the n reference data, so as to obtain m difference values; in step S122, fourth data is obtained from the n pieces of reference data and the m pieces of differential values.

In some possible implementations, the obtaining fourth data according to the n pieces of reference data and the m pieces of differential values may include: generating sparse codes corresponding to m second data based on the m differential values, and generating the fourth data based on the n reference data and the sparse codes.

In some possible implementations, in step S121, the performing differential processing on the m second data and the n reference data to obtain m differential values includes: under the condition that n is equal to 1, respectively carrying out differential processing on m second data and reference data to obtain m differential values; or under the condition that n is larger than 1, obtaining the mapping relation between n pieces of reference data and m pieces of second data, and differentiating each piece of second data with the target reference data based on the mapping relation to obtain m differential values, wherein the target reference data is the reference data with the mapping relation with the second data in the n pieces of reference data.

In the case where n is equal to 1, the reference data may be any one of m pieces of second data. Referring to fig. 5a, a first schematic diagram of a differential process provided by an embodiment of the present disclosure is shown. As shown in fig. 5a, at m first data, 10 activation values: "-9.625, -10.245, -8.176, -7.121, -9.963, -6.315, -8.425, -10.831, -9.462, -10.429", the second data may represent "3, 4, 3, 4" respectively, and if the reference data is 3, by differentiating the m second data from the reference data respectively, the differential values of "0, 1, 0, 1" may be obtained, and the sparse coding "0100100101" is generated based on the 10 differential values, and then in the form that the fourth data is "reference data+sparse coding", the fourth data corresponding to the original 10 first data may be expressed as: compared with the storage space which is consumed by respectively storing the original 10 activation values, the 3+ "0100000101" can obviously reduce the occupation of the storage space by storing the fourth data obtained by compressing the 10 second data and the 10 third data corresponding to the 10 activation values.

In addition, in the case where n is greater than 1, the reference data may be any n pieces of second data of m pieces of second data, referring to fig. 5b, which is a second schematic diagram of the differential processing provided in the embodiment of the present disclosure. As shown in fig. 5b, at m first data, 10 activation values: in the case of "-9.625, -10.245, -8.176, -7.121, -9.963, -6.315, -8.425, -10.831, -9.462, -10.429", the second data may be 1 to represent "3, 4, 3, 4", the reference data 1 is 3, the reference data 2 is 4, for example, a mapping relationship may be established in which the reference data 1 corresponds to the 1 st, 3 to 7 th, 9 th second data, the reference data 2 corresponds to the 2 nd, 8 th, 10 th second data, the differential value 1 obtained by differentiating the 1 st, 3 to 7 th, 9 th second data is 0, the differential value 2 obtained by differentiating the 2 nd, 8 th, 10 th second data with the reference data 2 is 0, and the sparse code "0000000000" may be generated based on the differential value 1 and the differential value 2, thereby generating the fourth data based on the two reference data and the coefficient code: "3+"0100000101", the fourth data may include the above-described mapping relationship, or the mapping relationship may be stored as a default configuration, and is not particularly limited here.

It should be noted that the foregoing is merely illustrative, and in some possible embodiments, in the case where n is equal to 1, the reference data may be set as required, for example, may be an average value or a median value of m second data, or may be any value. In addition, in the case where n is greater than 1, the n pieces of reference data may also be set as: the (i) th second data among the m second data is used as the reference data of the (i+1) th second data, where i < m, for example, with respect to the above second data of "3, 4, 3, 4", the (1) st second data may be used as the reference data of the (2) nd second data, the (2) nd second data may be used as the reference data of the (3) th second data, …, the (9) th second data may be used as the reference data of the (10) th second data, that is, each second data may be differentiated from the previous value thereof, and a sparse code may be generated based on the obtained differential value, which is not particularly limited herein.

Therefore, according to the data storage method provided by the embodiment of the disclosure, each first data is split respectively to obtain the data with the value distribution meeting the preset condition in the first data as the second data and the data not meeting the preset condition in the first data as the third data, and then the compressed part of the m first data with the value distribution being able to be compressed is compressed by performing compression coding on the second data meeting the preset condition to obtain the fourth data with the data volume smaller than that of the original m second data, so that compared with the storage space required for directly storing the m first data, the excessive occupation of the storage space can be reduced by storing the compressed fourth data and the m third data respectively, and the utilization efficiency of the storage space is improved.

Corresponding to the data storage method provided by the embodiment of the present disclosure, the embodiment of the present disclosure further provides a data processing method, and fig. 6 is a flowchart of the data processing method provided by the embodiment of the present disclosure. Referring to fig. 6, the method includes:

In step S61, in the case where the neural network performs the target processing task, fourth data is read from the first storage space and m pieces of third data are read from the second storage space for the first network layer of the neural network;

The fourth data and the m third data are obtained after m first data are stored according to the data storage method in the embodiment of the disclosure, where the m first data are m activation values output by the first network layer in a forward propagation process;

In step S62, decompressing the fourth data to obtain m second data;

in step S63, m first data are generated according to m second data and m third data;

In step S64, the m first data are used as input data of a second network layer of the neural network, and the input data are processed based on the second network layer to obtain output data of the second network layer.

The neural network may be any neural network, and the embodiments of the present disclosure are not limited thereto.

The target processing task may be any one of an image processing task, a voice processing task, a text processing task, and a video processing task, and may be, for example, a text classification task in a text processing task.

In this embodiment, in the case of executing the target processing task, in order to improve the utilization efficiency of the storage space, the m activation values generated by the first network layer may generate the fourth data and the third data corresponding to the m activation values based on the data storage method, so as to reduce the data storage amount, the second network layer after the first network layer may read the fourth data and the third data, restore the fourth data to m second data, and further based on the second data and the third data, and obtain the original m activation values through lossless restoration, so that the second network layer processes the m activation values as input data to obtain output data of the m activation values, and based on this mode, the neural network may obtain a final output result.

Therefore, according to the data processing method provided by the embodiment of the disclosure, when the neural network executes the target processing task, the active values generated by each network layer in the forward propagation process of the neural network are stored based on the data storage method, so that the occupation of the storage space can be reduced, and the utilization efficiency of the storage space can be improved.

It will be appreciated that the above-mentioned method embodiments of the present disclosure may be combined with each other to form a combined embodiment without departing from the principle logic, and are limited to the description of the present disclosure. It will be appreciated by those skilled in the art that in the above-described methods of the embodiments, the particular order of execution of the steps should be determined by their function and possible inherent logic.

In addition, the disclosure further provides a data storage device, a data processing device, an electronic device, and a computer readable storage medium, where any one of the foregoing may be used to implement any one of the data storage method or the data processing method provided in the disclosure, and corresponding technical schemes and descriptions and corresponding descriptions referring to method parts are not repeated.

Fig. 7 is a block diagram of a data storage device according to an embodiment of the present disclosure.

Referring to fig. 7, an embodiment of the present disclosure provides a data storage device including: a data splitting module 71, a data encoding module 72 and a data storage module 73.

The data splitting module 71 is configured to split m first data to be stored to obtain m second data and m third data, where the value distribution of the m second data meets a preset condition, and the value distribution of the m third data does not meet the preset condition, where the preset condition includes: the entropy value of the numerical value distribution is smaller than a preset threshold value, and m is more than 1;

The data encoding module 72 is configured to perform compression encoding on the m second data to obtain fourth data corresponding to the m second data, where a data amount of the fourth data is smaller than a data amount of the m second data;

the data storage module 73 is configured to store fourth data in the first storage space and store m pieces of third data in the second storage space.

In some possible implementations, the first data includes a plurality of data bits; the data splitting module 71 may be configured to, when splitting m first data to be stored to obtain m second data and m third data, respectively: for any first data, taking the data bit meeting the preset condition of the data bits of the first data as second data corresponding to the first data; and taking the data bits except the second data in the first data as third data corresponding to the first data.

In some possible implementations, the m first data includes floating point type data, and the data bits include exponent bits; the data splitting module 71 may be configured to, when taking, as the second data corresponding to the first data, the data bit satisfying the preset condition among the data bits of the first data: taking the exponent bits of the first data as second data; when the data bit of the first data other than the second data is used as the third data corresponding to the first data, the method can be used for: and taking the data bits except the index bits in the first data as third data.

In some possible embodiments, when compression encoding the m second data, the data encoding module 72 may be configured to: acquiring n pieces of reference data, wherein the n pieces of reference data comprise n pieces of data selected from m pieces of second data, and n is more than or equal to 1 and less than m; fourth data is obtained based on the n pieces of reference data and the m pieces of second data.

In some possible embodiments, the data encoding module 72, when obtaining fourth data based on the n reference data and the m second data, may be configured to: performing differential processing on the m second data and the n reference data to obtain m differential values; fourth data is obtained from the n pieces of reference data and the m differential values.

In some possible embodiments, the data encoding module 72 may be configured to, when performing differential processing on the m second data and the n reference data to obtain m differential values: under the condition that n is equal to 1, respectively carrying out differential processing on m second data and reference data to obtain m differential values; or under the condition that n is larger than 1, obtaining the mapping relation between n pieces of reference data and m pieces of second data, and differentiating each piece of second data with the target reference data based on the mapping relation to obtain m differential values, wherein the target reference data is the reference data with the mapping relation with the second data in the n pieces of reference data.

Fig. 8 is a block diagram of a data processing apparatus according to an embodiment of the present disclosure.

Referring to fig. 8, an embodiment of the present disclosure provides a data processing apparatus including: a data reading module 81, a data decompressing module 82, a data generating module 83, and a data processing module 84.

The data reading module 81 is configured to, in a case where a neural network performs a target processing task, read fourth data from a first storage space and read m third data from a second storage space for a first network layer of the neural network, where the fourth data and the m third data are obtained by storing m first data according to a data storage method according to an embodiment of the present disclosure, and the m first data are m activation values output by the first network layer in a forward propagation process;

the data decompression module 82 is configured to decompress the fourth data to obtain m second data;

The data generating module 83 is configured to generate the m first data according to the m second data and the m third data;

The data processing module 84 is configured to take the m first data as input data of a second network layer of the neural network, and process the input data based on the second network layer to obtain output data of the second network layer.

Referring to fig. 9, an embodiment of the present disclosure provides an electronic device including: at least one processor 901; at least one memory 902, and one or more I/O interfaces 903, connected between the processor 901 and the memory 902; the memory 902 stores one or more computer programs executable by the at least one processor 901, and the one or more computer programs are executed by the at least one processor 901 to enable the at least one processor 901 to perform the data storage method or the data processing method described above.

In some embodiments, the electronic device may be a brain-like chip, and since the brain-like chip may employ a vectorization computing manner, parameters such as weight information of a neural network model need to be called into through an external memory, for example, double Data Rate (DDR) synchronous dynamic random access memory. Therefore, the operation efficiency of batch processing is high in the embodiment of the disclosure.

The disclosed embodiments also provide a computer readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor/processing core, implements the data storage method or data processing method described above. The computer readable storage medium may be a volatile or nonvolatile computer readable storage medium.

Embodiments of the present disclosure also provide a computer program product comprising computer readable code, or a non-transitory computer readable storage medium carrying computer readable code, which when executed in a processor of an electronic device, performs the above-described data storage method or data processing method.

Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer-readable storage media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).

The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable program instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, random Access Memory (RAM), read Only Memory (ROM), erasable Programmable Read Only Memory (EPROM), static Random Access Memory (SRAM), flash memory or other memory technology, portable compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical disc storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable program instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and may include any information delivery media.

The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.

The computer program instructions for performing the operations of the present disclosure may be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as SMALLTALK, C ++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.

The computer program product described herein may be embodied in hardware, software, or a combination thereof. In an alternative embodiment, the computer program product is embodied as a computer storage medium, and in another alternative embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), or the like.

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purpose of limitation. In some instances, it will be apparent to one skilled in the art that features, characteristics, and/or elements described in connection with a particular embodiment may be used alone or in combination with other embodiments unless explicitly stated otherwise. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure as set forth in the appended claims.

Claims

1. A method of data storage, comprising:

2. The method of claim 1, wherein the first data comprises a plurality of data bits; the splitting of the m first data to be stored respectively to obtain m second data and m third data includes:

For any first data, taking the data bit meeting the preset condition of the data bits of the first data as second data corresponding to the first data;

and taking the data bits except the second data in the first data as third data corresponding to the first data.

3. The method of claim 2, wherein the m first data comprises floating point type data, the data bits comprising exponent bits;

the step of using the data bit satisfying the preset condition as the second data corresponding to the first data, includes: taking the exponent bits of the first data as the second data;

The step of using the data bits of the first data other than the second data as third data corresponding to the first data includes: and taking data bits except for the index bits in the first data as the third data.

4. The method of claim 1, wherein the compression encoding the m second data to obtain fourth data corresponding to the m second data comprises:

acquiring n pieces of reference data, wherein the n pieces of reference data comprise n pieces of data selected from the m pieces of second data, and n is more than or equal to 1 and less than m;

and obtaining the fourth data based on the n pieces of reference data and the m pieces of second data.

5. The method of claim 4, wherein the obtaining the fourth data based on the n reference data and the m second data comprises:

performing differential processing on the m second data and the n reference data to obtain m differential values;

And obtaining fourth data according to the n datum data and the m differential values.

6. The method of claim 5, wherein the differentiating the m second data with the n reference data to obtain m differential values comprises:

under the condition that n is equal to 1, respectively carrying out differential processing on the m second data and the reference data to obtain m differential values; or alternatively

And under the condition that n is greater than 1, obtaining the mapping relation between the n pieces of reference data and the m pieces of second data, and differentiating each piece of second data with target reference data based on the mapping relation to obtain m differential values, wherein the target reference data is the reference data with the mapping relation with the second data in the n pieces of reference data.

7. The method of any of claims 1-6, wherein the first data comprises an activation value during a neural network forward propagation.

8. A method of data processing, comprising:

under the condition that a neural network executes a target processing task, fourth data are read from a first storage space and m third data are read from a second storage space aiming at a first network layer of the neural network, wherein the fourth data and the m third data are obtained by storing m first data according to the method of any one of claims 1-7, and the m first data are m activation values output by the first network layer in a forward propagation process;

decompressing the fourth data to obtain m second data;

9. A data storage device, comprising:

10. An electronic device, comprising:

at least one processor; and

A memory communicatively coupled to the at least one processor; wherein,

The memory stores one or more computer programs executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.

11. A computer readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the method according to any of claims 1-8.