CN116128040A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN116128040A
CN116128040A CN202310086044.XA CN202310086044A CN116128040A CN 116128040 A CN116128040 A CN 116128040A CN 202310086044 A CN202310086044 A CN 202310086044A CN 116128040 A CN116128040 A CN 116128040A
Authority
CN
China
Prior art keywords
data
sequence
target
position information
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310086044.XA
Other languages
Chinese (zh)
Inventor
陈勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dingdao Zhixin Shanghai Semiconductor Co ltd
Original Assignee
Dingdao Zhixin Shanghai Semiconductor Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dingdao Zhixin Shanghai Semiconductor Co ltd filed Critical Dingdao Zhixin Shanghai Semiconductor Co ltd
Priority to CN202310086044.XA priority Critical patent/CN116128040A/en
Publication of CN116128040A publication Critical patent/CN116128040A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Apparatus For Radiation Diagnosis (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a data processing method and device, wherein the method comprises the following steps: acquiring at least two data objects to be processed, respectively determining effective data in each data object and storage position information corresponding to the effective data in the corresponding data object, determining the effective data corresponding to the storage positions among different data objects, obtaining a data group formed by the effective data corresponding to the positions, and then executing corresponding data processing on the obtained data group to finish the processing of the at least two data objects.

Description

Data processing method and device
Technical Field
The application belongs to the technical field of artificial intelligence, and particularly relates to a data processing method and device.
Background
In model training of deep neural networks, weights are often quantized and clipped, resulting in a large number of 0 values for the weights, and because of the ReLU operation (activation operation), the feature map also generates a large number of 0 values, and this phenomenon of large number of 0 values in the network is called sparsification. At present, how to improve the computing performance when processing the data object based on the sparsification characteristic of the data becomes a research-worthy problem in the field.
Disclosure of Invention
Therefore, the application discloses the following technical scheme:
a method of data processing, the method comprising:
acquiring at least two data objects to be processed;
respectively determining effective data in each data object and storage position information corresponding to the effective data in the corresponding data object;
determining effective data corresponding to storage positions among different data objects to obtain a data group formed by the effective data corresponding to the positions;
and executing corresponding data processing on the data group to finish the processing of the at least two data objects.
Optionally, the determining valid data in each data object and storage location information corresponding to the valid data in the corresponding data object respectively includes:
sequentially determining effective values in the data objects to obtain an effective value sequence, wherein the effective values in the effective value sequence are used as effective data in the data objects;
and determining storage position information corresponding to each effective value in the effective value sequence in the data object respectively to obtain a position information sequence.
Optionally, the determining the storage location information corresponding to each valid value in the valid value sequence in the data object, to obtain a location information sequence includes:
Sequentially adopting corresponding bitmasks to represent whether each value in the data object is a valid value, and obtaining a bitmask sequence corresponding to the data object;
identifying respective first target bitmasks in the bitmask sequence that represent valid values;
and sequentially determining storage positions corresponding to the first target bitmasks representing the effective values in the bitmask sequence to obtain a bitmask position sequence serving as the position information sequence.
Optionally, the determining the valid data corresponding to the storage locations between different data objects to obtain the data set formed by the valid data corresponding to the locations includes:
determining target position information with the same value in different position information sequences; wherein, a data object corresponds to a position information sequence, and each position information sequence is: a sequence formed by storage position information corresponding to each effective data in the corresponding data object;
and determining target data corresponding to each target position information in different data objects respectively, and taking the target data corresponding to each target position information in different data objects respectively as a data group.
Optionally, the determining the valid data corresponding to the storage locations between different data objects to obtain the data set formed by the valid data corresponding to the locations includes:
Determining a second target bit mask of which the storage positions in the bit mask sequences respectively corresponding to the different data objects are corresponding to each other and simultaneously represent valid values;
determining storage positions corresponding to the second target bitmasks in the position information sequences corresponding to the data objects respectively;
and determining target values indicated by the storage positions corresponding to the second target bitmasks in the position information sequences as a data group.
Optionally, the method further comprises:
generating compressed format data corresponding to at least one data object, and storing and/or transmitting the generated compressed format data; the compressed format data includes valid data and corresponding location information.
Optionally, the at least two data objects include a target feature map and a target weight vector, and the acquiring the plurality of data objects to be processed includes:
and acquiring the target feature map to be processed currently in the model processing of the neural network model, and the target weight vector required by processing the target feature map.
Optionally, the effective value is non-zero data.
Optionally, the data processing performed on the data set includes at least one of multiplication processing and addition processing.
A data processing apparatus, the apparatus comprising:
an acquisition unit for acquiring at least two data objects to be processed;
a first determining unit, configured to determine valid data in each data object and storage location information corresponding to the valid data in the corresponding data object;
the second determining unit is used for determining effective data corresponding to storage positions among different data objects to obtain a data set formed by the effective data corresponding to the positions;
and the data processing unit is used for executing corresponding data processing on the data group so as to finish the processing of the at least two data objects.
An electronic device, comprising:
a memory for storing at least one set of computer instructions;
a controller for implementing a data processing method as claimed in any one of the preceding claims by invoking and executing said set of instructions stored in said memory.
According to the scheme, the data processing method, the device and the electronic equipment disclosed by the application acquire at least two data objects to be processed, respectively determine effective data in each data object and storage position information corresponding to the effective data in the corresponding data object, determine the effective data corresponding to the storage positions among different data objects, obtain a data group formed by the effective data corresponding to the positions, and then execute corresponding data processing on the obtained data group to finish processing of the at least two data objects.
According to the method and the device, the data groups formed by the effective data corresponding to the positions in the data objects are determined, the corresponding data processing is carried out on the determined data groups, so that the processing of all the data objects is completed, the processing of the non-effective data in all the data objects is skipped, the sparsification characteristic of the data in the data objects is fully utilized, and the operation amount when the data processing is carried out is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application and that other drawings can be obtained according to the drawings provided without inventive effort to a person skilled in the art.
FIG. 1 is a flow chart of a data processing method provided in the present application;
FIG. 2 is an example of a feature value matrix corresponding to a feature map provided herein;
FIG. 3 is a schematic diagram of a process for determining a corresponding position information sequence of a valid value sequence in an associated data object provided herein;
FIG. 4 is an example of a sequence of significance values and a sequence of bitmasks respectively corresponding to a feature map and a weight vector provided herein;
FIG. 5 is a schematic diagram of searching for a corresponding position in a bit mask sequence of each first target bit mask using a binary search method provided by the present application;
FIG. 6 is an example of a sequence of significant values and their respective corresponding sequence of location information in a feature map and weight vector provided herein;
FIG. 7 is a schematic diagram of an implementation of a data pair consisting of non-zero eigenvalues and non-zero weight components corresponding to determined locations provided herein;
FIG. 8 is a schematic diagram of another implementation of a data pair consisting of non-zero eigenvalues and non-zero weight components corresponding to determined positions provided herein;
FIG. 9 is a schematic diagram of determining valid data pairs for feature maps and weight vectors to implement a skip 0 value operation provided herein;
FIG. 10 is a block diagram of the components of the data processing apparatus provided in the present application;
fig. 11 is a component configuration diagram of the electronic device provided in the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The application discloses a data processing method, a data processing device and electronic equipment, which are used for optimizing a data processing mode of a data object based on sparsification characteristics of data in the data object so as to achieve the technical effects of improving the computing performance of the data object, reducing the storage and transmission bandwidth required by a system, reducing the power consumption of the system and the like. The processing method is applicable to, but not limited to, numerous general purpose or special purpose computing device environments or electronic equipment in configurations, such as: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, and the like.
Referring to the flowchart of the data processing method shown in fig. 1, the data processing method provided in the embodiment of the present application may include the following processing procedures:
step 101, obtaining at least two data objects to be processed.
The at least two data objects may include, but are not limited to, a target feature map and a target weight vector. In the embodiment of the application, a target feature map to be processed and a target weight vector required for processing the target feature map in a deep neural network are taken as examples to perform scheme description.
The target feature map may be, but is not limited to, various types of processing data such as images, voices, and the like. The target feature map can be subjected to one-dimensional convolution, two-bit convolution or three-dimensional convolution, and the target feature map is not limited and can be determined according to actual requirements. For example, for a one-dimensional convolution kernel of 1*3 size, the feature map of 1*3 may be one-dimensionally convolved based on a 1*3 weight vector, and for a two-dimensional convolution kernel of 3*3 size, the feature map of 3*3 may be two-dimensionally convolved based on a 3*3 weight vector.
The method provided by the application can be applied to various fields such as natural language processing, image processing, video processing, voice recognition, industrial detection (such as equipment defect detection) and the like.
Step 102, determining valid data in each data object and storage position information corresponding to the valid data in the corresponding data object.
Typically, each data object contains a plurality of data to be processed, including valid data, and may contain non-valid data (invalid data). The valid data in the data object is data which has a contribution value to the data processing of the data object and is contained in the data object, and the data which does not have a contribution value to the data processing of the data object and is contained in the data object can be used as non-valid data or invalid data of the data object.
For the case that the data object to be processed includes the target feature map and the target weight vector, the valid data may be non-zero data in the data object, such as a non-zero feature value in the target feature map and a non-zero weight component in the target weight vector, and the zero feature value and the zero weight vector are regarded as invalid data.
In model training of deep neural networks, weights are often quantized and cut, resulting in a large number of 0 values in weight vectors, meanwhile, because of the ReLU operation (activation operation), feature maps also generate a large number of 0 values, main operation operations in the neural network include multiplication and addition, but the 0 values in the feature maps or weight vectors do not contribute to multiplication and addition, and therefore, according to the embodiment of the present application, preprocessing is performed before the feature maps are processed through the model, or non-zero value data in the feature maps and weight vectors are used as valid data through real-time processing, and zero value data is used as invalid data.
In this embodiment, the storage location information corresponding to the valid data in the corresponding data object refers to a logical location/a logical storage location of the valid data in the corresponding data object, for short, for example, refer to a feature value matrix of a feature map corresponding to a 3*3 convolution kernel shown in fig. 2, where each feature value p00, p01, p02 … p22, and the corresponding storage location information in the feature value matrix may be expressed as: 0. 1, 2 ….
And 103, determining effective data corresponding to storage positions among different data objects to obtain a data set formed by the effective data corresponding to the positions.
The valid data corresponding to the storage locations between different data objects may have two alternative cases, one refers to valid data with the same storage locations between different data objects; in another case, the mapping relationship between the data locations in different data objects is specified in advance according to the requirements, and the valid data corresponding to the storage locations between the different data objects may refer to the valid data in the different data objects that accords with the specified location mapping relationship. The present embodiment will be mainly described by taking the first case as an example.
Step 104, performing corresponding data processing on the data set to complete processing on the at least two data objects.
According to the scheme, the data processing method disclosed by the application comprises the steps of obtaining at least two data objects to be processed, respectively determining effective data in each data object and storage position information corresponding to the effective data in the corresponding data object, determining the effective data corresponding to the storage positions among different data objects, obtaining a data group formed by the effective data corresponding to the positions, and then executing corresponding data processing on the obtained data group to finish processing of the at least two data objects.
According to the method and the device, the data groups formed by the effective data corresponding to the positions in the data objects are determined, the corresponding data processing is carried out on the determined data groups to finish the processing of the data objects, the processing of the non-effective data in the data objects is skipped, the sparsification characteristic of the data in the data objects is fully utilized, the operation amount when the data processing is carried out is reduced, and therefore the technical effects of improving the calculation performance of the data processing, reducing the system storage and transmission bandwidth, reducing the system power consumption and the like can be achieved.
In an alternative embodiment, step 101 of the data processing method provided in the present application, that is, obtaining at least two data objects to be processed, may be further implemented as: the method comprises the steps of obtaining a target feature map to be processed currently in model processing of a neural network model (data processing based on the model), and a target weight vector required by processing the target feature map, wherein the obtained target feature map and the target weight vector are used as data objects to be processed.
The data processing requirements of the data processing scene based on the neural network model are met by acquiring a target feature map to be processed currently and a corresponding target weight vector in the neural network model and executing subsequent processing on the target feature map.
In an optional embodiment, after obtaining at least two data objects to be processed, the data processing method provided in the present application may further determine, for each data object in sequence, an effective value in the data object, to obtain an effective value sequence of the data object, and use an effective value in the effective value sequence as effective data in the data object in step 102; and then, determining storage position information corresponding to each effective value in the effective value sequence in the data object respectively to obtain a position information sequence, wherein the position information sequence can be used as the storage position information corresponding to the effective data in the data object.
For the example that the data object to be processed comprises the target feature map and the target weight vector, the non-zero feature values in the target feature map can be sequentially determined to obtain a non-zero feature value sequence, and first position information of positions corresponding to the non-zero feature values in the sequence in the target feature map respectively can be sequentially determined to obtain a first position information sequence. And sequentially determining non-zero weight components in the target weight vector to obtain a non-zero weight component sequence, and sequentially determining second position information of positions corresponding to the weight components in the sequence in the target weight vector respectively to obtain a second position information sequence.
Optionally, referring to fig. 3, the process of determining the storage location information corresponding to each valid value in the valid value sequence in the data object to obtain the location information sequence may be further implemented as:
step 301, sequentially adopting the corresponding bitmasks to represent whether each value in the data object is a valid value, so as to obtain a bitmask sequence corresponding to the data object.
Specifically, a bit mask with a preset number of bits (refer to a preset number of bits) may be used to indicate whether the value of the single data in the data object is a valid value, and preferably, in this embodiment, a 1-bit mask (i.e. a bit mask with one bit) is used to indicate whether the value of the single data in the data object is a valid value, where a bit mask 1 is used to indicate a valid value, a bit mask 0 is used to indicate an invalid value, and in practical application, the meaning (valid or invalid) represented by different bit masks may be set by itself, without limitation.
Correspondingly, for the example that the data object comprises the target feature map and the target weight vector, the corresponding bitmasks can be sequentially adopted to represent whether each feature value in the target feature map is a non-zero feature value, wherein 1 represents the non-zero feature value, 0 represents the zero feature value, and a first bitmask sequence corresponding to the target feature map is obtained; and sequentially adopting corresponding bitmasks to represent whether each weight component in the target weight vector is a non-zero weight component, wherein 1 represents the non-zero weight component, 0 represents the zero weight component, and obtaining a second bitmask sequence corresponding to the target weight vector.
Step 302, each first target bitmask in the bitmask sequence representing a valid value is identified.
Alternatively, each first target bitmask in the first bitmask sequence representing a non-zero eigenvalue may be specifically identified, and each first target bitmask in the second bitmask sequence representing a non-zero weight component may be identified.
Step 303, sequentially determining storage positions corresponding to the first target bitmasks representing the valid values in the bitmask position sequence, and obtaining the bitmask position sequence as the position information sequence.
Specifically, positions, corresponding to the first target bitmasks representing the non-zero feature values, in the first bitmask sequence are sequentially determined to obtain a first bitmask position sequence, and position information in the first bitmask position sequence is sequentially used as first position information, corresponding to the non-zero feature values, in the target feature map, so that the first position information sequence is obtained.
Similarly, the positions of the first target bitmasks representing the non-zero weight components in the second bitmask sequence can be sequentially determined to obtain the second bitmask position sequence, and the position information in the second bitmask position sequence is sequentially used as the second position information of the positions of the non-zero weight components in the target weight vectors, so that the second position information sequence is obtained.
Referring to the example of fig. 4, in this example, by performing significant value recognition on the feature map and the weight component, and based on significant value/invalid value characterization of the bitmask, it may be determined that the non-zero feature value sequence corresponding to the feature map and the first bitmask sequence are respectively: the non-zero weight component sequence and the second bitmask sequence corresponding to the weight vector of ABCDE,101010101 are respectively: abcde,011001011. On the basis, each first target bit mask which represents the effective value in the first bit mask sequence and the corresponding position of each first target bit mask in the first bit mask sequence can be further identified, and each first target bit mask is respectively corresponding to the position in the first bit mask sequence and is used as the first position information corresponding to each non-zero characteristic value in the characteristic diagram to obtain a first position information sequence; and identifying each first target bit mask representing the effective value in the second bit mask sequence, and the corresponding position of each first target bit mask in the second bit mask sequence, and taking the corresponding position of each first target bit mask in the second bit mask sequence as the corresponding second position information of each non-zero weight component in the weight vector to obtain a second position information sequence.
Preferably, in this embodiment, a binary search mode is used to search the position of each first target bitmask in the corresponding bitmask sequence, where the complexity of the binary search mode is lg (n), and the search process includes: dividing each bit mask in the bit mask sequence into two parts for parallel processing, carrying out OR operation on all bit masks (namely bit values corresponding to each bit mask) of each part, ending the processing of the part if the OR operation result is 0, continuing to carry out halving processing on the part if the OR operation result is 1, respectively continuing to OR all bit values of each bit mask after halving processing, sequentially finding each bit mask with 1 in the bit mask sequence (namely the first target bit mask representing an effective value) through the process, and determining the position of the bit mask with 1 in the bit mask sequence.
Further, referring to fig. 5 in combination to an example of identifying the position of the first target bitmask in the corresponding bitmask sequence, in order to increase the speed, the present example adopts a method of searching for the first target bitmask in the corresponding bitmask sequence based on the above-mentioned dichotomy principle, while the head and tail of the bitmask sequence are adopted. The positions of the obtained first target bitmasks in the corresponding bitmask sequences are searched, and then the positions of the corresponding effective values (such as non-zero eigenvalues and non-zero weight components) in the data objects are used as the positions of the corresponding effective values in the corresponding data objects, and specifically, the positions of the obtained corresponding first target bitmasks in the first position information sequences are used as the positions of the non-zero eigenvalues in the eigenvectors for the eigenvector and weight vector examples of fig. 4, so that the first position information sequences are formed. The corresponding searching process can be combined with that of fig. 5 and 6, taking the bit mask sequence 101010101 corresponding to the non-zero characteristic value sequence ABCDE as an example, dividing the bit mask sequence 101010101 corresponding to the non-zero characteristic value sequence ABCDE into two parts of 10101 and 0101 respectively, adopting a mode of searching the head part and the tail part simultaneously, searching the 10101 from the head part, searching the 0101 from the tail part, taking the searching of the 10101 as an example, continuing to divide the middle part into 101 and 01 respectively, and respectively solving or solving each bit value in the 101 respectively for 1, and solving or solving each bit value in the 01 respectively for 1, and correspondingly continuing to divide the two parts, and taking the 1 st bit mask value and the 3 rd bit mask value in the 101 as 1 by cycling the process, so that three bit masks in the 10101 are searched for 1 respectively, taking the 0101 in the tail part as the bit mask value in the 2 nd bit mask value in the 101, taking the bit mask value in the bit mask sequence of the 0101 as the 1, and taking the bit mask value in the bit sequence of the initial sequence of the corresponding to be the basic sequence 02468, namely, taking the value of the corresponding bit mask sequence in the bit mask sequence of the two corresponding to the bit mask sequence as the basic sequence of the bit mask sequence of the initial sequence of the value in the two positions as the basic sequence of the value of 02468 respectively. In other embodiments, the sequence of location information may also be calculated by other calculation rules.
Similarly, the position of each corresponding first target bit mask obtained by searching in the second position information sequence can be used as the position of each non-zero weight component in the weight vector to form the second position information sequence, as shown in fig. 6, the corresponding second position information sequence of the non-zero weight component sequence abcde in the weight vector is specifically 12578.
In addition, each bit mask in the bit mask sequence may be sequentially read based on a traversal manner, and whether the read bit mask represents an effective value is determined, for example, whether the read bit mask takes a value of 1 is determined, if yes, the bit mask is determined to be a first target bit mask representing the effective value, and a position corresponding to the first target bit mask in the bit mask sequence may be identified, and the position corresponding to the first target bit mask in the bit mask sequence may be used as a position of the effective value corresponding to the first target bit mask in a data object (such as a feature map), so as to form a position information sequence corresponding to the effective value sequence of the data object in the data object. Compared with the mode based on dichotomy searching, the implementation mode has higher complexity, and in practical application, the required implementation mode can be selected by oneself without limitation.
In an optional embodiment, in order to address the case that the valid data corresponding to the storage locations between different data objects is valid data with the same storage locations between different data objects, step 103 in the data processing method provided in the present application may directly determine the target location information with the same value in different location information sequences, determine the target data corresponding to each target location information in different data objects, and use the target data corresponding to each target location information in different data objects as the valid data corresponding to the storage locations between different data objects, thereby obtaining the data set formed by the valid data corresponding to the locations.
Wherein, a data object corresponds to a position information sequence, and each position information sequence is: each valid data in the corresponding data object is in a sequence formed by the corresponding storage position information in the data object. For example, for an example in which the data object includes a target feature map and a target weight vector, the target feature map corresponds to a first sequence of location information, such as 02468 in fig. 6, and the target weight vector corresponds to a second sequence of location information, such as 12578 in fig. 6.
For the neural network model of deep learning, it is necessary to operate on the "eigenvalue-weight component" pairs of the same location, and accordingly determine the location information with the same value in the different location information sequences as the target location information, as in the above example, the same location information 2-2 and 8-8 in the first location information sequence 02468 and the second location information sequence 12578, and determine the target eigenvalue corresponding to each target location information in the target eigenvector (or the non-zero eigenvalue sequence thereof) and the target weight component corresponding to each target location information in the target eigenvector (or the non-zero weight component sequence thereof), and take the target eigenvalue and the target weight component corresponding to each target location information in the target eigenvector and the target weight vector as a data set, for example, take B and B corresponding to 2-2 in the target eigenvector and the target weight vector as a data set, namely B-B, and take E and E corresponding to 8-8 in the target eigenvector and the target weight vector as a data set E-E.
In the first case, another embodiment of determining the data set is further provided, where the second target bitmasks corresponding to the storage locations in the bitmask sequences corresponding to the different data objects respectively are consistent and simultaneously represent the valid values, the storage locations corresponding to the second target bitmasks in the location information sequences corresponding to the respective data objects respectively are determined, and the target values indicated by the storage locations corresponding to the second target bitmasks in the location information sequences respectively are used as valid data corresponding to the storage locations between the different data objects, so as to obtain the data set formed by the valid data corresponding to the locations.
For examples in which the data object includes a target feature map and a target weight vector, it may be specifically determined that a first bit mask sequence corresponding to the target feature map and a second bit mask sequence corresponding to the target weight vector correspond in position to be consistent and simultaneously represent a bit mask of a non-zero value, determining position information corresponding to the bit mask in the first bit mask sequence/first position information sequence and position information corresponding to the bit mask in the second bit mask sequence/second position information sequence, and determining, as the data pair, a feature value indicated by the position information corresponding to the bit mask in the first bit mask sequence/first position information sequence and a weight component indicated by the position information corresponding to the bit mask in the second bit mask sequence/second position information sequence.
The former embodiment corresponding to the first case relates to byte level operation, and the latter embodiment relates to bit level operation, so that the latter embodiment has higher processing efficiency on data objects.
Further taking the target feature map and the target weight vector as examples, in the implementation of providing the bit level operation, two detailed implementation manners of determining the data group formed by the valid data corresponding to the positions in different data objects are provided.
In one implementation, referring to the implementation process schematic diagram of the determining data pair shown in fig. 7, performing a bitwise and operation on the first bitwise mask sequence of the target feature map and the second bitwise mask sequence of the target weight vector, only if the feature value and the weight component at the corresponding position are not 0 at the same time, the result of the bitwise and operation is 1, and the feature value and the weight vector of which the result of the operation is 1 can form an effective "feature value and weight vector" data pair. After the bit mask of the effective data pair is obtained, a head-tail bidirectional dichotomy searching mode is adopted to obtain the position of each effective data pair, such as 2 and 8 in fig. 7, by matching the position of each effective data pair with a first position information sequence of a target feature map and a second position information sequence of a target weight vector and mapping a non-zero feature value sequence and a non-zero weight component sequence, the effective data pair formed by the non-zero feature value and the non-zero weight component corresponding to the position can be obtained, such as a data pair B-B corresponding to the position 2 in the target feature map and the target weight vector in fig. 7, and a data pair E-E corresponding to the position 8 in the target feature map and the target weight vector.
In another implementation manner, referring to the implementation process schematic diagram of the determining data pair shown in fig. 8, a bitwise exclusive or' (XOR) operation is performed on the bitmask sequence of the target feature map and the target weight vector, and as a result, 1 indicates that the bitmask of the target feature map and the target weight vector have different values, one represents valid data, the other represents invalid data, and such data needs to be removed without participating in final calculation. And then, performing bit-wise AND operation on the bitmasks generated by the exclusive OR operation and the bitmasks of the target feature map and the target weight vector respectively, wherein the result is 1 which represents invalid data to be eliminated. And removing the invalid data, wherein the rest valid data, and forming valid data pairs between the target feature map and the valid data at the corresponding position in the target weight vector.
Optionally, the first implementation manner may open up a new storage space, and store the found valid data into the new space; in the second implementation mode, a new space is not required to be opened, the effective data is continuously reserved in the original storage space, movement is not required, invalid data is removed, invalid data is only required to be marked as invalid, and optionally, the data of the target feature map and the target weight vector can be respectively stored in different FIFO (first in first out) queues, and when the data are processed, the effective data in corresponding positions can be sequentially taken out from the top of the FIFO for processing.
In an alternative embodiment, the processing manner required for performing data processing on the obtained data set may be determined according to the service requirement, and optionally, the data processing performed on the data set includes at least one of multiplication processing and addition processing.
Taking at least two data objects including the target feature map and the target weight vector as an example, multiply-accumulate operation may be specifically performed on each determined data pair, that is, first, multiply the non-zero feature value and the non-zero weight vector in each data pair respectively, and then accumulate the multiplication result corresponding to each data pair, for example, multiply-accumulate two data pairs B-B and E-E in fig. 7, and then, specifically obtain b+b+e.
In the above example, the target feature map and the target weight vector are both 3x3 in size, 9 data are required to be calculated, after the method of the present application is adopted, the 0 value is skipped, only 2 pairs of effective data (the feature value and the weight component at the corresponding position are both non-0 values) are processed, the calculated amount is reduced from 9 to 2, specifically as shown in fig. 9, so that the calculation amount of data processing is greatly reduced based on the sparsification characteristic of the data in the data objects such as the feature map and the weight vector.
The method does not need to change the existing network model, only needs to preprocess (compress) the data object before the convolution operation of the model, is easy to realize, and is suitable for all network models.
In an alternative embodiment, the data processing method provided in the present application may further include the following processes:
generating compressed format data corresponding to at least one data object, and storing and/or transmitting the generated compressed format data; the compressed format data includes valid data and corresponding location information.
Optionally, the embodiment performs compression processing on at least one data object, and generates corresponding compression format data for the data object. The compression of the data object is realized by removing invalid values in the data object, such as zero values in the feature map and the weight vector, so as to reduce the data volume of the data object.
In the data object, because the relative position relation exists among the arrays, the data contained in the data object is provided with the position attribute, and when the data object is processed, the data in different data objects are required to be processed according to the positions to form the object group. However, when the data object is compressed by removing the invalid value in the data object, the relative position relationship between the valid data changes, so that the position information of the valid value in the data object is lost, and based on this, when the data object is compressed, a corresponding position index is generated for restoring the position of the non-removed valid value in the data object, which is also needed to be generated for each valid value sequence formed by each valid value in the data object.
Alternatively, the position index may be a position information sequence corresponding to each valid value of the valid value sequence in the data object, or may also be a bit mask sequence corresponding to the data object, so as to characterize the position of each valid value of the valid value sequence in the data object by each position information in the position information sequence, or a corresponding position in the bit mask sequence of each bit mask representing the valid value in the bit mask sequence. For specific implementation manners of the location information sequence, the bitmask sequence, etc., reference may be made to the related descriptions in the foregoing embodiments, which are not repeated herein.
Correspondingly, the compressed format data generated for the data object may include an effective value sequence corresponding to the data object, and a position information sequence corresponding to each effective value of the effective value sequence in the data object; alternatively, a sequence of valid values and a sequence of bitmasks corresponding to the data object may be included. When the data object is processed later, the corresponding position of each effective value in the effective value sequence in the data object can be restored based on the position information sequence or the bit mask sequence in the compressed format data.
For example, first compression format data corresponding to the target feature map is generated, the first compression format data is stored and/or transmitted, and/or second compression format data corresponding to the target weight vector is generated, and the second compression format data is stored and/or transmitted.
First compressed format data, which may include a sequence of non-zero eigenvalues of the target eigenvector, and a first sequence of position information or a first sequence of bitmasks as described above; the second compressed format data may include a sequence of non-zero weight components of the target weight vector, and a second sequence of location information or a second bitmask sequence as described above.
It should be noted that, since the amount of data of the position index is generally small, after the compression processing based on removing the invalid value is performed on the data object, even if the position index information is carried in the compressed format data, such as the position information sequence or the bit mask sequence, the effect of reducing the amount of data of the data object can be achieved compared with the original data object, and especially, the implementation mode that the bit mask sequence is adopted as the position index has a higher compression rate, and the position index is only on the bit level.
In the application example of the data object including the target feature map and the target weight vector, namely in the data processing scene of the neural network model, the feature map can be compressed when being output by the previous convolution layer, and the feature map data in the compressed format is transmitted to the next convolution layer, and the weight vector of each convolution layer in the neural network model can be compressed and stored in each convolution layer. On the basis, when a certain convolution layer receives the compressed format feature image output by the previous convolution layer and needs to be processed, a self-maintained compressed format weight vector can be obtained, the positions of all non-zero feature values in the non-zero feature value sequence and the positions of all non-zero weight components in the non-zero weight component sequence in the feature image are restored based on the feature image of the compressed format and the position information sequence or the bit mask sequence in the weight vector respectively, and further, the data pairs of the non-zero feature value-non-zero weight vector with the consistent positions are determined, and the required processing, such as multiplication and accumulation, is carried out on the data pairs.
According to the embodiment, by generating the compressed format data corresponding to at least one data object and storing and/or transmitting the generated compressed format data, the storage and transmission bandwidth required by the system can be reduced further based on the sparsification characteristic of the data in the data object.
The embodiment of the application further provides a data processing device, where the composition structure of the device is shown in fig. 10, and the device includes:
an acquiring unit 1001, configured to acquire at least two data objects to be processed;
a first determining unit 1002, configured to determine valid data in each data object, and storage location information corresponding to the valid data in the corresponding data object;
a second determining unit 1003, configured to determine valid data corresponding to storage locations between different data objects, and obtain a data group formed by valid data corresponding to the locations;
the data processing unit 1004 is configured to perform corresponding data processing on the data set, so as to complete processing on the at least two data objects.
In an embodiment, the first determining unit 1002 is specifically configured to:
sequentially determining effective values in the data objects to obtain an effective value sequence, wherein the effective values in the effective value sequence are used as effective data in the data objects;
And determining storage position information corresponding to each effective value in the effective value sequence in the data object respectively to obtain a position information sequence.
In an embodiment, when determining storage location information corresponding to each valid value in the valid value sequence in the data object, the first determining unit 1002 is specifically configured to:
sequentially adopting corresponding bitmasks to represent whether each value in the data object is a valid value, and obtaining a bitmask sequence corresponding to the data object;
identifying respective first target bitmasks in the bitmask sequence that represent valid values;
and sequentially determining storage positions corresponding to the first target bitmasks representing the effective values in the bitmask sequence to obtain a bitmask position sequence serving as the position information sequence.
In an embodiment, the second determining unit 1003 is specifically configured to:
determining target position information with the same value in different position information sequences; wherein, a data object corresponds to a position information sequence, and each position information sequence is: a sequence formed by storage position information corresponding to each effective data in the corresponding data object;
And determining target data corresponding to each target position information in different data objects respectively, and taking the target data corresponding to each target position information in different data objects respectively as a data group.
In an embodiment, the second determining unit 1003 is specifically configured to:
determining a second target bit mask of which the storage positions in the bit mask sequences respectively corresponding to the different data objects are corresponding to each other and simultaneously represent valid values;
determining storage positions corresponding to the second target bitmasks in the position information sequences corresponding to the data objects respectively;
and determining target values indicated by the storage positions corresponding to the second target bitmasks in the position information sequences as a data group.
In an embodiment, the data processing unit 1004 is further configured to:
generating compressed format data corresponding to at least one data object, and storing and/or transmitting the generated compressed format data; the compressed format data includes valid data and corresponding location information.
In an embodiment, the at least two data objects include a target feature map and a target weight vector; the obtaining unit 1001 is specifically configured to: and acquiring the target feature map to be processed currently in the model processing of the neural network model, and the target weight vector required by processing the target feature map.
In one embodiment, the valid value is non-zero data.
In one embodiment, the data processing performed on the data set includes at least one of a multiplication process and an addition process.
The data processing apparatus disclosed in the embodiments of the present application corresponds to the data processing method disclosed in the embodiments of the method, so that the description is relatively simple, and the relevant similarities refer to the description of the embodiments of the method, and are not described in detail herein.
The embodiment of the application also discloses an electronic device, and the composition structure of the electronic device, as shown in fig. 11, at least includes:
a memory 10 for storing a set of computer instructions;
the set of computer instructions may be implemented in the form of a computer program.
A processor 20 for implementing a data processing method as disclosed in any of the method embodiments above by executing a set of computer instructions.
The processor 20 may be a central processing unit (Central Processing Unit, CPU), application-specific integrated circuit (ASIC), digital Signal Processor (DSP), application-specific integrated circuit (ASIC), field Programmable Gate Array (FPGA), neural Network Processor (NPU), deep learning processor (DPU), or other programmable logic device, etc.
The electronic device is provided with a display device and/or a display interface, and can be externally connected with the display device.
Optionally, the electronic device further includes a camera assembly, and/or an external camera assembly is connected thereto.
In addition, the electronic device may include communication interfaces, communication buses, and the like. The memory, processor and communication interface communicate with each other via a communication bus.
The communication interface is used for communication between the electronic device and other devices. The communication bus may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus or the like, and may be classified as an address bus, a data bus, a control bus, or the like.
It should be noted that, in the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different manner from other embodiments, and identical and similar parts between the embodiments are referred to each other.
For convenience of description, the above system or apparatus is described as being functionally divided into various modules or units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present application.
From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solutions of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the methods described in the embodiments or some parts of the embodiments of the present application.
Finally, it is further noted that relational terms such as first, second, third, fourth, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application and are intended to be comprehended within the scope of the present application.

Claims (10)

1. A method of data processing, the method comprising:
acquiring at least two data objects to be processed;
respectively determining effective data in each data object and storage position information corresponding to the effective data in the corresponding data object;
determining effective data corresponding to storage positions among different data objects to obtain a data group formed by the effective data corresponding to the positions;
and executing corresponding data processing on the data group to finish the processing of the at least two data objects.
2. The method according to claim 1, wherein the determining valid data in each data object and storage location information corresponding to the valid data in the corresponding data object includes:
sequentially determining effective values in the data objects to obtain an effective value sequence, wherein the effective values in the effective value sequence are used as effective data in the data objects;
And determining storage position information corresponding to each effective value in the effective value sequence in the data object respectively to obtain a position information sequence.
3. The method according to claim 2, wherein the determining the storage location information corresponding to each valid value in the valid value sequence in the data object, to obtain the location information sequence, includes:
sequentially adopting corresponding bitmasks to represent whether each value in the data object is a valid value, and obtaining a bitmask sequence corresponding to the data object;
identifying respective first target bitmasks in the bitmask sequence that represent valid values;
and sequentially determining storage positions corresponding to the first target bitmasks representing the effective values in the bitmask sequence to obtain a bitmask position sequence serving as the position information sequence.
4. The method according to claim 1, wherein the determining the valid data corresponding to the storage locations between the different data objects, to obtain the data set formed by the valid data corresponding to the locations, includes:
determining target position information with the same value in different position information sequences; wherein, a data object corresponds to a position information sequence, and each position information sequence is: a sequence formed by storage position information corresponding to each effective data in the corresponding data object;
And determining target data corresponding to each target position information in different data objects respectively, and taking the target data corresponding to each target position information in different data objects respectively as a data group.
5. A method according to claim 3, wherein the determining the valid data corresponding to the storage locations between the different data objects, to obtain the data set formed by the valid data corresponding to the locations, includes:
determining a second target bit mask of which the storage positions in the bit mask sequences respectively corresponding to the different data objects are corresponding to each other and simultaneously represent valid values;
determining storage positions corresponding to the second target bitmasks in the position information sequences corresponding to the data objects respectively;
and determining target values indicated by the storage positions corresponding to the second target bitmasks in the position information sequences as a data group.
6. The method of claim 1, further comprising:
generating compressed format data corresponding to at least one data object, and storing and/or transmitting the generated compressed format data; the compressed format data includes valid data and corresponding location information.
7. The method of any of claims 1-6, the at least two data objects comprising a target feature map and a target weight vector, the obtaining a plurality of data objects to be processed comprising:
And acquiring the target feature map to be processed currently in the model processing of the neural network model, and the target weight vector required by processing the target feature map.
8. The method of any of claims 2, 3, and 5, the significant value being non-zero data.
9. The method of claim 1, the data processing performed on the data set comprising at least one of a multiplication process and an addition process.
10. A data processing apparatus, the apparatus comprising:
an acquisition unit for acquiring at least two data objects to be processed;
a first determining unit, configured to determine valid data in each data object and storage location information corresponding to the valid data in the corresponding data object;
the second determining unit is used for determining effective data corresponding to storage positions among different data objects to obtain a data set formed by the effective data corresponding to the positions;
and the data processing unit is used for executing corresponding data processing on the data group so as to finish the processing of the at least two data objects.
CN202310086044.XA 2023-02-01 2023-02-01 Data processing method and device Pending CN116128040A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310086044.XA CN116128040A (en) 2023-02-01 2023-02-01 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310086044.XA CN116128040A (en) 2023-02-01 2023-02-01 Data processing method and device

Publications (1)

Publication Number Publication Date
CN116128040A true CN116128040A (en) 2023-05-16

Family

ID=86297084

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310086044.XA Pending CN116128040A (en) 2023-02-01 2023-02-01 Data processing method and device

Country Status (1)

Country Link
CN (1) CN116128040A (en)

Similar Documents

Publication Publication Date Title
CN107229757B (en) Video retrieval method based on deep learning and Hash coding
US20230196837A1 (en) Action recognition method and apparatus, and device and storage medium
CN109614874B (en) Human behavior recognition method and system based on attention perception and tree skeleton point structure
CN112639828A (en) Data processing method, method and equipment for training neural network model
CN111382867A (en) Neural network compression method, data processing method and related device
TWI740726B (en) Sorting method, operation method and apparatus of convolutional neural network
CN111445418A (en) Image defogging method and device and computer equipment
CN112749666B (en) Training and action recognition method of action recognition model and related device
CN114529982B (en) Lightweight human body posture estimation method and system based on streaming attention
EP4318313A1 (en) Data processing method, training method for neural network model, and apparatus
KR20210034462A (en) Method for training generative adversarial networks to generate per-pixel annotation
CN112789627A (en) Neural network processor, data processing method and related equipment
TW202133032A (en) Image normalization processing method, apparatus and storage medium
CN111008631A (en) Image association method and device, storage medium and electronic device
CN111240746A (en) Floating point data inverse quantization and quantization method and equipment
CN111860253A (en) Multitask attribute identification method, multitask attribute identification device, multitask attribute identification medium and multitask attribute identification equipment for driving scene
CN111814534A (en) Visual task processing method and device and electronic system
WO2022001364A1 (en) Method for extracting data features, and related apparatus
CN114298289A (en) Data processing method, data processing equipment and storage medium
CN114820755B (en) Depth map estimation method and system
CN116363561A (en) Time sequence action positioning method, device, equipment and storage medium
CN116128040A (en) Data processing method and device
CN116051846A (en) Image feature extraction method, image feature extraction device, computer equipment and storage medium
CN112418388A (en) Method and device for realizing deep convolutional neural network processing
CN113807330B (en) Three-dimensional sight estimation method and device for resource-constrained scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination