CN111832692A - Data processing method, device, terminal and storage medium - Google Patents

Data processing method, device, terminal and storage medium Download PDF

Info

Publication number
CN111832692A
CN111832692A CN202010676005.1A CN202010676005A CN111832692A CN 111832692 A CN111832692 A CN 111832692A CN 202010676005 A CN202010676005 A CN 202010676005A CN 111832692 A CN111832692 A CN 111832692A
Authority
CN
China
Prior art keywords
intermediate data
target intermediate
neural network
data
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010676005.1A
Other languages
Chinese (zh)
Inventor
刘君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202010676005.1A priority Critical patent/CN111832692A/en
Publication of CN111832692A publication Critical patent/CN111832692A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application provides a data processing method, a data processing device, a terminal and a storage medium, and relates to the technical field of chips. The method comprises the following steps: compressing target intermediate data, wherein the target intermediate data refers to data required to be used in nth layer processing of a neural network, and n is a positive integer; responding to the nth layer of the neural network, and processing the target intermediate data to obtain compressed target intermediate data; decompressing the compressed target intermediate data to obtain target intermediate data; and processing the nth layer of the neural network according to the target intermediate data. The embodiment of the application can avoid the phenomenon that the power consumption of the terminal is increased due to the fact that the target intermediate data are exchanged into the DRAM.

Description

Data processing method, device, terminal and storage medium
Technical Field
The embodiment of the application relates to the technical field of chips, in particular to a data processing method, a data processing device, a terminal and a storage medium.
Background
With the development of terminal technology, the terminal has introduced an AI (Artificial Intelligence) chip, which can be used to implement the processing procedure of the neural network.
In the related art, during a process of implementing the neural network, target intermediate data (for example, a weight and/or a feature map) in the process of implementing the neural network needs to be exchanged from an SRAM (Static Random Access Memory) included in the AI chip to an external DRAM (Dynamic Random Access Memory).
However, the above-described related art may cause an increase in power consumption of the terminal.
Disclosure of Invention
The embodiment of the application provides a data processing method, a data processing device, a terminal and a storage medium. The technical scheme is as follows:
in one aspect, an embodiment of the present application provides a data processing method, where the method includes:
compressing target intermediate data, wherein the target intermediate data refers to data required to be used during nth layer processing of a neural network, and n is a positive integer;
responding to the nth layer of the neural network to process by using the target intermediate data, and acquiring the compressed target intermediate data;
decompressing the compressed target intermediate data to obtain the target intermediate data;
and processing the nth layer of the neural network according to the target intermediate data.
In another aspect, an embodiment of the present application provides a data processing apparatus, where the apparatus includes:
the data compression module is used for compressing target intermediate data, wherein the target intermediate data refers to data which needs to be used during nth layer processing of the neural network, and n is a positive integer;
the data acquisition module is used for responding to the nth layer of the neural network and processing the target intermediate data to acquire the compressed target intermediate data;
the data decompression module is used for decompressing the compressed target intermediate data to obtain the target intermediate data;
and the data processing module is used for processing the nth layer of the neural network according to the target intermediate data.
In another aspect, an embodiment of the present application provides a terminal, where the terminal includes a processor and a memory, where the memory stores a computer program, and the computer program is loaded and executed by the processor to implement the data processing method according to the above aspect.
In still another aspect, an embodiment of the present application provides a computer-readable storage medium, in which a computer program is stored, and the computer program is loaded and executed by a processor to implement the data processing method according to the above aspect.
In another aspect, the present application provides a computer program product, which is used to implement the above data processing method when the computer program product is executed.
The technical scheme provided by the embodiment of the application can bring the following beneficial effects:
the intermediate data corresponding to the nth layer which is not needed to be used temporarily is compressed, and the intermediate data corresponding to the nth layer is decompressed when the intermediate data is needed to be used, so that the memory space of the terminal is reasonably utilized, and the requirement on the memory space is reduced.
Drawings
Fig. 1 is a schematic structural diagram of an AI chip according to an embodiment of the present application;
FIG. 2 is a flow chart of a data processing method provided by an embodiment of the present application;
FIG. 3 is a flow chart of a data processing method provided by another embodiment of the present application;
FIG. 4 is a flow chart of a data processing method provided by another embodiment of the present application;
fig. 5 is a schematic structural diagram of a pnet network provided in an embodiment of the present application;
FIG. 6 is a flow chart of a data processing method provided by another embodiment of the present application;
FIG. 7 is a flow chart of a data processing method provided by yet another embodiment of the present application;
FIG. 8 is a block diagram of a data processing apparatus provided in one embodiment of the present application;
fig. 9 is a block diagram of a terminal according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
The execution subject of the embodiment of the present application may be a terminal, for example, the terminal may be a mobile phone, a PC (personal computer), a tablet computer, or other electronic devices.
Illustratively, the terminal is provided with an AI chip, which may also be referred to as an AI accelerator or a computing card, and is a chip dedicated to handling a large number of computing tasks in artificial intelligence applications. The AI chip can process data such as images, voice, video, etc. Optionally, the AI chip includes an NPU (Neural-network Processing Unit), a GPU (Graphics Processing Unit), an FPGA (Field Programmable logic array), an ASIC (Application Specific Integrated Circuit), a brain-like chip, a reconfigurable general AI chip, and the like. The technical scheme provided by the embodiment of the application can be executed by an AI chip.
Referring to fig. 1, which shows a schematic structural diagram of an AI chip according to an embodiment of the present application, the AI chip 100 includes a compression unit 110, a storage unit 120, and a decompression unit 130. The AI chip 100 runs a neural network processing engine (which may be called a neural network inference engine), and the neural network includes N layers, each Layer being Layer1 (Layer 1), Layer2 (Layer 2), Layer3 (Layer 3) … … LayerN (Layer N). In the embodiment of the present application, the compression unit 110 is configured to compress target intermediate data (intermediate data unused in a next layer temporarily), where the target intermediate data refers to data that needs to be used in processing of an nth layer of the neural network. The storage unit 120 is used to store the target intermediate data. The decompression unit 130 is used for decompressing the compressed target intermediate data.
Referring to fig. 2, a flowchart of a data processing method according to an embodiment of the present application is shown. The embodiment of the application can be applied to the AI chip shown in fig. 1, and the method can include the following steps.
Step 201, compressing the target intermediate data.
The target intermediate data refers to data required to be used in nth layer processing of the neural network, and n is a positive integer. A neural network is an operational model, which is formed by a large number of nodes (or neurons) interconnected. The neural network includes an input layer, a hidden layer, and an output layer. In a possible implementation, the hidden layer includes a convolutional layer, an active layer, a pooling layer, and a connection layer. In the embodiment of the present application, the nth layer of the neural network refers to any layer in the neural network, for example, the nth layer of the neural network is a convolutional layer, a pooling layer, a connection layer, or the like, which is not limited in the embodiment of the present application.
In one example, the target intermediate data includes a feature map corresponding to the nth layer of the neural network, and the feature map corresponding to the nth layer of the neural network refers to a feature map which needs to be used by the nth layer of the neural network for processing. For example, the pooling layer of the neural network needs to pool the feature map of the convolutional layer, and at this time, the feature map of the convolutional layer may be referred to as a feature map corresponding to the pooling layer.
In another example, the target intermediate data includes weights of an nth layer of the neural network, which refers to network weights of the nth layer of the neural network.
In yet another example, the target intermediate data includes both the corresponding feature maps of the nth layer of the neural network and the weights of the nth layer of the neural network.
It should be noted that, the compression timings of the feature map corresponding to the nth layer of the neural network and the weight value of the nth layer of the neural network are different, and the description of the compression timings may refer to the following embodiments, which are not described here first.
Step 202, responding to the nth layer of the neural network to process by using the target intermediate data, and acquiring the compressed target intermediate data.
The processing of the neural network refers to a process of processing input data to finally obtain an output result. The input data may be images, voice, video, etc., and the embodiment of the present application is not limited thereto.
And when the nth layer of the neural network needs to use the target intermediate data for processing, the terminal acquires the compressed target intermediate data. In one example, when the nth layer of the neural network needs to use the feature map corresponding to the nth layer of the neural network, the terminal acquires the compressed feature map; in another example, when the nth layer of the neural network needs to use the weight of the nth layer of the neural network, the terminal obtains the compressed weight; in another example, when the nth layer of the neural network needs to use the feature map corresponding to the nth layer of the neural network and the weight of the nth layer of the neural network, the terminal may simultaneously obtain the feature map corresponding to the nth layer and the weight of the nth layer; or the weight of the nth layer can be obtained first, and then the characteristic diagram corresponding to the nth layer is obtained; the feature map corresponding to the nth layer may also be obtained first, and then the weight of the nth layer is obtained, which is not limited in this embodiment.
And step 203, decompressing the compressed target intermediate data to obtain target intermediate data.
After the terminal acquires the compressed target intermediate data, the compressed target intermediate data needs to be decompressed to obtain the target intermediate data.
And step 204, processing the nth layer of the neural network according to the target intermediate data.
And the nth layer of the neural network processes according to the target intermediate data.
To sum up, in the technical solution provided in this application embodiment, the intermediate data corresponding to the nth layer that is not needed to be used temporarily is compressed, and the intermediate data corresponding to the nth layer is decompressed when needed to be used, so that the memory space of the terminal is reasonably utilized, and the requirement on the memory space is reduced.
In addition, because the embodiment of the application does not need to exchange the intermediate data into the DRAM, the occupation of the DRAM memory bandwidth is avoided, so that the DRAM memory bandwidth can be used by other application programs, and the purpose of system optimization is achieved.
In addition, the embodiment of the application realizes decompression as required, and avoids space waste.
Referring to fig. 3, a flowchart of a data processing method according to another embodiment of the present application is shown. The embodiment of the application can be applied to the AI chip shown in fig. 1, and the method can include the following steps.
Step 301, sending a compression request to a compression unit through a compression interface.
In an embodiment of the present application, the compression request is used to request compression of the target intermediate data.
In a possible implementation manner, the AI chip includes a control unit, the control unit calls a compression interface to send a compression request to the compression unit, so that the compression unit compresses the target intermediate data, and in response to the compression request, the compression unit compresses the target intermediate data.
Step 302, storing the compressed target intermediate data in a storage unit.
In a possible implementation, the memory cell includes an SRAM. SRAM is a type of random access memory. As long as the SRAM remains powered on, the data stored within the SRAM can be constantly maintained. In a possible implementation, the storage unit comprises at least one of: NOR Flash (non-volatile Flash), PCM (phase change Memory), MRAM (Magnetic Random Access Memory). NOR Flash can be used in stand-alone and embedded applications. PCM stores data using a large difference in conductivity of chalcogenide in crystalline and amorphous states. MRAM is a non-volatile magnetic random access memory that possesses the high speed read and write capabilities of SRAM, as well as the high integration of DRAM. The embodiment of the present application does not limit the type of the memory cell.
And 303, responding to the nth layer of the neural network to process by using the target intermediate data, and acquiring the compressed target intermediate data from the storage unit.
In a possible implementation manner, the control unit calls the data obtaining interface to send a data obtaining request to the storage unit, where the data obtaining request is used to request that the target intermediate data is sent to the decompression unit.
Step 304, a decompression request is sent to the decompression unit through the decompression interface.
In the embodiment of the present application, the decompression request is used to request decompression of the compressed target intermediate data.
In a possible implementation, the control unit calls the decompression interface to send a decompression request to the decompression unit. And in response to the decompression request, the decompression unit decompresses the obtained compressed target intermediate data after receiving the decompression request.
And 305, processing the nth layer of the neural network according to the target intermediate data.
In a possible implementation manner, the processing and the storage in the embodiment of the present application are performed in a storage unit. And when the decompression unit finishes decompressing the compressed target intermediate data, sending the target intermediate data to the storage unit so that the neural network processing engine on the storage unit processes the target intermediate data.
In the technical scheme provided by the embodiment of the application, the target intermediate data are compressed when not needed, and are decompressed and read when needed, so that the memory occupation of the storage unit is reduced, the memory requirement on the storage unit is reduced, the occupation space of the SRAM is saved, and the memory requirement on the AI chip is reduced.
In addition, the target intermediate data is compressed and accessed through the interface, the design is flexible and convenient, and the convenience and flexibility in system application are improved.
In an exemplary embodiment, the terminal stores the compressed target intermediate data to the storage unit by:
firstly, identification information of compressed target intermediate data is determined.
In the embodiment of the present application, the identification information is used to uniquely identify the compressed target intermediate data. The terminal can determine the target intermediate data according to the identification information of the target intermediate data, thereby facilitating the acquisition of the target intermediate data. For example, it may be determined from the identification information which layer of the neural network the compressed target intermediate data is in, or it may be determined from the identification information which layer of the neural network the compressed target intermediate data is in.
In one example, the terminal inputs the compressed target intermediate data into a hash table to obtain a hash value of the compressed target intermediate data, wherein the hash value corresponds to a mapping address in the hash table; determining index information of the hash value according to the number of layers of the target intermediate data and the operation corresponding to the target intermediate data; and determining the hash value and the index information of the hash value as identification information.
The hash table may also be referred to as a hash table, and is a data structure directly accessed from a key value. Hash values are a unique and extremely compact representation of a piece of data as a value.
After the terminal inputs the compressed target intermediate data into the hash table, the hash value of the compressed target intermediate data and the mapping address of the compressed target intermediate data in the hash table can be obtained. However, it is not possible to accurately know which layer of the feature map or which layer of the weight value the compressed target intermediate data belongs to according to the hash value, and at this time, the terminal may determine the index information of the compressed target intermediate data for the hash value according to the number of layers of the target intermediate data and the operation corresponding to the target intermediate data, and then determine the hash value and the index information of the hash value as the identification information. For example, if the compressed target intermediate data is a feature map of convolutional layers included in the 5 th layer of the neural network, the terminal may determine the index information of the hash value to be 5-convolutional, and if a plurality of convolutional layers are included in the 5 th layer of the neural network, the terminal may refine the index information of the hash value to a position of the convolutional layer generating the feature map in all convolutional layers included in the 5 th layer in order to more accurately acquire the compressed target intermediate data. For another example, if the compressed target intermediate data is the weight of the pooling layer included in the layer 6 of the neural network, the terminal may determine that the index information of the hash value is 6-pooling.
In another example, the terminal determines index information of the compressed target intermediate data according to the number of layers in which the target intermediate data is located and the operation corresponding to the target intermediate data.
And the terminal directly determines the index information of the compressed target intermediate data according to the layer number of the target intermediate data and the operation corresponding to the target intermediate data, and the operation is simple and convenient.
And secondly, correspondingly storing the compressed target intermediate data and the identification information to a storage unit.
When the compressed target intermediate data is subsequently acquired, the terminal can quickly acquire the compressed target intermediate data from the storage unit according to the identification information of the compressed target intermediate data.
Assuming that the terminal stores the compressed target intermediate data in a hash table manner, and the target intermediate data is taken as a feature map of the convolutional layer included in the 5 th layer for description, when the terminal needs to acquire the feature map of the convolutional layer included in the compressed 5 th layer, the terminal first determines that index information of the hash value is 5-convolution, and then the terminal determines the corresponding hash value according to the 5-convolution, so as to finally determine the mapping address of the feature map of the convolutional layer included in the compressed 5 th layer in the hash table, thereby acquiring the feature map of the convolutional layer included in the 5 th layer.
Assuming that the terminal accesses the compressed target intermediate data by means of an index, and the target intermediate data is taken as the feature map of the convolutional layer included in the 5 th layer for description, when the terminal needs to acquire the feature map of the convolutional layer included in the compressed 5 th layer, the terminal first determines that the index information of the feature map of the convolutional layer included in the compressed 5 th layer is 5-convolution, and then the terminal can directly acquire the feature map of the convolutional layer included in the compressed 5 th layer according to the 5-convolution.
In the embodiment of the application, one piece of identification information is determined for each piece of compressed target intermediate data, so that the subsequent reading is facilitated, the reading is convenient, and the reading granularity is controllable.
In an exemplary embodiment, the intermediate data is a feature map of the mth layer of the neural network, and m is a positive integer. Fig. 4 is a flowchart illustrating a data processing method according to another embodiment of the present application.
Step 401, in response to the m +2 th layer of the neural network not using the feature map of the m-th layer of the neural network for processing, compressing the feature map.
When the m +2 th layer of the neural network is processed by the diagnosis plan of the m-th layer of the neural network, the characteristic plan of the m-th layer of the neural network is indicated to be in an unused state temporarily, and the characteristic plan of the m-th layer can be compressed at the moment, so that the occupied space of the characteristic plan of the m-th layer in the storage unit is reduced.
And 402, responding to the nth layer of the neural network to process by using the characteristic diagram, and acquiring the compressed characteristic diagram.
And when the nth layer of the neural network needs to be processed by using the characteristic diagram of the mth layer, the terminal acquires the compressed characteristic diagram.
For example, as shown in fig. 5, a schematic diagram of the structure of a kind of Unet network is shown. The Unet network includes convolution operations, upsampling operations, max pooling operations, stacking operations, and the like. The convolution operation is an operation of processing an original image according to a template to obtain a new image; the up-sampling operation is an operation of enlarging an original image so that the enlarged image can be displayed on a higher-resolution display device; the maximum pooling operation is an operation in which the maximum value in an image region is set as a value obtained by pooling the region; a stack operation refers to an operation that connects two cells together.
In this embodiment, the mth layer is an input layer at the position of the solid line frame in fig. 5, the nth layer is an output layer at the position of the dotted line frame in fig. 5, and after the (m + 1) th layer completes processing based on the characteristic diagram of the mth layer, the terminal compresses the characteristic diagram of the mth layer through the compression unit 110, and stores the compressed characteristic diagram in the storage unit 120.
When the neural network processes to the nth layer, the nth layer needs to be stacked with the feature map of the mth layer, so that the nth layer needs to be processed by using the feature map of the mth layer, and at this time, the terminal acquires the compressed feature map.
In order to reduce the occupation of the memory cell space, if the feature map of a certain layer of the neural network is not used after the next layer of processing is completed, the feature map of the certain layer may be deleted.
And step 403, decompressing the compressed feature map to obtain the feature map.
Still taking the above example as an example, the terminal decompresses the above compressed feature map through the decompression unit 130 to obtain the feature map of the mth layer.
And step 404, processing the nth layer of the neural network according to the feature map.
And after the nth layer acquires the characteristic diagram of the mth layer, performing subsequent processing.
For the description of steps 403 to 404, reference may be made to the above embodiments, which are not described herein again.
In an exemplary embodiment, the target intermediate data are weights of the nth layer of the neural network. Fig. 6 is a flowchart illustrating a data processing method according to another embodiment of the present application.
Step 601, before loading the neural network, compressing the weight of the nth layer of the neural network.
In a possible implementation, the weights of the layers comprised by the neural network are compressed separately before loading the neural network. For example, assuming that the neural network includes 3 layers, the compression results in 3 compressed weights, respectively.
Step 602, in response to the nth layer of the neural network using the weight for processing, obtaining a compressed weight.
And when the nth layer of the neural network needs to use the weight value of the neural network for processing, obtaining the compressed weight value.
Step 603, decompressing the compressed weight to obtain the weight.
And step 604, processing the nth layer of the neural network according to the weight value.
In a possible implementation manner, after the nth layer of the neural network completes processing, the weight may be deleted to reduce the occupation of the storage unit space.
For the description of steps 602 to 604, reference may be made to the above embodiments, which are not repeated herein.
Compared with the prior art in which the weights of all layers of the neural network are integrally decompressed, the embodiment of the application realizes decompression as required, and avoids waste of storage unit space.
In an exemplary embodiment, the target data includes weights of an nth layer of the neural network and a feature map of an mth layer. Fig. 7 is a flowchart illustrating a data processing method according to another embodiment of the present application.
Step 701, before loading the neural network, compressing the weight of the nth layer of the neural network.
In a possible implementation, after the training of the neural network is completed, the weights of the layers included in the neural network are separately compressed.
And step 702, responding to the m +2 th layer of the neural network, wherein the characteristic diagram of the m-th layer of the neural network cannot be processed, and compressing the characteristic diagram of the m-th layer.
When the m +2 th layer of the neural network is processed by the feature map of the m-th layer of the neural network, which indicates that the feature map of the m-th layer of the neural network is temporarily in an unused state, the feature map of the m-th layer can be compressed, so that the occupied space of the feature map of the m-th layer in the storage unit is reduced.
Step 703, in response to the nth layer of the neural network using the intermediate data for processing, obtaining compressed intermediate data.
The intermediate data comprises a weight of the nth layer and a feature map of the mth layer, and the terminal can obtain the weight of the nth layer and then obtain the feature map of the mth layer; or, the terminal may also obtain the feature map of the mth layer first and then obtain the weight of the nth layer; or, the terminal may also obtain the feature map of the mth layer and the weight of the nth layer at the same time.
Step 704, decompressing the compressed intermediate data to obtain the target intermediate data.
The terminal may decompress the weight of the nth layer first, and then decompress the feature map of the mth layer; or, the terminal may also decompress the feature map of the mth layer first, and then decompress the weight of the nth layer; or, the terminal may also decompress the feature map of the mth layer and the weight of the nth layer at the same time.
Step 705, processing the nth layer of the neural network according to the target intermediate data.
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
Referring to fig. 8, a block diagram of a data processing apparatus provided in an embodiment of the present application is shown, where the apparatus has a function of implementing the above method example, and the function may be implemented by hardware or by hardware executing corresponding software. The apparatus 800 may include: a data compression module 810, a data acquisition module 820, a data decompression module 830, and a data processing module 840.
A data compression module 810, configured to compress target intermediate data, where the target intermediate data is data that needs to be used when processing an nth layer of a neural network, and n is a positive integer;
a data obtaining module 820, configured to respond to the nth layer of the neural network and perform processing using the target intermediate data, and obtain the compressed target intermediate data;
a data decompression module 830, configured to decompress the compressed target intermediate data to obtain the target intermediate data;
a data processing module 840, configured to process an nth layer of the neural network according to the target intermediate data.
To sum up, in the technical solution provided in this application embodiment, the intermediate data corresponding to the nth layer that is not needed to be used temporarily is compressed, and the intermediate data corresponding to the nth layer is decompressed when needed to be used, so that the memory space of the terminal is reasonably utilized, and the requirement on the memory space is reduced.
Optionally, the data compression module 810 is configured to:
sending a compression request to a compression unit through a compression interface, the compression request requesting compression of the target intermediate data,
in response to the compression request, the compression unit compresses the target intermediate data.
Optionally, the apparatus 800 further includes: and a data storage module (not shown).
The data storage module is used for storing the compressed target intermediate data to a storage unit;
a data acquisition module 820 for:
and acquiring the compressed target intermediate data from the storage unit.
Optionally, the data storage module includes: an identification determination unit and a data storage unit.
An identification determining unit, configured to determine identification information of the compressed target intermediate data, where the identification information is used to uniquely identify the compressed target intermediate data;
and the data storage unit is used for correspondingly storing the compressed target intermediate data and the identification information to the storage unit.
Optionally, the identification determining unit is configured to:
inputting the compressed target intermediate data into a hash table to obtain a hash value of the compressed target intermediate data, wherein the hash value corresponds to a mapping address in the hash table;
determining index information of the hash value according to the number of layers of the target intermediate data and the operation corresponding to the target intermediate data;
and determining the hash value and the index information of the hash value as the identification information.
Optionally, the identification determining unit is configured to:
and determining index information of the compressed target intermediate data according to the layer number of the target intermediate data and the operation corresponding to the target intermediate data.
Optionally, the data decompression module 830 is configured to:
sending a decompression request to a decompression unit through a decompression interface, the decompression request requesting decompression of the compressed target intermediate data,
in response to the decompression request, the decompression unit decompresses the target intermediate data.
Optionally, the target intermediate data is a weight of an nth layer of the neural network;
the data compression module 810 is configured to:
compressing the weights of the nth layer of the neural network prior to loading the neural network.
Optionally, the target intermediate data further includes a feature map of an mth layer of the neural network, where m is a positive integer less than or equal to n;
the data compression module 810 is further configured to:
compressing the feature map of the mth layer in response to the mth +2 layer of the neural network not being processed with the feature map of the mth layer of the neural network.
Optionally, the memory unit comprises a static random access memory SRAM.
It should be noted that, when the apparatus provided in the foregoing embodiment implements the functions thereof, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the apparatus may be divided into different functional modules to implement all or part of the functions described above. In addition, the apparatus and method embodiments provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments for details, which are not described herein again.
Referring to fig. 9, a block diagram of a terminal according to an embodiment of the present application is shown.
The terminal in the embodiment of the present application may include one or more of the following components: a processor 910 and a memory 920.
Processor 910 may include one or more processing cores. The processor 910 connects various parts within the entire terminal using various interfaces and lines, performs various functions of the terminal and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 920 and calling data stored in the memory 920. Alternatively, the processor 910 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 910 may integrate one or more of a Central Processing Unit (CPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, an application program and the like; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 910, but may be implemented by a single chip.
Optionally, the processor 910, when executing the program instructions in the memory 920, implements the methods provided by the various method embodiments described above.
The Memory 920 may include a Random Access Memory (RAM) or a Read-Only Memory (ROM). Optionally, the memory 920 includes a non-transitory computer-readable medium. The memory 920 may be used to store instructions, programs, code sets, or instruction sets. The memory 920 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for at least one function, instructions for implementing the various method embodiments described above, and the like; the storage data area may store data created according to the use of the terminal, and the like.
The structure of the terminal described above is only illustrative, and in actual implementation, the terminal may include more or less components, such as: a display screen, etc., which are not limited in this embodiment.
Those skilled in the art will appreciate that the configuration shown in fig. 9 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.
In an exemplary embodiment, a computer-readable storage medium is also provided, in which a computer program is stored, which is loaded and executed by a processor of a computer device to implement the individual steps in the above-described method embodiments.
In an exemplary embodiment, a computer program product is also provided for implementing the above method when executed.
The above description is only exemplary of the present application and should not be taken as limiting the present application, and any modifications, equivalents, improvements and the like that are made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (13)

1. A method of data processing, the method comprising:
compressing target intermediate data, wherein the target intermediate data refers to data required to be used during nth layer processing of a neural network, and n is a positive integer;
responding to the nth layer of the neural network to process by using the target intermediate data, and acquiring the compressed target intermediate data;
decompressing the compressed target intermediate data to obtain the target intermediate data;
and processing the nth layer of the neural network according to the target intermediate data.
2. The method of claim 1, wherein compressing the target intermediate data comprises:
sending a compression request to a compression unit through a compression interface, the compression request requesting compression of the target intermediate data,
in response to the compression request, the compression unit compresses the target intermediate data.
3. The method of claim 1, wherein after compressing the target intermediate data, further comprising:
storing the compressed target intermediate data to a storage unit;
the acquiring the compressed target intermediate data includes:
and acquiring the compressed target intermediate data from the storage unit.
4. The method of claim 3, wherein storing the compressed target intermediate data to a storage unit comprises:
determining identification information of the compressed target intermediate data, wherein the identification information is used for uniquely identifying the compressed target intermediate data;
and correspondingly storing the compressed target intermediate data and the identification information to the storage unit.
5. The method of claim 4, wherein the determining the identification information of the compressed target intermediate data comprises:
inputting the compressed target intermediate data into a hash table to obtain a hash value of the compressed target intermediate data, wherein the hash value corresponds to a mapping address in the hash table;
determining index information of the hash value according to the number of layers of the target intermediate data and the operation corresponding to the target intermediate data;
and determining the hash value and the index information of the hash value as the identification information.
6. The method of claim 4, wherein the determining the identification information of the compressed target intermediate data comprises:
and determining index information of the compressed target intermediate data according to the layer number of the target intermediate data and the operation corresponding to the target intermediate data.
7. The method according to claim 1, wherein the decompressing the compressed target intermediate data to obtain the target intermediate data comprises:
sending a decompression request to a decompression unit through a decompression interface, the decompression request requesting decompression of the compressed target intermediate data,
in response to the decompression request, the decompression unit decompresses the target intermediate data.
8. The method according to any one of claims 1 to 7, wherein the target intermediate data is a weight of an nth layer of the neural network;
the compressing the target intermediate data comprises the following steps:
compressing the weights of the nth layer of the neural network prior to loading the neural network.
9. The method of claim 8, wherein the target intermediate data further comprises a feature map of an mth layer of the neural network, the m being a positive integer less than or equal to the n;
after compressing the weight of the nth layer of the neural network, the method further includes:
compressing the feature map of the mth layer in response to the mth +2 layer of the neural network not being processed with the feature map of the mth layer of the neural network.
10. The method according to any one of claims 1 to 7, wherein the memory unit comprises a Static Random Access Memory (SRAM).
11. A data processing apparatus, characterized in that the apparatus comprises:
the data compression module is used for compressing target intermediate data, wherein the target intermediate data refers to data which needs to be used during nth layer processing of the neural network, and n is a positive integer;
the data acquisition module is used for responding to the nth layer of the neural network and processing the target intermediate data to acquire the compressed target intermediate data;
the data decompression module is used for decompressing the compressed target intermediate data to obtain the target intermediate data;
and the data processing module is used for processing the nth layer of the neural network according to the target intermediate data.
12. A terminal, characterized in that it comprises a processor and a memory, said memory storing a computer program which is loaded and executed by said processor to implement the data processing method according to any one of claims 1 to 10.
13. A computer-readable storage medium, in which a computer program is stored, which is loaded and executed by a processor to implement the data processing method according to any one of claims 1 to 10.
CN202010676005.1A 2020-07-14 2020-07-14 Data processing method, device, terminal and storage medium Withdrawn CN111832692A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010676005.1A CN111832692A (en) 2020-07-14 2020-07-14 Data processing method, device, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010676005.1A CN111832692A (en) 2020-07-14 2020-07-14 Data processing method, device, terminal and storage medium

Publications (1)

Publication Number Publication Date
CN111832692A true CN111832692A (en) 2020-10-27

Family

ID=72923210

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010676005.1A Withdrawn CN111832692A (en) 2020-07-14 2020-07-14 Data processing method, device, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN111832692A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114723033A (en) * 2022-06-10 2022-07-08 成都登临科技有限公司 Data processing method, data processing device, AI chip, electronic device and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106096023A (en) * 2016-06-24 2016-11-09 腾讯科技(深圳)有限公司 Method for reading data, method for writing data and data server
CN106447034A (en) * 2016-10-27 2017-02-22 中国科学院计算技术研究所 Neutral network processor based on data compression, design method and chip
CN108255859A (en) * 2016-12-29 2018-07-06 航天信息股份有限公司 A kind of method and system for being used to establish index for mass digital certificate
US20190190538A1 (en) * 2017-12-18 2019-06-20 Facebook, Inc. Accelerator hardware for compression and decompression
CN110163370A (en) * 2019-05-24 2019-08-23 上海肇观电子科技有限公司 Compression method, chip, electronic equipment and the medium of deep neural network
CN111047020A (en) * 2018-10-12 2020-04-21 上海寒武纪信息科技有限公司 Neural network operation device and method supporting compression and decompression

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106096023A (en) * 2016-06-24 2016-11-09 腾讯科技(深圳)有限公司 Method for reading data, method for writing data and data server
CN106447034A (en) * 2016-10-27 2017-02-22 中国科学院计算技术研究所 Neutral network processor based on data compression, design method and chip
CN108255859A (en) * 2016-12-29 2018-07-06 航天信息股份有限公司 A kind of method and system for being used to establish index for mass digital certificate
US20190190538A1 (en) * 2017-12-18 2019-06-20 Facebook, Inc. Accelerator hardware for compression and decompression
CN111047020A (en) * 2018-10-12 2020-04-21 上海寒武纪信息科技有限公司 Neural network operation device and method supporting compression and decompression
CN110163370A (en) * 2019-05-24 2019-08-23 上海肇观电子科技有限公司 Compression method, chip, electronic equipment and the medium of deep neural network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114723033A (en) * 2022-06-10 2022-07-08 成都登临科技有限公司 Data processing method, data processing device, AI chip, electronic device and storage medium
CN114723033B (en) * 2022-06-10 2022-08-19 成都登临科技有限公司 Data processing method, data processing device, AI chip, electronic device and storage medium

Similar Documents

Publication Publication Date Title
US11909422B2 (en) Neural network processor using compression and decompression of activation data to reduce memory bandwidth utilization
US20190318231A1 (en) Method for acceleration of a neural network model of an electronic euqipment and a device thereof related appliction information
WO2022037257A1 (en) Convolution calculation engine, artificial intelligence chip, and data processing method
CN110991630A (en) Convolutional neural network processor for edge calculation
TWI766568B (en) Processing device for executing convolution neural network computation and operation method thereof
CN111338695A (en) Data processing method based on pipeline technology and related product
WO2021147276A1 (en) Data processing method and apparatus, and chip, electronic device and storage medium
CN110647981B (en) Data processing method, data processing device, computer equipment and storage medium
US11494237B2 (en) Managing workloads of a deep neural network processor
CN111832692A (en) Data processing method, device, terminal and storage medium
CN110458285B (en) Data processing method, data processing device, computer equipment and storage medium
CN115456149B (en) Impulse neural network accelerator learning method, device, terminal and storage medium
US20200285955A1 (en) Method for accelerating deep learning and user terminal
US20210224632A1 (en) Methods, devices, chips, electronic apparatuses, and storage media for processing data
CN116051345A (en) Image data processing method, device, computer equipment and readable storage medium
US11507349B2 (en) Neural processing element with single instruction multiple data (SIMD) compute lanes
CN112612427A (en) Vehicle stop data processing method and device, storage medium and terminal
US11741349B2 (en) Performing matrix-vector multiply operations for neural networks on electronic devices
CN115456858B (en) Image processing method, device, computer equipment and computer readable storage medium
CN113435591B (en) Data processing method, device, computer equipment and storage medium
CN117370488A (en) Data processing method, device, electronic equipment and computer readable storage medium
CN114077889A (en) Neural network processor and data processing method
KR20220034542A (en) STORAGE DEVICE, and METHOD OF OPERATING STORAGE DEVICE
CN118153552A (en) Data analysis method, device, computer equipment and storage medium
CN116360575A (en) Data processing method, device, terminal equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20201027

WW01 Invention patent application withdrawn after publication