CN110399972A - Data processing method, device and electronic equipment - Google Patents

Data processing method, device and electronic equipment Download PDF

Info

Publication number
CN110399972A
CN110399972A CN201910661953.5A CN201910661953A CN110399972A CN 110399972 A CN110399972 A CN 110399972A CN 201910661953 A CN201910661953 A CN 201910661953A CN 110399972 A CN110399972 A CN 110399972A
Authority
CN
China
Prior art keywords
convolution
item
operator
information
mask
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910661953.5A
Other languages
Chinese (zh)
Other versions
CN110399972B (en
Inventor
戴彦
李天健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Sensetime Intelligent Technology Co Ltd
Original Assignee
Shanghai Sensetime Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Sensetime Intelligent Technology Co Ltd filed Critical Shanghai Sensetime Intelligent Technology Co Ltd
Priority to CN201910661953.5A priority Critical patent/CN110399972B/en
Publication of CN110399972A publication Critical patent/CN110399972A/en
Application granted granted Critical
Publication of CN110399972B publication Critical patent/CN110399972B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Algebra (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Complex Calculations (AREA)

Abstract

This application provides a kind of data processing method and devices, wherein, the described method includes: obtaining the mask information of the first convolution item, wherein, the first convolution item includes the sparse convolution operation between pending data and convolution kernel, and the mask information of the first convolution item is used to mark the nonzero element in the first convolution item;According to the mask information of the first convolution item, the information of the first convolution operator corresponding with the first convolution item in multiple default convolution operators is determined;Based on the information of first convolution operator, sparse convolution operation is carried out to the first convolution item.Using data processing method provided by the present application, it is possible to reduce calculate calculation amount of the equipment when executing the convolution algorithm in deep neural network, improve the efficiency of the convolution algorithm in deep neural network.

Description

Data processing method, device and electronic equipment
Technical field
This application involves depth learning technology field, in particular to a kind of data processing method, device and electronic equipment.
Background technique
With the rapid development of artificial intelligence (Artificial Intelligence, AI) technology, it is based on depth nerve net The depth learning technology of network such as convolutional neural networks can carry out image recognition and detection, speech recognition with higher accuracy rate Deng being widely used in the fields such as security monitoring, intelligent driving, human-computer interaction, intelligent medical.
Convolution algorithm amount in deep neural network is usually very big, how efficiently to realize the convolution in deep neural network Operation is those skilled in the art's urgent problem to be solved.
Summary of the invention
The embodiment of the present application provides a kind of data processing method, device and electronic equipment.
According to the embodiment of the present application in a first aspect, provide a kind of data processing method, it is applied to calculate in equipment, institute The method of stating includes: the mask information for obtaining the first convolution item, wherein the first convolution item includes pending data and convolution kernel Between sparse convolution operation, the mask information of the first convolution item is used to mark the non-zero entry in the first convolution item Element;According to the mask information of the first convolution item, determine corresponding with the first convolution item in multiple default convolution operators The information of first convolution operator;Based on the information of first convolution operator, sparse convolution fortune is carried out to the first convolution item It calculates.
In one possible implementation, it calculates equipment and is stored with the multiple default convolution operator.
In one possible implementation, the mask information according to the first convolution item determines multiple default The information of the first convolution operator corresponding with the first convolution item in convolution operator, comprising: according to the first convolution item Mask information determines the target operator address of the first convolution item;The information based on first convolution operator, to described One convolution item carries out sparse convolution operation, comprising: obtains first convolution operator from target operator address, and utilizes and obtain First convolution operator got carries out sparse convolution operation to the first convolution item.
In one possible implementation, the mask information according to the first convolution item determines multiple default The information of the first convolution operator corresponding with the first convolution item in convolution operator, comprising: according to the first convolution item Mapping relations between mask information and default convolution operator information and default mask information determine that the first convolution item is right The information for first convolution operator answered.
In one possible implementation, the mapping between the default convolution operator information and default mask information is closed It is stored in the mapping table comprising multiple list items.
In one possible implementation, described to be calculated according to the mask information of the first convolution item and default convolution Mapping relations between sub-information and default mask information determine corresponding first convolution operator of the first convolution item Information, comprising: mask information based on the first convolution item searches the mapping table, and by the mapping table, default cover Default convolution operator information in the matched list item of mask information of code information and the first convolution item is as the first volume The information of integrating.
In one possible implementation, the mask information includes covering for the sparse data in the first convolution item Code.
In one possible implementation, the mask information includes any one of following: with the pending data pair The data mask answered;Convolution kernel mask corresponding with the convolution kernel;It is corresponding with the pending data and the convolution kernel Combine mask.
In one possible implementation, the mask information according to the first convolution item determines multiple default The information of the first convolution operator corresponding with the first convolution item in convolution operator, comprising: in the first iteration according to The mask information of first convolution item determines first convolution corresponding with the first convolution item in multiple default convolution operators The information of operator;
The information based on first convolution operator carries out sparse convolution operation to the first convolution item, comprising: Information based on first convolution operator in secondary iteration carries out sparse convolution operation to the first convolution item, wherein The secondary iteration is the successive iterations of first iteration.
In one possible implementation, the second convolution item for being different from the first convolution item is carried out in the first iteration Convolution algorithm.
In one possible implementation, to the information of the convolution algorithm of the second convolution item and determining first convolution operator It is parallel to execute.
In this way, determining that the first convolution item is corresponding by the mask information for being in advance based on the first convolution item in iteration in front The first convolution operator information, and be directly based upon in successive iterations the first convolution operator and convolution fortune carried out to the first convolution item It calculates, to further promote the efficiency of convolution algorithm.
In one possible implementation, the secondary iteration is the following iteration of first iteration.
In one possible implementation, the method also includes the information of storage first convolution operator.
In one possible implementation, the method also includes: in the secondary iteration be based on the first volume The information of integrating obtains first convolution operator stored in reservoir.
In one possible implementation, the method also includes: in first iteration be based on the second convolution item The information of corresponding second convolution operator carries out sparse convolution operation to the second convolution item, wherein the first convolution item For next convolution item of the second convolution item.
In one possible implementation, the method also includes: covering for the second convolution item is obtained in the first iteration The information of code information or corresponding second convolution operator of the second convolution item.
In one possible implementation, the first iteration is the iteration for the first time of this data handling procedure, then first Mask information based on the second convolution item in iteration determines corresponding second convolution operator of the second convolution item, and is based on volume Two Integrating carries out convolution algorithm to the second convolution item;In addition, the mask information based on the first convolution item in the first iteration, determines Corresponding first convolution operator of first convolution item, and store the information of the first convolution operator.
In one possible implementation, the first iteration is not the iteration for the first time of this data handling procedure, then The information of the second convolution operator, and the acquisition of information volume Two based on the second convolution operator can be obtained in one iteration from memory Integrating, alternatively, in the first iteration available second convolution item mask information, and from memory obtain the second convolution item Corresponding second convolution operator of mask information information.
In one possible implementation, before the mask information for obtaining the first convolution item, the method is also It include: that the mask information of the first convolution item is determined according to the nonzero element that sparse data includes in the first convolution item.
In one possible implementation, the mask information of the first convolution item includes preset quantity mask member Element;Meet following any formula: M=2 between the quantity M of the default convolution operator and the quantity n of the mask elementn, or Person, M=2n-1。
According to the second aspect of the embodiment of the present application, a kind of data processing equipment is provided, comprising: mask obtains module, Be configured as obtaining the mask information of the first convolution item, wherein the first convolution item include pending data and convolution kernel it Between sparse convolution operation, the mask information of the first convolution item is used to mark the nonzero element in the first convolution item; Operator information determination module is configured as the mask information according to the first convolution item, determines in multiple default convolution operators The information of the first convolution operator corresponding with the first convolution item;Computing module is configured as based on the first volume integrating The information of son carries out sparse convolution operation to the first convolution item.
Optionally, the operator information determination module is configured as the mask information according to the first convolution item, determines The target operator address of first convolution item;
The computing module is configured as obtaining first convolution operator from target operator address, and utilizes and obtain First convolution operator got carries out sparse convolution operation to the first convolution item.
Optionally, the operator information determination module, be configured as according to the mask information of the first convolution item and Mapping relations between default convolution operator information and default mask information, determine the first convolution item corresponding described first The information of convolution operator.
Optionally, the mapping relations between the default convolution operator information and default mask information are stored in comprising multiple In the mapping table of list item;
The operator information determination module is configured as the mask information based on the first convolution item and searches the mapping Table, and will be default in the mapping table, in the matched list item of mask information of default mask information and the first convolution item Information of the convolution operator information as first convolution operator.
Optionally, the mask information includes any one of following:
Data mask corresponding with the pending data;
Convolution kernel mask corresponding with the convolution kernel;
Combination mask corresponding with the pending data and the convolution kernel.
Optionally, the operator information determination module is configured as in the first iteration according to the first convolution item Mask information determines the information of first convolution operator corresponding with the first convolution item in multiple default convolution operators;
The computing module is configured as the information based on first convolution operator in secondary iteration, to described One convolution item carries out sparse convolution operation, wherein the secondary iteration is the following iteration of first iteration.
Optionally, the computing module is configured as in first iteration based on the second convolution item corresponding second The information of convolution operator carries out sparse convolution operation to the second convolution item, wherein the first convolution item is described second Next convolution item of convolution item.
Optionally, the mask obtains module, is configured as obtaining the mask information of the second convolution item in the first iteration.
Optionally, described device further include:
Mask information determining module is configured as according to the nonzero element that sparse data includes in the first convolution item, Determine the mask information of the first convolution item.
Optionally, the mask information of the first convolution item includes preset quantity mask element;
Meet following any formula: M=between the quantity M of the default convolution operator and the quantity n of the mask element 2n, alternatively, M=2n-1。
Optionally, described device further include:
Memory module is configured as storing the information of the mask information of the first convolution item and first convolution operator At least one of in.
According to the third aspect of the embodiment of the present application, a kind of convolution algorithm accelerator is provided, comprising:
Operator memory, for storing multiple default convolution operators;
Controller, for the mask information according to the first convolution item, determine in the multiple default convolution operator with it is described The information of corresponding first convolution operator of first convolution item;
Register cache area, for providing input data for convolution algorithm;
Computing unit, the first convolution operator for being exported according to the operator memory execute the first convolution item Sparse convolution operation.
Optionally, branch's multi-path choice fallout predictor is provided in the controller;
Branch's multi-path choice fallout predictor, it is true for the mask information in the first iteration according to the first convolution item The information of the first convolution operator corresponding with the first convolution item in fixed the multiple default convolution operator, so as to calculate equipment Information based on first convolution operator in secondary iteration carries out sparse convolution operation to the first convolution item;
Wherein, the secondary iteration is the following iteration of first iteration.
According to the fourth aspect of the embodiment of the present application, a kind of computer readable storage medium, the storage medium are provided It is stored with computer program, the computer program realizes above-mentioned first aspect described in any item sides when being executed by processor Method.
According to the 5th of the embodiment of the present application the aspect, a kind of chip is provided, comprising:
Processor is configured as executing method described in any possible implementation of above-mentioned first aspect.
According to the 6th of the embodiment of the present application the aspect, a kind of electronic equipment, including memory, processor and storage are provided On a memory and the computer program that can run on a processor, which is characterized in that when the processor executes described program Realize the described in any item methods of above-mentioned first aspect.
Using data processing method provided by the embodiments of the present application, it is true to calculate mask information of the equipment based on the first convolution item The information of fixed corresponding first convolution operator, the information for being then based on the first convolution operator determines the first convolution operator, and utilizes First convolution operator carries out sparse convolution operation to the first convolution item, it is possible to reduce calculates equipment in executing deep neural network Convolution algorithm when calculation amount, improve convolution algorithm efficiency.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not The application can be limited.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the application Example, and together with specification it is used to explain the principle of the application.
Fig. 1 is a kind of the application data processing method flow chart shown according to an exemplary embodiment;
Fig. 2 is the application another data processing method flow chart shown according to an exemplary embodiment;
Fig. 3 is the application another data processing method flow chart shown according to an exemplary embodiment;
Fig. 4 is the application another data processing method flow chart shown according to an exemplary embodiment;
Fig. 5 is a kind of the application schematic diagram realizing convolution algorithm and accelerating shown according to an exemplary embodiment;
Fig. 6 is the schematic diagram of the application another data processing method shown according to an exemplary embodiment;
Fig. 7 is a kind of the application block diagram of data processing equipment shown according to an exemplary embodiment;
Fig. 8 is the block diagram of the application another data processing equipment shown according to an exemplary embodiment;
Fig. 9 is the block diagram of the application another data processing equipment shown according to an exemplary embodiment;
Figure 10 is a kind of the application structural schematic diagram of convolution algorithm accelerator shown according to an exemplary embodiment;
Figure 11 is the structural schematic diagram of the application a kind of electronic equipment shown according to an exemplary embodiment.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with the application.On the contrary, they be only with it is such as appended The example of the consistent device and method of some aspects be described in detail in claims, the application.
It is only to be not intended to be limiting the application merely for for the purpose of describing particular embodiments in term used in this application. It is also intended in the application and the "an" of singular used in the attached claims, " described " and "the" including majority Form, unless the context clearly indicates other meaning.It is also understood that term "and/or" used herein refers to and wraps It may be combined containing one or more associated any or all of project listed.
It will be appreciated that though various information, but this may be described using term first, second, third, etc. in the application A little information should not necessarily be limited by these terms.These terms are only used to for same type of information being distinguished from each other out.For example, not departing from In the case where the application range, the first information can also be referred to as the second information, and similarly, the second information can also be referred to as One information.Depending on context, word as used in this " if " can be construed to " ... when " or " when ... When " or " in response to determination ".
Before introducing the application, artificial neural network correlation involved in the embodiment of the present application is simply introduced first and is known Know:
Neural network may include: convolutional layer (Convolutional Layer), pond layer (Pooling Layer), swash The network units such as layer (Activation Layer) living, full articulamentum (Fully Connected Layer), by above-mentioned network list Member is stacked according to certain way.
About convolutional layer, the convolutional layer in artificial neural network be with convolution algorithm to original input picture or it is upper one layer it is defeated The layer that characteristic pattern out is converted.Convolutional layer (convolutional layer) is deep neural network (Deep Neural Network, DNN) common a kind of layer when handling image.When a deep neural network is based on convolutional layer, referred to as For convolutional neural networks (Convolutional Neural Network, CNN).
Convolution algorithm, in a convolutional layer, in order to extract the feature of diversified forms from image, usually using multiple volumes Product verification input picture carries out different convolution operations.
Input data about convolutional layer, it is generally the case that first convolutional layer of a deep neural network can be to scheme As being used as input data, and convolutional layer later can be using the characteristic pattern (Feature Map) that preamble convolutional layer exports as input Data.In the embodiment of the present application, the input picture of the first convolutional layer and the input feature vector figure of other convolutional layers can be referred to as For pending data.
In practical application scene, an intelligent task may be made of multiple neural networks such as convolutional neural networks, volume Product neural network is the neural network of computation-intensive, and a data transmission may relate to one or more convolutional layers and carry out tens Secondary or up to a hundred convolution algorithm, therefore, the convolution algorithm amount that an intelligent task is related to are usually very big, need largely to calculate Resource, and more demand of the more complicated network of deep layer to computing resource is bigger.
In order to reduce the convolution algorithm amount in neural network, make the limited equipment of computing capability such as edge Edge end equipment Also intelligent task can be executed using neural network well, the embodiment of the present application provides a kind of data processing method, to mention Height calculates the efficiency that equipment executes convolution algorithm.Above-mentioned calculating equipment, which can be, is embedded in end AI (Artificial Intelligence, artificial intelligence) chip, execute edge calculations terminal device.
Referring to a kind of Fig. 1 data processing method flow chart shown according to an exemplary embodiment, the method can be answered With in calculating equipment, comprising the following steps:
In a step 11, the mask information of the first convolution item is obtained;
In deep neural network such as convolutional neural networks CNN, for a convolutional layer, by the pending data of input with One convolution kernel carries out the process of convolution algorithm, the calculating process of referred to as one convolution item.That is, the embodiment of the present application In, the pending data participated in convolution algorithm, convolution kernel will be prepared, is referred to as convolution item.
The first convolution item in the embodiment of the present application, which can be, calculates equipment currently to the convolution item of operation, is also possible to count It calculates equipment and has executed convolution item to be processed after epicycle convolution algorithm, postorder will be specifically described in conjunction with example.
The mask information of above-mentioned first convolution item executes the convolution of rarefaction convolution algorithm for determining to the first convolution item Operator.The mask information is for the nonzero element in the first convolution of label item.
In step 12, according to the mask information of the first convolution item, determine in multiple default convolution operators with it is described The information of corresponding first convolution operator of first convolution item;
In the embodiment of the present application, calculating equipment can use the mask information of the first convolution item, determine the first convolution operator Information.Wherein, the first convolution operator is used convolution when calculating equipment to the execution sparse convolution operation of the first convolution item Operator.In the embodiment of the present application, the information of the first convolution operator can be first convolution operator itself, be also possible to for searching The index information of first convolution operator, such as operator address information.
In the embodiment of the present application, the preset memory locations for calculating equipment are stored with multiple default convolution operators, calculate equipment Corresponding with the first convolution item the can be determined from above-mentioned multiple default convolution operators according to the mask information of the first convolution item The information of one convolution operator.
In step 13, based on the information of first convolution operator, sparse convolution fortune is carried out to the first convolution item It calculates.
Equipment is calculated after the information for determining the first convolution operator, can determine the first convolution operator, then utilizing should First convolution operator executes sparse convolution operation to above-mentioned first convolution item.
In the embodiment of the present application, equipment is calculated before executing sparse convolution operation to the first convolution item, it can prior base The information of corresponding first convolution operator is determined in the mask information of the first convolution item, is then based on the information of the first convolution operator It determines the first convolution operator, and sparse convolution operation is carried out to the first convolution item using the first convolution operator, it can be effectively save Computing resource reduces the calculation amount for calculating equipment when executing the convolution algorithm in deep neural network, so that computing capability has The equipment of limit, such as Edge end equipment, such as mobile phone, security protection camera, automobile, smart home device, various IoT (Internet Of Things, Internet of Things) equipment etc. executes the smart machines of edge calculations, it can also efficiently execute the volume of deep neural network Product operation improves the utilization rate for calculating computing resource in equipment, while the artificial intelligence experience of lifting means.
About the implementation of above-mentioned steps 12, step 13, according to timing position of first convolution item during convolution algorithm Difference may include following two situation:
Situation one, the first convolution item are the convolution item for calculating equipment original execution, alternatively, determining the letter of the first convolution operator Breath, it is unrelated with the calculating process of first convolution item.
In above situation one, the first convolution item is to calculate the current pending convolution item of equipment.In step 12, calculating is set The standby mask information using the first convolution item determines the information of the first convolution operator.Corresponding step 13 are as follows: utilize the first convolution The information of operator determines the first convolution operator, and executes sparse convolution operation to the first convolution item using the first convolution operator.
Situation two, the determination of the information of the first convolution operator are related with the implementation procedure of last round of convolution algorithm.
The situation two is corresponded to, referring to fig. 2 another data processing method flow chart shown according to an exemplary embodiment, Above-mentioned steps 12 may include:
Step 121, in the first iteration according to the mask information of the first convolution item, determine multiple default convolution operators In first convolution operator corresponding with the first convolution item information;
Correspondingly, above-mentioned steps 13 may include:
Step 131, the information based on first convolution operator in secondary iteration carry out the first convolution item dilute Dredge convolution algorithm, wherein the secondary iteration is the following iteration of first iteration.
In the embodiment of the present application, following procedure can be determined as to an iterative process: to calculate equipment input data Sparse convolution operation has been executed for the pending data in input data to equipment is calculated.
In the embodiment of the present application, if the first iteration is related to the calculating to first convolution item, in first iteration, calculating is set Not only include the pending data of this convolution algorithm in the standby input data obtained, further includes next time involved by convolution algorithm The mask information of convolution item, the i.e. mask information of the first convolution item.Wherein, the first convolution item is in secondary iteration, calculates equipment Execute the convolution item that convolution algorithm is related to.In timing, which is the following iteration of first iteration.
For ease of understanding, above-mentioned first iteration, secondary iteration, the first convolution item, the first convolution item mask information between Relationship, can be as shown in Table 1:
Table one
Iteration timing Input mask information Operational element
First iteration The mask information of first convolution item Second convolution item
Secondary iteration It wouldn't list First convolution item
By above-mentioned table one it is found that in the embodiment of the present application, the second convolution item is that convolution algorithm relates in the first iterative process And operational element, the first convolution item is the operational element that convolution algorithm is related to during secondary iteration.Wherein, in timing, Secondary iteration is located at after the first iteration.Correspondingly, the convolution algorithm process of above-mentioned first convolution item, is located at second in timing After the convolution algorithm process of convolution item.
In the embodiment of the present application, the first convolution can be determined according to the mask information of the first convolution item in the first iteration The information of corresponding first convolution operator of item;In secondary iteration, equipment is calculated for the first convolution item and executes sparse convolution fortune When calculation, the first convolution operator can quickly be determined according to the information of the first convolution operator pre-determined in the first iteration, thus Sparse convolution operation is carried out to the first convolution item using the first convolution operator.
Correspondingly, in the embodiment of the present application, the method also includes:
In the first iteration, based on the information of corresponding second convolution operator of the second convolution item, to the second convolution item Carry out sparse convolution operation;Wherein, the first convolution item is next convolution item of the second convolution item.
It should be noted that the step can carry out simultaneously with above-mentioned steps 121.
Corresponding above situation two, calculating equipment can be before executing convolution algorithm to the second convolution item, alternatively, to the When two convolution items execute convolution algorithm, the information of the first convolution operator is determined based on the mask information of above-mentioned first convolution item.
Wherein, the implementation about above-mentioned steps 121 can be in some embodiments of the application specifically: in the first iteration The middle mask information according to the first convolution item determines the target operator address of the first convolution item.
Correspondingly, above-mentioned steps 131 may include:
First convolution operator is obtained from target operator address in secondary iteration, and described using getting First convolution operator carries out sparse convolution operation to the first convolution item.
About the implementation of above-mentioned steps 13, in the embodiment of the present application, calculate equipment preset memory locations be stored with it is default Convolution operator calculates equipment after determining target operator address, can use above-mentioned target operator address and deposits from described preset The corresponding target convolution operator of position acquisition is stored up, and sparse convolution operation is executed to convolution item using the target convolution operator.
In some embodiments of the application, default convolution operator set can be preset by calculating in equipment, the convolution operator collection It include multiple default convolution operators in conjunction, each convolution operator is for executing a kind of convolution algorithm.Above-mentioned default convolution operator collection It include: the sparse convolution operator for executing rarefaction convolution algorithm in conjunction.So-called rarefaction convolution algorithm is by certain in convolution item A or certain elements are set as 0, become zero valued elements;It calculates equipment and convolution fortune is being carried out to convolution item using convolution operator During calculation, operation related with these zero valued elements can be skipped, and convolution only is executed to the corresponding data of nonzero element Operation.
After calculating equipment can determine target operator address according to above-mentioned default mask information, calculated from above-mentioned default convolution Corresponding target convolution operator is searched in subclass, and the corresponding program code of target convolution operator is called to execute convolution item Sparse convolution operation.
Referring to Fig. 3 another data processing method flow chart shown according to an exemplary embodiment, in above-mentioned steps 11 Before, the method can also include:
Step 10, according to the nonzero element that sparse data includes in the first convolution item, determine the first convolution item Mask information.
On how to determine the mask information of convolution item, in the embodiment of the present application, can be first passed through for a convolution item It presets LS-SVM sparseness method such as beta pruning processing mode and sets zero for the Partial Elements in the convolution item, obtain LS-SVM sparseness Convolution item afterwards is properly termed as sparse data in the embodiment of the present application;Then according to the position of nonzero element in above-mentioned sparse data Confidence breath determines above-mentioned mask information.So can be marked with the mask information of convolution item above-mentioned dilute in the embodiment of the present application Dredge the nonzero element in data.
In the embodiment of the present application, different according to the object of LS-SVM sparseness, mask information may include following three kinds of classifications:
Data mask, mask information corresponding with pending data;It is the non-zero entry marked in pending data that it, which is acted on, Element.
Sparse convolution implementation corresponding with above-mentioned data mask is properly termed as Dense-Sparse convolution realization side Formula.Nonzero element i.e. in convolution kernel is dense;Nonzero element in pending data be it is sparse, mask information is for marking Remember the nonzero element in the pending data after LS-SVM sparseness.
Convolution kernel mask, mask information corresponding with convolution kernel;It is the nonzero element marked in convolution kernel that it, which is acted on,.
Sparse convolution implementation corresponding with above-mentioned convolution kernel mask, is properly termed as Sparse-Dense convolution realization side Formula.That is, the nonzero element in convolution kernel is sparse;The pending data of input such as the nonzero element in characteristic pattern piecemeal are thick Close;Mask information is used to mark the nonzero element in the convolution kernel after LS-SVM sparseness.
Mask is combined, mask information corresponding with pending data and convolution kernel, effect is to mark number to be processed simultaneously According to the nonzero element in convolution kernel.
Sparse convolution implementation corresponding with said combination mask is properly termed as Sparse-Sparse convolution realization side Formula.That is, the nonzero element in convolution kernel be it is sparse, the nonzero element in the pending data of input is also sparse;Mask The nonzero element that information marks the pending data after LS-SVM sparseness simultaneously and convolution kernel is included.
In practical application scene, it can select to execute using certain a kind of convolution implementation according to practical application request Sparse convolution operation, correspondingly, including the mask information of corresponding classification into the mask information for calculating equipment input convolution item, For marking the nonzero element in convolution item.
In the embodiment of the present application, it is related to determining the process of sparse convolution operator indirectly according to mask information, to illustrate to use Convolution algorithm efficiency can be improved in method provided by the present application.Following the description by exemplary illustration mask information and convolution operator it Between relationship:
On the basis of determining mask categories, each default corresponding specific sparse convolution operator of mask information.
Illustratively, for figure piecemeal, corresponding mask information are above-mentioned data mask characterized by pending data, under State the corresponding relationship that content will illustrate mask information Yu sparse convolution operator in conjunction with specific example.
In practical application scene, characteristic pattern f and convolution kernel w are generally 4 dimension tensors, and data format can be NCHW lattice Formula.Illustratively, the quantity of input feature vector figure is indicated for the characteristic pattern f, N of NCHW format;The channel of C expression feature diagram data (Channel) number, for example, for the image data of rgb format, port number 3 respectively indicates the channel R, the channel G, channel B; H indicates vertical (Height) component of the pixel coordinate system of characteristic pattern;W indicates the width of the pixel coordinate system of characteristic pattern (Width) component.
The application is by by taking the convolution operation of an one-dimensional convolution kernel and one-dimensional characteristic figure piecemeal as an example, exemplary illustration mask The corresponding relationship of information and convolution operator.
It is assumed that the size of convolution kernel (kernel size) is 1 × 3, step-length (stride) is 1.One-dimensional convolution kernel includes three A element is labeled as w0, w1, w2, and the numerical value of convolution nuclear element is as shown in Table 2:
Table two
w0 w1 w2
2.0f 3.0f 1.0f
Assuming that one-dimensional characteristic pattern piecemeal (Tile) includes four elements, it is respectively labeled as: f0, f1, f2, f3, characteristic pattern The corresponding relationship of element and numerical value can be as shown in Table 3:
Table three
f0 f1 f2 f3
0.0f 0.0f 1.5f 0.0f
In the case where not carrying out LS-SVM sparseness, calculating equipment can be calculated as follows according to default convolution operator:
Output0=w2 × f0;
Output1=w1 × f0+w2 × f1;
Output2=w0 × f0+w1 × f1+w2 × f2;
Output3=w0 × f1+w1 × f2+w2 × f3;
Output4=w0 × f2+w1 × f3;
Output5=w0 × f3.
It is found that during entire convolution algorithm, calculate equipment and need to be implemented 12 multiplying items, be respectively as follows: w2 × f0、w1×f0、w2×f1、w0×f0、w1×f1、w2×f2、w0×f1、w1×f2、w2×f3、w0×f2、w1×f3、w0× f3。
From above-mentioned table three it is found that only f2 is not 0 in element due to characteristic pattern piecemeal, only following three defeated Result is not 0 out, i.e. Output2=w2 × f2;Output3=w1 × f2;Output4=w0 × f2.It is set that is, calculating In the standby above-mentioned 12 multiplying items executed, only there are three operation item, that is, w2 × f2, w1 × f2, w0 × f2 calculated result not It is zero.
Based on this, the embodiment of the present application devises a kind of sparse convolution operator, calculates the equipment calls sparse convolution operator, The calculating that above-mentioned w2 × f2, tri- w1 × f2, w0 × f2 operation items can directly be carried out, without successively executing above-mentioned 12 fortune The calculating of item is calculated, calculation amount saving is 3/12 originally.
Above-mentioned sparse convolution operator is corresponding with the elemental characteristic of characteristic pattern piecemeal shown in table three.From table three it is found that f0, The numerical value of f1, f3 are zero, it is assumed that each element is not zero in the characteristic pattern piecemeal being originally inputted, can in the embodiment of the present application To mark characteristic pattern block data shown in above-mentioned table three by mask information (m0, m1, m2, m3).
Illustratively, it is assumed that a mask element indicates the data of corresponding position being set to 0 for 0, it is determined that table three is made Mask information is 0010, can be determined according to above-mentioned calculating process: the corresponding sparse convolution operator of mask information 0010 The operation of execution are as follows: Output2=w2 × f2;Output3=w1 × f2;Output4=w0 × f2.
Therefore above-mentioned 4 mask elements m0, m1, m2, m3, which may be constructed 16 kinds of mask combinations, can correspond to generation 16 Independent convolution operator.The corresponding relationship of above-mentioned mask information and convolution operator can be as shown in Table 4:
Table four
Mask information Convolution operator
0000 Operator 0
0001 Operator 1
0010 Operator 2
0011 Operator 3
0100 Operator 4
0101 Operator 5
0110 Operator 6
0111 Operator 7
1000 Operator 8
1001 Operator 9
1010 Operator 10
1011 Operator 11
1100 Operator 12
1101 Operator 13
1110 Operator 14
1111 Operator 15
It altogether include 16 kinds of convolution operators: 0~Operator of Operator 15 in list shown in the table four, each Mask information corresponds to a convolution operator.
Wherein, the corresponding sky operator, that is, Operator 0 of mask 0000, the corresponding dense operator of mask information 1111 are removed That is Operator 15;1~Operator of remaining operator Operator 14 can be described as sparse convolution operator, for treating operation Convolution item executes sparse convolution operation.
In the embodiment of the present application, the corresponding sparse convolution operator of mask information, it can be understood as execute specific mask combination The program code of corresponding sparse convolution operation.
Above-mentioned example is to illustrate the corresponding relationship of mask information and convolution operator by taking data mask as an example.Similarly, above-mentioned The corresponding relationship that mask information is also applied for convolution kernel mask with the corresponding relationship of convolution operator, combines mask and convolution operator In.
For example, can still by taking the convolution operation of above-mentioned one-dimensional convolution kernel and one-dimensional characteristic figure piecemeal as an example for combination mask To determine rarefaction convolution operator using the combination of 7 mask elements.
Still by taking above-mentioned convolution item as an example, it is assumed that the mask of convolution kernel is 010, and the mask is for marking above-mentioned table one by beta pruning The convolution nuclear element obtained after processing, as shown in Table 5:
Table five
w0 w1 w2
0.0f 3.0f 0.0f
Assuming that the mask information for nonzero element in marker characteristic figure piecemeal is still 0010, as shown in Table 3;Then right In the Sparse-Sparse convolution operation that above-mentioned convolution item executes, presets the corresponding sparse convolution operator of mask 0100010 and execute Operation are as follows: Output3=w1 × f2.
In the application example, the Gao Sanwei " 010 " of above-mentioned default mask information 0100010, for non-zero in label convolution kernel The mask of element;Low four " 0010 " are the mask of nonzero element in marker characteristic figure piecemeal.In the application is exemplary, pre- If extracted characteristics of image can after the corresponding convolution operator of mask information 0100010 executes Sparse-Sparse convolution operation In the case where meeting intelligent task demand, equipment is calculated when executing above-mentioned convolution item, only needs to count using above-mentioned convolution operator A product calculation item is calculated, calculation amount is reduced to 1 product calculation item by 12 original product calculation items, effectively reduces Calculate the convolution algorithm amount of equipment.
It should be understood that in some embodiments of the application, the quantity M of convolution operator and the quantity n of mask element can be with Meet following relationship: M=2n.In other embodiments of the application, the quantity M of convolution operator and the quantity n of mask element can also To meet following relationship: M=2n-1。
In practical applications, sparse convolution operator can be selected according to the demand of intelligent task, convolution item is executed sparse Convolution algorithm to reduce the operand of convolutional layer in neural network, and then reduces the meter calculated when equipment executes intelligent task Calculation amount.
In some embodiments of the application, above-mentioned steps 12 may include:
According between the mask information of the first convolution item and default convolution operator information and default mask information Mapping relations determine the information of corresponding first convolution operator of the first convolution item.
In some embodiments, the mapping relations between above-mentioned default convolution operator and default mask information can store in In mapping table comprising multiple list items.
That is, the mapping table includes multiple list items, each list item indicates a kind of default mask information and a default convolution Mapping relations between the information of operator.
Correspondingly, above-mentioned steps 12 may include:
Mask information based on the first convolution item searches the mapping table, and mask letter will be preset in the mapping table Breath is with the default convolution operator information in the matched list item of mask information of the first convolution item as the first volume integrating The information of son.
The above process is each list item in the mask information matching mapping table based on the first convolution item, will be in target list item Convolution operator information be determined as the information of the first convolution operator.It wherein, include: covering for the first convolution item in above-mentioned target list item Mapping relations between code information and the information of the first convolution operator.In some embodiments of the application, above-mentioned first volume integrating The information of son can be the address of the first convolution operator.
Based on this, in data processing method provided by the present application, can be determined dilute first according to the mask information of convolution item Dredge address, that is, target operator address of convolution operator;The target convolution operator of target operator address direction is then branched to, is utilized The target convolution operator executes convolution algorithm to convolution item.
In some embodiments of the application, default operator address list can be inquired according to default mask information by calculating equipment, Determine target operator address.
Above situation one is corresponded to, referring to fig. 4 another data processing method process shown according to an exemplary embodiment Figure, above-mentioned steps 12 may include:
In step 1211, default operator address list is matched according to the mask information of the first convolution item, determines institute The target operator address of the first convolution item is stated, the default operator address list includes: mask information and convolution operator address Corresponding relationship;
In some embodiments of the application, calculating in equipment can store default operator address list, the default operator Location list includes: the corresponding relationship of mask information Yu convolution operator address.
Wherein, above-mentioned mask information can be the combination of preset quantity mask element.Each mask element accounts for 1bit, Numerical value can be binary number value, for example, can be set to 0 or 1 using binary form.In the embodiment of the present application, it can arrange Meaning representated by the binary numerical value of mask element, for example, the mask element that numerical value is 0 is for marking convolution item such as characteristic pattern point It is arranged to 0 pixel in block;Conversely, the mask element that numerical value is 1 is used to mark the non-zero in convolution item such as characteristic pattern piecemeal Pixel value.
Above-mentioned steps 13 may include:
Step 1311, the target operator address based on the first convolution item obtains the first convolution operator;
In some embodiments of the application, default convolution operator set can be prestored by calculating in equipment, for example, the electronics is set The corresponding instruction set of above-mentioned each convolution operator is stored in the cache of standby processor.
Then the step 1311 can be with specifically: the target operator address based on the first convolution item, from the default volume The first convolution operator is searched in integrating subclass.
Step 1312, sparse convolution operation is executed to the first convolution item using first convolution operator.
It is corresponding above situation one, illustrative, referring to a kind of Fig. 5 realizations convolution fortune shown according to an exemplary embodiment The schematic diagram accelerated is calculated, still shown in the above-mentioned table four for 16 kinds of convolution operators, in program compiling, is covered for above-mentioned 16 kinds The corresponding 16 kinds of convolution operators of code information, definition have 16 operator addresses.It the address of each convolution operator can be according to base Location, mask information and operator block size determine;Wherein, base address (Base Address) is the starting point of all convolution operators Location, operator block size indicates the storage resource size that operator code occupies, and in the embodiment of the present application, each operator block size can be with For 0x100.The then corresponding operator address of a mask information are as follows: base address+mask * 0x100;For example, binary mask information 0101 corresponding hexadecimal values 5, then, and 0101 corresponding operator address (Operator Address) of mask information are as follows: Base+ 0x500;Operator 5, i.e. Operator 5 can be found according to the address.And so on, binary mask information 0110 is corresponding Hexadecimal values 6, then the corresponding operator address of mask information 0110 can indicate are as follows: Base+0x600, it can according to the address To find Operator 6;Binary mask information 1010 corresponds to hexadecimal values A, the then corresponding calculation of mask information 0110 Subaddressing can indicate are as follows: Base+0xA00 can find Operator 10 according to the address.It indicates in this manner Each mask information corresponds to the address of convolution operator.
During the processor for calculating equipment executes program, control circuit (Control Circuit) is based on current Mask information controls the destination address of program counter (Program Counter, PC).For example, if currently to operation convolution The sparse convolution mask of item such as characteristic pattern piecemeal is 0101, then, it calculates equipment control PC and jumps on the Base+0x500 of address, Using the corresponding convolution operator in the address to currently sparse convolution operation is executed to operation convolution item, returned to after having executed normal Program flow.During being somebody's turn to do, interative computation process it is not related to.
Corresponding above situation two calculates equipment and obtains covering for the first convolution item in the first iteration in some embodiments It can also include: the mask information that the second convolution item is obtained in the first iteration before code information.Wherein, the second convolution item is The object of sparse convolution operation is executed in first iteration.
Based on this, in some embodiments of the application, a kind of branch's multi-path choice fallout predictor can be designed in calculating equipment (Branch Multiplex Predictor)。
When executing sparse convolution operation for the second convolution item, first to above-mentioned branch's multi-path choice fallout predictor input second The mask information of convolution item.
Branch's multi-path choice fallout predictor is after the mask information for getting above-mentioned second convolution item, according to above-mentioned volume Two The mask information of product item determines the corresponding second target operator address of the second convolution item, and according to the mask of above-mentioned first convolution item Information Accurate Prediction the first convolution item corresponding first object operator address;So that the second target operator address of processor foundation, It, can be according to first predicted in advance after having executed sparse convolution operation to the second convolution item using the second target convolution operator Target operator address automatic jumps to first object convolution operator to the execution sparse convolution operation of the first convolution item.
Illustratively, it by taking Dense-Sparse convolution implementation as an example, is shown referring to Fig. 6 according to an exemplary embodiment A kind of schematic diagram realizing convolution algorithm and accelerating, it is assumed that there are three continuous characteristic pattern piecemeal, respectively indicate are as follows: Tile1, Tile2, Tile3, corresponding mask information are respectively as follows: 0101,1001,1100.Pair of features described above figure piecemeal and mask information Should be related to can be as shown in Table 6:
Table six
Characteristic pattern piecemeal Mask information
Tile 1 0101
Tile 2 1001
Tile 3 1100
In iteration for the first time, the corresponding mask information 0101 of tile 1, tile 2 are inputted to branch's multi-path choice fallout predictor Corresponding mask information 1001;Multi-path choice fallout predictor is successively according to mask 0101,1001, with determining corresponding target operator Location: base+0x500, base+0x900.Wherein, the corresponding mask information 1001 of tile 2 belongs in the embodiment of the present application The mask information of one convolution item;The second convolution item that the corresponding mask information 0101 of tile 1 belongs in the embodiment of the present application is covered Code information.
Processor finds Operator 5 according to base+0x500 first, using the Operator 5 to belonging to tile 1 Executed sparse convolution operation to operation convolution item;It is then possible to find Operator 9 automatically according to base+0x900, As shown in the following figure in Fig. 6.
At second (2nd) in iteration, to the corresponding sparse mask 1001 of branch's multi-path choice fallout predictor input tile 2 with And the corresponding sparse mask 1100 of tile 3.Processor is using the Operator 9 determined in iteration for the first time to belonging to tile 2 To operation convolution item execute sparse convolution operation, also, to convolution item belonging to tile 2 (i.e. the second convolution item) execute When sparse convolution operation, the sparse convolution operator address of tile 3 is predicted according to the corresponding mask information 1100 of tile 3, so that Processor is in third time (3th) can be according to the operator address of branch's multi-path choice fallout predictor look-ahead in iteration: base+ 0xC00 finds target convolution operator Operator 12, using Operator 12 to belonging to tile 3 to operation convolution item Execute sparse convolution operation.
In the embodiment of the present application, processor, can be according to before having executed sparse convolution operation to the second convolution item The mask information of one convolution item goes out next jump address by branch's multi-path choice fallout predictor Accurate Prediction, so that processor exists It, can be immediately according to the operator address for the first convolution item predicted in advance after having executed sparse convolution operation to the second convolution item Next-hop convolution operator i.e. the first convolution operator accurately is found, sparse convolution operation is executed to the first convolution item.
Processor is generally using very long assembly line, if branch prediction failure may lose several to tens In the clock period, therefore, the assembly line of processor the long more needs accurate branch prediction.
In the embodiment of the present application, since branch's multi-path choice fallout predictor can be accurate according to the mask information of the first convolution item It predicts next jump address, effectively avoids branch misprediction, it can thus be avoided processor is in branch predictor to next-hop Turn rollback after address prediction mistake and empty assembly line, effectively improves rate and property that pipeline processor executes convolution algorithm Can, so that the limited equipment of computing capability, can also efficiently execute the convolution algorithm of deep neural network.
Data processing method provided by the present application can be applied to the artificial intelligence such as intelligent driving, human-computer interaction, safety monitoring In energy application scenarios.
It should be noted that the neural network related in the embodiment of the present application may include: deep neural network (deep Neural network, DNN) such as convolutional neural networks (convolutional neural network, CNN) etc., this Shen Please embodiment the concrete form of above-mentioned neural network is not construed as limiting.
Above example is to be said by taking one-dimensional convolution algorithm as an example to the method that equipment realizes fast convolution operation is calculated It is bright, it should be understood that, the above-mentioned method for realizing fast convolution operation using the corresponding convolution operator of sparse mask can also be applied In the application scenarios of higher-dimension convolution algorithm, the application to this with no restriction.
For the various method embodiments described above, for simple description, therefore, it is stated as a series of action combinations, but Be those skilled in the art should understand that, the disclosure is not limited by the described action sequence because according to the disclosure, certain A little steps can be performed in other orders or simultaneously.
Secondly, those skilled in the art should also know that, embodiment described in this description belongs to alternative embodiment, Necessary to the related actions and modules not necessarily disclosure.
Corresponding with aforementioned applications function realizing method embodiment, the disclosure additionally provides the reality of application function realization device Apply example.
Referring to a kind of Fig. 7 block diagram of data processing equipment shown according to an exemplary embodiment, the embodiment of the present application is mentioned A kind of data processing equipment has been supplied, can be set in calculating equipment, the apparatus may include:
Mask obtains module 21, is configured as obtaining the mask information of the first convolution item, wherein the first convolution item packet The sparse convolution operation between pending data and convolution kernel is included, the mask information of the first convolution item is configured as label institute State the nonzero element in the first convolution item;
Operator information determination module 23 is configured as the mask information according to the first convolution item, determines multiple default The information of the first convolution operator corresponding with the first convolution item in convolution operator;
Computing module 25 is configured as the information based on first convolution operator, carries out to the first convolution item dilute Dredge convolution algorithm.
In other embodiments of the application, the operator information determination module 23 be can be configured as according to described The mask information of one convolution item determines the target operator address of the first convolution item;
Correspondingly, the computing module 25, can be configured as from target operator address and obtains first convolution Operator, and sparse convolution operation is carried out to the first convolution item using first convolution operator got.
In other embodiments of the application, the operator information determination module 23 be can be configured as according to described Mapping relations between the mask information of one convolution item and default convolution operator information and default mask information determine described The information of corresponding first convolution operator of one convolution item.
Mapping relations in some embodiments of the application, between the default convolution operator information and default mask information It is stored in the mapping table comprising multiple list items;
Correspondingly, the operator information determination module 23, can be configured as the mask letter based on the first convolution item Breath searches the mapping table, and by the mapping table, default mask information matches with the mask information of the first convolution item List item in information of the default convolution operator information as first convolution operator.
In any of the above-described data processing equipment embodiment of the application, the mask information may include any one of following:
Data mask corresponding with the pending data;
Convolution kernel mask corresponding with the convolution kernel;
Combination mask corresponding with the pending data and the convolution kernel.
In one data processing equipment embodiment of the application, the operator information determination module 23 be can be configured as According to the mask information of the first convolution item in first iteration, determine in multiple default convolution operators with the first convolution item The information of corresponding first convolution operator;
Corresponding, the computing module 25 can be configured as in secondary iteration based on first convolution operator Information carries out sparse convolution operation to the first convolution item, wherein the secondary iteration is that the next of first iteration changes Generation.
In another Installation practice of the application, computing module 25 can be additionally configured to the base in first iteration In the information of corresponding second convolution operator of the second convolution item, sparse convolution operation is carried out to the second convolution item, wherein institute State next convolution item that the first convolution item is the second convolution item.
Corresponding, in another Installation practice of the application, the mask obtains module 21, can be additionally configured to the The mask information of the second convolution item is obtained in one iteration.
Referring to the structural block diagram of Fig. 8 another data processing equipment shown according to an exemplary embodiment, dress shown in Fig. 7 On the basis of setting embodiment, described device can also include:
Mask information determining module 20 is configured as according to the non-zero entry that sparse data includes in the first convolution item Element determines the mask information of the first convolution item.
In the embodiment of the present application, the mask information of the first convolution item may include preset quantity mask element;
It can satisfy following any formula between the quantity M of the default convolution operator and the quantity n of the mask element: M=2n, alternatively, M=2n-1。
Mask obtains module 21, can be additionally configured to the mask information that the second convolution item is obtained in the first iteration.
Referring to the structural block diagram of Fig. 9 another data processing equipment shown according to an exemplary embodiment, dress shown in Fig. 7 On the basis of setting embodiment, described device can also include:
Memory module 22 is configured as storing the letter of the mask information of the first convolution item and first convolution operator At least one of in breath.
For device embodiment, since it corresponds essentially to embodiment of the method, so related place is referring to method reality Apply the part explanation of example.The apparatus embodiments described above are merely exemplary, wherein described be used as separation unit The unit of explanation may or may not be physically separated, and component shown as a unit can be or can also be with It is not physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to actual The purpose for needing to select some or all of the modules therein to realize application scheme.Those of ordinary skill in the art are not paying Out in the case where creative work, it can understand and implement.
Correspondingly, being shown referring to Figure 10 according to an exemplary embodiment present invention also provides a kind of convolution algorithm accelerator A kind of structural schematic diagram of convolution algorithm accelerator out, comprising:
Operator memory 100, for storing default convolution operator set;
In the embodiment of the present application, above-mentioned operator memory 100 can be normal memory, can also be that large capacity instruction is slow (Instruction Cache) is deposited, for storing multiple default convolution operators, such as 16 kinds of convolution operators in examples detailed above.If Above-mentioned operator memory is large capacity instruction buffer, and the processor for calculating equipment can the fast fast reading from the large capacity instruction buffer Target convolution operator is taken, calculates the instruction fetch expense that equipment obtains target convolution operator to reduce.
Controller 200, for the mask information according to the first convolution item, determine in the multiple default convolution operator with institute State the information of corresponding first convolution operator of the first convolution item;
Wherein, branch's multi-path choice fallout predictor 201 is provided in the controller 200.
Branch's multi-path choice fallout predictor 201, for being believed in the first iteration according to the mask of the first convolution item Breath determines the information of the first convolution operator corresponding with the first convolution item in the multiple default convolution operator, so as to calculate Information of the equipment based on first convolution operator in secondary iteration carries out sparse convolution operation to the first convolution item.
As shown in Figure 10, controller 200 can obtain volume the from the register (not shown) of storage mask information The mask information of product item and the base address of all convolution operators, are input in branch's multi-path choice fallout predictor 201.Example as shown, Branch's multi-path choice fallout predictor 201 can be the 0101 operator address for determining the second convolution item based on mask 1, and be based on mask 2 The i.e. 1001 next convolution items of prediction are the operator address of the first convolution item.
Register cache area, for providing input data for convolution algorithm;
In the embodiment of the present application, above-mentioned register cache area can be the storage region being made of muti-piece register, caching, For storing data and/or mask information to operation convolution item;
As shown in Figure 10, above-mentioned register cache area may include:
Convolution nuclear memory 401, for storing the convolution kernel of neural network;After above-mentioned convolution kernel includes: LS-SVM sparseness Sparse convolution core, alternatively, the dense convolution kernel without LS-SVM sparseness;For example, in Sparse-Sparse convolution implementation Sparse convolution core, alternatively, the dense convolution kernel in Dense-Sparse convolution implementation.
Convolution kernel caching 402, for caching the convolution kernel of the second convolution item, in Dense-Sparse convolution implementation Dense convolution kernel.
Data storage 501, for storing pending data, after above-mentioned pending data may include: LS-SVM sparseness Pending data, also may include: the pending data without LS-SVM sparseness.
Data buffer storage 502, the pending data being related to for caching the second convolution item, as Dense-Sparse convolution is realized The pending data after LS-SVM sparseness in mode.
Computing unit 300, the target convolution operator for being exported according to the operator memory is to the second convolution item Execute sparse convolution operation.
In the embodiment of the present application, register cache area is connect with computing unit, is provided for computing unit 300 and is executed sparse volume The input data of product operation.
Computing unit 300 executes sparse volume to the second convolution item according to the target convolution operator that operator memory 200 exports Product operation, obtain convolutional calculation as a result, and convolutional calculation result is exported into output caching 601, and then by convolutional calculation knot Fruit is stored in output storage 602.
In the embodiment of the present application, equipment is calculated before executing convolution algorithm to a convolution item every time, it can first really The fixed sparse convolution operator being adapted to the convolution item executes sparse convolution to convolution item using determining sparse convolution operator and transports It calculates, effectively reduces convolution algorithm amount, so as to the computing resource that effectively save convolution algorithm occupies, reduce depth nerve Requirement of the convolution algorithm of network to equipment computing capability.
In addition, this specification also provides a kind of data processing equipment, and the apparatus may include: memory, processor, it is described Memory stores in the memory for storing the computer instruction that can be run on a processor, the processor for calling Executable instruction, realize any embodiment of this specification described in data processing method.
Corresponding to above-mentioned data processing method, the embodiment of the present application also provides shown in Figure 11 according to the one of the application The schematic configuration diagram of the electronic equipment of exemplary embodiment.Figure 11 is please referred to, in hardware view, which includes processing Device, internal bus, network interface, memory and nonvolatile memory are also possible that required for other business hard certainly Part.Processor from read in nonvolatile memory corresponding computer program into memory then run, on logic level It is formed and realizes the data processing equipment provided by the present application realizing convolution algorithm and accelerating.Certainly, other than software realization mode, Other implementations, such as logical device or the mode of software and hardware combining etc. is not precluded in the application, that is to say, that following The executing subject of process flow is not limited to each logic unit, is also possible to hardware or logical device.
It will be understood by those skilled in the art that this specification one or more embodiment can provide as method, system or calculating Machine program product.Therefore, this specification one or more embodiment can be used complete hardware embodiment, complete software embodiment or The form of embodiment combining software and hardware aspects.Moreover, this specification one or more embodiment can be used at one or It is multiple wherein include computer usable program code computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) on the form of computer program product implemented.
This specification embodiment also provides a kind of computer readable storage medium, can store calculating on the storage medium Machine program realizes the realization convolution fortune that any embodiment of this specification Fig. 1 to Fig. 6 provides when described program is executed by processor The step of calculating the method accelerated.
This specification embodiment additionally provides a kind of chip, comprising:
Processor is configured as executing the data processing method that any embodiment of this specification Fig. 1 to Fig. 6 provides.
Theme described in this specification and the embodiment of feature operation can be realized in the following: Fundamental Digital Circuit, Computer software or firmware, the computer including structure disclosed in this specification and its structural equivalents of tangible embodiment are hard The combination of part or one or more of which.The embodiment of theme described in this specification can be implemented as one or Multiple computer programs, i.e. coding are executed by data processing equipment on tangible non-transitory program carrier or are controlled at data Manage one or more modules in the computer program instructions of the operation of device.Alternatively, or in addition, program instruction can be with It is coded on manually generated transmitting signal, such as electricity, light or electromagnetic signal that machine generates, the signal are generated will believe Breath encodes and is transferred to suitable receiver apparatus to be executed by data processing equipment.Computer storage medium can be machine can Read storage equipment, machine readable storage substrate, random or serial access memory equipment or one or more of which group It closes.
Processing described in this specification and logic flow can by execute one of one or more computer programs or Multiple programmable calculators execute, to execute corresponding function by the way that output is operated and generated according to input data.Institute Stating processing and logic flow can also be held by dedicated logic circuit-such as FPGA (field programmable gate array) or processor Row, and device also can be implemented as dedicated logic circuit.
The computer for being suitable for carrying out computer program includes, such as general and/or special microprocessor.Computer Basic module includes processing unit for being practiced or carried out instruction and deposits for storing instruction with the one or more of data Storage device.In general, computer will also include one or more mass-memory units for storing data, such as disk, Magneto-optic disk or CD etc. or computer will operationally be coupled with this mass-memory unit with receive from it data or to its Transmission data or two kinds of situations have both at the same time.However, computer is not required have such equipment.In addition, computer can To be embedded in another equipment, such as mobile phone, personal digital assistant (PDA), Mobile audio frequency or video player, game behaviour The portable storage of vertical platform, global positioning system (GPS) receiver or such as universal serial bus (USB) flash drive is set It is standby, it names just a few.
It is suitable for storing computer program instructions and the computer-readable medium of data including the non-volatile of form of ownership Memory, medium and memory devices, for example including semiconductor memory devices (such as EPROM, EEPROM and flash memory device), Disk (such as internal hard drive or removable disk), magneto-optic disk and CD ROM and DVD-ROM disk.Processor and memory can be by special It is supplemented or is incorporated in dedicated logic circuit with logic circuit.
Although this specification includes many specific implementation details, these are not necessarily to be construed as the model for limiting any invention It encloses or range claimed, and is primarily used for describing the feature of the specific embodiment of specific invention.In this specification Certain features described in multiple embodiments can also be combined implementation in a single embodiment.On the other hand, individually implementing Various features described in example can also be performed separately in various embodiments or be implemented with any suitable sub-portfolio.This Outside, although feature can work in certain combinations as described above and even initially so be claimed, institute is come from One or more features in claimed combination can be removed from the combination in some cases, and claimed Combination can be directed toward the modification of sub-portfolio or sub-portfolio.
Similarly, although depicting operation in the accompanying drawings with particular order, this is understood not to require these behaviour Make the particular order shown in execute or sequentially carry out or require the operation of all illustrations to be performed, to realize desired knot Fruit.In some cases, multitask and parallel processing may be advantageous.In addition, the various system modules in above-described embodiment Separation with component is understood not to be required to such separation in all embodiments, and it is to be understood that described Program assembly and system can be usually integrated in together in single software product, or be packaged into multiple software product.
The specific embodiment of theme has been described as a result,.Other embodiments are within the scope of the appended claims.In In some cases, the movement recorded in claims can be executed in different order and still realize desired result.This Outside, the processing described in attached drawing and it is nonessential shown in particular order or sequential order, to realize desired result.In certain realities In existing, multitask and parallel processing be may be advantageous.
The foregoing is merely the preferred embodiments of this specification one or more embodiment, not to limit this theory Bright book one or more embodiment, all within the spirit and principle of this specification one or more embodiment, that is done is any Modification, equivalent replacement, improvement etc. should be included within the scope of the protection of this specification one or more embodiment.

Claims (10)

1. a kind of data processing method, which is characterized in that be applied to calculate in equipment, which comprises
Obtain the mask information of the first convolution item, wherein the first convolution item includes between pending data and convolution kernel Sparse convolution operation, the mask information of the first convolution item are used to mark the nonzero element in the first convolution item;
According to the mask information of the first convolution item, determine corresponding with the first convolution item in multiple default convolution operators The information of first convolution operator;
Based on the information of first convolution operator, sparse convolution operation is carried out to the first convolution item.
2. the method according to claim 1, wherein the mask information according to the first convolution item, really The information of the first convolution operator corresponding with the first convolution item in fixed multiple default convolution operators, comprising:
According to the mask information of the first convolution item, the target operator address of the first convolution item is determined;
The information based on first convolution operator carries out sparse convolution operation to the first convolution item, comprising:
First convolution operator is obtained from target operator address, and using first convolution operator got to institute It states the first convolution item and carries out sparse convolution operation.
3. the method according to any one of claims 1 and 2, which is characterized in that the mask information includes any one of following:
Data mask corresponding with the pending data;
Convolution kernel mask corresponding with the convolution kernel;
Combination mask corresponding with the pending data and the convolution kernel.
4. according to the method in any one of claims 1 to 3, which is characterized in that described according to the first convolution item Mask information determines the information of the first convolution operator corresponding with the first convolution item in multiple default convolution operators, comprising:
According to the mask information of the first convolution item in the first iteration, determine in multiple default convolution operators with described first The information of corresponding first convolution operator of convolution item;
The information based on first convolution operator carries out sparse convolution operation to the first convolution item, comprising:
Information based on first convolution operator in secondary iteration carries out sparse convolution operation to the first convolution item, Wherein, the secondary iteration is the following iteration of first iteration.
5. according to the method described in claim 4, it is characterized in that, the method also includes:
Information based on corresponding second convolution operator of the second convolution item in first iteration, to the second convolution item into Row sparse convolution operation, wherein the first convolution item is next convolution item of the second convolution item.
6. a kind of data processing equipment, which is characterized in that described device includes:
Mask obtains module, is configured as obtaining the mask information of the first convolution item, wherein the first convolution item includes wait locate The sparse convolution operation between data and convolution kernel is managed, the mask information of the first convolution item is for marking first convolution Nonzero element in;
Operator information determination module is configured as the mask information according to the first convolution item, determines that multiple default convolution are calculated The information of the first convolution operator corresponding with the first convolution item in son;
Computing module is configured as the information based on first convolution operator, carries out sparse convolution to the first convolution item Operation.
7. a kind of convolution algorithm accelerator characterized by comprising
Operator memory, for storing multiple default convolution operators;
Controller determines in the multiple default convolution operator for the mask information according to the first convolution item with described first The information of corresponding first convolution operator of convolution item;
Register cache area, for providing input data for convolution algorithm;
Computing unit, the first convolution operator for being exported according to the operator memory execute the first convolution item sparse Convolution algorithm.
8. a kind of computer readable storage medium, which is characterized in that the storage medium is stored with computer program, the calculating Method described in any one of the claims 1-5 is realized when machine program is executed by processor.
9. a kind of chip characterized by comprising
Processor is configured as data processing method described in any one of perform claim requirement 1 to 5.
10. a kind of electronic equipment, which is characterized in that the equipment include: memory, processor and storage on a memory and can The computer program run on a processor, the processor are realized any in the claims 1-5 when executing described program Method described in.
CN201910661953.5A 2019-07-22 2019-07-22 Data processing method and device and electronic equipment Active CN110399972B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910661953.5A CN110399972B (en) 2019-07-22 2019-07-22 Data processing method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910661953.5A CN110399972B (en) 2019-07-22 2019-07-22 Data processing method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN110399972A true CN110399972A (en) 2019-11-01
CN110399972B CN110399972B (en) 2021-05-25

Family

ID=68324793

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910661953.5A Active CN110399972B (en) 2019-07-22 2019-07-22 Data processing method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN110399972B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112883982A (en) * 2021-01-08 2021-06-01 西北工业大学 Data zero-removing coding and packaging method for neural network sparse features
CN113032843A (en) * 2021-03-30 2021-06-25 北京地平线信息技术有限公司 Method and apparatus for obtaining and processing tensor data with digitally signed information
CN113269316A (en) * 2021-03-26 2021-08-17 复旦大学 Sparse data selection logic module supporting sparse neural network computing accelerator
CN114092708A (en) * 2021-11-12 2022-02-25 北京百度网讯科技有限公司 Characteristic image processing method and device and storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095902A (en) * 2014-05-23 2015-11-25 华为技术有限公司 Method and apparatus for extracting image features
CN106487939A (en) * 2015-08-26 2017-03-08 阿里巴巴集团控股有限公司 A kind of method and apparatus determining User IP subnet, a kind of electronic equipment
CN107239825A (en) * 2016-08-22 2017-10-10 北京深鉴智能科技有限公司 Consider the deep neural network compression method of load balancing
CN107527358A (en) * 2017-08-23 2017-12-29 北京图森未来科技有限公司 A kind of dense optical flow method of estimation and device
CN107729999A (en) * 2016-08-12 2018-02-23 北京深鉴科技有限公司 Consider the deep neural network compression method of matrix correlation
CN107886164A (en) * 2017-12-20 2018-04-06 东软集团股份有限公司 A kind of convolutional neural networks training, method of testing and training, test device
US20180129935A1 (en) * 2016-11-07 2018-05-10 Electronics And Telecommunications Research Institute Convolutional neural network system and operation method thereof
WO2018084974A1 (en) * 2016-11-04 2018-05-11 Google Llc Convolutional neural network
EP3396524A1 (en) * 2017-04-28 2018-10-31 INTEL Corporation Instructions and logic to perform floating-point and integer operations for machine learning
CN108805889A (en) * 2018-05-07 2018-11-13 中国科学院自动化研究所 The fining conspicuousness method for segmenting objects of margin guide and system, equipment
CN109840585A (en) * 2018-01-10 2019-06-04 中国科学院计算技术研究所 A kind of operation method and system towards sparse two-dimensional convolution

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095902A (en) * 2014-05-23 2015-11-25 华为技术有限公司 Method and apparatus for extracting image features
CN106487939A (en) * 2015-08-26 2017-03-08 阿里巴巴集团控股有限公司 A kind of method and apparatus determining User IP subnet, a kind of electronic equipment
CN107729999A (en) * 2016-08-12 2018-02-23 北京深鉴科技有限公司 Consider the deep neural network compression method of matrix correlation
CN107239825A (en) * 2016-08-22 2017-10-10 北京深鉴智能科技有限公司 Consider the deep neural network compression method of load balancing
WO2018084974A1 (en) * 2016-11-04 2018-05-11 Google Llc Convolutional neural network
US20180129935A1 (en) * 2016-11-07 2018-05-10 Electronics And Telecommunications Research Institute Convolutional neural network system and operation method thereof
EP3396524A1 (en) * 2017-04-28 2018-10-31 INTEL Corporation Instructions and logic to perform floating-point and integer operations for machine learning
CN107527358A (en) * 2017-08-23 2017-12-29 北京图森未来科技有限公司 A kind of dense optical flow method of estimation and device
CN107886164A (en) * 2017-12-20 2018-04-06 东软集团股份有限公司 A kind of convolutional neural networks training, method of testing and training, test device
CN109840585A (en) * 2018-01-10 2019-06-04 中国科学院计算技术研究所 A kind of operation method and system towards sparse two-dimensional convolution
CN108805889A (en) * 2018-05-07 2018-11-13 中国科学院自动化研究所 The fining conspicuousness method for segmenting objects of margin guide and system, equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BIPIN B ET AL: "Image Convolution Optimization Using Sparse Matrix Vector Multiplication Technique", 《2016 INTL. CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI)》 *
连自锋: "基于深层神经网络的图像识别算法研究", 《中国博士学位论文全文数据库信息科技辑》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112883982A (en) * 2021-01-08 2021-06-01 西北工业大学 Data zero-removing coding and packaging method for neural network sparse features
CN112883982B (en) * 2021-01-08 2023-04-18 西北工业大学 Data zero-removing coding and packaging method for neural network sparse features
CN113269316A (en) * 2021-03-26 2021-08-17 复旦大学 Sparse data selection logic module supporting sparse neural network computing accelerator
CN113269316B (en) * 2021-03-26 2022-10-11 复旦大学 Sparse data selection logic module supporting sparse neural network computing accelerator
CN113032843A (en) * 2021-03-30 2021-06-25 北京地平线信息技术有限公司 Method and apparatus for obtaining and processing tensor data with digitally signed information
CN113032843B (en) * 2021-03-30 2023-09-15 北京地平线信息技术有限公司 Method and apparatus for obtaining and processing tensor data with digital signature information
CN114092708A (en) * 2021-11-12 2022-02-25 北京百度网讯科技有限公司 Characteristic image processing method and device and storage medium

Also Published As

Publication number Publication date
CN110399972B (en) 2021-05-25

Similar Documents

Publication Publication Date Title
CN110399972A (en) Data processing method, device and electronic equipment
US9021241B2 (en) Combined branch target and predicate prediction for instruction blocks
CN102460420B (en) Conditional operation in an internal processor of a memory device
CN104040492B (en) Microprocessor accelerated code optimizer and dependency reordering method
CN111656367A (en) System and architecture for neural network accelerator
TWI287747B (en) Instruction processing method, apparatus and system, and storage medium having stored thereon instructions
WO2020147410A1 (en) Pedestrian detection method and system, computer device, and computer readable storage medium
EP3398113B1 (en) Loop code processor optimizations
KR102378887B1 (en) Method and Apparatus of Bounding Box Regression by a Perimeter-based IoU Loss Function in Object Detection
WO2020076392A1 (en) Modifying machine learning models to improve locality
CN103250131A (en) Single cycle multi-ranch prediction including shadow cache for early far branch prediction
CN111400868B (en) Distributed workshop scheduling optimization method and system with order and robot carrying functions
WO2021042763A1 (en) Image searches based on word vectors and image vectors
JP2021505978A (en) Storage and loading methods, devices, systems and storage media for visual self-location estimation maps
CN103189853A (en) Method and apparatus for providing efficient context classification
CN104050710A (en) 3-d graphics rendering with implicit geometry
TW202324209A (en) Data processing method and non-transitory computer program product for neural network sequential inputs
US11714992B1 (en) Neural network processing based on subgraph recognition
CN108875914B (en) Method and device for preprocessing and post-processing neural network data
JP2022516549A (en) Chip operating frequency setting
CN110083433A (en) Embedded software running method and device, terminal and computer readable storage medium
WO2020186518A1 (en) Method and apparatus for debugging, and system on chip
CN112947932A (en) Method and device for optimizing vectorization in compiling process and electronic equipment
WO2017116927A1 (en) Zero cache memory system extension
CN114443174A (en) Code loading method, code loading device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant