CN109688395A

CN109688395A - Operation method, device and Related product

Info

Publication number: CN109688395A
Application number: CN201811638388.2A
Authority: CN
Inventors: 不公告发明人
Original assignee: Beijing Zhongke Cambrian Technology Co Ltd
Current assignee: Cambricon Technologies Corp Ltd; Beijing Zhongke Cambrian Technology Co Ltd
Priority date: 2018-12-29
Filing date: 2018-12-29
Publication date: 2019-04-26
Anticipated expiration: 2038-12-29
Also published as: CN111193916B; CN111193916A; CN111222635A; CN111193917A; CN109688395B; CN111193917B; CN111191788A

Abstract

This disclosure relates to a kind of operation method, device and Related product, the product includes control module, and the control module includes: instruction cache unit, instruction process unit and storage queue unit；Described instruction cache unit, for storing the associated computations of artificial neural network operation；Described instruction processing unit obtains multiple operational orders for parsing to the computations；The storage queue unit, for storing instruction queue, the instruction queue include: by the pending multiple operational orders of the tandem of the queue or computations.By above method, operation efficiency of the Related product when carrying out the operation of neural network model is can be improved in the disclosure.

Description

Operation method, device and Related product

Technical field

This disclosure relates to field of artificial intelligence more particularly to a kind of operation method, device and Related product.

Background technique

In field of artificial intelligence, depth learning technology has in image, field of video processing relatively broad at present Using based on the neural network that the training of specific data set is completed, it is higher accurate to reach in classification and Detection task Rate.But as being limited to data set and pretreating tool used in training process, the neural network that training is completed is usually only It can be used under the consistent application environment of/data format identical as training set type, reusability is not high.If necessary to use not The input data of same type will usually carry out retraining, or pre-process to input data, lead to network pretreatment time Long, data-handling efficiency is low.

Summary of the invention

In view of this, the present disclosure proposes a kind of operation method, device and Related product, by according to the first format and The model of two Format adjusting Caffe image processing models defines file, so as to define file generated according to model adjusted The input data format that Caffe image processing model is supported is the first format, effectively increases Caffe image processing model Matching degree and reusability.

According to the one side of the disclosure, a kind of operation method, which is characterized in that the method is applied to Heterogeneous Computing framework In, the Heterogeneous Computing framework includes general processor and artificial intelligence process device, comprising:

When receiving waiting task, judges the first format of the input image data of the waiting task and preset The second format of input data for being supported of Caffe image processing model it is whether consistent；

When first format and second format are inconsistent, according to first format and second format, The model for adjusting the Caffe image processing model defines file, so as to define file generated according to the model adjusted The input image data supported of Caffe image processing model be first format.

In one possible implementation, first format and second format are tristimulus image data lattice Formula；

Wherein, it according to first format and second format, adjusts the model and defines file, comprising:

According to the channel number and channel sequence of the first format and second format, the model definition text is adjusted Part.

In one possible implementation, channel of the channel number of second format less than first format Number, and the channel sequence of first format is identical as the channel sequence of second format；

Wherein, according to the channel number and channel sequence of the first format and second format, it is fixed to adjust the model Adopted file, comprising:

The convolution that the corresponding convolution kernel addition convolution kernel weight of the first floor convolutional layer in file is zero is defined in the model Channel, so that the model adjusted defines the input picture number that the first floor convolutional layer in file supports first format According to.

In one possible implementation, channel of the channel number of second format less than first format Number, and the channel sequence of second format is different from the channel of first format sequence；

The channel sequence for the corresponding convolution kernel of first floor convolutional layer that the model defines in file is adjusted, and in the first floor The convolutional channel that convolution kernel weight is zero is added in the corresponding convolution kernel of convolutional layer, so that the modified first floor convolutional layer branch Hold the input image data of first format.

In one possible implementation, the channel number of second format is equal to the channel of first format Number, and the channel sequence of first format is different from the channel of second format sequence；

The channel sequence for adjusting the corresponding convolution kernel of first floor convolutional layer that the model defines in file, so that adjusted The model defines the input image data that the first floor convolutional layer in file supports first format.

In one possible implementation, the channel number of second format is greater than the channel of first format Number, and the channel sequence of first format is different from the channel of second format sequence, and described in the second format ratio The extra channel weight of first format is zero；

The convolutional channel that the weight that the model is defined to the corresponding convolution kernel of first floor convolutional layer in file is zero is deleted, And remaining channel sequence in the convolution kernel is adjusted, so that the model adjusted defines the first floor convolutional layer branch in file Hold the input image data of first format.

In one possible implementation, the channel number of second format is greater than the channel of first format Number, and the channel sequence of second format is identical as the channel sequence of first format, and described in the second format ratio The extra channel weight of first format is zero；

The convolutional channel that the weight that the model is defined to the corresponding convolution kernel of first floor convolutional layer in file is zero is deleted, So that the model adjusted defines the input image data that the first floor convolutional layer in file supports first format.

In one possible implementation, first format is luma-chroma image data format, second lattice Formula is tristimulus image data format；

Wherein, it according to first format and second format, modifies the model and defines file, comprising:

It is defined in the model and adds the first data conversion layer in file, the first data conversion layer is located at first floor convolution Before layer, for the input image data of first format to be converted to second format.

It is defined in the model and adds the second data conversion layer in file, the second data conversion layer is used for described the The input image data of one format is converted to third format；

According to the third format, modifies the model and define first floor convolutional layer in file, so that the first floor convolution Layer supports the input image data of the third format,

Wherein, the third format is four-way data format, and four-way includes additional saturating of primary display channels and one Lightness channel.

In one possible implementation, the second data conversion layer, positioned at the Caffe image processing model Before first floor convolutional layer.

In one possible implementation, wherein according to the third format, modify the model and define in file First floor convolutional layer, comprising:

The channel sequence of the corresponding convolution kernel of the first floor convolutional layer is adjusted, and adds convolution kernel power in the convolution kernel The convolutional channel that weight is zero, so that the input data that the first floor convolutional layer adjusted is supported is the third format.

In one possible implementation, the method also includes:

When receiving the input image data of the waiting task, file and weight are defined according to model adjusted File generates Caffe image processing model；

It will be handled in the Caffe image processing model of input image data input generation, obtain processing result image.

According to the one side of the disclosure, a kind of arithmetic unit is provided, which is characterized in that the arithmetic unit is used for isomery In computing architecture, the Heterogeneous Computing framework includes general processor and artificial intelligence process device, comprising:

Judgment module, for when receiving waiting task, judging the input image data of the waiting task Whether the first format is consistent with the second format of the input data that preset Caffe image processing model is supported；

Adjust module, for when first format and second format are inconsistent, according to first format and Second format, the model for adjusting the Caffe image processing model define file, so that according to the model adjusted The input image data that the Caffe image processing model of definition file generated is supported is first format.

Wherein, the adjustment module, comprising:

The first adjustment submodule, for according to the channel number of the first format and second format and channel sequence, It adjusts the model and defines file.

Wherein, the first adjustment submodule, comprising:

The first adjustment unit, for defining the corresponding convolution kernel addition convolution of the first floor convolutional layer in file in the model The convolutional channel that core weight is zero, so that the model adjusted defines the first floor convolutional layer in file and supports first lattice The input image data of formula.

In one possible implementation, the channel number of the second format is less than the channel number of first format, And the channel sequence of second format is different from the channel of first format sequence；

Wherein, the first adjustment submodule, comprising:

Second adjustment unit defines the channel of the corresponding convolution kernel of first floor convolutional layer in file for adjusting the model Sequentially, and in the corresponding convolution kernel of the first floor convolutional layer convolutional channel that convolution kernel weight is zero is added, so that after modification The first floor convolutional layer support the input image data of first format.

Wherein, the first adjustment submodule, comprising:

Third adjustment unit defines the channel of the corresponding convolution kernel of first floor convolutional layer in file for adjusting the model Sequentially, so that the model adjusted defines the input picture number that the first floor convolutional layer in file supports first format According to.

Wherein, the first adjustment submodule, comprising:

Third adjustment unit, the weight for the model to be defined to the corresponding convolution kernel of first floor convolutional layer in file are Zero convolutional channel is deleted, and adjusts remaining channel sequence in the convolution kernel, so that the model definition text adjusted First floor convolutional layer in part supports the input image data of first format.

Wherein, the first adjustment submodule, comprising:

4th adjustment unit, the weight for the model to be defined to the corresponding convolution kernel of first floor convolutional layer in file are Zero convolutional channel is deleted, so that the model adjusted defines the first floor convolutional layer in file and supports first format Input image data.

Wherein, the adjustment module, comprising:

Second adjustment submodule adds the first data conversion layer for defining in the model in file, first number Before being located at first floor convolutional layer according to conversion layer, for the input image data of first format to be converted to second lattice Formula.

Wherein, the adjustment module, comprising:

Third adjusting submodule adds the second data conversion layer for defining in the model in file, second number It is used to the input image data of first format being converted to third format according to conversion layer；

4th adjusting submodule defines first floor convolution in file for according to the third format, modifying the model Layer, so that the first floor convolutional layer supports the input image data of the third format,

In one possible implementation, the 4th adjusting submodule, comprising:

4th adjustment unit, for adjusting the channel sequence of the corresponding convolution kernel of the first floor convolutional layer, and in the volume The convolutional channel that addition convolution kernel weight is zero in product core, so that the input data that the first floor convolutional layer adjusted is supported For the third format.

In one possible implementation, further includes:

Model generation module, for when receiving the input image data of the waiting task, according to adjusted Model defines file and weight file, generates Caffe image processing model；

Input processing module, for will be handled in the Caffe image processing model of input image data input generation, Obtain processing result image.

According to another aspect of the present disclosure, a kind of neural network chip is additionally provided, which is characterized in that the chip includes Arithmetic unit described in any one as above.

According to the one side of the disclosure, a kind of electronic equipment is provided, which is characterized in that the electronic equipment includes as above The neural network chip.

According to another aspect of the present disclosure, a kind of board is additionally provided, which is characterized in that the board includes: memory Part, interface arrangement and control device and neural network chip as described above；

Wherein, the neural network chip and the memory device, the control device and the interface arrangement are distinguished Connection；

The memory device, for storing data；

The interface arrangement, for realizing the data transmission between the neural network chip and external equipment；

The control device is monitored for the state to the neural network chip.

In one possible implementation, the memory device includes: multiple groups storage unit, is stored described in each group single It is first to be connect with the neural network chip by bus, the storage unit are as follows: DDR SDRAM；

The chip includes: DDR controller, the control for data transmission and data storage to each storage unit System；

The interface arrangement are as follows: standard PCIE interface.

Above-mentioned operation method, at the first format and Caffe image for the input image data for judging waiting task When second format of the input data that reason model is supported is inconsistent, by according to the first format and the second Format adjusting Caffe The model of image processing model defines file, so as to define the Caffe image procossing mould of file generated according to model adjusted The input data format that type is supported is the first format.As a result, when carrying out the processing of waiting task, even if waiting task Input picture the input data format supported of data format and Caffe image processing model it is inconsistent, still be able to suitable Benefit write-in defines in the Caffe image processing model of file generated according to model adjusted, this just effectively increases Caffe figure As the matching degree and reusability of processing model.

According to below with reference to the accompanying drawings to detailed description of illustrative embodiments, the other feature and aspect of the disclosure will become It is clear.

Detailed description of the invention

Comprising in the description and constituting the attached drawing of part of specification and specification together illustrates the disclosure Exemplary embodiment, feature and aspect, and for explaining the principles of this disclosure.

Fig. 1 shows the flow chart of the operation method according to one embodiment of the disclosure；

Fig. 2 a and Fig. 2 b show the schematic diagram of the first floor convolutional layer of the BGR model according to one embodiment of the disclosure；

It is under ARGB image that Fig. 3 a and Fig. 3 b, which are shown according to the first format of the input image data of one embodiment of the disclosure, The adjustment schematic diagram of convolution kernel；

Fig. 4 shows the process flow according to the first format of the input image data of one embodiment of the disclosure for YUV image Schematic diagram；

Fig. 5 shows the flow chart of the operation method according to one embodiment of the disclosure；

Fig. 6 shows the block diagram of the arithmetic unit according to one embodiment of the disclosure；

Fig. 7 shows the structural block diagram of the board according to one embodiment of the disclosure.

Specific embodiment

Various exemplary embodiments, feature and the aspect of the disclosure are described in detail below with reference to attached drawing.It is identical in attached drawing Appended drawing reference indicate element functionally identical or similar.Although the various aspects of embodiment are shown in the attached drawings, remove It non-specifically points out, it is not necessary to attached drawing drawn to scale.

Dedicated word " exemplary " means " being used as example, embodiment or illustrative " herein.Here as " exemplary " Illustrated any embodiment should not necessarily be construed as preferred or advantageous over other embodiments.

In addition, giving numerous details in specific embodiment below to better illustrate the disclosure. It will be appreciated by those skilled in the art that without certain details, the disclosure equally be can be implemented.In some instances, for Method, means, element and circuit well known to those skilled in the art are not described in detail, in order to highlight the purport of the disclosure.

Referring to Fig. 1, Fig. 1 shows the flow chart of the operation method according to one embodiment of the disclosure.Wherein, it needs to illustrate , the operation method of one embodiment of the disclosure can be applied in server or terminal.Comprising:

Step S100 judges the first format of the input image data of waiting task when receiving waiting task Whether the second format of the input data supported with preset Caffe image processing model is consistent.

Step S200, in the first format and inconsistent the second format, according to the first format and the second format, adjustment The model of Caffe image processing model defines file, so as to be defined at the Caffe image of file generated according to model adjusted Managing the input data that model is supported is the first format.

Above-mentioned operation method, in the first format and Caffe image of the input image data of the waiting task received It, can basis at Caffe image processing model end when second format of the input data that processing model can be supported is inconsistent First format defines file with model of second format to Caffe image processing model and is adjusted, to replace carrying out at the end CPU The format conversion of input image data and pretreated operation.The embodiment of the present disclosure does not need to carry out complicated operation, is not required to yet A large amount of calculation resources are occupied, data-handling efficiency is effectively increased, save network pretreatment time.

Wherein, it is noted that in the above-mentioned embodiment of the disclosure, Caffe image processing model can for based on Convolutional neural networks frame (the Convolutional Architecture for Fast Feature of deep learning Embedding, abbreviation Caffe).

It is also desirable to, it is noted that above-mentioned operation method can be the obtained convolutional neural networks of training (i.e. Caffe image processing model) on carry out.Wherein, it will be appreciated by persons skilled in the art that for generating Caffe image It is structured file (Prototxt, abbreviation pt) that the file for handling model, which includes two: one, that is, mentioned-above model is determined Adopted file；Another is then weight file (caffemodel).The object adjusted in above-mentioned operation method can be stored in Model in disk defines file (pt).

It defines file by the model to Caffe image processing model and is adjusted, no matter can make input picture Which kind of format is data be, model, which defines file and can correspond to the specific data format of input image data, to be adjusted correspondingly, And then the corresponding Caffe image processing model of file generated is defined according to model adjusted.It is generated in the embodiment of the present disclosure Caffe image processing model can sequentially read the input image data of different-format type, and to the input picture read Data are handled, and the reusability and matching degree of Caffe image processing model are effectively increased.

Also, the embodiment of the present disclosure provide method can only to the model of Caffe image processing model define file into Row adjustment carries out data conversion and pretreated mode to input picture at the end CPU compared to traditional, does not need to be answered Miscellaneous operation does not need to occupy a large amount of calculation resources yet.And it is split in neural network using data compared to traditional The mode remerged, the method that the embodiment of the present disclosure provides do not need the neural network structure that extra computation is added, do not need pair Network structure carries out manual setting, avoids more operation bidirectional, finally effectively increases image processing efficiency, saves figure As the processing time, resource consumption is reduced.

Further, there are two types of the characterization of image data generally comprises.It is a kind of are as follows: using three primary colours (red red, it is green Green, blue blue) plus photosystem principles illustrated color of image.That is, using record pixel three primary colours data format (such as: BGR format) carry out color of image characterization (tristimulus image data format).It is another then are as follows: according to the original of brightness and color difference Reason description color of image.That is, carrying out color of image using the brightness of record pixel and the data format (such as: yuv format) of coloration Characterization (luma-chroma image data format).

The convolutional neural networks generated based on routine data collection training are usually that triple channel (being defaulted as BGR under OpenCV) is defeated Enter, i.e., the input data format that Caffe image processing model is usually supported is BGR format.This paper embodiments described below It is illustrated so that the data format that Caffe image processing model is supported is BGR format as an example.It is understood that following institute The data format that the Caffe image processing model of description is supported is that BGR format is illustrative only, and does not have to limit and make With.The embodiment that those skilled in the art provide according to the disclosure is also based on input data format as four-way input Caffe image processing model handled, herein without repeating.

Fig. 2 a and Fig. 2 b show the schematic diagram of the first floor convolutional layer of the BGR model according to one embodiment of the disclosure.Refering to figure 2b, in the Caffe image processing model of above-mentioned BGR triple channel input, convolution kernel weight scale corresponding to first floor convolutional layer For 3*Kh*Kw, the convolution window (Kh*Kw) under BGR triple channel is respectively corresponded.

In a kind of possible embodiment, when the first format and the second format are tristimulus image data format, the The difference of one format and the second format is that channel sequence (putting in order for red-green-blue) and channel number are inconsistent. It therefore,, can be according to the first format and the second format when adjustment model defines file at this point, according to the first format and the second format Channel number and channel sequence carry out the adjustment that model defines file.

That is, defining file to model according to the difference of the channel number and channel sequence of the first format and the second format In first floor convolutional layer convolution kernel carry out channel addition, delete and channel sequence is carried out the mode such as to rearrange Adjustment, so that model adjusted defines the first floor convolutional layer in file and can support the input image data of the first format.

In one possible implementation, when channel number of the channel number of the second format less than the first format, and When the channel sequence of first format is identical as the channel sequence of the second format, according to the channel number of the first format and the second format And channel sequence, adjustment model define file can include:

The convolutional channel that the corresponding convolution kernel addition convolution kernel weight of the first floor convolutional layer in file is zero is defined in model, So that model adjusted defines the input image data that the first floor convolutional layer in file supports the first format.

Such as: the first format are as follows: ABGR format, the second format are as follows: BGR format, the model definition of Caffe image processing model In file, the corresponding convolution kernel weight of first floor convolutional layer respectively corresponds the convolution window to put in order as under BGR triple channel.Into When row model defines the adjustment of file, the corresponding convolution kernel addition convolution kernel weight of the first floor convolutional layer in file is defined in model The convolutional channel for being zero, so that model adjusted defines the corresponding convolution kernel weight of first floor convolutional layer in file and respectively corresponds row Column sequence is the convolution window of ABGR four-way.

It is the data format for recording the three primary colours of pixel in the first format and the second format, difference is only that channel number Difference, and the channel number of the first format be less than the second format channel number when, define the first floor convolution in file in model The corresponding convolution kernel of layer one convolutional channel of corresponding addition, it is easy to operate, it is easy to accomplish.

Further, when the channel number of the second format is less than the channel number of the first format, and the channel of the second format When sequence is different from the channel of the first format sequence, at this point, the difference of the first format and the second format is then not only only that channel The difference of number, channel sequence are also different.Therefore, at this time according to the channel number of the first format and the second format and Channel sequence, adjustment model define file, then specifically include:

Adjustment model defines the channel sequence of the corresponding convolution kernel of first floor convolutional layer in file, and in first floor convolutional layer pair The convolutional channel that convolution kernel weight is zero is added in the convolution kernel answered, so that modified first floor convolutional layer supports the first format Input image data.

It is under ARGB image that Fig. 3 a and Fig. 3 b, which are shown according to the first format of the input image data of one embodiment of the disclosure, The adjustment schematic diagram of convolution kernel.Refering to Fig. 3 a, the first format is ARGB format, and the second format is BGR format, at Caffe image The model of reason model, which defines the corresponding convolution kernel weight of first floor convolutional layer in file and respectively corresponds, to put in order as under BGR triple channel Convolution window.As shown in Figure 3b, carry out model define the adjustment of file when, can by first floor convolutional layer to convolution kernel Weight is adjusted to the convolution window under corresponding RGB triple channel, and adding a weighted value in convolution kernel weight after the adjustment is zero A convolutional channel so that model adjusted defines convolution corresponding to the corresponding convolution kernel weight of first floor convolutional layer in file Window is the convolution window of ARGB four-way.

That is, when the first format of input image data is ARGB format, the second format that Caffe image processing model is supported When for BGR format, the convolution kernel weight that model defines first floor convolutional layer in file can be rearranged, and add a weighted value For 0 A channel, make input image data after convolution kernel operation, result is equal to the operation result of BGR input.On Adjustment operation is stated in the operational process of whole network, the convolution kernel weight of first floor convolutional layer is once rearranged, it is defeated Enter image data not needing to be pre-processed, carries out operation bidirectional without user, can realize BGR weight to ARGB weight automatically Conversion, and weight conversion is effectively reducing the pretreated time, is improving entirely primary using only occurring in operational process The treatment effeciency of network.

In one possible implementation, when channel number of the channel number of the second format equal to the first format, and When the channel sequence of first format is different from the channel of the second format sequence, at this point, the difference of the first format and the second format is only It is that channel sequence is different.Therefore, when defining file to model at this time and being adjusted, specifically:

Adjustment model defines the channel sequence of the corresponding convolution kernel of first floor convolutional layer in file, so that model adjusted Define the input image data that the first floor convolutional layer in file supports the first format.

Such as: the first format is BGR format, when the second format is rgb format, at this point, the model of Caffe image processing model It defines the corresponding convolution kernel weight of first floor convolutional layer in file and respectively corresponds the convolution window to put in order as under RGB triple channel. Since the channel number of the second format is identical as the channel number of the first format, the modification for carrying out channel number is not needed, Only need to carry out the adjustment of channel sequence.Define the adjustment of file to model at this time as a result, are as follows: define model in file The channel sequence of the corresponding convolution kernel of the first floor convolutional layer BGR is adjusted to by RGB.

Similarly, in these cases, the corresponding convolution kernel channel of first floor convolutional layer in file also need to be only defined to model Sequence is adjusted.It is easy to operate, it is easy to accomplish.

As a kind of possible implementation, the channel number of the second format is greater than the channel number of the first format, and the The channel sequence of one format is different from the channel of the second format sequence, and the second format channel weight more extra than the first format is When zero, the adjustment for defining file progress to model at this time can be with are as follows:

The convolutional channel that the weight that model is defined to the corresponding convolution kernel of first floor convolutional layer in file is zero is deleted, and is adjusted Remaining channel sequence in whole convolution kernel, so that model adjusted defines the first floor convolutional layer in file and supports the first format Input image data.

Such as: the first format is BGR format, and the second format is RGB0 format, at this point, the model of Caffe image processing model It defines the corresponding convolution kernel weight of first floor convolutional layer in file and respectively corresponds the convolution window to put in order as under RGB0 four-way. In this case, when defining the convolution kernel of the first floor convolutional layer in file to model and being adjusted, the volume that only need to be 0 by weight Product channel is deleted, and is adjusted to the sequence of triple channel (RGB) remaining after deletion, is adjusted to the channel BGR sequence i.e. It can.

As a kind of possible implementation, the channel number of the second format is greater than the channel number of the first format, and the The channel sequence of one format is identical as the channel sequence of the second format, and the second format channel weight more extra than the first format is When zero, the adjustment for defining file progress to model at this time can be with are as follows:

The convolutional channel that the weight that model is defined to the corresponding convolution kernel of first floor convolutional layer in file is zero is deleted, so that Model adjusted defines the input image data that the first floor convolutional layer in file supports the first format.

Such as: the first format is BGR format, and the second format is BGR0 format, at this point, the model of Caffe image processing model It defines the corresponding convolution kernel weight of first floor convolutional layer in file and respectively corresponds the convolution window to put in order as under BGR0 four-way. In this case, when defining the convolution kernel of the first floor convolutional layer in file to model and being adjusted, same only need to be 0 by weight Convolutional channel delete.

As a kind of possible embodiment, when the first format be luma-chroma image data format (that is, yuv format), When second format is tristimulus image data format (e.g., BGR format or ARGB format etc.), at this time due to the first format and second The characterization principle of format is different, therefore, when defining file to model at this time and being adjusted, needs to carry out format to the first format to turn It changes.As a result, in that case, file is defined according to the first format and the second form modifying model, specifically included:

It being defined in model and adds the first data conversion layer in file, the first data conversion layer is located at before first floor convolutional layer, For the input image data of the first format to be converted to the second format.

The first data conversion layer can be added before model defines the first floor convolutional layer in file, pass through increased first First format is converted directly into the second format by data conversion layer.

Such as: the first format is yuv format, and when the second format is ABGR format, the adjustment for defining file to model can be with are as follows: The first data conversion layer is added before the first floor convolutional layer that model defines file, it is by the first data conversion layer that yuv format is straight Switch through and is changed to ABGR format.

As a kind of possible embodiment, when the first format carries out image using different characteristic manners from the second format When characterization (that is, the first format is yuv format, the second format is BGR format) of data, since the first format is converted to the second lattice When formula, conversion process is comparatively more complicated, in order to further reduce the difficulty of adjustment, according to the first format and When two form modifying models define file, further includes:

The second data conversion layer is added in file firstly, defining in model, the second data conversion layer is used for the first format Input image data be converted to third format, point of addition can be for before first floor convolutional layer.Meanwhile third format is four-way Track data format, four-way include primary display channels (B, G, R) and an additional transparency channel (A).That is, third format can Think and tetra- channels A, B, G and R are carried out with any of them format after permutation and combination.It can be modified according to third format Model defines the first floor convolutional layer in file, so that first floor convolutional layer supports the input image data of third format.

Model is carried out in these cases when defining the modification of file, and the second data conversion layer of addition can be used for the One format is first converted to third format.Wherein, the third format and the second format obtained after conversion is characterized using identical pixel Mode (that is, the mode for being all made of the three primary colours of record pixel), defines in file model further according to the third format after conversion First floor convolutional layer modify, reduce the addition difficulty system of the second data conversion layer for carrying out Data Format Transform Number, further simplifies treatment process, so that the operation method of the disclosure is more easily realized.

Fig. 4 shows the process flow according to the first format of the input image data of one embodiment of the disclosure for YUV image Schematic diagram.Refering to Fig. 4, the first format is yuv format, when the second format is BGR format, in Caffe image processing model Model defines in file, can be by defining file in the model of Caffe image processing model when converting to yuv format One specific network layer of middle addition:, providing a variety of translative mode by the layer by MLUYUVtoRGB layers (that is, second data conversion layer), Such as: the image data of yuv format is converted into RGB0 format, BGR0 format or ARGB format.Further according to transformation result (that is, turning Third format after changing) first floor convolutional layer in Caffe image processing model is adjusted, so that first floor convolutional layer is supported Input data format be third format.

When the first format is yuv format (input image data is YUV image), it is only necessary in Caffe image procossing mould Corresponding second data conversion layer is inserted into type, after being formatted by the second data conversion layer to YUV image, further according to turn Data format after changing carries out the adjustment of first floor convolutional layer, and the support to YUV image, easy to operate, image procossing can be realized Efficiency is higher.

Herein, it should be noted that the operation that data conversion layer is inserted into Caffe image processing model can pass through use Operation insertion directly in Caffe image processing model in family can also be realized to realize by computer instruction.The disclosure is to this Without limitation.

Further, in the embodiment that above-mentioned first format is yuv format, the second format is BGR format, according to Three formats, modification model define the first floor convolutional layer in file, so that the input data that first floor convolutional layer is supported is third lattice When formula, may include:

When the third format being converted to is ARGB format, with the first format be ARGB format, the second format is BGR lattice The case where formula, is identical, can directly adopt above-mentioned relevant mode of operation to the adjustment of first floor convolutional layer at this time.That is, adjustment first floor volume The convolutional channel sequence of the corresponding convolution kernel of lamination, and an A convolutional channel is added in convolution kernel, so that first floor convolutional layer is propped up The input data held is ARGB format.Wherein, the convolution kernel weight of A convolutional channel is zero.

Similarly, same only to adjust the corresponding volume of first floor convolutional layer when the third format being converted to is BGRA format The convolutional channel sequence of product core, and an A convolutional channel is added in convolution kernel, so that the input data that first floor convolutional layer is supported For BGRA format.Wherein, the convolution kernel weight of A convolutional channel is similarly zero.

It should be pointed out that adding the second data conversion layer before first floor convolutional layer, there is the second data conversion layer to The third format that one format is converted to can be a variety of four-way formats.The be finally converted in order to facilitate identification The specific format information of three formats can set default form for the third format that the second data conversion layer is converted to (ARGB format).It, can be by head when user needs to be converted to other four-way data formats in addition to ARGB format Format parameter is arranged in layer convolutional layer, and the specific format letter for the third format being finally converted to is indicated by the format parameter Breath.

Herein, it should be noted that user's additional format parameter in first floor convolutional layer can be yuv_input.When When format parameter is set as yuv_input:BGR0, show the third format after currently converting to YUV input picture into BGR0 lattice Formula.

In addition, in the above-described embodiment, other modes can also be used to realize, herein without specifically limiting in format parameter It is fixed.

As a result, when the data format of input image data (the first format) is yuv format, user need to only scheme in Caffe As being inserted into corresponding second data conversion layer (MLUYUVtoRGB layers) in processing model, and first floor convolutional layer is increased additionally The format parameter yuv_input of format information after being used to indicate conversion realizes the support to the input picture of yuv format, behaviour Make easy, image processing efficiency height.

Fig. 5 shows the flow chart of the operation method according to one embodiment of the disclosure.In a kind of possible embodiment, ginseng Fig. 5 is read, the operation method may also include that

Step S300 defines file according to model adjusted in the input image data for receiving waiting task With weight file, Caffe image processing model is generated.

Step S400 will handle in the Caffe image processing model of input image data input generation, obtain image procossing As a result.

That is, defined after file modifies by model of any one of the above mode to Caffe image processing model, File can be defined according to modified model and trains the obtained corresponding Caffe image procossing mould of weight file generated before Type.It can be by the input image data of the waiting task received (such as: the input image data or yuv format of ARGB format Input image data) be input in the Caffe image processing model of generation and handled.Caffe image processing model can be with It reads input image data and performs corresponding processing, obtain processing result image, it is more to realize that Caffe image processing model is supported The purpose of the image input of type, effectively improves the reusability of the Caffe image processing model of generation.

Summary, any one of the above operation method are schemed in the data format (the first format) and Caffe of input image data When the data format (the second format) that picture processing model can be supported is inconsistent, using in Caffe image processing model end, root File is defined to the model of Caffe image processing model according to the first format and the second format to be adjusted.Make compared to traditional Carry out Data Format Transform and pretreated mode with CPU, avoid every input picture be required to complete image on CPU it is logical The process of the operation of the process and complexity of road permutatation, avoids the consumption of the calculation resources of a large amount of CPU.Compared to traditional The mode that the mode remerged handles the four-ways input picture such as ARGB is split using data in neural network, it is above-mentioned any one Kind operation method does not need the more operation bidirectional of user, overcomes larger caused by traditional modification neural network fashion change It moves, processing is complicated, is difficult to the defect debugged.

Fig. 6 shows the block diagram of the arithmetic unit 100 of one embodiment of the disclosure.Refering to Fig. 6, arithmetic unit 100 is for different In structure computing architecture, Heterogeneous Computing framework includes general processor and artificial intelligence process device, comprising:

Judgment module 110, of the input image data for when receiving waiting task, judging waiting task Whether one format is consistent with the second format of the input data that preset Caffe image processing model is supported；

Module 120 is adjusted, in the first format and inconsistent the second format, according to the first format and the second format, The model of adjustment Caffe image processing model defines file, so as to be schemed according to the Caffe that model adjusted defines file generated As the input image data that processing model is supported is the first format.

In one possible implementation, the first format and the second format are tristimulus image data format；

Wherein, module 120 is adjusted, comprising:

The first adjustment submodule, for according to the channel number of the first format and the second format and channel sequence, adjustment Model defines file.

In one possible implementation, the channel number of the second format is less than the channel number of the first format, and the The channel sequence of one format is identical as the channel sequence of the second format；

Wherein, the first adjustment submodule, comprising:

The first adjustment unit, for defining the corresponding convolution kernel addition convolution kernel power of the first floor convolutional layer in file in model The convolutional channel that weight is zero, so that model adjusted defines the input picture that the first floor convolutional layer in file supports the first format Data.

In one possible implementation, the channel number of the second format is less than the channel number of the first format, and the The channel sequence of two formats is different from the channel of the first format sequence；

Wherein, the first adjustment submodule, comprising:

Second adjustment unit, for adjust model define the corresponding convolution kernel of first floor convolutional layer in file channel it is suitable Sequence, and the convolutional channel that convolution kernel weight is zero is added in the corresponding convolution kernel of first floor convolutional layer, so that the modified first floor Convolutional layer supports the input image data of the first format.

In one possible implementation, the channel number of the second format is equal to the channel number of the first format, and the The channel sequence of one format is different from the channel of the second format sequence；

Wherein, the first adjustment submodule, comprising:

Third adjustment unit, for adjust model define the corresponding convolution kernel of first floor convolutional layer in file channel it is suitable Sequence, so that model adjusted defines the input image data that the first floor convolutional layer in file supports the first format.

In one possible implementation, the channel number of the second format is greater than the channel number of the first format, and the The channel sequence of one format is different from the channel of the second format sequence, and the second format channel weight more extra than the first format is Zero；

Wherein, the first adjustment submodule, comprising:

Third adjustment unit, the weight for model to be defined to the corresponding convolution kernel of first floor convolutional layer in file is zero Convolutional channel is deleted, and adjusts remaining channel sequence in convolution kernel, so that model adjusted defines the volume of the first floor in file Lamination supports the input image data of the first format.

In one possible implementation, the channel number of the second format is greater than the channel number of the first format, and the The channel sequence of two formats is identical as the channel sequence of the first format, and the second format channel weight more extra than the first format is Zero；

Wherein, the first adjustment submodule, comprising:

4th adjustment unit, the weight for model to be defined to the corresponding convolution kernel of first floor convolutional layer in file is zero Convolutional channel is deleted, so that model adjusted defines the input picture number that the first floor convolutional layer in file supports the first format According to.

In one possible implementation, the first format is luma-chroma image data format, and the second format is three bases Color image data format；

Wherein, module 120 is adjusted, comprising:

Second adjustment submodule adds the first data conversion layer for defining in model, the first data conversion layer in file Before first floor convolutional layer, for the input image data of the first format to be converted to the second format.

Wherein, module 120 is adjusted, comprising:

Third adjusting submodule adds the second data conversion layer for defining in model, the second data conversion layer in file For the input image data of the first format to be converted to third format；

4th adjusting submodule, for modifying model and defining the first floor convolutional layer in file, so that first according to third format Layer convolutional layer supports the input image data of third format,

Wherein, third format is four-way data format, and four-way includes primary display channels and an additional transparency Channel.

In one possible implementation, the 4th adjusting submodule, comprising:

4th adjustment unit for adjusting the channel sequence of the corresponding convolution kernel of first floor convolutional layer, and adds in convolution kernel Add the convolutional channel that convolution kernel weight is zero, so that the input data that first floor convolutional layer adjusted is supported is third format.

In one possible implementation, further includes:

Model generation module, in the input image data for receiving waiting task, according to model adjusted File and weight file are defined, Caffe image processing model is generated；

Input processing module is obtained for will handle in the Caffe image processing model of input image data input generation Processing result image.

In one possible implementation, a kind of chip is also disclosed comprising above-mentioned arithmetic unit 100.

In one possible implementation, a kind of chip-packaging structure is disclosed comprising said chip.

In one possible implementation, a kind of board is also disclosed comprising said chip encapsulating structure.Refering to Fig. 7, Fig. 7 provide a kind of board, and above-mentioned board can also include other mating portions other than including said chip 389 Part, which includes but is not limited to: memory device 390, interface arrangement 391 and control device 392；

The memory device 390 is connect with the chip in the chip-packaging structure by bus, for storing data.Institute Stating memory device may include multiple groups storage unit 393.Storage unit described in each group is connect with the chip by bus.It can To understand, storage unit described in each group can be DDR SDRAM (English: Double Data Rate SDRAM, Double Data Rate Synchronous DRAM).

DDR, which does not need raising clock frequency, can double to improve the speed of SDRAM.DDR allows the rising in clock pulses Edge and failing edge read data.The speed of DDR is twice of standard SDRAM.In one embodiment, the storage device can be with Including storage unit described in 4 groups.Storage unit described in each group may include multiple DDR4 particles (chip).In one embodiment In, the chip interior may include 4 72 DDR4 controllers, and 64bit is used for transmission number in above-mentioned 72 DDR4 controllers According to 8bit is used for ECC check.It is appreciated that data pass when using DDR4-3200 particle in the storage unit described in each group Defeated theoretical bandwidth can reach 25600MB/s.

In one embodiment, storage unit described in each group include multiple Double Data Rate synchronous dynamics being arranged in parallel with Machine memory.DDR can transmit data twice within a clock cycle.The controller of setting control DDR in the chips, Control for data transmission and data storage to each storage unit.

The interface arrangement is electrically connected with the chip in the chip-packaging structure.The interface arrangement is for realizing described Data transmission between chip and external equipment (such as server or computer).Such as in one embodiment, the interface Device can be standard PCIE interface.For example, data to be processed are transferred to the core by standard PCIE interface by server Piece realizes data transfer.Preferably, when using the transmission of 16 interface of PCIE 3.0X, theoretical bandwidth can reach 16000MB/s. In another embodiment, the interface arrangement can also be other interfaces, and the application is not intended to limit above-mentioned other interfaces Specific manifestation form, the interface unit can be realized signaling transfer point.In addition, the calculated result of the chip is still by institute It states interface arrangement and sends back external equipment (such as server).

The control device is electrically connected with the chip.The control device is for supervising the state of the chip Control.Specifically, the chip can be electrically connected with the control device by SPI interface.The control device may include list Piece machine (Micro Controller Unit, MCU).If the chip may include multiple processing chips, multiple processing cores or more A processing circuit can drive multiple loads.Therefore, the chip may be at the different work shape such as multi-load and light load State.It may be implemented by the control device to processing chips multiple in the chip, multiple processing and/or multiple processing circuits Working condition regulation.

In some embodiments, a kind of electronic equipment has been applied for comprising above-mentioned board.

Electronic equipment include data processing equipment, robot, computer, printer, scanner, tablet computer, intelligent terminal, Mobile phone, automobile data recorder, navigator, sensor, camera, server, cloud server, camera, video camera, projector, hand Table, earphone, mobile storage, wearable device, the vehicles, household electrical appliance, and/or Medical Devices.

The vehicles include aircraft, steamer and/or vehicle；The household electrical appliance include TV, air-conditioning, micro-wave oven, Refrigerator, electric cooker, humidifier, washing machine, electric light, gas-cooker, kitchen ventilator；The Medical Devices include Nuclear Magnetic Resonance, B ultrasound instrument And/or electrocardiograph.

The presently disclosed embodiments is described above, above description is exemplary, and non-exclusive, and It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill Many modifications and changes are obvious for the those of ordinary skill in art field.The selection of term used herein, purport In the principle, practical application or technological improvement to the technology in market for best explaining each embodiment, or lead this technology Other those of ordinary skill in domain can understand each embodiment disclosed herein.

Claims

1. a kind of operation method, which is characterized in that the method is applied in Heterogeneous Computing framework, the Heterogeneous Computing framework packet Include general processor and artificial intelligence process device, comprising:

When receiving waiting task, judge the first format of the input image data of the waiting task with it is preset Whether the second format of the input data that Caffe image processing model is supported is consistent；

When first format and second format are inconsistent, according to first format and second format, adjustment The model of the Caffe image processing model defines file, so as to define file generated according to the model adjusted The input image data that Caffe image processing model is supported is first format.

2. method according to claim 1, which is characterized in that first format and second format are tristimulus image Data format；

According to the channel number and channel sequence of the first format and second format, adjusts the model and define file.

3. method according to claim 2, which is characterized in that the channel number of second format is less than first format Channel number, and first format channel sequence it is identical as the channel sequence of second format；

Wherein, according to the channel number and channel sequence of the first format and second format, the model definition text is adjusted Part, comprising:

The convolutional channel that the corresponding convolution kernel addition convolution kernel weight of the first floor convolutional layer in file is zero is defined in the model, So that the model adjusted defines the input image data that the first floor convolutional layer in file supports first format.

4. method according to claim 2, which is characterized in that the channel number of second format is less than first format Channel number, and the channel sequence of second format is different from the channel of first format sequence；

The channel sequence for the corresponding convolution kernel of first floor convolutional layer that the model defines in file is adjusted, and in the first floor convolution The convolutional channel that addition convolution kernel weight is zero in the corresponding convolution kernel of layer, so that the modified first floor convolutional layer supports institute State the input image data of the first format.

5. method according to claim 2, which is characterized in that the channel number of second format is equal to first format Channel number, and the channel sequence of first format is different from the channel of second format sequence；

The channel sequence for the corresponding convolution kernel of first floor convolutional layer that the model defines in file is adjusted, so that adjusted described Model defines the input image data that the first floor convolutional layer in file supports first format.

6. method according to claim 2, which is characterized in that the channel number of second format is greater than first format Channel number, and the channel sequence of first format is different from the channel of second format sequence, and second lattice The formula channel weight more extra than first format is zero；

The convolutional channel that the weight that the model is defined to the corresponding convolution kernel of first floor convolutional layer in file is zero is deleted, and is adjusted Remaining channel sequence in the whole convolution kernel, so that the model adjusted defines the first floor convolutional layer in file and supports institute State the input image data of the first format.

7. method according to claim 2, which is characterized in that the channel number of second format is greater than first format Channel number, and the channel sequence of second format is identical as the channel sequence of first format, and second lattice The formula channel weight more extra than first format is zero；

8. method according to claim 1, which is characterized in that first format is luma-chroma image data format, institute Stating the second format is tristimulus image data format；

Defined in the model and add the first data conversion layer in file, the first data conversion layer be located at first floor convolutional layer it Before, for the input image data of first format to be converted to second format.

9. method according to claim 1, which is characterized in that first format is luma-chroma image data format, institute Stating the second format is tristimulus image data format；

It is defined in the model and adds the second data conversion layer in file, the second data conversion layer is used for first lattice The input image data of formula is converted to third format；

According to the third format, modifies the model and define first floor convolutional layer in file, so that the first floor convolutional layer branch The input image data of the third format is held,

Wherein, the third format is four-way data format, and four-way includes primary display channels and an additional transparency Channel.

10. method according to claim 7, which is characterized in that the second data conversion layer is located at the Caffe image Before the first floor convolutional layer for handling model.

11. method according to claim 7, which is characterized in that

Wherein, it according to the third format, modifies the model and defines first floor convolutional layer in file, comprising:

The channel sequence of the corresponding convolution kernel of the first floor convolutional layer is adjusted, and adds convolution kernel weight in the convolution kernel and is Zero convolutional channel, so that the input data that the first floor convolutional layer adjusted is supported is the third format.

12. any one of -11 the method according to claim 1, which is characterized in that the method also includes:

When receiving the input image data of the waiting task, file and weight text are defined according to model adjusted Part generates Caffe image processing model；

13. a kind of arithmetic unit, which is characterized in that the arithmetic unit is used in Heterogeneous Computing framework, the Heterogeneous Computing frame Structure includes general processor and artificial intelligence process device, comprising:

Judgment module, first of the input image data for when receiving waiting task, judging the waiting task Whether format is consistent with the second format of the input data that preset Caffe image processing model is supported；

Module is adjusted, for when first format and second format are inconsistent, according to first format and described Second format, the model for adjusting the Caffe image processing model define file, so as to be defined according to the model adjusted The input image data that the Caffe image processing model of file generated is supported is first format.

14. 3 described device according to claim 1, which is characterized in that first format and second format are three primary colours figure As data format；

Wherein, the adjustment module, comprising:

The first adjustment submodule, for according to the channel number of the first format and second format and channel sequence, adjustment The model defines file.

15. 4 described device according to claim 1, which is characterized in that the channel number of second format is less than first lattice The channel number of formula, and the channel sequence of first format is identical as the channel sequence of second format；

Wherein, the first adjustment submodule, comprising:

The first adjustment unit, for defining the corresponding convolution kernel addition convolution kernel power of the first floor convolutional layer in file in the model The convolutional channel that weight is zero, so that the model adjusted defines the first floor convolutional layer in file and supports first format Input image data.

16. 4 described device according to claim 1, which is characterized in that the channel number of the second format is less than first format Channel number, and the channel sequence of second format is different from the channel of first format sequence；

Wherein, the first adjustment submodule, comprising:

Second adjustment unit, for adjust the model define the corresponding convolution kernel of first floor convolutional layer in file channel it is suitable Sequence, and the convolutional channel that convolution kernel weight is zero is added in the corresponding convolution kernel of the first floor convolutional layer, so that modified The first floor convolutional layer supports the input image data of first format.

17. 4 described device according to claim 1, which is characterized in that the channel number of second format is equal to first lattice The channel number of formula, and the channel sequence of first format is different from the channel of second format sequence；

Wherein, the first adjustment submodule, comprising:

Third adjustment unit, for adjust the model define the corresponding convolution kernel of first floor convolutional layer in file channel it is suitable Sequence, so that the model adjusted defines the input image data that the first floor convolutional layer in file supports first format.

18. 4 the method according to claim 1, which is characterized in that the channel number of second format is greater than first lattice The channel number of formula, and the channel sequence of first format is different from the channel of second format sequence, and described second The format channel weight more extra than first format is zero；

Wherein, the first adjustment submodule, comprising:

Third adjustment unit, the weight for the model to be defined to the corresponding convolution kernel of first floor convolutional layer in file is zero Convolutional channel is deleted, and adjusts remaining channel sequence in the convolution kernel, so that the model adjusted defines in file First floor convolutional layer support the input image data of first format.

19. 4 the method according to claim 1, which is characterized in that the channel number of second format is greater than first lattice The channel number of formula, and the channel sequence of second format is identical as the channel sequence of first format, and described second The format channel weight more extra than first format is zero；

Wherein, the first adjustment submodule, comprising:

4th adjustment unit, the weight for the model to be defined to the corresponding convolution kernel of first floor convolutional layer in file is zero Convolutional channel is deleted, so that the model adjusted defines the input that the first floor convolutional layer in file supports first format Image data.

20. 3 described device according to claim 1, which is characterized in that first format is luma-chroma image data format, Second format is tristimulus image data format；

Wherein, the adjustment module, comprising:

Second adjustment submodule adds the first data conversion layer for defining in the model in file, first data turn It changes layer to be located at before first floor convolutional layer, for the input image data of first format to be converted to second format.

21. 3 described device according to claim 1, which is characterized in that first format is luma-chroma image data format, Second format is tristimulus image data format；

Wherein, the adjustment module, comprising:

Third adjusting submodule adds the second data conversion layer for defining in the model in file, second data turn Layer is changed for the input image data of first format to be converted to third format；

4th adjusting submodule defines first floor convolutional layer in file for according to the third format, modifying the model, with The first floor convolutional layer is set to support the input image data of the third format,

22. according to claim 21 described device, which is characterized in that the 4th adjusting submodule, comprising:

4th adjustment unit, for adjusting the channel sequence of the corresponding convolution kernel of the first floor convolutional layer, and in the convolution kernel The convolutional channel that middle addition convolution kernel weight is zero, so that the input data that the first floor convolutional layer adjusted is supported is institute State third format.

23. any one of 3-22 described device according to claim 1, which is characterized in that further include:

Model generation module, for when receiving the input image data of the waiting task, according to model adjusted File and weight file are defined, Caffe image processing model is generated；

24. a kind of neural network chip, which is characterized in that the chip includes as described in claim 13-23 any one Arithmetic unit.

25. a kind of electronic equipment, which is characterized in that the electronic equipment includes neural network core as claimed in claim 24 Piece.

26. a kind of board, which is characterized in that the board includes: memory device, interface arrangement and control device and such as right It is required that neural network chip described in 22；

Wherein, the neural network chip is separately connected with the memory device, the control device and the interface arrangement；

The memory device, for storing data；

The control device is monitored for the state to the neural network chip.

27. board according to claim 26, which is characterized in that

The memory device includes: that multiple groups storage unit, storage unit described in each group and the neural network chip pass through always Line connection, the storage unit are as follows: DDR SDRAM；

The chip includes: DDR controller, the control for data transmission and data storage to each storage unit；

The interface arrangement are as follows: standard PCIE interface.