CN109754011A

CN109754011A - Data processing method, device and Related product based on Caffe

Info

Publication number: CN109754011A
Application number: CN201811639458.6A
Authority: CN
Inventors: 不公告发明人
Original assignee: Beijing Zhongke Cambrian Technology Co Ltd
Current assignee: Cambricon Technologies Corp Ltd
Priority date: 2018-12-29
Filing date: 2018-12-29
Publication date: 2019-05-14
Anticipated expiration: 2038-12-29
Also published as: CN109754011B

Abstract

This application involves a kind of data processing method based on Caffe, device and Related products, the operator type of normalizing parameter and CNN first floor convolutional layer is defined in Caffe file according to configuration order, it obtains matching and postpones Caffe file, then to being compiled as executable file with postponing Caffe file, and executable file is run on artificial intelligence process device, so that artificial intelligent processor is carried out feature normalization to the input data of convolutional layer, and convolution algorithm is executed to the data after feature normalization.This method is to be placed into the feature normalization of input data in layer to carry out, and the operator defined in Caffe file is the operator that artificial intelligent processor can be operated directly, in this way, the standardization of input data and convolution algorithm can be fused together progress by artificial intelligence process device, the efficiency that convolutional neural networks carry out picture number identification is substantially increased, deep learning related application task may further be made more efficient.

Description

Caffe-based data processing method and device and related products

Technical Field

The present application relates to the field of deep learning technologies, and in particular, to a data processing method and apparatus based on Caffe, and a related product.

Background

Deep learning refers to an algorithm set for solving various problems such as images and texts by applying various machine learning algorithms on a multilayer neural network. In performing deep learning related tasks, for example: the task of processing the image field is to use a convolutional neural network, which is a deep feedforward artificial neural network and has been successfully applied to image recognition.

The first layer in the convolutional neural network is a convolutional layer for extracting some features in the image, and before the convolutional layer extracts the features, feature normalization (i.e., normalization) needs to be performed on the image data, wherein feature normalization refers to making each dimension of the image data have zero mean and unit variance. At present, feature standardization is performed on image data in a convolutional neural network, a central processing unit calls an Open source code Computer Vision Library (OpenCV) to perform mean value and variance processing on the image data, the processed image data is used as input data of the convolutional neural network, and the central processing unit performs compiling and running operations on each layer of the convolutional neural network layer by layer according to the input data.

However, the above method of recognizing the number of images using a convolutional neural network has a problem of low efficiency.

Disclosure of Invention

Therefore, it is necessary to provide a data processing method and apparatus based on Caffe and related products for solving the problem of low efficiency of the above method for identifying the number of images by using convolutional neural network.

In a first aspect, an embodiment of the present invention provides a data processing method based on Caffe, where the method includes:

acquiring a configuration command; the configuration command is used for indicating parameter configuration of the Caffe file;

according to the configuration command, defining a standardized parameter and an operator type of a Convolutional Neural Network (CNN) first layer convolutional layer in the Caffe file to obtain a configured Caffe file; the normalized parameter represents a parameter for performing characteristic normalization on input data of the CNN convolutional layer;

compiling the configured Caffe file to obtain an executable file, and running the executable file on an artificial intelligence processor; the executable file is used for instructing the artificial intelligence processor to carry out feature standardization on the input data of the CNN convolutional layer and executing convolution operation on the data after feature standardization.

In one embodiment, the configured Caffe file further comprises artificial intelligence processor logic and general purpose processor logic; the artificial intelligence processor logic represents the execution sequence of the statements when executing the artificial intelligence processor layer in the Caffe file; the general processor logic represents the execution sequence of the sentences when executing the general processor layer in the Caffe file;

then, before running the executable file on an artificial intelligence processor, the method comprises:

adding a logic switching identifier in the executable file according to the switching instruction; the logic switching identifier is used for indicating the operation of the CNN convolutional layer to be the logic of the artificial intelligence processor.

In one embodiment, defining a standardized parameter and an operator type of the CNN first layer convolution layer in the Caffe file to obtain a configured Caffe file includes:

and respectively adding the standardized parameters to the convolutional layer parameters in the Caffe file, and defining the operator type of the CNN first layer convolutional layer in a factory mode in the Caffe file to form the configured Caffe file.

In one embodiment, the normalized parameters are parameters obtained by training according to a preset model.

In one embodiment, the normalization parameters include: a mean reduction parameter and a scaling parameter;

the mean value reducing parameter is used for representing the mean value reducing operation of the input data;

and the scaling parameter is used for representing the scaling operation of the data after the averaging operation is carried out on the input data.

In one embodiment, the averaging parameter comprises a first averaging parameter or a second averaging parameter;

the first mean value parameter is used for representing and carrying out mean value reduction on pixels of the input data at the same spatial position; or,

the second mean parameter characterizes a mean reduction of the channels of the input data.

In a second aspect, an embodiment of the present invention provides a data processing method based on Caffe, where the method includes:

receiving an executable file, wherein the executable file is a file obtained by compiling computer equipment according to a configured Caffe file; the configured Caffe file comprises standardized parameters and an operator type of the CNN first-layer convolutional layer;

and according to the executable file, carrying out characteristic standardization processing on the input data, and carrying out convolution operation on the input data after the characteristic standardization processing.

In one embodiment, the performing, according to the executable file, a feature normalization process on the input data includes:

carrying out mean value reduction operation on the input data according to a corresponding function called by an operator type and a mean value reduction parameter carried in an executable file;

and carrying out scaling processing on the data subjected to the mean value reduction operation according to scaling parameters.

In one embodiment, the performing, according to a corresponding function called by an operator type carried in an executable file and an averaging parameter, an averaging operation on the input data includes:

and if the average value reducing parameter is a first average value reducing parameter, carrying out average value reducing operation on the pixels of the input data in the same spatial position according to the corresponding function called by the operator type carried in the executable file and the first average value reducing parameter.

and if the mean value reducing parameter is a second mean value reducing parameter, carrying out mean value reducing operation on the channel in the input data according to the calling corresponding function of the operator type carried in the executable file and the second mean value reducing parameter.

In a third aspect, an embodiment of the present invention provides a Caffe-based data processing apparatus, where the apparatus includes:

the acquisition module is used for acquiring the configuration command; the configuration instruction is used for indicating parameter configuration of the Caffe file;

the definition module is used for defining a standardized parameter and an operator type of a CNN first-layer convolutional layer in the Caffe file according to the configuration instruction to obtain a configured Caffe file; the normalized parameter represents a parameter for performing characteristic normalization on input data of the CNN convolutional layer;

the processing module is used for compiling the configured Caffe file to obtain an executable file and running the executable file on an artificial intelligence processor; the executable file is used for instructing the artificial intelligence processor to carry out feature standardization on the input data of the CNN convolutional layer and executing convolution operation on the data after feature standardization.

In a fourth aspect, an embodiment of the present invention provides a Caffe-based data processing apparatus, where the apparatus includes:

the receiving module is used for receiving an executable file, wherein the executable file is a file obtained by compiling the computer equipment according to the configured Caffe file; the configured Caffe file comprises standardized parameters and an operator type of the CNN first-layer convolutional layer;

and the operation module is used for carrying out characteristic standardization processing on the input data according to the executable file and carrying out convolution operation on the input data after the characteristic standardization processing.

In a fifth aspect, an embodiment of the present invention provides a Caffe-based data processing apparatus, including a memory and a processor, where the memory stores a computer program, and the processor implements the method steps in any one of the first and second aspects when executing the computer program.

In a sixth aspect, an embodiment of the present invention provides a combined processing device, where the combined processing device includes the Caffe-based data processing device according to the fifth aspect, a general interconnection interface, and other processing devices except for the Caffe-based data processing device; the Caffe-based data processing device interacts with the other processing devices.

In a seventh aspect, an embodiment of the present invention provides a machine learning chip, where the machine learning chip includes the combined processing apparatus as described in the sixth aspect.

In an eighth aspect, an embodiment of the present invention provides a board, where the board includes the machine learning chip according to the seventh aspect.

In a ninth aspect, an embodiment of the present invention provides an electronic device, where the electronic device includes the board card according to the eighth aspect.

According to the data processing method and device based on Caffe and the related product, computer equipment defines standardized parameters and the operator type of the CNN first-layer convolutional layer in a Caffe file according to a configuration command to obtain a configured Caffe file, then the configured Caffe file is compiled into an executable file, and the executable file is operated on an artificial intelligence processor, so that the artificial intelligence processor performs characteristic standardization on input data of the CNN convolutional layer, and performs convolutional operation on the data after the characteristic standardization. In the method, the characteristics of the input data of the CNN convolutional layer are standardized and put into the CNN convolutional layer, and the operator type defined by the computer equipment in the Caffe file is an operator which can be directly operated by the artificial intelligent processor, so that the artificial intelligent processor can combine the standardized processing and the convolutional operation of the input data together, the efficiency of the convolutional neural network for image number identification is greatly improved, and further, the deep learning related application task is more efficient.

Drawings

Fig. 1 is an application environment diagram of a Caffe-based data processing method according to an embodiment;

fig. 2 is a schematic flow chart of a Caffe-based data processing method according to an embodiment;

fig. 3 is a schematic flow chart of a Caffe-based data processing method according to an embodiment;

fig. 4 is a schematic flowchart of a Caffe-based data processing method according to an embodiment;

fig. 5 is a block diagram of a Caffe-based data processing apparatus according to an embodiment;

fig. 6 is a block diagram of a Caffe-based data processing apparatus according to an embodiment;

FIG. 7 is a schematic structural diagram of a combined treatment apparatus according to an embodiment;

FIG. 8 is a schematic diagram of another combined treatment apparatus according to an embodiment;

fig. 9 is a schematic structural diagram of a board card in an embodiment.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application.

The terms "first," "second," "third," and "fourth," etc. in the description and claims of this application and in the accompanying drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

The Caffe-based data processing method provided by the present application can be applied to the application environment shown in fig. 1, and the computer device can be a server, and the computer device includes a processor, a memory, a network interface, and a database, which are connected through a system bus. Wherein the processor is configured to provide computational and control capabilities. The memory comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database is used for storing data of the Caffe-based data processing method. The network interface is used for communicating with other external devices through network connection. The computer program is executed by a processor to implement a Caffe-based data processing method.

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. The data processing method based on Caffe provided by the embodiment of the application aims to solve the technical problem that the image number identification method adopting a convolutional neural network in the prior art is low in efficiency. The following describes in detail the technical solutions of the present application and how the technical solutions of the present application solve the above technical problems by embodiments and with reference to the drawings. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. It should be noted that, in the data processing method based on Caffe provided in the present invention, the execution main body in fig. 2 is a computer device, and the execution main bodies in fig. 3 and fig. 4 are artificial intelligent processors, wherein the execution main bodies in fig. 2 to fig. 4 may also be data processing apparatuses based on Caffe, and the apparatuses may be implemented as part or all of data processing based on Caffe by software, hardware, or a combination of software and hardware.

An embodiment of a data processing method based on Caffe is described below with an execution subject as a computer device.

In an embodiment, fig. 2 provides a Caffe-based data processing method, and this embodiment relates to a specific process in which a computer device configures a Caffe file, compiles the configured Caffe file into an executable file, and runs the executable file on an artificial intelligence processor, so that the artificial intelligence processor performs feature normalization on input data of a convolutional layer, and performs convolution operation on the feature-normalized data. As shown in fig. 2, the method includes:

s101, acquiring a configuration command; the configuration command is used for indicating parameter configuration of the Caffe file.

In this embodiment, after obtaining the configuration command, the computer device may perform parameter configuration on the Caffe file. The manner in which the computer device obtains the configuration command may be that the computer device directly receives the configuration command input by the user, or that the computer device actively obtains a configuration file and then parses the configuration command from the configuration file, which is of course, other manners exist, and this embodiment is not limited thereto.

S102, defining a standardized parameter and an operator type of a Convolutional Neural Network (CNN) first layer convolutional layer in the Caffe file according to the configuration command to obtain a configured Caffe file; the normalization parameter indicates a parameter for performing characteristic normalization on the input data of the CNN convolutional layer.

Based on the configuration command obtained by the computer device in the step S101, according to the configuration command, the standardized parameters and the operator type of the first layer Convolutional layer of the Convolutional Neural Network (CNN) are defined in the Caffe file, so as to obtain the configured Caffe file. The normalization parameter represents a parameter for performing characteristic normalization on input data of the CNN convolutional layer, and includes, for example: the normalization parameter may be a mean parameter, a scaling parameter, or the like, or may be another parameter, which is not limited in this embodiment. The operator type of the CNN first layer convolutional layer represents an operator used for performing feature normalization on input data, for example: ConvFirstOp, where ConvFirstOp is one operator in cnml, i.e., the first layer convolution operation. cnml can also be called a machine learning library, and is an Application Program Interface (API for short) for deep learning reasoning, that is, cnml can provide an API for a series of operations of the ConvFirstOp operator, and according to the API, a function corresponding to the ConvFirstOp operator can be directly called from cnml.

S103, compiling the configured Caffe file to obtain an executable file, and running the executable file on an artificial intelligence processor; the executable file is used for instructing the artificial intelligence processor to carry out feature standardization on the input data of the CNN convolutional layer and executing first-layer convolution operation on the data after feature standardization.

In this embodiment, based on the configured Caffe file obtained in S102, the computer device compiles the configured Caffe file to obtain an executable file, and runs the executable file on the artificial intelligence processor. And the artificial intelligence processor performs characteristic standardization on input data of the CNN first-layer convolution layer according to the executable file, and performs first-layer convolution operation on the data after the characteristic standardization. The artificial intelligence processor may be, for example: a Machine Learning Unit (MLU). Taking an artificial intelligence processor as an MLU as an example, in practical application, the MLU may call a function corresponding to ConvFirstOp from cnml according to an operator (e.g., ConvFirstOp) carrying the first-layer convolutional layer in an executable file, according to an API of the ConvFirstOp provided by cnml, perform feature normalization on input data of the CNN convolutional layer, and perform convolution operation on the normalized data. When the operator type of the CNN first layer convolutional layer is defined as ConvFirstOp, an artificial intelligence processor (MLU) can be used for reasoning when performing feature normalization on CNN convolutional layer input data.

In the data processing method based on Caffe provided in this embodiment, a computer device defines a standardized parameter and an operator type of a CNN first-layer convolution layer in a Caffe file according to a configuration command to obtain a configured Caffe file, then compiles the configured Caffe file into an executable file, and runs the executable file on an artificial intelligence processor, so that the artificial intelligence processor performs feature standardization on input data of the CNN convolution layer, and performs convolution operation on the feature-standardized data. In the method, the characteristics of the input data of the CNN convolutional layer are standardized and put into the CNN convolutional layer, and the operator type defined by the computer equipment in the Caffe file is an operator which can be directly operated by the artificial intelligent processor, so that the artificial intelligent processor can combine the standardized processing and the convolutional operation of the input data together, the efficiency of the convolutional neural network for image number identification is greatly improved, and further, the deep learning related application task is more efficient.

On the basis of the above embodiment, optionally, the configured Caffe file further includes artificial intelligence processor logic and general processor logic; the artificial intelligence processor logic represents the execution sequence of the statements when executing the artificial intelligence processor layer in the Caffe file; the general processor logic represents the execution sequence of the sentences when executing the general processor layer in the Caffe file; then, before running the executable file on an artificial intelligence processor, the method comprises: adding a logic switching identifier in the executable file according to the switching instruction; the logic switching identifier is used for indicating the operation of the CNN convolutional layer to be the logic of the artificial intelligence processor.

In this embodiment, the artificial intelligence processor is an MLU as an example, the general purpose processor is a CPU as an example, the configured Caffe file further includes artificial intelligence processor logic and general purpose processor logic, and the configured Caffe file further includes MLU logic and CPU logic, where the MLU logic represents an execution sequence of each statement in an MLU layer in the Caffe file, and the CPU logic represents an execution sequence of each statement in a CPU layer in the Caffe file. In practical application, because the configured Caffe file further includes MLU logic and CPU logic, the computer device may add a logic switching identifier in the executable file according to the switching instruction; the logical switch flag may indicate that the operation of the CNN convolutional layer is to be performed in MLU logic. The switching instruction can be an instruction which is manually input by a user and carries a logic switching identifier, and the computer equipment adds the logic switching identifier carried in the switching instruction to the executable file when receiving the switching instruction. Wherein, for example, the logical switching flag may be represented by 0 or 1, and when the logical switching flag is 0 or 1, the computer device performs the operation of the CNN convolutional layer as MLU logic. It should be understood that, if the logic switching is not performed according to the logic switching identifier, the computer device may default that the operation of the CNN first layer convolutional layer is performed by using the CPU logic, and of course, another logic switching identifier may be provided for the CPU logic, and when the operation needs to be performed by using the CPU logic, the computer device performs the logic switching according to the increased CPU logic switching identifier. The logic switching identifier may be a number, a letter, or an identifier that can be recognized by the CPU or an MLU formed by a combination of a number and a letter, and the specific form of the logic switching identifier is not limited in this embodiment.

In the data processing method based on Caffe provided by this embodiment, before an executable file runs on an artificial intelligence processor, a computer device adds a logic switching identifier in the executable file according to a switching instruction, and switches the operation of a CNN convolutional layer to the logic of the artificial intelligence processor according to the logic switching identifier, so that the operation suitable for the artificial intelligence processor is freely switched between the artificial intelligence processor and a general processor by setting the logic switching identifier, thereby greatly improving the efficiency of the convolutional neural network for image number recognition, and further, making the deep learning related application task more efficient.

For the Caffe file described in the above embodiment, where the normalized parameters and the operator type of the CNN first-layer convolutional layer are defined in the Caffe file to obtain the configured Caffe file, an embodiment of the present application provides a data processing method based on Caffe, and in an embodiment, an implementation manner of the step S102 includes: and respectively adding the standardized parameters to the convolutional layer parameters in the Caffe file, and defining the operator type of the CNN first layer convolutional layer in a factory mode in the Caffe file to form the configured Caffe file.

Wherein, the convolution layer parameter in Caffe file is the convolutionParameter in Caffe proto. Factory model in Caffe file, i.e. layer _ factor, for example: src/coffee/layer _ factor. In this embodiment, the computer device adds a standardized parameter to the convolution layer parameter in the Caffe file, defines the operator type of the CNN first layer convolution layer in the factory mode in the Caffe file, obtains the configured Caffe file, adds a standardized parameter to the constraint parameter in the Caffe proto, defines the operator type of the CNN first layer convolution layer in the layer _ factor, and then obtains the configured Caffe file. In this embodiment, the computer device defines the standardized parameters and the operator type of the CNN first layer convolutional layer at the determined position, so that the artificial intelligence processor can perform the standardized processing and the convolution operation on the input data according to the executable file compiled from the configured Caffe file, thereby greatly improving the efficiency of the convolutional neural network in image number recognition.

Since a specific value needs to be provided for the standardized parameter so that the artificial intelligence processor can smoothly complete the feature standardization processing of the data when the input data is standardized according to the executable file, in an embodiment, the value of the standardized parameter is the data obtained by training according to a preset model. The value of the standardized parameter is data obtained by training according to a preset model, and the preset model represents a model which is constructed in advance by a user and is used for carrying out standardized parameter training on a training set. Of course, the value of the preset model training standardized parameter is only an enumeration manner, and may also be a numerical value obtained by a user according to big data counted by experience or other methods, which is not limited in this embodiment. In this embodiment, the computer device takes values of the standardized parameters according to the preset model training, so that the artificial intelligence processor standardizes the input data according to the standardized parameters, and differences between features in the input data are greatly reduced.

In addition, the feature normalization representation is such that each dimension of the input data has a zero mean and a unit variance, then the normalization parameters include: a mean reduction parameter and a scaling parameter; the average value reducing parameter representation is used for carrying out average value reducing operation on input data; and the scaling parameter represents that the data after the input data is subjected to the average value reduction operation is subjected to the scaling operation. Optionally, the averaging parameter comprises a first averaging parameter or a second averaging parameter; the first average value reduction parameter representation is used for carrying out average value reduction on pixels of input data on the same spatial position; or, the second average reduction parameter characterizes an average reduction of the channel of the input data.

Wherein, carry on the standardized processing of the characteristic to the input data, namely calculate the mean value and zoom in every dimension of the input data, then the standardized parameter includes: the method comprises the following steps of (1) carrying out mean value reduction on all training sets by mean value reduction parameter representation and scaling parameter; the scaling parameter representation is used for carrying out scaling operation on data after the mean value reduction operation is carried out on all training sets, wherein the mean value reduction parameters comprise a first mean value reduction parameter and a second mean value reduction parameter, the first mean value reduction parameter representation is used for carrying out mean value reduction on pixels of input data in the same spatial position, the second mean value reduction parameter representation is used for carrying out mean value reduction on channels of the input data, and the first mean value reduction parameter can be a mean _ file parameter; the second averaging parameter may be a mean value parameter; it should be noted that the mean _ file parameter may be defined by: binaryproto to specify the mean file. For example, if the scaling parameter is set to be an std parameter, and the value of the std parameter is 0.017, if the mean value reduction parameter is a mean _ file parameter, the artificial intelligence processor performs mean value reduction on pixels in the same spatial position of the input data, and scales the data after the mean value reduction operation by 0.017, so as to obtain final data, that is, the data after the input data feature standardization. In addition, if the mean-value parameter is the mean _ value parameter, and the specific values of the mean _ value parameter are set to be 104, 117, and 123, the channels of the input data of the artificial intelligence processor are subjected to mean-value subtraction (all R channels subtract 104, all G channels subtract 117, and all B channels subtract 123) to obtain data subjected to mean-value subtraction, and then the data are scaled by 0.017, so that the final data are input data subjected to feature normalization, for example: the first layer of convolutional layers may be defined as:

it should be understood that the artificial intelligence processor performs the operation of subtracting the mean value of the input data according to the mean _ file parameter and the mean _ value parameter. In another example, in combination with the above-mentioned method for taking values of the normalized parameters, in practical application, normalized data (i.e., a mean reduction parameter and a scaling parameter) is obtained according to a preset model training, and if the mean reduction parameter is mean and the scaling parameter is std, it is assumed that an operator provided on the artificial intelligence processor is ConvFirstOp, and then the mean reduction operation and the scaling operation are performed on input data according to a formula out ((data-mean)/stdt) filter + bias) to obtain output data. Wherein, filter is convolution kernel, bias is bias, which are obtained by training neural network.

In this embodiment, the artificial intelligence processor performs an averaging operation on the input data according to a set averaging parameter (mean _ file parameter or mean _ value parameter), and then performs scaling processing on the data after the averaging operation to realize standardization processing of the input data, thereby greatly reducing differences between features in the input data.

An embodiment of a data processing method based on Caffe is described below with an execution subject as an artificial intelligence processor. It should be noted that, since terms such as the artificial intelligence processor, the normalization parameter, the operator type of the CNN convolutional layer, the mean value reduction parameter, the scaling parameter, and the like, and the interaction process among some data have been specifically explained in the above embodiments, they are not described in detail in the following embodiments.

In an embodiment, fig. 3 provides a data processing method based on Caffe, and this embodiment relates to a specific process in which an artificial intelligence processor performs a feature normalization process on input data according to an executable file, and performs a convolution operation on the input data after the feature normalization process. As shown in fig. 3, the method includes:

s201, receiving an executable file, wherein the executable file is obtained by compiling computer equipment according to a configured Caffe file; the configured Caffe file comprises standardized parameters and an operator type of the CNN first-layer convolutional layer.

In this embodiment, if the artificial intelligence processor takes the MLU as an example, the MLU receives the executable file. The executable file is a file obtained by compiling the computer equipment according to the configured Caffe file, wherein the configured Caffe file represents the Caffe file after the standardization parameters and the operator type of the CNN first-layer convolutional layer are defined in the original Caffe file.

And S202, performing characteristic standardization processing on the input data according to the executable file, and performing convolution operation on the input data after the characteristic standardization processing.

In this step, based on the executable file received by the artificial intelligence processor in step S201, the artificial intelligence processor performs the feature normalization processing on the input data according to the executable file, and performs the first-layer convolution operation on the input data after the feature normalization processing. For example, taking the example that the artificial intelligence processor is an MLU, the MLU may call an API of ConvFirstOp from cnml to perform feature normalization on input data of the CNN first-layer convolutional layer according to an operator (e.g., ConvFirstOp) carrying the first-layer convolutional layer in the executable file, and perform first-layer convolutional operation on the data after the feature normalization.

In the data processing method based on Caffe provided by the embodiment, the artificial intelligence processor performs feature standardization processing on input data according to the received executable file, and performs convolution operation on the input data after the feature standardization processing.

Considering that the feature normalization process performed by the artificial intelligence processor on the input data includes two steps, namely, a mean reduction operation and a scaling operation, in one embodiment, as shown in fig. 4, the above S202 includes:

s301, according to the corresponding function called by the operator type carried in the executable file and the parameter of the mean value reduction, the mean value reduction operation is carried out on the input data.

In this embodiment, the artificial intelligence processor performs an averaging operation on the input data according to a corresponding function and an averaging parameter called by an operator type carried in an executable file, and since the executable file is compiled according to a Caffe file configured with standardized parameters (including the averaging parameter and a scaling parameter) and a CNN first-layer convolution layer operator type, the artificial intelligence processor can directly obtain the CNN first-layer convolution layer operator and the averaging parameter from the executable file, then call the corresponding function according to the operator, and perform the averaging operation on the input data.

The defined averaging parameter may be a first averaging parameter or a second averaging parameter, so the step S301 includes two implementation manners:

optionally, one implementation manner of the step S301 includes: and if the average value reducing parameter is a first average value reducing parameter, carrying out average value reducing operation on the pixels of the input data in the same spatial position according to the corresponding function called by the operator type carried in the executable file and the first average value reducing parameter.

Wherein, the first mean value reduction parameter, for example: the mean _ file parameter may be defined by the mean _ file: binaryproto to specify the mean file. For example, if the average value reducing parameter is a mean _ file parameter, the artificial intelligence processor performs an average value reducing operation on pixels of the input data in the same spatial position, that is, performs an average value reducing operation on each pixel point of all pictures, and the obtained data is data obtained by averaging the input data.

Optionally, another implementation manner of the step S301 includes:

and if the average value reducing parameter is a second average value reducing parameter, carrying out average value reducing operation on the channel of the input data according to a corresponding function called by the operator type carried in the executable file and the second average value reducing parameter.

Wherein, the second mean value reduction parameter, for example: the mean _ value parameter has three values, which respectively represent the mean values of three channels (i.e., R channel, G channel, and B channel), and for example, if the value of mean _ value is set to 104, 117, and 123, the artificial intelligence processor performs the mean value subtraction on the channels of the input data, that is, all R channels subtract 104, all G channels subtract 117, and all B channels subtract 123 to obtain data after the mean value subtraction of the input data.

S302, carrying out scaling processing on the data after the mean value reduction operation according to scaling parameters.

In this step, based on the average value reduction operation performed on the input data by the artificial intelligence processor in step S301, the artificial intelligence processor performs scaling processing on the data subjected to the average value reduction according to the scaling parameters, and the obtained final data is the input data subjected to the feature standardization. The scaling parameter is included in the standardized parameter as the above-mentioned average value reducing parameter, and the artificial intelligence processor can be directly obtained and used according to the execution file.

In the data processing method based on Caffe provided in this embodiment, the artificial intelligence processor performs the operation of subtracting the mean value of the input data according to the corresponding function called by the operator type and the mean value subtraction parameter carried in the executable file, and then scales the data after mean value subtraction according to the scaling parameter to obtain the final input data after normalization, thereby greatly reducing the difference between the features in the input data.

It should be understood that although the various steps in the flow charts of fig. 2-4 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-4 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternating with other steps or at least some of the sub-steps or stages of other steps.

In one embodiment, as shown in fig. 5, there is provided a Caffe-based data processing apparatus comprising: an acquisition module 10, a definition module 11 and a processing module 12, wherein:

an obtaining module 10, configured to obtain a configuration command; the configuration instruction is used for indicating parameter configuration of the Caffe file;

a defining module 11, configured to define, according to the configuration instruction, a standardized parameter and an operator type of a convolutional layer of a convolutional neural network CNN first layer in the Caffe file, so as to obtain a configured Caffe file; the normalization parameter represents a parameter for performing characteristic normalization on input data of the CNN convolutional layer;

the processing module 12 is configured to compile the configured Caffe file to obtain an executable file, and run the executable file on an artificial intelligence processor; the executable file is used for instructing the artificial intelligence processor to carry out feature standardization on the input data of the CNN convolutional layer and executing convolution operation on the data after feature standardization.

The implementation principle and technical effect of the data processing apparatus based on Caffe provided in this embodiment are similar to those of the above embodiment of the data processing method based on Caffe, and are not described herein again.

In one embodiment, as shown in fig. 6, there is provided a Caffe-based data processing apparatus, comprising: a receiving module 13 and an operation module 14, wherein:

the receiving module 13 is configured to receive an executable file, where the executable file is a file obtained by compiling, by a computer device, according to a configured Caffe file; the configured Caffe file comprises standardized parameters and an operator type of the CNN first-layer convolutional layer;

and the operation module 14 is configured to perform feature normalization processing on the input data according to the executable file, and perform convolution operation on the input data after the feature normalization processing.

For the specific definition of the Caffe-based data processing apparatus, reference may be made to the above definition of the Caffe-based data processing method, which is not described herein again. The various modules in the above Caffe-based data processing apparatus may be implemented wholly or partially by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In an embodiment, the present application further provides a Caffe-based data apparatus, including a processor and a memory, where the memory stores a computer program, and the processor implements the following steps when executing the computer program:

according to the configuration command, defining standardized parameters and an operator type of a CNN first-layer convolutional layer in the Caffe file to obtain a configured Caffe file; the normalization parameter represents a parameter for performing characteristic normalization on input data of the CNN convolutional layer;

Or,

receiving an executable file, wherein the executable file is a file obtained by compiling computer equipment according to a configured Caffe file; the configured Caffe file comprises standardized parameters and an operator type of the CNN convolutional layer;

Referring to fig. 7, an embodiment of the present application further provides a combined processing apparatus, which includes the above Caffe-based data processing apparatus, a general interconnection interface, and other processing apparatuses except for the above Caffe-based data processing apparatus; and the data processing device based on Caffe interacts with other processing devices to jointly complete the calculation operation specified by the user. The other processing devices include one or more types of general purpose/special purpose processors such as a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a neural network processor, and the like. The number of processors included in the other processing devices is not limited. The other processing devices are used as interfaces of the Caffe-based data processing device with external data and control, and comprise data transportation to finish basic control of starting, stopping and the like of the data processing device; other processing devices can cooperate with the Caffe-based data processing device to complete the operation task. And the universal interconnection interface is used for transmitting data and control instructions between the data processing device based on Caffe and other processing devices. The data processing device based on Caffe acquires required input data from other processing devices and writes the input data into a shared memory on a data processing device chip based on Caffe; the machine learning device can acquire control instructions from other processing devices and write the control instructions into the data processing device chip; the data in the shared memory of the Caffe-based data processing apparatus may also be read and transmitted to other processing apparatuses.

Optionally, as shown in fig. 8, the above-mentioned combined processing device may further include a storage device, and the storage device is respectively connected to the Caffe-based data processing device and the other processing devices. The storage device is used for storing data in the Caffe-based data processing device and the other processing devices, and is particularly suitable for storing all data which cannot be stored in the internal storage of the Caffe-based data processing device or the other processing devices and which need to be calculated.

The combined processing device can be used as an SOC (system on chip) system of equipment such as a mobile phone, a robot, an unmanned aerial vehicle and video monitoring equipment, the core area of a control part is effectively reduced, the processing speed is increased, and the overall power consumption is reduced. In this case, the generic interconnect interface of the combined processing device is connected to some component of the apparatus. Some parts are such as camera, display, mouse, keyboard, network card, wifi interface.

In an embodiment, the present application further provides a machine learning chip, which includes the above Caffe-based data processing device and/or combined processing device.

In an embodiment, an embodiment of the present application further provides a chip packaging structure, which includes the above chip.

In an embodiment, an embodiment of the present application further provides a board card, which includes the chip packaging structure. Referring to fig. 9, the board card may include other accessories besides the chip package structure 81, including but not limited to: a memory device 82, an interface device 83, and a control device 84; the memory device 82 is connected to the machine learning chip 811 in the chip package 81 through a bus for storing data, and the memory device 82 may include a plurality of sets of memory cells 821. Each set of the storage units 821 and the machine learning chip 811 are connected by a bus. It is understood that each group of the memory units 821 may be a DDR SDRAM (Double Data Rate SDRAM).

DDR can double the speed of SDRAM without increasing the clock frequency. DDR allows data to be read out on the rising and falling edges of the clock pulse. DDR is twice as fast as standard SDRAM. In one embodiment, the storage device may include 4 sets of the storage unit. Each group of the memory cells may include a plurality of DDR4 particles (chips). In one embodiment, the machine learning chip may internally include 4 72-bit DDR4 controllers, wherein 64bit of the 72-bit DDR4 controller is used for data transmission, and 8bit is used for ECC check. It can be understood that when DDR4-3200 particles are adopted in each group of memory cells, the theoretical bandwidth of data transmission can reach 25600 MB/s. In one embodiment, each group of the memory cells includes a plurality of double rate synchronous dynamic random access memories arranged in parallel. DDR can transfer data twice in one clock cycle. And a controller for controlling DDR is arranged in the chip and is used for controlling data transmission and data storage of each memory unit.

The interface device 83 is electrically connected to a machine learning chip 811 in the chip package 81. The interface device 83 is used for data transmission between the machine learning chip 811 and an external device (such as a server or a computer). For example, in one embodiment, the interface device 83 may be a standard PCIE (peripheral component interconnect express) interface. For example, the data to be processed is transmitted to the machine learning chip by the server through a standard PCIE interface, so as to implement data transfer. Preferably, when PCIE3.0X 16 interface transmission is adopted, the theoretical bandwidth can reach 16000 MB/s. In another embodiment, the interface device 83 may also be another interface, and the embodiment of the present application does not limit the concrete expression of the other interface, and the interface device may implement a switching function. In addition, the calculation result of the machine learning chip 811 is still transmitted back to an external device (e.g., a server) by the interface device 83.

The control device 84 is electrically connected to the machine learning chip 811. The control device 84 is used to monitor the state of the chip. Specifically, the machine learning chip 811 and the control device 84 may be electrically connected through an SPI (serial peripheral Interface) Interface. The control device may include a single chip Microcomputer (MCU). As the machine learning chip can comprise a plurality of Caffe-based data processing devices and/or combined processing devices, a plurality of loads can be driven. Therefore, the machine learning chip can be in different working states such as multi-load and light load. The control device 84 can be used to control the operating states of a plurality of data processing devices and/or combination processing devices in the machine learning chip.

In some embodiments, an electronic device is provided that includes the above board card. The electronic device comprises a data processing device, a robot, a computer, a printer, a scanner, a tablet computer, an intelligent terminal, a mobile phone, a vehicle data recorder, a navigator, a sensor, a camera, a server, a cloud server, a camera, a video camera, a projector, a watch, an earphone, a mobile storage, a wearable device, a vehicle, a household appliance, and/or a medical device. The vehicle comprises an airplane, a ship and/or a vehicle; the household appliances comprise a television, an air conditioner, a microwave oven, a refrigerator, an electric cooker, a humidifier, a washing machine, an electric lamp, a gas stove and a range hood; the medical equipment comprises a nuclear magnetic resonance apparatus, a B-ultrasonic apparatus and/or an electrocardiograph.

Those skilled in the art should also appreciate that the embodiments described in this specification are all alternative embodiments and that the acts and modules involved are not necessarily required for this application. In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may be implemented in the form of a software program module.

The integrated units, if implemented in the form of software program modules and sold or used as stand-alone products, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned memory comprises: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

It will be understood by those skilled in the art that all or part of the processing of the above embodiments may be implemented by a program to instruct associated hardware, and the program may be stored in a computer readable memory, and the memory may include: flash Memory disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.

The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the method and the core concept of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A data processing method based on Caffe is characterized by comprising the following steps:

2. The method of claim 1, wherein the configured Caffe file further comprises artificial intelligence processor logic and general purpose processor logic; the artificial intelligence processor logic represents the execution sequence of the statements when executing the artificial intelligence processor layer in the Caffe file; the general processor logic represents the execution sequence of the sentences when executing the general processor layer in the Caffe file;

3. The method according to claim 1, wherein defining the normalization parameters and the operator type of the CNN first layer convolution layer in the Caffe file to obtain a configured Caffe file comprises:

4. The method of claim 1, wherein the normalization parameter takes the form of data trained according to a predetermined model.

5. The method of claim 1, wherein the normalization parameters comprise: a mean reduction parameter and a scaling parameter;

6. The method of claim 5, wherein the averaging parameter comprises a first averaging parameter or a second averaging parameter;

the first mean value reduction parameter is used for representing the mean value reduction of pixels of the input data at the same spatial position;

the second averaging parameter characterizes an averaging of the channels of the input data.

7. A data processing method based on Caffe is characterized by comprising the following steps:

8. The method of claim 7, wherein performing feature normalization on the input data according to the executable file comprises:

calling a corresponding function and an average value reducing parameter according to an operator type carried in an executable file, and carrying out average value reducing operation on the input data;

9. The method of claim 8, wherein the calling a corresponding function and a mean-reducing parameter according to an operator type carried in an executable file to perform a mean-reducing operation on the input data comprises:

and if the average value reducing parameter is a first average value reducing parameter, carrying out average value reducing operation on the pixels of the input data at the same spatial position according to a corresponding function called by the operator type carried in the executable file and the first average value reducing parameter.

10. The method of claim 8, wherein the performing an averaging operation on the input data according to a corresponding function and an averaging parameter of an operator type call carried in an executable file comprises:

and if the mean value reducing parameter is a second mean value reducing parameter, carrying out mean value reducing operation on the channel in the input data according to a corresponding function called by the operator type carried in the executable file and the second mean value reducing parameter.

11. A Caffe-based data processing apparatus, comprising:

12. A Caffe-based data processing apparatus, comprising:

13. A Caffe-based data processing apparatus comprising a memory and a processor, said memory storing a computer program, wherein said processor implements the steps of the method according to any one of claims 1 to 10 when executing said computer program.

14. A combined processing device, characterized in that the combined processing device comprises the Caffe-based data processing device according to claim 13, a universal interconnect interface and other processing devices than the Caffe-based data processing device; the Caffe-based data processing device interacts with the other processing devices.

15. A machine learning chip, characterized in that it comprises a combined processing device according to claim 14.

16. A board comprising the machine learning chip of claim 15.

17. An electronic device, characterized in that it comprises a card according to claim 16.