Disclosure of Invention
The embodiment of the application provides a data processing method, a data processing device, a storage medium and a server based on a neural network, and aims to solve the problems that the neural network with a pipeline structure is large in creating workload, a chip cannot accommodate a calculation process, and the calculation efficiency of the neural network is low. The technical scheme is as follows:
in one aspect, a data processing method based on a neural network is provided, and the method includes:
when input data are input into a circulating network unit in a neural network, acquiring the number of channels of the input data, wherein the input data are initial data input into the neural network, or the input data are data obtained according to previous output data of the circulating network unit, the circulating network unit comprises n convolution units with different sizes, and n is more than or equal to 2;
selecting a target convolution unit matched with the channel number from the n convolution units;
inputting the input data into the target convolution unit;
and performing convolution operation on the input data by using the target convolution unit, and determining an obtained first operation result as output data of the circulating network unit.
In a possible implementation manner, when the cyclic network unit further includes n fifo queues, and each fifo queue is connected to one convolution unit;
before the selecting a target convolution unit from the n convolution units that matches the number of channels, the method further includes: selecting a target first-in first-out queue matched with the number of the channels from the n first-in first-out queues; inputting the input data into the target first-in first-out queue;
the selecting a target convolution unit matched with the channel number from the n convolution units comprises: selecting a target convolution unit connected with the target first-in first-out queue from the n convolution units;
the inputting the input data into the target convolution unit includes: and inputting the input data output from the target first-in first-out queue into the target convolution unit.
In a possible implementation, when the n fifo queues are further connected to a predetermined convolution unit in the cyclic network unit;
the method further comprises the following steps: acquiring the number of channels of input data output from the target first-in first-out queue;
the selecting a target convolution unit matched with the channel number from the n convolution units comprises: and if the number of the channels is not matched with the convolution units connected with the target first-in first-out queue, determining the preset convolution unit as the target convolution unit.
In a possible implementation manner, when the cyclic network unit further includes n instruction control units and n data general control units, each instruction control unit is connected to one fifo queue and one data general control unit, and each data general control unit is connected to one convolution unit, the input data output from the target fifo queue is input to the target convolution unit, which includes:
utilizing an instruction control module to buffer the output data of the first-in first-out queue;
inputting the output data after data buffering into the data master control unit, wherein the speed of the output data after data buffering is matched with the processing speed of the data master control unit;
providing an output window with a preset size for the output data by utilizing a row-column controller in the data master control unit;
and inputting the output data in the output window into the target convolution unit.
In one possible implementation, the method further includes:
if the input data correspond to a pooling unit, performing pooling operation on the output data by using the pooling unit, and determining an obtained second operation result as next input data of the circulating network unit;
and if the input data does not correspond to the pooling unit, determining the output data as the input data of the next time of the circulating network unit.
In one possible implementation, before the inputting the input data into the recurrent network elements in the neural network, the method further includes:
performing convolution operation on the input data by using a preprocessing convolution unit to obtain a third operation result;
and determining the third operation result as final input data.
In one possible implementation, when n is 3, the convolution units include a 3 × 3 convolution unit, a 5 × 5 convolution unit, and a 1 × 1 convolution unit.
In one aspect, a data processing apparatus based on a neural network is provided, the apparatus comprising:
an obtaining module, configured to obtain the number of channels of input data when the input data is input to a cyclic network unit in a neural network, where the input data is initial data input to the neural network, or the input data is data obtained according to previous output data of the cyclic network unit, the cyclic network unit includes n convolution units of different sizes, and n is greater than or equal to 2;
a selecting module, configured to select a target convolution unit matched with the number of channels from the n convolution units;
an input module for inputting the input data into the target convolution unit;
and the determining module is used for performing convolution operation on the input data by using the target convolution unit and determining an obtained first operation result as the output data of the circulating network unit.
In one aspect, there is provided a computer readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the neural network-based data processing method as described above.
In one aspect, a server is provided, which includes a processor and a memory, where at least one instruction is stored in the memory, and the instruction is loaded and executed by the processor to implement the data processing method based on neural network as described above.
The technical scheme provided by the embodiment of the application has the beneficial effects that at least:
because the cyclic network unit in the neural network comprises n convolution units with different sizes, a matched target convolution unit can be selected for the input data according to the number of channels of the input data, the target convolution unit is used for carrying out convolution operation on the input data, and the obtained first operation result is determined as the output data of the cyclic network unit. Subsequently, the output data can continue to be input as input data into the circulating network element, so that the data can be cyclically processed by one circulating network element. Thus, the pipeline structure can be replaced by a circulating network unit, thereby reducing the workload of creating the chip; the volume of the circulating network unit is small, and the chip can accommodate the calculation process; and the circulating network unit can reduce the calculation time among different network layers, thereby improving the data processing efficiency.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present application more clear, the embodiments of the present application will be further described in detail with reference to the accompanying drawings.
As shown in fig. 1, a neural network in the prior art includes x network units, a full-connection unit, and a classifier, where the x network units form a pipeline structure, and the last network unit is connected to the full-connection unit, and the full-connection unit is connected to the classifier. When input data are input into the neural network, the input data are processed through a first network unit, and output data of the first network unit are used as input data of a second network unit; and in the same way, the output data of the last network unit is input into the full-connection unit and the classifier to obtain the output of the neural network.
Because of some disadvantages of the pipeline structure, a circular network unit is created in this embodiment, and data can be processed by multiplexing the circular network unit. Specifically, the neural network in the application comprises a circulation network unit, a full-connection unit and a classifier, wherein the circulation network unit is connected with the full-connection unit, and the full-connection unit is connected with the classifier. When input data are input into the neural network, processing the input data through the circulating network unit, and taking output data of the circulating network unit as input data of the circulating network unit again; and in the same way, inputting the last output data of the circulating network unit into the full-connection unit and the classifier to obtain the output of the neural network.
It should be noted that, the network unit or the cyclic network unit may be connected to the pooling unit according to a requirement, so as to perform pooling operation by the pooling unit, which is not described in detail in this embodiment.
The data processing flow of the neural network is described in detail below.
Referring to fig. 2, a flowchart of a method of a neural network-based data processing method provided in an embodiment of the present application is shown, where the neural network-based data processing method may be applied to a server. The data processing method based on the neural network can comprise the following steps:
step 201, when inputting input data into a cyclic network unit in a neural network, acquiring the number of channels of the input data, where the input data is initial data input into the neural network, or the input data is data obtained according to previous output data of the cyclic network unit, and the cyclic network unit includes n convolution units with different sizes.
When input data is first input to the recurrent network elements, the input data may be initial data, for example, the input data may be an image of an input neural network. When input data is not first input to the cyclic network element, the input data may be data obtained according to output data of the cyclic network element at the previous time, for example, the input data may be output data of the cyclic network element at the previous time, or the input data may be data obtained by performing pooling operation on the output data of the cyclic network element at the previous time.
It should be noted that the cyclic network unit may include n convolution units with different sizes, so as to process data with different channel numbers. Wherein n is greater than or equal to 2. The present embodiment does not limit the size of the convolution unit.
In one example, when n is 3, the convolution unit may include a 3 × 3 convolution unit, a 5 × 5 convolution unit, and a 1 × 1 convolution unit.
In this embodiment, before inputting the input data into the cyclic network unit in the neural network, the preprocessing convolution unit may also be used to perform convolution operation on the input data to obtain a third operation result; and determining the third operation result as input data.
In one example, the pre-processing convolution unit may be a 1 × 1 convolution unit. Then, the convolution operation may be performed on the input data by using a 1 × 1 convolution unit, and the obtained third operation result may be used as the input data.
Step 202, selecting a target convolution unit matched with the number of channels from the n convolution units.
In this embodiment, one convolution unit may be selected from the n convolution units, and the convolution unit may be determined as the target convolution unit.
Taking a neural network in the prior art as an example to classify the input data, the input data can be divided into first input data in the network units 1-3 (i.e. Conv 1-3), second input data in the network units 4-9 (i.e. Conv 4-9) and third input data in the remaining network units. The first input data has a large length and width, and a small number of data channels, and is suitable for being processed by a 3 × 3 convolution unit, so if the input data is the first input data, the target convolution unit may be the 3 × 3 convolution unit. The number of data channels of the second input data is large, and the second input data is suitable for being processed by a 5 × 5 convolution unit, or if the second input data is processed by a 3 × 3 convolution unit, an additional calculation unit is needed to achieve the processing effect of the 5 × 5 convolution unit. Therefore, if the input data is the second input data, the target convolution unit may be a 5 × 5 convolution unit. The third input data is suitably processed by a 1 × 1 convolution unit, so if the input data is the third input data, the target convolution unit may be a 1 × 1 convolution unit, please refer to fig. 3.
Referring to fig. 4, In a First implementation manner, when the cyclic network unit further includes n FIFO queues (First In First Out, FIFO), and each FIFO queue is connected to one convolution unit, before step 202, a target FIFO queue matching the number of channels may be further selected from the n FIFO queues; input data is input into the target first-in first-out queue. Since each fifo queue is connected to one convolution unit, step 202 may specifically be to select a target convolution unit connected to the target fifo queue from among the n convolution units. That is, one convolution unit connected to the target fifo queue is determined as the target convolution unit.
Since there may be an error when selecting the fifo queue according to the number of channels, in order to improve the accuracy of selecting the target convolution unit, in this embodiment, a secondary determination may be performed on the number of channels of the input data output from the target fifo queue, so that an accurate target convolution unit is selected according to the secondary determination. In a second implementation, the number of channels of input data output from the target fifo queue may be obtained; if the number of the channels is matched with the convolution unit connected with the target first-in first-out queue, determining the convolution unit as a target convolution unit; and if the number of the channels is not matched with the convolution units connected with the target first-in first-out queue, determining the preset convolution unit as a target convolution unit. Wherein, the n first-in first-out queues are all connected with a preset convolution unit in the circulating network unit. Taking the above three convolution units as an example, the predetermined convolution unit may be a 1 × 1 convolution unit.
Step 203, input data is input into the target convolution unit.
Since the input data has been input into the target fifo queue, the input data output from the target fifo queue may be input into the target convolution unit.
In this embodiment, when the cyclic network unit further includes n instruction control units and n data master control units, each instruction control unit is connected to one fifo queue and one data master control unit, and each data master control unit is connected to one convolution unit, the input data output from the target fifo queue is input to the target convolution unit, which may include: performing data buffering on output data of the first-in first-out queue by using an instruction control module; inputting the output data after data buffering into a data master control unit, wherein the speed of the output data after data buffering is matched with the processing speed of the data master control unit; providing an output window with a preset size for output data by using a row-column controller in the data master control unit; the output data within the output window is input to the target convolution unit.
Referring to fig. 5, an instruction control module and a data master control module are connected between each fifo queue and the convolution unit in fig. 5, and the instruction control module is used as a data interface to receive data output by the fifo queue, wherein after the data output by the fifo queue enters the instruction control module, data buffering is performed once to adapt the speed of the data output by the fifo queue to the speed of the data master control module to receive the data. The buffered data enters a data master control module, a row-column controller is adopted in the data master control module to provide output windows with corresponding sizes for the data of different data channels, and finally, the processed data are respectively output to each convolution unit through the output windows of the data master control module. Wherein, the number of channels and the size of the output window are in positive correlation.
And 204, performing convolution operation on the input data by using the target convolution unit, and determining an obtained first operation result as output data of the circulating network unit.
In this embodiment, if the input data corresponds to the pooling unit, the pooling unit is used to perform pooling operation on the output data, and the obtained second operation result is determined as the next input data of the cyclic network unit; if the input data does not correspond to a pooling unit, the output data is determined to be the next input data of the circulating network unit.
In summary, in the data processing method based on the neural network provided in the embodiment of the present application, since the cyclic network unit in the neural network includes n convolution units with different sizes, a matched target convolution unit can be selected for the input data according to the number of channels of the input data, and then the target convolution unit is used to perform convolution operation on the input data, so that the obtained first operation result is determined as the output data of the cyclic network unit. Subsequently, the output data can continue to be input as input data into the circulating network element, so that the data can be cyclically processed by one circulating network element. Thus, the pipeline structure can be replaced by a circulating network unit, thereby reducing the workload of creating the chip; the volume of the circulating network unit is small, and the chip can accommodate the calculation process; and the circulating network unit can reduce the calculation time among different network layers, thereby improving the data processing efficiency.
Referring to fig. 6, a block diagram of a data processing apparatus based on a neural network according to an embodiment of the present application is shown, where the data processing apparatus based on a neural network may be applied to a server. The data processing device based on the neural network can comprise:
an obtaining module 610, configured to obtain the number of channels of input data when the input data is input to a cyclic network unit in a neural network, where the input data is initial data input to the neural network, or the input data is data obtained according to previous output data of the cyclic network unit, the cyclic network unit includes n convolution units of different sizes, and n is greater than or equal to 2;
a selecting module 620, configured to select a target convolution unit matched with the number of channels from the n convolution units;
an input module 630, configured to input data into the target convolution unit;
the determining module 640 is configured to perform convolution operation on the input data by using the target convolution unit, and determine an obtained first operation result as output data of the cyclic network unit.
In one possible implementation manner, when the cyclic network unit further includes n fifo queues, and each fifo queue is connected to one convolution unit;
the selecting module 620 is further configured to select a target fifo queue matching the number of channels from the n fifo queues before selecting a target convolution unit matching the number of channels from the n convolution units;
an input module 630, further configured to input data into the target fifo queue;
the selecting module 620 is further configured to select a target convolution unit connected to the target fifo queue from the n convolution units;
the input module 630 is further configured to input the input data output from the target fifo queue to the target convolution unit.
In one possible implementation, when the n fifo queues are further connected to a predetermined convolution unit in the cyclic network unit;
the obtaining module 610 is further configured to obtain a channel number of input data output from the target fifo queue;
the selecting module 620 is further configured to determine a predetermined convolution unit as the target convolution unit if the number of channels does not match the number of convolution units connected to the target fifo queue.
In a possible implementation manner, when the cyclic network unit further includes n instruction control units and n data total control units, each instruction control unit is connected to one fifo queue and one data total control unit, and each data total control unit is connected to one convolution unit, the input module 630 is further configured to:
performing data buffering on output data of the first-in first-out queue by using an instruction control module;
inputting the output data after data buffering into a data master control unit, wherein the speed of the output data after data buffering is matched with the processing speed of the data master control unit;
providing an output window with a preset size for output data by using a row-column controller in the data master control unit;
the output data within the output window is input to the target convolution unit.
In one possible implementation, the determining module 640 is further configured to:
if the input data correspond to the pooling unit, performing pooling operation on the output data by using the pooling unit, and determining the obtained second operation result as the next input data of the circulating network unit;
if the input data does not correspond to a pooling unit, the output data is determined to be the next input data of the circulating network unit.
In one possible implementation, the apparatus further includes:
the operation module is used for performing convolution operation on input data by utilizing the preprocessing convolution unit before the input data is input into the circulation network unit in the neural network to obtain a third operation result; and determining the third operation result as input data.
In one possible implementation, when n is 3, the convolution units include a 3 × 3 convolution unit, a 5 × 5 convolution unit, and a 1 × 1 convolution unit.
In summary, in the data processing apparatus based on the neural network provided in the embodiment of the present application, since the cyclic network unit in the neural network includes n convolution units with different sizes, a matched target convolution unit can be selected for the input data according to the number of channels of the input data, and then the target convolution unit is used to perform convolution operation on the input data, so that the obtained first operation result is determined as the output data of the cyclic network unit. Subsequently, the output data can continue to be input as input data into the circulating network element, so that the data can be cyclically processed by one circulating network element. Thus, the pipeline structure can be replaced by a circulating network unit, thereby reducing the workload of creating the chip; the volume of the circulating network unit is small, and the chip can accommodate the calculation process; and the circulating network unit can reduce the calculation time among different network layers, thereby improving the data processing efficiency.
One embodiment of the present application provides a computer-readable storage medium having stored therein at least one instruction, at least one program, code set, or set of instructions that is loaded and executed by a processor to implement a neural network-based data processing method as described above.
One embodiment of the present application provides a server, which includes a processor and a memory, where at least one instruction is stored in the memory, and the instruction is loaded and executed by the processor to implement the data processing method based on neural network as described above.
It should be noted that: in the data processing device based on the neural network according to the above embodiment, when performing data processing based on the neural network, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed to different functional modules as needed, that is, the internal structure of the data processing device based on the neural network may be divided into different functional modules to complete all or part of the functions described above. In addition, the data processing apparatus based on the neural network provided in the above embodiments and the data processing method based on the neural network belong to the same concept, and specific implementation processes thereof are described in detail in the method embodiments and are not described herein again.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description should not be taken as limiting the embodiments of the present application, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the embodiments of the present application should be included in the scope of the embodiments of the present application.