CN109190761A

CN109190761A - Data processing method, device, equipment and storage medium

Info

Publication number: CN109190761A
Application number: CN201810887707.7A
Authority: CN
Inventors: 杨少雄; 赵晨
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Baidu Online Network Technology Beijing Co Ltd; Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2018-08-06
Filing date: 2018-08-06
Publication date: 2019-01-11

Abstract

The embodiment of the present application provides a kind of data processing method, device, equipment and storage medium, after the first input data of n-th layer network in getting neural network, based on the incidence relation between the preset network number of plies and multi-order function, the corresponding target multi-order function of the n-th layer network is determined；The input data is handled based on the target multi-order function, obtains the second input data of the n-th layer network；The n-th layer network is trained based on second input data.Technical solution provided by the embodiments of the present application can reduce each layer input data Character losing of neural network, improve model convergence rate, lift scheme accuracy.

Description

Data processing method, device, equipment and storage medium

Technical field

The invention relates to field of computer technology more particularly to a kind of data processing method, device, equipment and deposit Storage media.

Background technique

Batch standardization (batch-norm) algorithm is for solving in neural network in data transmission procedure in the prior art The problem of gradient disappears.It mainly does two pieces thing in neural network, and first thing feelings are to the defeated of layer each in neural network Enter data and make normalized, so that the input of each layer is all satisfied preset data distribution, second thing feelings are to after normalization Input data carry out linear process so that each layer input data keep differentiation, so as to preferably receive model It holds back.But the linear process that batch-norm is done at present only includes zooming and panning, can be made when carrying out linear process Input data loses more feature, is unfavorable for model and restrains as early as possible, while also will affect the accuracy of model, so that model is not It is enough accurate.

Summary of the invention

The embodiment of the present application provides a kind of data processing method, device, equipment and storage medium, to reduce neural network Each layer input data Character losing.

The embodiment of the present application first aspect provides a kind of data processing method, comprising: the n-th layer in getting neural network After first input data of network, based on the incidence relation between the preset network number of plies and multi-order function, the n-th layer is determined The corresponding target multi-order function of network；First input data is handled based on the target multi-order function, obtains institute State the second input data of n-th layer network；The n-th layer network is trained based on second input data；Wherein, n For positive integer.

The embodiment of the present application second aspect provides a kind of data processing equipment, comprising: determining module, for getting mind After the first input data of n-th layer network in network, based on the incidence relation between the preset network number of plies and multi-order function, Determine the corresponding target multi-order function of the n-th layer network；Processing module, for based on the target multi-order function to described the One input data is handled, and the second input data of the n-th layer network is obtained；Training module, for being based on described second Input data is trained the n-th layer network；Wherein, n is positive integer.

The embodiment of the present application third aspect provides a kind of computer equipment, comprising: one or more processors；Storage dress It sets, for storing one or more programs, when one or more of programs are executed by one or more of processors, so that One or more of processors realize the method as described in above-mentioned first aspect.

The embodiment of the present application fourth aspect provides a kind of computer readable storage medium, is stored thereon with computer program, The method as described in above-mentioned first aspect is realized when the program is executed by processor.

The first input for passing through the n-th layer network in getting neural network based on aspects above, the embodiment of the present application After data, based on the incidence relation between the preset network number of plies and multi-order function, determine that the corresponding target of n-th layer network is multistage Function, and the first input data is handled based on target multi-order function, the second input data of n-th layer network is obtained, from And n-th layer network is trained based on the second input data.Due to the embodiment of the present application in neural network each layer network Input data when being handled, using multi-order function, rather than first order linear function in the prior art, compared to one Rank linear function, multi-order function can retain more data details and data characteristics after to data processing, so as to mention The convergence rate and training effectiveness of high model, the accuracy of lift scheme.

It should be appreciated that content described in foregoing invention content part is not intended to limit the pass of embodiments herein Key or important feature, it is also non-for limiting scope of the present application.The other feature of this public affairs application will be become by description below It is readily appreciated that.

Detailed description of the invention

Fig. 1 is the input data processing schematic diagram of a scenario of each layer network in a kind of neural network of prior art offer；

Fig. 2 is the input data processing scene signal of each layer network in a kind of neural network provided by the embodiments of the present application Figure；

Fig. 3 is a kind of flow chart of data processing method provided by the embodiments of the present application；

Fig. 4 is the execution method flow diagram of step S13 provided by the embodiments of the present application a kind of；

Fig. 5 is a kind of structural schematic diagram of data processing equipment provided by the embodiments of the present application；

Fig. 6 is a kind of structural schematic diagram of training module 33 provided by the embodiments of the present application.

Specific embodiment

Embodiments herein is more fully described below with reference to accompanying drawings.Although showing that the application's is certain in attached drawing Embodiment, it should be understood that, the application can be realized by various forms, and should not be construed as being limited to this In the embodiment that illustrates, providing these embodiments on the contrary is in order to more thorough and be fully understood by the application.It should be understood that It is that being given for example only property of the accompanying drawings and embodiments effect of the application is not intended to limit the protection scope of the application.

The specification and claims of the embodiment of the present application and the term " first " in above-mentioned attached drawing, " second ", " Three ", the (if present)s such as " 4th " are to be used to distinguish similar objects, without for describing specific sequence or successive time Sequence.It should be understood that the data used in this way are interchangeable under appropriate circumstances, for example so as to the embodiment of the present application described herein It can be performed in other sequences than those illustrated or described herein.In addition, term " includes " and " having " and he Any deformation, it is intended that cover it is non-exclusive include, for example, contain the process, method of a series of steps or units, System, product or equipment those of are not necessarily limited to be clearly listed step or unit, but may include being not clearly listed Or the other step or units intrinsic for these process, methods, product or equipment.

Fig. 1 is the input data processing schematic diagram of a scenario of each layer network in a kind of neural network of prior art offer, B0 indicates input data in Fig. 1, and b1 indicates a certain layer network in neural network.As shown in Figure 1, inputting b1 in input data b0 It needs to handle by normalized and first-order linear before, so that the input data of each layer keeps differentiation.Wherein, to defeated When entering data progress linear process, the method that the prior art generallys use includes zooming and panning, however zooming and panning can make It obtains input data and loses more feature, be unfavorable for model and restrain as early as possible, while also will affect the accuracy of model, so that model It is not accurate enough.

Fig. 2 is the input data processing scene signal of each layer network in a kind of neural network provided by the embodiments of the present application Figure, a0 indicates input data in Fig. 2, and a1 indicates a certain layer network in neural network.As shown in Fig. 2, in input data a0 It needs to handle by normalized and multi order linear before input a1, wherein the normalized mistake in the embodiment of the present application Similarly to the prior art, but the embodiment of the present application is handled input data using multi-order function journey, is not only able to guarantee The otherness of each layer network input data, additionally it is possible to guarantee that the feature of input data is not lost or less loss, exists in this way When based on multi-order function treated input data training network layer, it will be able to so that network layer is restrained as early as possible, and then improve mould The training effectiveness of type and accuracy.Certain above-mentioned Fig. 2 is merely illustrative, and is not unique restriction to the application.

Technical scheme is explained in detail below with reference to exemplary embodiment:

Fig. 3 is a kind of flow chart of data processing method provided by the embodiments of the present application, and this method can be by a kind of data Processing unit executes, as shown in figure 3, this method includes S11-S13:

S11, in getting neural network after the first input data of n-th layer network, based on the preset network number of plies with Incidence relation between multi-order function determines the corresponding target multi-order function of the n-th layer network, wherein n is positive integer.

In the present embodiment for the name of " target multi-order function " be only for by the corresponding multi-order function of n-th layer network with The corresponding multi-order function of other network layers is distinguished, without other meanings.

Incidence relation in the present embodiment between the network number of plies and multi-order function, which can according to need, to be set.For example, In a kind of possible design, the expression-form that can design the corresponding multi-order function of heterogeneous networks layer is different, when each network layer When the expression-form difference of corresponding multi-order function, the order of the corresponding multi-order function of each network layer can be identical or not Together.In alternatively possible design, the expression-form of the corresponding multi-order function of heterogeneous networks layer is identical, but heterogeneous networks layer The order of corresponding multi-order function is different, for example, under a kind of possible scene, in the corresponding multi-order function of heterogeneous networks layer When expression-form is identical, the order of the corresponding multi-order function of each network layer can be based on preset network number of plies and function rank Functional relation between number, which calculates, to be obtained.Certainly above two possible design is only illustrated rather than to the application only One limits.

First input data involved in the present embodiment can be the input data by normalized, be also possible to not By the input data of normalized, when the first input data is without normalized, the first input number is being got According to rear, the present embodiment can also include the steps that the first input data is normalized, specific method for normalizing It may refer to the prior art, repeated no more in the present embodiment.

S12, first input data is handled based on the target multi-order function, obtains the n-th layer network The second input data.

As an example it is assumed that the expression formula of the corresponding multi-order function in n-th layer network are as follows:

x³+x²+x¹+c

Wherein, c is constant, and x is input data, then the first input data is substituted into the expression formula can be obtained n-th layer net Second input data of network.

Certainly it above are only exemplary illustration, rather than unique restriction to the application.

S13, the n-th layer network is trained based on second input data.

Method based on the second input data training n-th layer network in the present embodiment may refer to the prior art, this implementation It is repeated no more in example.

After the present embodiment is by the first input data of n-th layer network in getting neural network, it is based on preset net Incidence relation between network layers number and multi-order function determines the corresponding target multi-order function of n-th layer network, and multistage based on target Function handles the first input data, obtains the second input data of n-th layer network, to be based on the second input data pair N-th layer network is trained.When due to the present embodiment, the input data of each layer network is handled in neural network, use Be multi-order function, rather than first order linear function in the prior art, compared to first order linear function, multi-order function is in logarithm According to more data details and data characteristics can be retained after processing, so as to improve the convergence rate and training effect of model Rate, the accuracy of lift scheme.

Above-described embodiment is further extended and is optimized below with reference to exemplary embodiment:

Fig. 4 is the execution method flow diagram of step S13 provided by the embodiments of the present application a kind of, as shown in figure 4, in Fig. 3 reality On the basis of applying example, step S13 may include sub-step S21-S22:

S21, preset data increment processing is carried out to second input data, obtains incremental data.

Wherein, the processing of data increment involved in the present embodiment includes at least one of following processing:

Weighted sum processing, the linear compression processing of data, number between data increment processing, data based on GAN network According to linear translation processing.

Wherein, it is summed in the method for obtaining incremental data by data weighting, optional operation includes following several:

In a kind of possible design, setting is weighted summation process between the second input data, participates in weighted sum Data can be random be extracted from the second input data, the data amount check of weighted sum can be preset, can also be with Machine setting, the weighted value of each data can obtain at random in weighted sum, can also each weighted sum data be all made of it is identical Weighted value can set the weighted values of each data as 1/5 for example, when being weighted summation to 5 data.Certainly here It is merely illustrative explanation rather than unique restriction to the application.

It, can be by the data in the data and the second input data in the first input data in alternatively possible design It is weighted summation, obtains incremental data.Designed with former it is similar, in this type of design, the data amount check of weighted sum Also to preset or set at random, however, to ensure that the otherness of the input data of each network interlayer, in this design In, it is preferred that maximum weighted value always data in corresponding second input data in setting weighted sum.

S22, the n-th layer network is trained based on second input data and the incremental data.

The present embodiment obtains incremental data, and be based on by carrying out preset data increment processing to the second input data Second input data and incremental data training n-th layer network, so that each network layer can have sufficient data to be trained, Improve the accuracy of each network layer and entire neural network.

Fig. 5 is a kind of structural schematic diagram of data processing equipment provided by the embodiments of the present application, as shown in figure 5, the device 30 include:

Determining module 31, after the first input data of n-th layer network in getting neural network, based on preset Incidence relation between the network number of plies and multi-order function determines the corresponding target multi-order function of the n-th layer network；

Processing module 32 obtains institute for handling based on the target multi-order function first input data State the second input data of n-th layer network；

Training module 33, for being trained based on second input data to the n-th layer network；

Wherein, n is positive integer.

Optionally, the expression-form of the corresponding multi-order function of heterogeneous networks layer is different in the incidence relation.

Optionally, the order of the corresponding multi-order function of heterogeneous networks layer is different in the incidence relation.

Device provided in this embodiment can be used in the method for executing Fig. 3 embodiment, executive mode and beneficial effect class Seemingly repeat no more herein.

Fig. 6 is a kind of structural schematic diagram of training module 33 provided by the embodiments of the present application, as shown in fig. 6, implementing in Fig. 5 On the basis of example, training module 33, comprising:

Data increment submodule 331 is increased for carrying out preset data increment processing to second input data Measure data；

Training submodule 332, for based on second input data and the incremental data training n-th layer net Network.

Optionally, the preset data increment processing includes at least one of following processing:

Device provided in this embodiment can be used in the method for executing Fig. 4 embodiment, executive mode and beneficial effect class Seemingly repeat no more herein.

The embodiment of the present application also provides a kind of computer equipment, comprising: one or more processors；

Storage device, for storing one or more programs, when one or more of programs are one or more of Processor executes, so that one or more of processors realize method described in any of the above-described embodiment.

The embodiment of the present application is also provided in a kind of computer readable storage medium, is stored thereon with computer program, the journey Method described in any of the above-described embodiment is realized when sequence is executed by processor.

Function described herein can be executed at least partly by one or more hardware logic components.Example Such as, without limitation, the hardware logic component for the exemplary type that can be used includes: field programmable gate array (FPGA), dedicated Integrated circuit (ASIC), Application Specific Standard Product (ASSP), the system (SOC) of system on chip, load programmable logic device (CPLD) etc..

For implement disclosed method program code can using any combination of one or more programming languages come It writes.These program codes can be supplied to the place of general purpose computer, special purpose computer or other programmable data processing units Device or controller are managed, so that program code makes defined in flowchart and or block diagram when by processor or controller execution Function/operation is carried out.Program code can be executed completely on machine, partly be executed on machine, as stand alone software Is executed on machine and partly execute or executed on remote machine or server completely on the remote machine to packet portion.

In the context of the disclosure, machine readable media can be tangible medium, may include or is stored for The program that instruction execution system, device or equipment are used or is used in combination with instruction execution system, device or equipment.Machine can Reading medium can be machine-readable signal medium or machine-readable storage medium.Machine readable media can include but is not limited to electricity Son, magnetic, optical, electromagnetism, infrared or semiconductor system, device or equipment or above content any conjunction Suitable combination.The more specific example of machine readable storage medium will include the electrical connection of line based on one or more, portable meter Calculation machine disk, hard disk, random access memory (RAM), read-only memory (ROM), Erasable Programmable Read Only Memory EPROM (EPROM Or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage facilities or Any appropriate combination of above content.

Although this should be understood as requiring operating in this way with shown in addition, depicting each operation using certain order Certain order out executes in sequential order, or requires the operation of all diagrams that should be performed to obtain desired result. Under certain environment, multitask and parallel processing be may be advantageous.Similarly, although containing several tools in being discussed above Body realizes details, but these are not construed as the limitation to the scope of the present disclosure.In the context of individual embodiment Described in certain features can also realize in combination in single realize.On the contrary, in the described in the text up and down individually realized Various features can also realize individually or in any suitable subcombination in multiple realizations.

Although having used specific to this theme of the language description of structure feature and/or method logical action, answer When understanding that theme defined in the appended claims is not necessarily limited to special characteristic described above or movement.On on the contrary, Special characteristic described in face and movement are only to realize the exemplary forms of claims.

Claims

1. a kind of data processing method characterized by comprising

In getting neural network after the first input data of n-th layer network, it is based on the preset network number of plies and multi-order function Between incidence relation, determine the corresponding target multi-order function of the n-th layer network；

First input data is handled based on the target multi-order function, obtain the n-th layer network second is defeated Enter data；

The n-th layer network is trained based on second input data；

Wherein, n is positive integer.

2. the method according to claim 1, wherein heterogeneous networks layer is corresponding multistage in the incidence relation The expression-form of function is different.

3. according to the method described in claim 2, it is characterized in that, heterogeneous networks layer is corresponding multistage in the incidence relation The order of function is different.

4. method according to any one of claim 1-3, which is characterized in that described to be based on second input data pair The n-th layer network is trained, comprising:

Preset data increment processing is carried out to second input data, obtains incremental data；

Based on second input data and the incremental data training n-th layer network.

5. according to the method described in claim 4, it is characterized in that, the preset data increment processing includes at least following place One of reason:

Data increment processing based on GAN network, the weighted sum processing between data, the linear compression processing of data, data Linear translation processing.

6. a kind of data processing equipment characterized by comprising

Determining module is based on preset network layer after the first input data of n-th layer network in getting neural network Incidence relation between several and multi-order function, determines the corresponding target multi-order function of the n-th layer network；

Processing module obtains the n-th layer for handling based on the target multi-order function first input data Second input data of network；

Training module, for being trained based on second input data to the n-th layer network；

Wherein, n is positive integer.

7. device according to claim 6, which is characterized in that heterogeneous networks layer is corresponding multistage in the incidence relation The expression-form of function is different.

8. device according to claim 7, which is characterized in that heterogeneous networks layer is corresponding multistage in the incidence relation The order of function is different.

9. device a method according to any one of claims 6-8, which is characterized in that the training module, comprising:

Data increment submodule obtains incremental data for carrying out preset data increment processing to second input data；

Training submodule, for based on second input data and the incremental data training n-th layer network.

10. device according to claim 9, which is characterized in that the preset data increment processing includes at least as follows One of processing:

11. a kind of computer equipment characterized by comprising

One or more processors；

Storage device, for storing one or more programs, when one or more of programs are by one or more of processing Device executes, so that one or more of processors realize method according to any one of claims 1 to 5.

12. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor Method according to any one of claims 1 to 5 is realized when execution.