Specific embodiment
The disclosure is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure can phase
Mutually combination.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can be using the exemplary system of the embodiment of the model generating method or model generating means of the disclosure
System framework 100.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105.
Network 104 can be to provide the medium of communication link between terminal device 101,102,103 and server 105.Network
104 may include various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out
Send message etc..Various telecommunication customer end applications can be installed, such as model generates class and answers on terminal device 101,102,103
With the application of, conversational class, live streaming class application, searching class application, instant messaging tools, mailbox client, social platform software etc..
Terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102,103 is hard
When part, it can be the various electronic equipments with communication function, including but not limited to smart phone, tablet computer, e-book is read
Read device, MP3 player (Moving Picture Experts Group Audio Layer III, dynamic image expert compression
Standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert pressure
Contracting standard audio level 4) player, pocket computer on knee and desktop computer etc..When terminal device 101,102,103
When for software, it may be mounted in above-mentioned cited electronic equipment.Its may be implemented into multiple softwares or software module (such as
For providing Distributed Services), single software or software module also may be implemented into.It is not specifically limited herein.
Server 105 can be to provide the server of various services, such as to the model on terminal device 101,102,103
Generate the background server that class application is supported.Some parameters (such as training sample data that terminal device can generate model
Deng) it is packaged as model generation request, model is then generated into request and is sent to background server.Background server can be to reception
To model generate the data such as request and carry out the processing such as analyzing, and processing result (such as various parameters of model) is fed back into end
End equipment.
It should be noted that model generating method provided by the embodiment of the present disclosure is generally executed by server 105, accordingly
Ground, model generating means are generally positioned in server 105.Optionally, model generating method provided by the embodiment of the present disclosure
It can also be executed by terminal device 101,102,103.
It should be noted that server can be hardware, it is also possible to software.When server is hardware, may be implemented
At the distributed server cluster that multiple servers form, individual server also may be implemented into.It, can when server is software
To be implemented as multiple softwares or software module (such as providing Distributed Services), single software or software also may be implemented into
Module.It is not specifically limited herein.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need
It wants, can have any number of terminal device, network and server.
Referring to FIG. 2, it illustrates the processes 200 of one embodiment of model generating method.The present embodiment is mainly with this
Method is applied to come in the electronic equipment for having certain operational capability for example, the electronic equipment can be service shown in fig. 1
Device.The model generating method, comprising the following steps:
Step 201, training sample data are obtained.
In the present embodiment, executing subject (such as server shown in FIG. 1) available training of model generating method
Sample data.
Herein, training sample data can be used for training to training pattern, to generate new model.
In the present embodiment, the above-mentioned mind that can be unbred neural network to training pattern or training is not completed
Through network.Herein, neural network can refer to artificial neural network.Common neural network is for example including deep neural network
(Deep Neural Network, DNN), convolutional neural networks (Convolutional Neural Network, CNN), circulation
Neural network (Recurrent Neural Network, RNN) etc..
Optionally, it can be preset to the network structure of training pattern, for example, it is desired to which it includes which that neural network, which is arranged,
Which neuron layer, order of connection relationship between layers and every layer all include, the corresponding weight of each neuron
(weight) and bias term (bias), every layer activation primitive etc..Network structure to training pattern can pass through various nets
Network parameter indicates that network parameter can include but is not limited to weight, bias term etc..
As an example, when training pattern is depth convolutional neural networks, since depth convolutional neural networks are one
The neural network of multilayer, it is therefore desirable to determine which layer depth convolutional neural networks include (for example, convolutional layer, pond layer, Quan Lian
Connect layer, classifier etc.), order of connection relationship and each layer between layers include which network parameter (for example,
Weight, bias term, the step-length of convolution) etc..Wherein, convolutional layer can be used for extracting characteristics of image.It can for each convolutional layer
To determine how many convolution kernel, the size of each convolution kernel, the weight of each neuron in each convolution kernel, each convolution
The corresponding bias term of core, the step-length etc. between adjacent convolution twice.
Step 202, during based on training sample data and to the propagated forward of training pattern, the first precision class is utilized
The data of type are calculated, and the reality output of the first precision type is obtained.
In the present embodiment, above-mentioned executing subject can be based on above-mentioned training sample data and to the forward direction of training pattern
It in communication process, is calculated using the data of the first precision type, obtains the reality output of the first precision type.
In the present embodiment, model training process can be calculated using real-coded GA.Real-coded GA is according to essence
The difference of degree can be divided into following several types: half precision type, but precision type and type double precision.In general, 16 floating
Point data may belong to half precision type, and 32 floating datas may belong to single precision type, and 64 floating datas can be with
Belong to type double precision.
In the present embodiment, training sample data are imported to training pattern, is then obtained to the output layer of training pattern
Reality output, this process are properly termed as propagated forward.Using the target output and reality output to training pattern, output is determined
Layer error.
In the present embodiment, calculated using the data of the first precision type, it is meant that participate in calculate data be
First precision type, that is, the training sample data calculated and the network parameter to training pattern are participated in, are the first precision classes
Type.If the training sample data got and the network parameter to training pattern are not the first precision type, Ke Yizhuan
It is changed to the first precision type, then carries out propagated forward calculating.
Step 203, in the back-propagation process based on reality output and to training pattern, the second precision type is utilized
Data are calculated.
In the present embodiment, above-mentioned executing subject can be based on above-mentioned reality output and above-mentioned to the reversed of training pattern
In communication process, calculated using the data of the second precision type.Thus, it is possible to the network parameter to training pattern is updated,
So as to based on generating new model to training pattern.
In the present embodiment, error back propagation is carried out using output layer error amount, and then adjusts the net to training pattern
Network parameter, this process are properly termed as backpropagation.As an example, back-propagation algorithm (Back can be used
Propagation Algorithm, BP algorithm) and gradient descent method (such as stochastic gradient descent algorithm) to above-mentioned mould to be trained
The network parameter of type is adjusted.
In the present embodiment, calculated using the data of the second precision type, it is meant that participate in calculate data be
Second precision type, that is, the reality output calculated and the network parameter to training pattern are participated in, is the second precision type.
If the reality output got and the network parameter to training pattern are not the second precision types, the second essence can be converted to
Type is spent, then carries out backpropagation calculating.
Herein, above-mentioned first precision type is different with above-mentioned second precision type.
It should be noted that the prior art is using identical precision type in propagated forward and back-propagation process
Data are calculated.In the disclosure, propagated forward is different with the precision type of data used by backpropagation.Skill as a result,
Art effect at least may include:
First, provide a kind of new model generating mode.
Second, model training process is divided into two parts, a part is calculated using the data of degree of precision, another portion
Divide and calculated using the data of lower accuracy, can not only improve the speed of model training, but also guarantee the accuracy of model training.
It is a signal of the application scenarios of the model generating method of embodiment according to Fig.2, with continued reference to Fig. 3, Fig. 3
Figure.In the application scenarios of Fig. 3:
Firstly, the available training sample data of server 301.
Then, during server 301 can be based on above-mentioned training sample data and to the propagated forward of training pattern, benefit
It is calculated with the data of the first precision type, obtains the reality output of the first precision type.
Then, server 301 can in based on above-mentioned reality output and the above-mentioned back-propagation process to training pattern,
It is calculated using the data of the second precision type.It is thus possible to update the network parameter to training pattern, obtain updated
Network parameter, to generate new model.Herein, above-mentioned first precision type is different with above-mentioned second precision type.
The method provided by the above embodiment of the disclosure, by the way that during model training, propagated forward process is used
The data of first precision type are calculated, and back-propagation process is calculated using the data of the second precision type, also, on
It is different with above-mentioned second precision type to state the first precision type, it is thus possible to the network parameter to training pattern be updated, to generate
New model, technical effect at least may include: to provide a kind of new model generating mode.
In some embodiments, the first precision type or the second precision type are half precision type.That is following two
Mode is any:
The first, which is half precision type, which is any one of following: single precision class
Type and type double precision.
Second, which is half precision type, which is any one of following: single precision class
Type and type double precision.
It should be noted that in the prior art, it is generally recognized that: the data of half precision type are suitable for transmission (transmission speed
Fastly), it is unsuitable for calculating (precision is inadequate).In some implementations of the disclosure, inventor expects can be by model training
It is divided into two parts, then carries out a part of calculating (propagated forward or backpropagation) using the data of half precision type, uses
The data of higher precision carry out the calculating of another part, to overcome technology prejudice, (data of half precision type are unsuitable for counting
Calculate), it realizes: a part of calculating can be carried out using the data of half precision type, and improve calculating speed;It can utilize again higher
The data of precision carry out another part calculating, and guarantee the accuracy of model training.
In some embodiments, the precision (being properly termed as the first precision) of above-mentioned first precision type instruction is less than above-mentioned the
The precision (being properly termed as the second precision) of two precision types instruction.
It should be noted that the first precision is less than the second precision, i.e. propagated forward process uses lesser precision, reversed to pass
Process is broadcast using biggish precision.Thus, it is possible in the forward propagation process, improve calculating speed;In back-propagation process,
Improve the accuracy of updated network parameter.It is thus possible to not only improve the speed of model training, but also guarantee the standard of model training
Exactness.
In some embodiments, the precision of above-mentioned first precision type instruction is greater than the essence of above-mentioned second precision type instruction
Degree.
It should be noted that the first precision is greater than the second precision, i.e. propagated forward process uses biggish precision, reversed to pass
Process is broadcast using lesser precision.Thus, it is possible in the forward propagation process, guarantee the accuracy calculated;In backpropagation
Cheng Zhong improves the speed of calculating.It is thus possible to not only guarantee the accuracy of model training, but also improve the speed of model training.
With further reference to Fig. 4, it illustrates the processes 400 of another embodiment of model generating method.The model generates
The process 400 of method, comprising the following steps:
Step 401, training sample data and the network parameter to training pattern are obtained.
In the present embodiment, executing subject (such as server shown in FIG. 1) available training of model generating method
Sample data and network parameter to training pattern.
Herein, the realization details of step 401, can be with reference to the description in step 201, and details are not described herein.
Step 402, in response to determining that training sample data are not the data of the first precision type, training sample data are turned
The data of the first precision type are changed to, the first training sample data are generated.
In the present embodiment, above-mentioned executing subject can first determine whether training sample data are the first precision types,
If it is not, then training sample data to be converted into the data of the first precision type, to obtain the first training sample data, i.e.,
Generate the first training sample data.
Step 403, it is not the data of the first precision type in response to the determining network parameter to training pattern, network is joined
Number is converted to the data of the first precision type, generates first network parameter.
In the present embodiment, above-mentioned executing subject can first determine whether above-mentioned network parameter is the first precision type,
If it is not, then the data that above-mentioned network parameter is converted into the first precision type are generated to obtain first network parameter
First network parameter.
Herein, sequence is executed to step 402 and step 403, without limitation.
Step 404, using the first training sample data and first network parameter, propagated forward calculating is carried out, obtains first
The reality output of precision type.
In the present embodiment, above-mentioned executing subject can use above-mentioned first training sample data and above-mentioned first network ginseng
Number carries out propagated forward calculating, obtains the reality output of the first precision type.
Step 405, reality output is converted into the second precision type by the first precision type.
In the present embodiment, above-mentioned reality output can be converted to the second essence by the first precision type by above-mentioned executing subject
Spend type.
Step 406, it is not the data of the second precision type in response to the determining network parameter to training pattern, network is joined
Number is converted to the data of the second precision type, generates the second network parameter.
In the present embodiment, above-mentioned executing subject can first determine whether above-mentioned network parameter is the second precision type,
If it is not, then the data that above-mentioned network parameter is converted into the second precision type are generated to obtain the second network parameter
Second network parameter
Step 407, according to the reality output and the second network parameter of the second precision type, backpropagation calculating is carried out, with
Update the second network parameter.
In the present embodiment, above-mentioned executing subject can join according to the reality output of the second precision type and second network
Number carries out backpropagation calculating, to update second network parameter.It is thus possible to implementation model training, to generate new mould
Type.
Figure 4, it is seen that compared with the corresponding embodiment of Fig. 2, the process of the model generating method in the present embodiment
400 highlight the step of carrying out precision conversion to data.The scheme of the present embodiment description, technical effect at least can wrap as a result,
It includes:
First, provide a kind of new model generating mode.
Second, provide more comprehensively model generating method.
It generates and fills present disclose provides a kind of model as the realization to method shown in above-mentioned each figure with further reference to Fig. 5
The one embodiment set, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which specifically can be applied to respectively
In kind electronic equipment.
As shown in figure 5, the model generating means 500 of the present embodiment include: acquiring unit 501,502 and of propagated forward unit
Backpropagation unit 503.Wherein, acquiring unit is configured to obtain training sample data;Propagated forward unit, is configured to
During based on the training sample data and to the propagated forward of training pattern, counted using the data of the first precision type
It calculates, obtains the reality output of the first precision type;Backpropagation unit is configured to be based on the reality output and be somebody's turn to do wait train
In the back-propagation process of model, calculated using the data of the second precision type, wherein the first precision type and this
Two precision types are different.
In the present embodiment, the acquiring unit 501 of model generating means 500, propagated forward unit 502 and backpropagation list
Member 503 it is specific processing and its brought technical effect can respectively refer to Fig. 2 corresponding embodiment in step 201, step 202 and
The related description of step 203, details are not described herein.
In some optional implementations of the present embodiment, above-mentioned determination unit is further configured to: by above-mentioned wait train
The gradient value of the layer to be updated of model, is determined as first gradient value;According in above-mentioned first gradient value and above-mentioned layer to be updated
The present weight value of weight determines the scale factor of above-mentioned layer to be updated.
In some optional implementations of the present embodiment, the first precision type or the second precision type are half essence
Spend type.
In some optional implementations of the present embodiment, the precision of the first precision type instruction is less than second essence
Spend the precision of type instruction.
In some optional implementations of the present embodiment, the precision of the first precision type instruction is greater than second essence
Spend the precision of type instruction.
In some optional implementations of the present embodiment, which is further configured to: in response to determination
The training sample data are not the data of the first precision type, which is converted to the number of the first precision type
According to the first training sample data of generation;It is not the data of the first precision type in response to the determining network parameter to training pattern,
The network parameter is converted to the data of the first precision type, generates first network parameter;Utilize the first training sample data
With the first network parameter, propagated forward calculating is carried out, the reality output of the first precision type is obtained.
In some optional implementations of the present embodiment, which is further configured to: the reality is defeated
The second precision type is converted to by the first precision type out;In response to determining that the network parameter to training pattern is not the second precision
The network parameter is converted to the data of the second precision type by the data of type, generates the second network parameter;According to the second precision
The reality output of type and second network parameter carry out backpropagation calculating, to update second network parameter.
It should be noted that the realization details of each unit and technology effect in the model generating means that the embodiment of the present disclosure provides
Fruit can be with reference to the explanation of other embodiments in the disclosure, and details are not described herein.
Below with reference to Fig. 6, it illustrates the electronic equipment (end of example as shown in figure 1 for being suitable for being used to realize the embodiment of the present disclosure
End or server) 600 structural schematic diagram.Electronic equipment shown in Fig. 6 is only an example, should not be to the embodiment of the present disclosure
Function and use scope bring any restrictions.
As shown in fig. 6, electronic equipment 600 may include processing unit (such as central processing unit, graphics processor etc.)
601, random access can be loaded into according to the program being stored in read-only memory (ROM) 602 or from storage device 608
Program in memory (RAM) 603 and execute various movements appropriate and processing.In RAM 603, it is also stored with electronic equipment
Various programs and data needed for 600 operations.Processing unit 601, ROM 602 and RAM603 are connected with each other by bus 604.
Input/output (I/O) interface 605 is also connected to bus 604.
In general, following device can connect to I/O interface 605: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph
As the input unit 606 of head, microphone, accelerometer, gyroscope etc.;Including such as liquid crystal display (LCD), loudspeaker, vibration
The output device 607 of dynamic device etc.;Storage device 608 including such as tape, hard disk etc.;And communication device 609.Communication device
609, which can permit electronic equipment 600, is wirelessly or non-wirelessly communicated with other equipment to exchange data.Although Fig. 6 shows tool
There is the electronic equipment 600 of various devices, it should be understood that being not required for implementing or having all devices shown.It can be with
Alternatively implement or have more or fewer devices.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communication device 609, or from storage device 608
It is mounted, or is mounted from ROM 602.When the computer program is executed by processing unit 601, the embodiment of the present disclosure is executed
Method in the above-mentioned function that limits.
It should be noted that the above-mentioned computer-readable medium of the disclosure can be computer-readable signal media or meter
Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not
Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter
The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires
Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage
Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device,
Or above-mentioned any appropriate combination.In the disclosure, computer readable storage medium can be it is any include or storage journey
The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this
In open, computer-readable signal media may include in a base band or as the data-signal that carrier wave a part is propagated,
In carry computer-readable program code.The data-signal of this propagation can take various forms, including but not limited to
Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable and deposit
Any computer-readable medium other than storage media, the computer-readable signal media can send, propagate or transmit and be used for
By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium
Program code can transmit with any suitable medium, including but not limited to: electric wire, optical cable, RF (radio frequency) etc. are above-mentioned
Any appropriate combination.
Above-mentioned computer-readable medium can be included in above-mentioned electronic equipment;It is also possible to individualism, and not
It is fitted into the electronic equipment.
Above-mentioned computer-readable medium carries one or more program, when said one or multiple programs are by the electricity
When sub- equipment executes, so that the electronic equipment: obtaining training sample data;Based on the training sample data and to training pattern
Propagated forward during, calculated using the data of the first precision type, obtain the reality output of the first precision type;?
Based on the reality output and it is somebody's turn to do to be calculated using the data of the second precision type in the back-propagation process of training pattern,
Wherein, the first precision type is different with the second precision type.
The calculating of the operation for executing the disclosure can be write with one or more programming languages or combinations thereof
Machine program code, above procedure design language include object oriented program language-such as Java, Smalltalk, C+
+, it further include conventional procedural programming language-such as " C " language or similar programming language.Program code can
Fully to execute, partly execute on the user computer on the user computer, be executed as an independent software package,
Part executes on the remote computer or executes on a remote computer or server completely on the user computer for part.
In situations involving remote computers, remote computer can pass through the network of any kind --- including local area network (LAN)
Or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as utilize Internet service
Provider is connected by internet).
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use
The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in the embodiment of the present disclosure can be realized by way of software, can also be by hard
The mode of part is realized.Wherein, the title of unit does not constitute the restriction to the unit itself under certain conditions, for example, obtaining
Unit is taken to be also described as " obtaining the unit of training sample data ".
Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art
Member is it should be appreciated that the open scope involved in the disclosure, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic
Scheme, while should also cover in the case where not departing from design disclosed above, it is carried out by above-mentioned technical characteristic or its equivalent feature
Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed in the disclosure
Can technical characteristic replaced mutually and the technical solution that is formed.