CN107392310A - neural network model training method and device - Google Patents

neural network model training method and device Download PDF

Info

Publication number
CN107392310A
CN107392310A CN201610320443.8A CN201610320443A CN107392310A CN 107392310 A CN107392310 A CN 107392310A CN 201610320443 A CN201610320443 A CN 201610320443A CN 107392310 A CN107392310 A CN 107392310A
Authority
CN
China
Prior art keywords
weight parameter
data
valued
output data
training sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610320443.8A
Other languages
Chinese (zh)
Inventor
张默
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Moshanghua Technology Co Ltd
Original Assignee
Beijing Moshanghua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Moshanghua Technology Co Ltd filed Critical Beijing Moshanghua Technology Co Ltd
Priority to CN201610320443.8A priority Critical patent/CN107392310A/en
Publication of CN107392310A publication Critical patent/CN107392310A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

This application discloses a kind of neural network model training method and device, methods described to include:Obtain training sample, the random initial weight parameter for generating neural network model;Initial weight parameter is subjected to three-valued calculate and obtains three-valued weight parameter;Input data and the three-valued weight parameter using training sample, calculate output data;Judge whether output data meets loss condition with label data;When output data and label data meet loss condition, the three-valued weight parameter is adjusted;The step of recalculating output data using the three-valued weight parameter after the input data of the training sample and adjustment, return by the output data compared with the label data, obtaining comparative result continues executing with;When output data and label data are unsatisfactory for loss condition, the target weight parameter using the three-valued weight parameter as the neural network model.The embodiment of the present invention improves the training effectiveness of neural network model.

Description

Neural network model training method and device
Technical field
The application belongs to artificial intelligence field, specifically, is related to a kind of neural network training method and device.
Background technology
Neural network model is the 26S Proteasome Structure and Function based on mimic biology brain and a kind of information processing system for forming.Closely Nian Lai, neural network model achieve rapid development in multiple fields such as computer vision, speech recognition, riddle.Nerve Network model is multitiered network structure, including an input layer and an output layer, and is arranged between input layer and output layer Several intermediate layers (also referred to as hidden layer) of cloth, sequentially it is connected between each layer.Wherein, each layer has several neurons (also referred to as Node), weight parameter is there are between the node of preceding layer and the node of later layer, and the node of later layer can be by previous The node of layer is calculated with weight parameter.
The weight parameter of neutral net is usually unknown, and needs are trained.Using constantly by it is known input and Corresponding output carrys out " training " this network, and network constantly adjusts the weight between oneself each node according to input and output Parameter is come the output that meets to input and respond.After training terminates, an input is given, neutral net is according to adjusted good power Weight parameter calculates an output.
But traditional neural network model training when, its weight parameter more using floating type (single precision or Person's double precision) define the data type of weight parameter.Due to the data of floating number need to take in computer systems it is larger Internal memory so that the training time of neutral net is elongated, it is also necessary to consumes larger internal memory.One so how is trained to calculate effect The neural network model that rate is higher, committed memory is smaller is current urgent problem to be solved.
The content of the invention
In view of this, technical problems to be solved in this application be exist in traditional neural network model it is computationally intensive, The problem of EMS memory occupation is high.
In order to solve the above-mentioned technical problem, should this application discloses a kind of training method of neural network model and device Method is changed to use three integer datas using by the weight parameter preserved by floating type in model in the training process Type, in the training process constantly adjustment weight parameter cause the neural network model that is made up of the weight parameter it is more succinct, It is more efficient.
The embodiment of the present invention provides a kind of training method of neural network model, including:
Obtain training sample;Wherein, the training sample includes input data and mark corresponding with the input data Sign data;
The initial weight parameter of random generation neural network model;
The initial weight parameter is subjected to three-valued calculating, obtains three-valued weight parameter;
Input data and the three-valued weight parameter using the training sample, calculate output data;
Judge whether the output data meets loss condition with the label data;
When the output data and the label data meet loss condition, the three-valued weight parameter is adjusted;
Utilize the three-valued weight parameter after the input data of the training sample and adjustment;Recalculate output Data, return it is described by the output data with the label data compared with, acquisition comparative result the step of continue executing with;
When the comparative result is unsatisfactory for loss condition, using the three-valued weight parameter as the neutral net mould The target weight parameter of type.
The embodiment of the present invention provides a kind of trainer of neural network model, including:
First acquisition module, for obtaining training sample;Wherein, the training sample include input data and with it is described Label data corresponding to input data;
First generation module, for generating the initial weight parameter of neural network model at random;
First computing module, for the initial weight parameter to be carried out into three-valued calculating, obtain three-valued weight parameter;
Second computing module, for the input data using the training sample and the three-valued weight parameter, meter Calculate output data;
First judge module, for judging whether the output data and the label data meet loss condition;
First adjusting module, for when the output data and the label data meet loss condition, described in adjustment Three-valued weight parameter;
3rd computing module, for the three-valued weight after the input data using the training sample and adjustment Parameter;Recalculate output data, return it is described by the output data with the label data compared with, obtain and compare knot The step of fruit, continues executing with;
Second acquisition module, for when the comparative result is unsatisfactory for loss condition, by the three-valued weight parameter Target weight parameter as the neural network model.
Compared with prior art, the application can be obtained including following technique effect:
1) low EMS memory occupation;
2) operand diminishes so that operation time reduces;
3) operation efficiency is higher, and operation result is more accurate.
Certainly, implementing any product of the application must be not necessarily required to reach all the above technique effect simultaneously.
Brief description of the drawings
Accompanying drawing described herein is used for providing further understanding of the present application, forms the part of the application, this Shen Schematic description and description please is used to explain the application, does not form the improper restriction to the application.In the accompanying drawings:
Fig. 1 is one embodiment flow chart of the application neural network model training method;
Fig. 2 is one embodiment structural representation of the application neural network model trainer.
Embodiment
Presently filed embodiment is described in detail below in conjunction with drawings and Examples, and thereby how the application is applied Technological means can fully understand and implement according to this to solve technical problem and reach the implementation process of technical effect.
The embodiment of the present invention is mainly used in the various differences such as computer vision field, field of speech recognition, riddle Artificial intelligence field.Neural network model training when, it is necessary to according to multiple input datas input neural network model, according to Obtained output data constantly adjusts the weight parameter in neural network model, so as to obtain it is optimal being capable of complete independently The neural network model to work accordingly.
Input data be input to neural network model to during obtaining corresponding output data, it is necessary to by a large amount of Calculating.Also, due to needing substantial amounts of input data to obtain accurate neural network model, cause whole neutral net The amount of calculation of the training process of model is very huge.On the basis of this large amount of calculating, due to the weight of neural network model The data type that parameter uses is floating point type, and the internal memory that the data of floating point type take in computer systems is larger, makes Reduced into calculating speed, calculate time lengthening, the training process for ultimately resulting in whole model is extremely inefficient.
In order to solve this technical problem, inventor passes through after a series of research, proposes technical scheme. In the present embodiment, three-valued weight is converted into by original floating type using by the weight parameter in neutral net Parameter so that when being calculated in inputting data into neural network model, calculating process becomes more simple, improves computing effect Rate.Simultaneously as the EMS memory occupation amount of three-valued weight parameter is much smaller than the EMS memory occupation amount of original real-coded GA, from And the EMS memory occupation amount of whole neural network model is reduced, save memory space.
Technical scheme is described in detail below in conjunction with accompanying drawing.
As shown in figure 1, for a kind of one embodiment flow chart of neural network training method of the present invention, this method can wrap Include following steps:
101:Obtain training sample;Wherein, the training sample includes input data and corresponding with the input data Label data.
Wherein, the training sample, it is the known input and correct output result corresponding with the input gathered in advance (label data).In the training process, the training sample can be one, namely in training, can use a training Sample repeatedly inputs neural network model and is trained;The training sample can also be multiple, namely in training, can use Different training samples is sequentially inputted into neural network model to be trained.
102:The initial weight parameter of random generation neural network model.
Before training, its weight parameter is unknown to neural network model, and the training to neural network model that is to say Determine the weight parameter of neural network model.
The embodiment of the present application generates an initial initial weight parameter using the method generated at random.The initial weight Parameter can be a vector, and the vectorial dimension can be true according to the number of plies of neural network model and the quantity of each node It is fixed.
103:The initial weight parameter is subjected to three-valued calculating, obtains three-valued weight parameter.
Initial weight parameter is usually to be preserved with real-coded GA, the initiation parameter of floating type can be passed through into numerical value Conversion, integer data or other kinds of data are converted to by original real-coded GA.
Preferably, can be by each weighted data in the initial weight parameter compared with weight threshold;
When the weighted data is more than the weight threshold, by the weighted data it is three-valued be numerical value 1, in the power Tuple according to be equal to the weight threshold when, by the weighted data it is three-valued be numerical value 0, be less than the power in the weighted data Weight threshold value when, by the weighted data it is three-valued be data -1, obtain three-valued weight parameter.
104:Input data and the three-valued weight parameter using the training sample, calculate output data.
Neural network model, can be according to the weight parameter of neural network model when there is input data input, one layer one Push ahead layer, last layer until being advanced to neural network model, obtain output result to the end.
The output result is the output result of Current Situation of Neural Network model, with the label data in the training sample simultaneously Differ, the label data is correct output result corresponding to the input data, distinguishes for convenience, is named as number of tags According to.
105:Judge whether the output data meets loss condition with the label data.
If the output data and the label data are without any difference, illustrate the training of the neural network model The complete total correctness of journey.When having differences, illustrating the neural network model between the output data and the label data Training process when difference is bigger, illustrates that the training process mistake of the neural network model is more in the presence of mistake.
Preferably, can by the way of the distance of output data and label data is calculated, come weigh output data with And between label data existing difference size.That is, calculate the distance of the output data and the label data;Judge Whether the distance is more than pre-determined distance.When calculating the output data and the distance value of the label data is bigger, explanation Therebetween existing difference is bigger;When calculating the output data and the distance value of the label data is smaller, illustrate two Existing difference is smaller between person.
106:When the output data and the label data meet loss condition, the adjustment three-valued weight ginseng Number.
When the difference between the output data and the label data meets loss condition, three value can be adjusted Change weight parameter.The adjustment three-valued weight parameter is that can use punishment strategy, the punishment three-valued weight ginseng Number.
When the difference between the output data and the label data passes through range marker, the comparative result meets Loss condition can be that the comparative result between the distance and some predetermined threshold value meets default loss condition.
Preferably, can the distance be greater than being equal to some predetermined threshold value, then the comparative result meets the damage Mistake condition;When the distance is less than some predetermined threshold value, then the comparative result is unsatisfactory for the loss condition.
107:Utilize the three-valued weight parameter after the input data of the training sample and adjustment;Recalculate Output data, return it is described by the output data with the label data compared with, acquisition comparative result the step of continue Perform;
Preferably, the method repeatedly trained can be used, obtains optimal neural network model.Using the side repeatedly trained , it is necessary to which the input data of training sample repeatedly is input into neural network model obtains output data during method.Wherein, training every time The training sample used can be the same or different.
That is, the training sample in step 107 can be identical with the training sample in step 101, if for example, using god Image content classification is carried out through network, can use the pictures of identical 100, inputs 100 times and arrives neural network model, obtain Three-valued weight parameter;
Training sample in step 107 can be different from the training sample in step 101, if for example, using nerve net Network carries out image content classification, can use the pictures of identical 10000, every time input 100, inputs 100 times to nerve Network model, obtain three-valued weight parameter.
108:When the output data and the label data are unsatisfactory for loss condition, by the three-valued weight parameter Target weight parameter as the neural network model.
When the comparative result is unsatisfactory for loss condition, the difference between the output data and the label data that is to say Mutation is small, illustrates that the training result of Current Situation of Neural Network model has met point to require.At this moment, can be by current training process In target weight parameter of the three-valued weight parameter as the neural network model.
In the above-described embodiments, it is converted into using by the weight parameter in neutral net by the data of original floating point type Three-valued weight parameter, using three-valued weight parameter, it is possible to reduce the occupancy of neutral net internal memory, increase calculate effect Rate, reduce the training time of neural network model.
As another embodiment, for the accuracy of the neural network model of raising, in step 103:Will be described initial Weight parameter carries out three-valued calculating, and obtaining three-valued weight parameter includes:
The three-valued weight parameter with the initial weight parameter infinite approach can be used.Can be specifically to define one Scale factor and three-valued weight parameter are simultaneously carried out dot product calculating by individual scale factor, are obtained initial weight parameter and are calculated with dot product As a result the distance between, this apart from it is minimum when, obtain current three-valued weight parameter.
The initial weight parameter can be subjected to three-valued calculating according to following calculation formula:
Wherein, α >=0, wt∈{-1,0,1}
Wherein w is the initial weight parameter, wtFor unknown three-valued weight parameter, α is scale factor, for wt It is normalized, J (w;α,wt) represent the Euclidean distance of initial weight parameter and three-valued weight parameter;
Find scale factor so that during the Euclidean distance minimum of initial weight parameter and three-valued weight parameter, calculating obtains Obtain three-valued weight parameter;
Without loss of generality, it is assumed that w is the vector in n-dimensional space, i.e. w ∈ Rn.Use wiWithRepresent w and w respectivelytI-th Tie up component.Formula (A1) is deployed, obtains expansion formula:
T represents transposition.To α and wtDerivation obtains:
Make the local derviation formula in formula (A3) be equal to 0, obtain
In formula (A4) α andBetween interdepend, and α > 0,It can be made by the two limitations With alternative manner, obtain when the formula (A1) is obtained minimum valueWith α optimal solution:
That is, α*WithWhen substituting into formula (A1), the value that formula (A1) obtains is minimum.Due toIn data all make Preserved with floating type, and floating data takes less than 0 in computing systems, therefore a less threshold value eps is set, Eps be one close to 0, but not equal to 0 number, for example, can be eps=1e-6.In this case,Optimal solution It is approximately:
Wherein, eps is weight threshold, wiFor i-th of weighted data in initial weight parameter.
CurrentAs need the three-valued weight parameter obtained.
In the present embodiment, using the three-valued weight obtained and the training effect of the initial weight parameter is closest Feature, improve the training precision of neural network model.So that the training effect of neural network model is more preferable.
As another embodiment, multiple training samples can be used to carry out the training of neural network model, step 101 In, the acquisition training sample;Wherein, the training sample includes input data and label corresponding with the input data Data.The training sample can include multiple.
In step 104, the input data using the training sample and the three-valued weight parameter, calculate defeated Going out data can include:
Using the input data in one group of training sample and the three-valued weight parameter, output data is calculated;
In step 107, the three-valued weight using after the input data of the training sample and adjustment is joined Number;Recalculate output data, return it is described by the output data with the label data compared with, acquisition comparative result The step of continue executing with and can include:
Utilize the three-valued weight parameter after the input data for one group of training sample not being trained and adjustment; Recalculate output data, return it is described by the output data compared with the label data, obtain comparative result Step continues executing with.
In the above-described embodiments, the three-valued weight training of neural network model is carried out using most training samples, will be made The neural network model trained is adapted to a variety of different test samples, can improve neutral net mould to a certain extent The computational accuracy of type, improve the degree of accuracy of whole neural network model.
As shown in Fig. 2 for a kind of one embodiment structural representation of neural metwork training device of the present invention, the device can With including following module:
First acquisition module 201:For obtaining training sample;Wherein, the training sample include input data and with Label data corresponding to the input data;
First generation module 202:For generating the initial weight parameter of neural network model at random;
First computing module 203:For the initial weight parameter to be carried out into three-valued calculating, three-valued weight ginseng is obtained Number;
Preferably, first computing module can include:
First comparing unit, for each weighted data in the initial weight parameter to be compared with weight threshold Compared with;
First acquisition unit, for when the weighted data is more than the weight threshold, the weighted data three to be worth Turn to numerical value 1, when the weighted data is equal to the weight threshold, by the weighted data it is three-valued be numerical value 0, described When weighted data is less than the weight threshold, by the weighted data it is three-valued be numerical value -1, obtain three-valued weight parameter.
Second computing module 204:Input data and the three-valued weight parameter for the utilization training sample, Calculate output data;
First judge module 205:For judging whether the output data and the label data meet loss condition;
First judge module can include:
4th computing unit, for calculating the distance of the output data and the label data;
First judging unit, for judging whether the distance is more than pre-determined distance.
First adjusting module 206:For when the output data and the label data meet loss condition, adjusting institute State three-valued weight parameter;
3rd computing module 207:For described three-valued after the input data using the training sample and adjustment Weight parameter;Recalculate output data, return it is described by the output data with the label data compared with, acquisition ratio Continued executing with compared with the step of result;
Second acquisition module 208:For when the output data and the label data are unsatisfactory for loss condition, by institute State target weight parameter of the three-valued weight parameter as the neural network model.
Said apparatus can realize the training that neural network model is carried out using three-valued weight parameter, obtain very fast Calculating speed and higher computational efficiency so that while computational efficiency is improved improve neural network model calculating essence Degree.
As another embodiment, in order to further improve the accuracy of neural network model, first computing module It can also include:
First computing unit, for the initial weight parameter to be carried out into three-valued calculating according to following calculation formula;
First computing unit, it specifically can be used for calculating according to below equation so that European measure formulas J (w;α, wt) obtain minimum value when α and wt
s.t.α≥0,wt∈{-1,0,1}
Wherein w is the initial weight parameter, wtFor unknown three-valued weight parameter, α is scale factor, for wt It is normalized, J (w;α,wt) it is three-valued weight parameter;
Find scale factor so that during the Euclidean distance minimum of initial weight parameter and three-valued weight parameter, calculating obtains Obtain three-valued weight parameter;
Wherein, eps is weight threshold, wiFor i-th of weighted data in initial weight parameter.In the present embodiment, adopt With the three-valued weight feature closest with the training effect of the initial weight parameter is obtained, neural network model is improved Training precision.So that the training effect of neural network model is more preferable.
As another embodiment, multiple training samples can be used to carry out the training of neural network model, in the implementation In example:
First acquisition module can include:
First acquisition unit, for obtaining at least one set of training sample;Wherein, the training sample include input data with And label data corresponding with the input data;
Preferably, second computing module includes:
Second computing unit, for utilizing input data and the three-valued weight parameter in one group of training sample, Calculate output data;
Preferably, the 3rd computing module includes:
3rd computing unit, after using the input data for being put into one group of training sample and adjustment is not trained The three-valued weight parameter;Output data is recalculated, is returned described by the output data and label data progress The step of comparing, obtaining comparative result continues executing with.
In the above-described embodiments, the three-valued weight training of neural network model is carried out using most training samples, will be made The neural network model trained is adapted to a variety of different test samples, can improve neutral net mould to a certain extent The computational accuracy of type, improve the degree of accuracy of whole neural network model.
Some preferred embodiments of the application have shown and described in described above, but as previously described, it should be understood that the application Be not limited to form disclosed herein, be not to be taken as the exclusion to other embodiment, and available for various other combinations, Modification and environment, and above-mentioned teaching or the technology or knowledge of association area can be passed through in application contemplated scope described herein It is modified., then all should be in this Shen and the change and change that those skilled in the art are carried out do not depart from spirit and scope Please be in the protection domain of appended claims.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net Network interface and internal memory.
Internal memory may include computer-readable medium in volatile memory, random access memory (RAM) and/or The forms such as Nonvolatile memory, such as read-only storage (ROM) or flash memory (flash RAM).Internal memory is computer-readable medium Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer-readable instruction, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moved State random access memory (DRAM), other kinds of random access memory (RAM), read-only storage (ROM), electric erasable Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read-only storage (CD-ROM), Digital versatile disc (DVD) or other optical storages, magnetic cassette tape, the storage of tape magnetic rigid disk or other magnetic storage apparatus Or any other non-transmission medium, the information that can be accessed by a computing device available for storage.Define, calculate according to herein Machine computer-readable recording medium does not include non-temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
Some vocabulary has such as been used to censure specific components among specification and claim.Those skilled in the art should It is understood that hardware manufacturer may call same component with different nouns.This specification and claims are not with name The difference of title is used as the mode for distinguishing component, but is used as the criterion of differentiation with the difference of component functionally.Such as logical The "comprising" of piece specification and claim mentioned in is an open language, therefore should be construed to " include but do not limit In "." substantially " refer in receivable error range, those skilled in the art can be described within a certain error range solution Technical problem, basically reach the technique effect.In addition, " coupling " one word is herein comprising any direct and indirect electric property coupling Means.Therefore, if the first device of described in the text one is coupled to a second device, representing the first device can directly electrical coupling The second device is connected to, or the second device is electrically coupled to indirectly by other devices or coupling means.Specification Subsequent descriptions for implement the application better embodiment, so it is described description be for the purpose of the rule for illustrating the application, It is not limited to scope of the present application.The protection domain of the application is worked as to be defined depending on appended claims institute defender.
It should also be noted that, term " comprising ", "comprising" or its any other variant are intended to nonexcludability Comprising, so that commodity or system including a series of elements not only include those key elements, but also including without clear and definite The other element listed, or also include for this commodity or the intrinsic key element of system.In the feelings not limited more Under condition, the key element that is limited by sentence "including a ...", it is not excluded that in the commodity including the key element or system also Other identical element be present.

Claims (10)

  1. A kind of 1. neural network model training method, it is characterised in that including:
    Obtain training sample;Wherein, the training sample includes input data and number of tags corresponding with the input data According to;
    The initial weight parameter of random generation neural network model;
    The initial weight parameter is subjected to three-valued calculating, obtains three-valued weight parameter;
    Input data and the three-valued weight parameter using the training sample, calculate output data;
    Judge whether the output data meets loss condition with the label data;
    When the output data and the label data meet loss condition, the three-valued weight parameter is adjusted;
    Utilize the three-valued weight parameter after the input data of the training sample and adjustment;Recalculate output number According to, return it is described by the output data with the label data compared with, acquisition comparative result the step of continue executing with;
    When the output data and the label data are unsatisfactory for loss condition, using the three-valued weight parameter as described in The target weight parameter of neural network model.
  2. 2. method according to claim 1, it is characterised in that it is described that the initial weight parameter is subjected to three-valued calculating, Obtaining three-valued weight parameter includes:
    By each weighted data in the initial weight parameter compared with weight threshold;
    When the weighted data is more than the weight threshold, by the weighted data it is three-valued be numerical value 1, in the weight number During according to equal to the weight threshold, by the weighted data it is three-valued be numerical value 0, be less than the weight threshold in the weighted data During value, by the weighted data it is three-valued be data -1, obtain three-valued weight parameter.
  3. 3. method according to claim 1, it is characterised in that it is described that the initial weight parameter is subjected to three-valued calculating, Obtaining three-valued weight parameter includes:
    The initial weight parameter is subjected to three-valued calculating according to following calculation formula;
    <mrow> <msub> <mi>argmin</mi> <mrow> <mo>(</mo> <mi>&amp;alpha;</mi> <mo>,</mo> <msub> <mi>w</mi> <mi>t</mi> </msub> <mo>)</mo> </mrow> </msub> <mi>J</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>;</mo> <mi>&amp;alpha;</mi> <mo>,</mo> <msub> <mi>w</mi> <mi>t</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mo>|</mo> <mo>|</mo> <mi>w</mi> <mo>-</mo> <msub> <mi>&amp;alpha;w</mi> <mi>t</mi> </msub> <mo>|</mo> <msubsup> <mo>|</mo> <mn>2</mn> <mn>2</mn> </msubsup> </mrow>
    Wherein, α >=0, wt∈{-1,0,1}
    Wherein, w is the initial weight parameter, wtFor three-valued weight parameter, α is scale factor, for wtCarry out normalizing Change, J (w;α,wt) represent the Euclidean distance of initial weight parameter and three-valued weight parameter;
    Find scale factor so that during the Euclidean distance minimum of initial weight parameter and three-valued weight parameter, calculate and obtain three Value weight parameter;
    Wherein, eps is weight threshold, wiFor i-th of weighted data in initial weight parameter.
  4. 4. according to the method for claim 1, it is characterised in that the acquisition training sample includes multigroup;
    The input data using the training sample and the three-valued weight parameter, calculating output data includes:
    Using the input data in one group of training sample and the three-valued weight parameter, output data is calculated;
    The three-valued weight parameter using after the input data of the training sample and adjustment;Recalculate output Data, return it is described by the output data with the label data compared with, acquisition comparative result the step of continue executing with Including:
    Utilize the three-valued weight parameter after the input data for one group of training sample not being trained and adjustment;Again Calculate output data, return it is described by the output data with the label data compared with, acquisition comparative result the step of Continue executing with.
  5. 5. method according to claim 1, it is characterised in that described to judge whether are the output data and the label data Meet that loss condition includes:
    Calculate the distance of the output data and the label data;
    Judge whether the distance is more than pre-determined distance.
  6. A kind of 6. neural network model trainer, it is characterised in that including:
    First acquisition module, for obtaining training sample;Wherein, the training sample include input data and with the input Label data corresponding to data;
    First generation module, for generating the initial weight parameter of neural network model at random;
    First computing module, for the initial weight parameter to be carried out into three-valued calculating, obtain three-valued weight parameter;
    Second computing module, for the input data using the training sample and the three-valued weight parameter, calculate defeated Go out data;
    First judge module, for judging whether the output data and the label data meet loss condition;
    First adjusting module, for when the output data and the label data meet loss condition, adjustment described three to be worth Change weight parameter;
    3rd computing module, for the three-valued weight ginseng after the input data using the training sample and adjustment Number;Recalculate output data, return it is described by the output data with the label data compared with, acquisition comparative result The step of continue executing with;
    Second acquisition module, for when the output data and the label data are unsatisfactory for loss condition, described three to be worth Change target weight parameter of the weight parameter as the neural network model.
  7. 7. device according to claim 6, it is characterised in that first computing module includes:
    First comparing unit, for by each weighted data in the initial weight parameter compared with weight threshold;
    First acquisition unit, for when the weighted data is more than the weight threshold, it is by the weighted data is three-valued Numerical value 1, when the weighted data is equal to the weight threshold, by the weighted data it is three-valued be numerical value 0, in the weight When data are less than the weight threshold, by the weighted data it is three-valued be numerical value -1, obtain three-valued weight parameter.
  8. 8. device according to claim 6, it is characterised in that first computing module includes:
    First computing unit, for the initial weight parameter to be carried out into three-valued calculating according to following calculation formula;
    <mrow> <msub> <mi>argmin</mi> <mrow> <mo>(</mo> <mi>&amp;alpha;</mi> <mo>,</mo> <msub> <mi>w</mi> <mi>t</mi> </msub> <mo>)</mo> </mrow> </msub> <mi>J</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>;</mo> <mi>&amp;alpha;</mi> <mo>,</mo> <msub> <mi>w</mi> <mi>t</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mo>|</mo> <mo>|</mo> <mi>w</mi> <mo>-</mo> <msub> <mi>&amp;alpha;w</mi> <mi>t</mi> </msub> <mo>|</mo> <msubsup> <mo>|</mo> <mn>2</mn> <mn>2</mn> </msubsup> </mrow>
    Wherein, α >=0, wt∈{-1,0,1}
    Wherein, w is the initial weight parameter, wtFor three-valued weight parameter, α is scale factor, for wtCarry out normalizing Change, J (w;α,wt) it is three-valued weight parameter;
    Find scale factor so that during the Euclidean distance minimum of initial weight parameter and three-valued weight parameter, calculate and obtain three Value weight parameter;
    Wherein, eps is weight threshold, wiFor i-th of weighted data in initial weight parameter.
  9. 9. device according to claim 6, it is characterised in that the training sample includes multigroup;
    Second computing module includes:
    Second computing unit, for using the input data in one group of training sample and the three-valued weight parameter, calculating Output data;
    3rd computing module includes:
    3rd computing unit, for using described in not being trained after the input data for being put into one group of training sample and adjustment Three-valued weight parameter;Recalculate output data, return it is described by the output data compared with the label data, The step of obtaining comparative result continues executing with.
  10. 10. device according to claim 6, it is characterised in that first judge module includes:
    4th computing unit, for calculating the distance of the output data and the label data;
    First judging unit, for judging whether the distance is more than pre-determined distance.
CN201610320443.8A 2016-05-16 2016-05-16 neural network model training method and device Pending CN107392310A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610320443.8A CN107392310A (en) 2016-05-16 2016-05-16 neural network model training method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610320443.8A CN107392310A (en) 2016-05-16 2016-05-16 neural network model training method and device

Publications (1)

Publication Number Publication Date
CN107392310A true CN107392310A (en) 2017-11-24

Family

ID=60338660

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610320443.8A Pending CN107392310A (en) 2016-05-16 2016-05-16 neural network model training method and device

Country Status (1)

Country Link
CN (1) CN107392310A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107862387A (en) * 2017-12-05 2018-03-30 深圳地平线机器人科技有限公司 The method and apparatus for training the model of Supervised machine learning
CN108647714A (en) * 2018-05-09 2018-10-12 平安普惠企业管理有限公司 Acquisition methods, terminal device and the medium of negative label weight
CN109102017A (en) * 2018-08-09 2018-12-28 百度在线网络技术(北京)有限公司 Neural network model processing method, device, equipment and readable storage medium storing program for executing
CN110211593A (en) * 2019-06-03 2019-09-06 北京达佳互联信息技术有限公司 Audio recognition method, device, electronic equipment and storage medium
CN110782030A (en) * 2019-09-16 2020-02-11 平安科技(深圳)有限公司 Deep learning weight updating method, system, computer device and storage medium
CN111798875A (en) * 2020-07-21 2020-10-20 杭州芯声智能科技有限公司 VAD implementation method based on three-value quantization compression
CN112020721A (en) * 2018-06-15 2020-12-01 富士通株式会社 Training method and device for classification neural network for semantic segmentation, and electronic equipment

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107862387A (en) * 2017-12-05 2018-03-30 深圳地平线机器人科技有限公司 The method and apparatus for training the model of Supervised machine learning
CN107862387B (en) * 2017-12-05 2022-07-08 深圳地平线机器人科技有限公司 Method and apparatus for training supervised machine learning models
CN108647714A (en) * 2018-05-09 2018-10-12 平安普惠企业管理有限公司 Acquisition methods, terminal device and the medium of negative label weight
CN112020721A (en) * 2018-06-15 2020-12-01 富士通株式会社 Training method and device for classification neural network for semantic segmentation, and electronic equipment
CN109102017A (en) * 2018-08-09 2018-12-28 百度在线网络技术(北京)有限公司 Neural network model processing method, device, equipment and readable storage medium storing program for executing
CN109102017B (en) * 2018-08-09 2021-08-03 百度在线网络技术(北京)有限公司 Neural network model processing method, device, equipment and readable storage medium
CN110211593A (en) * 2019-06-03 2019-09-06 北京达佳互联信息技术有限公司 Audio recognition method, device, electronic equipment and storage medium
CN110782030A (en) * 2019-09-16 2020-02-11 平安科技(深圳)有限公司 Deep learning weight updating method, system, computer device and storage medium
WO2021051556A1 (en) * 2019-09-16 2021-03-25 平安科技(深圳)有限公司 Deep learning weight updating method and system, and computer device and storage medium
CN111798875A (en) * 2020-07-21 2020-10-20 杭州芯声智能科技有限公司 VAD implementation method based on three-value quantization compression

Similar Documents

Publication Publication Date Title
CN107392310A (en) neural network model training method and device
US11625594B2 (en) Method and device for student training networks with teacher networks
CN103400577B (en) The acoustic model method for building up of multilingual speech recognition and device
CN110782008B (en) Training method, prediction method and device of deep learning model
US20170084269A1 (en) Subject estimation system for estimating subject of dialog
KR101933916B1 (en) Alternative training distribution data in machine learning
CN106445919A (en) Sentiment classifying method and device
CN110334357A (en) A kind of method, apparatus, storage medium and electronic equipment for naming Entity recognition
CN110569738A (en) natural scene text detection method, equipment and medium based on dense connection network
CN106227721A (en) Chinese Prosodic Hierarchy prognoses system
CN112507039A (en) Text understanding method based on external knowledge embedding
CN108062302A (en) A kind of recognition methods of particular text information and device
CN108959474B (en) Entity relation extraction method
CN108985133B (en) Age prediction method and device for face image
CN112199505B (en) Cross-domain emotion classification method and system based on feature representation learning
Li et al. Text classification method based on convolution neural network
JP2020098592A (en) Method, device and storage medium of extracting web page content
WO2022216462A1 (en) Text to question-answer model system
CN112988964B (en) Text prosody boundary prediction method, device, equipment and storage medium
CN116610795B (en) Text retrieval method and device
WO2022251720A1 (en) Character-level attention neural networks
CN110826726B (en) Target processing method, target processing device, target processing apparatus, and medium
CN114091458A (en) Entity identification method and system based on model fusion
CN113139382A (en) Named entity identification method and device
Hasan et al. Fault Occurrence Detection and Classification of Fault Type in Electrical Power Transmission Line with Machine Learning Algorithms.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20171124

Assignee: Apple R&D (Beijing) Co.,Ltd.

Assignor: BEIJING MOSHANGHUA TECHNOLOGY Co.,Ltd.

Contract record no.: 2019990000054

Denomination of invention: Neural network model training method and apparatus

License type: Exclusive License

Record date: 20190211

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20171124