CN108320026A - Machine learning model training method and device - Google Patents
Machine learning model training method and device Download PDFInfo
- Publication number
- CN108320026A CN108320026A CN201710344182.8A CN201710344182A CN108320026A CN 108320026 A CN108320026 A CN 108320026A CN 201710344182 A CN201710344182 A CN 201710344182A CN 108320026 A CN108320026 A CN 108320026A
- Authority
- CN
- China
- Prior art keywords
- sample data
- order
- loss function
- average gradient
- epicycle
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Abstract
The present invention relates to a kind of machine learning model training method and devices, including:It obtains and has purified sample data before epicycle cleans dirty sample data;According to the "current" model parameter of existing purified sample data and machine learning model, the first second order average gradient of the loss function of the model is determined;According to epicycle from the purified sample data obtained after the dirty sample data cleaning in part and "current" model parameter is taken in dirty sample data, the second second order average gradient of loss function is determined;According to the first second order average gradient and the second second order average gradient, the whole second order average gradient of loss function is obtained;"current" model parameter is adjusted according to whole second order average gradient;If the model parameter after adjustment is unsatisfactory for training termination condition, using next round as epicycle, returns to obtain and continue to train the step of having purified sample data before epicycle cleans dirty sample data, until meeting training termination condition.Reduce the newer number of iteration, and then reduces loss of the iteration update to machine resources.
Description
Technical field
The present invention relates to field of computer technology, more particularly to a kind of machine learning model training method and device.
Background technology
The process of machine learning typically refers to sample data of the computer according to input, by series of algorithms to input
Sample data analyzed, to build initial model, and update by repetitive exercise the model parameter of initial model, with
To final suitable model.
In conventional method, it is updated to model parameter by gradient descent method (gradient descent).Its
In, using gradient descent method update model parameter when, the gradient of loss function can be calculated, according to the gradient come to model parameter into
Row iteration updates, and model is carried out gradually convergence to improve the accuracy rate of model.
However, traditional method for being updated model parameter based on gradient descent method, each iteration put forward the accuracy rate of model
High level is smaller, needs the newer number of iteration relatively more, it is then desired to expend in machine more resource to be iterated more
Newly.
Invention content
Based on this, it is necessary to when for updating model parameter currently based on gradient descent method, need to expend more in machine
Resource the technical issues of being iterated update, provides a kind of machine learning model training method and device.
A kind of machine learning model training method, including:
Obtain the existing purified sample data before epicycle cleans dirty sample data;
According to the "current" model parameter of the existing purified sample data and machine learning model, the engineering is determined
Practise the first second order average gradient of the loss function of model;
Epicycle is obtained from the purified sample data for taking the dirty sample data in part to be obtained after cleaning in dirty sample data;
The purified sample data and the "current" model parameter cleaned according to epicycle, determine the loss function
The second second order average gradient;
According to the first second order average gradient and the second second order average gradient, the entirety of the loss function is obtained
Second order average gradient;
The "current" model parameter is adjusted according to the whole second order average gradient;
When model parameter after adjustment is unsatisfactory for training termination condition, using next round as epicycle, the acquisition is returned to
To continue to train the step of existing purified sample data before the dirty sample data of epicycle cleaning, until the model parameter after adjustment
Meet training termination condition.
A kind of machine learning model training device, including:
Sample data acquisition module, for obtaining the existing purified sample data before epicycle cleans dirty sample data;
Second order average gradient determining module, for working as according to the existing purified sample data and machine learning model
Preceding model parameter determines the first second order average gradient of the loss function of the machine learning model;
The sample data acquisition module is additionally operable to acquisition epicycle and takes the dirty sample data cleaning in part from dirty sample data
The purified sample data obtained afterwards;
The second order average gradient determining module is additionally operable to the purified sample data cleaned according to epicycle and institute
"current" model parameter is stated, determines the second second order average gradient of the loss function;According to the first second order average gradient and
The second second order average gradient obtains the whole second order average gradient of the loss function;
Model parameter adjusts module, for adjusting the "current" model parameter according to the whole second order average gradient;When
When model parameter after adjustment is unsatisfactory for training termination condition, using next round as epicycle, the sample data is notified to obtain mould
Block works, until the model parameter after adjustment meets training termination condition.
Above-mentioned machine learning model training method and device, according to the first second order of existing clean data counting loss function
Average gradient, and the second second order average gradient of purified sample data counting loss function for being cleaned by epicycle, into
And obtain under "current" model parameter, the whole second order average gradient of loss function, it is average by the whole second order of loss function
Gradient to "current" model parameter is updated adjustment.Wherein, according to second order average gradient come update model parameter to model into
The convergent speed of row carries out convergent speed faster than gradient descent method to model, and the newer number of required iteration will subtract
It is few, and then reduce the loss in model parameter renewal process to machine resources.
Description of the drawings
Fig. 1 is the internal structure schematic diagram of electronic equipment in one embodiment;
Fig. 2 is the flow diagram of machine learning model training method in one embodiment;
Fig. 3 is that the second second order average gradient of loss function in one embodiment determines the flow diagram of step;
Fig. 4 is the flow diagram of machine learning model training method in another embodiment;
Fig. 5 is the structural schematic diagram of machine learning model training device in one embodiment;
Fig. 6 is the structural schematic diagram of machine learning model training device in another embodiment.
Specific implementation mode
In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.
Fig. 1 is the internal structure schematic diagram of electronic equipment in one embodiment.The electronic equipment can be terminal or clothes
Business device.Terminal can be personal computer or mobile electronic device, and mobile electronic device includes mobile phone, tablet computer, individual
At least one of digital assistants or Wearable etc..Server can be taken with the either multiple physics of independent server
The server cluster of business device composition is realized.As shown in Figure 1, the electronic equipment include the processor connected by system bus,
Non-volatile memory medium, built-in storage and network interface.Wherein, the non-volatile memory medium of the electronic equipment can store behaviour
Make system and computer-readable instruction, which is performed, and processor may make to execute a kind of machine learning
Model training method.The processor of the electronic equipment supports the operation of entire electronic equipment for providing calculating and control ability.
The built-in storage of the electronic equipment can store computer-readable instruction, when which is executed by processor, can make
It obtains processor and executes a kind of machine learning model training method.The network interface of the electronic equipment is led to for connecting network
Letter.It will be understood by those skilled in the art that structure shown in Fig. 1, only with the relevant part-structure of application scheme
Block diagram, does not constitute the restriction for the electronic equipment being applied thereon to application scheme, and specific electronic equipment may include
Than more or fewer components as shown in the figure, either combines certain components or arranged with different components.
Fig. 2 is the flow diagram of machine learning model training method in one embodiment.The present embodiment is mainly with the party
Method is illustrated applied to the electronic equipment in above-mentioned Fig. 1.With reference to Fig. 2, which specifically includes
Following steps:
S202 obtains the existing purified sample data before epicycle cleans dirty sample data.
Specifically, electronic equipment can train to obtain machine learning model according to the progress machine learning of whole sample datas
Initial model, and whole sample datas are carried out with the cleaning of dirty sample data by wheel, to initial model receive by wheel adjustment
It holds back, improves the accuracy rate of model.Wherein, convergence is adjusted to model, can be adjusted by the model parameter to the model
It is whole to realize.
During every wheel cleans dirty sample data, electronic equipment can obtain before epicycle cleans dirty sample data
Some purified sample data.In one embodiment, existing purified sample data can clean dirty sample data in epicycle
Before, the satisfactory sample data that cleaning obtains is had already passed through.Dirty sample data, can be in whole sample datas not yet
Cleaned sample data.
For example, before first round cleaning, dirty sample data can be whole sample datas 100, and there is no existing pure
Sample data, the first round clean 10 dirty sample datas, obtain the purified sample data after 10 cleanings, then clear in the second wheel
Before washing, it is this 10 purified sample data obtained through over cleaning to have clean data then, and dirty sample data is then 100-10=
90.
S204 determines machine learning according to the "current" model parameter of existing purified sample data and machine learning model
First second order average gradient of the loss function of model.
Wherein, "current" model parameter, before referring to epicycle progress model parameter adjustment, the model parameter of machine learning model.Damage
Function is lost, the inconsistent degree of predicted value and actual value for evaluating machine learning model, loss function value is smaller, engineering
The performance for practising model is better.
Gradient is a vector, for indicating that loss function value changes maximum direction and the maximum of loss function value becomes
Rate.Second order gradient refers to the maximum of the loss function value obtained according to the second dervative or approximate second derivative of loss function and becomes
Change direction and maximum rate of change.Wherein, approximate second derivative, refer to it is that loss function is obtained by non-secondary derivation, with to damage
It loses function and carries out the derivative that the second dervative that secondary derivation obtains is similar in gradient attribute.
First second order average gradient of loss function refers to and is had respectively according at least one before the dirty sample data of epicycle cleaning
Purified sample data, at least one second order of the loss function of the machine learning model sought under "current" model parameter
The average value of gradient.
S206 obtains epicycle from the purified sample data for taking the dirty sample data in part to be obtained after cleaning in dirty sample data.
Wherein, dirty sample data is removed before epicycle cleans dirty sample data before referring to epicycle cleaning, in whole sample datas
Sample data except some purified sample data, it will be understood that dirty sample data here refers to current all dirty sample data.
For example, whole sample datas have 100, existing purified sample data have 20 before epicycle cleans dirty sample data
Item, then, dirty sample data is then 100-20=80 items.The dirty sample data in part, refer to from the dirty sample data of current whole by
A part of dirty sample data extracted according to preset rules.For example, extracting 10 dirty sample datas, institute from 80 dirty sample datas
The 10 dirty sample datas extracted are the dirty sample data in part.
The dirty sample data in part that electronic equipment can extract epicycle from dirty sample data is cleaned, and epicycle is obtained
Purified sample data after cleaning.Electronic equipment can also be obtained to epicycle directly from sample data cleaning equipment from dirty sample
The purified sample data obtained after the part sample data cleaning extracted in data.
S208, the purified sample data and "current" model parameter cleaned according to epicycle, determines the second of loss function
Second order average gradient.
Wherein, second order gradient refers to the loss function value obtained according to the second dervative or approximate second derivative of loss function
Maximum change direction and maximum rate of change.Wherein, approximate second derivative refers to and is obtained by non-secondary derivation to loss function
, with to loss function carry out second dervative that secondary derivation obtains derivative close in gradient attribute.Loss function
Second second order average gradient refers to the purified sample data cleaned respectively according to epicycle, the machine learning model sought
The average value of at least one second order gradient of the loss function under "current" model parameter.
S210, according to the first second order average gradient and the second second order average gradient, the whole second order for obtaining loss function is flat
Equal gradient.
Specifically, electronic equipment can be weighted average meter to the first second order average gradient and the second second order average gradient
It calculates, obtains the whole second order average gradient of loss function.
In one embodiment, step S210 includes:By the first second order average gradient and the second second order average gradient, respectively
It sums according to corresponding first weight and the second Weight, obtains the whole second order average gradient of loss function.Wherein, first
Weight cleans the accounting that existing purified sample data before dirty sample data account for whole sample datas for epicycle;Second weight is
The dirty sample data that epicycle is cleaned before dirty sample data accounts for the accounting of whole sample datas.
Wherein, epicycle cleans the dirty sample data before dirty sample data, before referring to the dirty sample data of epicycle cleaning, whole samples
Sample data in data other than existing purified sample data before epicycle cleans dirty sample data.
In one embodiment, the whole second order average gradient of loss function can be calculated according to following formula:
Wherein, g (θ) is the whole second order average gradient of loss function, RcleanFor before epicycle cleans dirty sample data
Some clean datas, R are whole sample datas, | Rclean| for the number of the existing clean data before epicycle cleans dirty sample data
Amount, | R | for whole sample data volumes, gc(θ) is the first second order average gradient of loss function, and c is the abbreviation of clean, is used for
It marks epicycle and cleans existing clean data before dirty sample data;RdirtyThe dirty sample number before dirty sample data is cleaned for epicycle
According to, | Rdirty| for the quantity of the existing clean data before epicycle cleans dirty sample data, gs(θ) is the two or two of loss function
Rank average gradient, s are the abbreviations of sample.Wherein, | R |=| Rdirty|+|Rclean|。For the first weight,For the second weight.
S212 adjusts "current" model parameter according to whole second order average gradient.
Wherein, "current" model parameter is adjusted according to whole second order average gradient, referred to flat along the whole second order of loss function
Equal gradient descent direction is the value for declining length adjustment "current" model parameter with whole second order average gradient value, so that loss letter
Numerical value is reduced with maximum rate of change.
Electronic equipment individually can adjust "current" model parameter according to whole second order average gradient, can also be according to study speed
Rate and whole second order average gradient adjust "current" model parameter.When current according to learning rate and whole second order average gradient adjustment
When model parameter, electronic equipment can be the whole second order average gradient descent direction along loss function, with learning rate and
The product of whole second order average gradient value is to decline length, to adjust the value of "current" model parameter, so that loss function value is with most
Big change rate reduces.
Wherein, learning rate, the gradient for regulation loss function decline stride.Learning rate can be fixed value,
It can be the dynamic value that respective change is carried out during adjusting model parameter.
In one embodiment, "current" model parameter can be adjusted according to following formula:
Wherein, θnewFor the model parameter after adjustment, θ(d)For "current" model parameter, γ is learning rate,For
Loss function is in "current" model parameter θ(d)Under whole second order average gradient.
S214 when the model parameter after adjustment is unsatisfactory for training termination condition, using next round as epicycle, returns to S202
To continue to train, until the model parameter after adjustment meets training termination condition.
Wherein, training termination condition can be that the number of iteration cleaning reaches preset times.Specifically, electronic equipment can
To judge whether the number of iteration cleaning reaches preset times, if so, the model parameter after judgement adjustment meets training and terminates
Condition.Training termination condition can also be that after carrying out cleaning update, the change rate of loss function value is in a preset range
It is interior.Specifically, electronic equipment may determine that the change rate of loss function value whether in a preset range, if so, can sentence
The model parameter for setting the tone whole meets training termination condition.
Above-mentioned machine learning model training method, according to the average ladder of the first second order of existing clean data counting loss function
Degree, and the second second order average gradient of purified sample data counting loss function for being cleaned by epicycle, and then obtain
Under "current" model parameter, the whole second order average gradient of loss function, by the whole second order average gradient of loss function come
Adjustment is updated to "current" model parameter.Wherein, model is restrained to update model parameter according to second order average gradient
Speed, convergent speed is carried out faster to model than gradient descent method, the newer number of required iteration will be reduced, in turn
Reduce the loss of machine resources in model parameter renewal process.
In addition, the purified sample data that electronic equipment is cleaned by existing purified sample data and epicycle;To ask
The whole second order average gradient for taking loss function, ensure that and adjust model parameter based on pure sample data, avoid base
In the erroneous effects that clean data and the blended data of dirty data adjustment model parameter are brought, the standard of model parameter adjustment is improved
True rate.
In one embodiment, step S204 includes:By the current of existing purified sample data and machine learning model
Model parameter substitutes into loss function;It seeks substituting into the first of the loss function of existing purified sample data and "current" model parameter
First-order partial derivative and the first second order local derviation matrix;According to the first second order local derviation inverse of a matrix matrix and the first first-order partial derivative, really
Determine the first second order average gradient of the loss function of machine learning model.
Wherein, the first first-order partial derivative refers to when model parameter is "current" model parameter, and the substitution sought is existing pure
The derivative of the loss function of sample data.First second order local derviation matrix refers to when model parameter is "current" model parameter, to substituting into
The loss function of existing purified sample data carries out the derivative that secondary derivation obtains.Wherein, secondary ask is carried out to loss function
The derivative led is matrix.Secondary derivation differentiates again to the first first-order partial derivative of loss function.First second order local derviation
Inverse of a matrix matrix can be that the progress inverse operation of electronic equipment pair the first second order local derviation matrix obtains inverse matrix.First Second Order Partial
Inverse of a matrix matrix is led, can also be the approximate matrix of the first second order local derviation inverse of a matrix matrix.Existing purified sample data
It is one or more.
When existing purified sample data are one, electronic equipment will can directly be based on the existing purified sample number
It is flat as the first second order of loss function according to the first first-order partial derivative sought and the first second order local derviation inverse of a matrix matrix product
Equal gradient.
When existing purified sample data are a plurality of, electronic equipment can then be sought existing pure based on each item respectively
The product of the first first-order partial derivative and the first second order local derviation inverse of a matrix matrix that sample data obtains obtains the of loss function
One second order divides gradient, then seeks the average value of multiple first second order point gradients, obtains the average ladder of the first second order of loss function
Degree.
In one embodiment, step S204 includes:According to the loss function of following formula computing machine learning model
First second order average gradient:
Wherein, gc(θ) is the first second order average gradient;C is the abbreviation of clean, indicates average for calculating the first second order
The sample data of gradient is pure data;RcleanFor existing purified sample data;φ () indicates loss function;H(φ())
Indicate the second order local derviation matrix of loss function;For i-th of input data in existing purified sample data;It is existing
I-th of output data in purified sample data;θ is "current" model parameter;It is the of loss function
One second order local derviation inverse of a matrix matrix;For the first first-order partial derivative of loss function;It is then that the first second order divides gradient.
In one embodiment, step S208 includes:Purified sample data and the "current" model ginseng that epicycle is cleaned
Number substitutes into loss function;Seek substituting into the of the loss function of purified sample data and "current" model parameter that epicycle is cleaned
Two first-order partial derivatives and the second second order local derviation matrix;According to the second second order local derviation inverse of a matrix matrix and the second first-order partial derivative,
Determine the second second order average gradient of loss function.
Wherein, the second first-order partial derivative refers to when model parameter is "current" model parameter, and the substitution epicycle sought is cleaned
The derivative of the loss function of the purified sample data arrived.Second second order local derviation matrix, it is "current" model parameter to refer in model parameter
When, the loss function to substituting into the purified sample data that epicycle is cleaned carries out the derivative that secondary derivation obtains, wherein to damage
It is matrix to lose function and carry out the derivative that secondary derivation obtains.Wherein, secondary derivation is the second first-order partial derivative to loss function
It differentiates again.Wherein, the second second order local derviation inverse of a matrix matrix can be that the progress of electronic equipment pair the second second order local derviation matrix is inverse
Operation obtains inverse matrix.Second second order local derviation inverse of a matrix matrix can also be the close of the second second order local derviation inverse of a matrix matrix
Like matrix.Wherein, the purified sample data that epicycle is cleaned are one or more.
When the purified sample data that epicycle is cleaned are one, electronic equipment will directly can be cleaned based on the epicycle
The product for the second first-order partial derivative and the second second order local derviation inverse of a matrix matrix that obtained purified sample data are sought is as damage
Lose the second second order average gradient of function.
When the purified sample data that epicycle is cleaned are a plurality of, electronic equipment can then be sought being based on each item sheet respectively
The product for the second first-order partial derivative and the second second order local derviation inverse of a matrix matrix that the purified sample data that wheel cleaning obtains obtain,
The second second order point gradient of loss function is obtained, the average value of multiple second second order point gradients is then sought, obtains loss function
The second second order average gradient.
Fig. 3 is to determine damage according to the second second order local derviation inverse of a matrix matrix and the second first-order partial derivative in one embodiment
The flow of the step of losing the second second order average gradient of function (the second second order average gradient of abbreviation loss function determines step)
Schematic diagram.As shown in figure 3, the second second order average gradient of loss function determines that step specifically includes following steps:
S302 obtains the sampling probability for the corresponding dirty sample data of each purified sample data that epicycle is cleaned.
Wherein, when the dirty sample data of extraction section is cleaned from all dirty sample datas, all in dirty sample data
Each dirty sample data both correspond to a probability that can be extracted, which is sampling probability.Wherein, sampling probability
To dirty sample data is cleaned after it is directly proportional to the promotion degree of model accuracy rate.For example, dirty sample data d1 is corresponded to
Sampling probability be 60%, the corresponding sampling probability of dirty sample data d2 is 50%, then, it is right after being cleaned to dirty sample data d1
The promotion degree of model accuracy rate wants high after comparing dirty sample data d2 cleanings to the promotion degree of model accuracy rate.
S304 seeks corresponding second second order local derviation matrix for every purified sample data that epicycle is cleaned
The ratio of inverse matrix and the product of the second first-order partial derivative and the sampling probability of corresponding dirty sample data.
Wherein, the corresponding second second order local derviation inverse of a matrix matrix of every purified sample data being cleaned with epicycle and
Second first-order partial derivative refers to every purified sample data that epicycle is cleaned substituting into loss function respectively, and that seeks is current
The the second second order local derviation inverse of a matrix matrix and the second first-order partial derivative of loss function under model parameter.For example, epicycle is cleaned
The second second order local derviation inverse of a matrix matrix corresponding to the purified sample data c1 arrived and the second first-order partial derivative, as by sample
Data c1 substitutes into loss function, the second second order local derviation inverse of a matrix matrix of loss function and the under the "current" model parameter sought
Two first-order partial derivatives.
Dirty sample data corresponding with every purified sample data that epicycle is cleaned, refer to cleaned with epicycle it is every
Purified sample data are in the front and back dirty data with state transformational relation of epicycle cleaning.Dirty sample number before being cleaned to epicycle
According to being cleaned, the corresponding purified sample data after epicycle cleaning are obtained.For example, epicycle carries out clearly dirty sample data d1
It washes, obtains purified sample data c1, then d1 is then dirty sample data corresponding with the purified sample data c1 that epicycle is cleaned.
Specifically, electronic equipment is obtaining the second Second Order Partial corresponding to every purified sample data that epicycle is cleaned
When leading inverse of a matrix matrix and the second first-order partial derivative, the second second order local derviation inverse of a matrix matrix can be sought and the second single order is inclined
The product of derivative, and seek the sampling of the dirty sample data corresponding to the purified sample data that the product is cleaned with epicycle
The ratio of probability obtains at least one ratio.
S306 seeks the average value of each ratio, obtains the second second order average gradient of loss function.
Specifically, electronic equipment can take the quantity of the dirty sample data in part according to epicycle from dirty sample data, seek
The average value of each ratio obtains the second second order average gradient of loss function.
In one embodiment, step S306 includes:The second second order that the loss function is calculated according to following formula is flat
Equal gradient:
Wherein, gs(θ) is the second second order average gradient;S is the purified sample data that epicycle is cleaned;C is clean
Abbreviation indicates that the sample data for calculating the second second order average gradient is pure data;P (i) is i-th that epicycle extracts
The sampling probability of dirty sample data;φ () indicates loss function;H (φ ()) indicates the second order local derviation matrix of loss function;
I-th of input data in the purified sample data cleaned for epicycle;In the purified sample data cleaned for epicycle
I-th of output data;θ is "current" model parameter;For the second second order local derviation square of loss function
The inverse matrix of battle array;For the second first-order partial derivative of loss function.
In the present embodiment, for every purified sample data that epicycle is cleaned, corresponding second second order local derviation is sought
The ratio of the product of inverse of a matrix matrix and the second first-order partial derivative and the sampling probability of corresponding dirty sample data, to seek damaging
Lose the second second order average gradient of function.Wherein, sampling probability and after being cleaned to dirty sample data to model accuracy rate
Promotion degree it is directly proportional.I.e. in the second second order average gradient for seeking loss function, increase clear to dirty sample data
Considering to the promotion degree of model accuracy rate after wash clean so that the second second order average gradient of the loss function sought is more
It is accurate.
In one embodiment, dirty sample data includes that user characteristics sample data and the user accordingly demarcated draw a portrait and mark
Label.This method further includes:After model parameter after adjustment meets training termination condition, then user characteristic data is obtained, will be used
Family characteristic inputs the machine learning model of modulated mould preparation shape parameter, output user's portrait label.
Wherein, user's portrait is can to reflect user characteristics according to user's social property, living habit and consumer behavior etc.
Data and the user model of a labeling that takes out.User's portrait label, is analyzed by user information
Highly refined signature identification.
User characteristics sample data refers to the sample data of characterization user characteristics.In one embodiment, user characteristics sample
Data include the data such as social property, living habit and the consumer behavior of user.
After model parameter after adjustment meets training termination condition, you can to obtain the user's portrait machine for meeting demand
Learning model.User characteristic data is obtained, user characteristic data is inputted to user's portrait machine learning of modulated mould preparation shape parameter
Model then exports user's portrait label corresponding with the user characteristic data.According to the model parameter for meeting training termination condition
Corresponding user draws a portrait machine learning model to export user's portrait label corresponding with user characteristic data, can improve output
User draw a portrait label accuracy rate.
In one embodiment, as shown in figure 4, providing another machine learning model training method, this method includes
Following steps:
S402 obtains the existing purified sample data before epicycle cleans dirty sample data.
The "current" model parameter of existing purified sample data and machine learning model is substituted into loss function by S404.
S406 seeks the first single order local derviation for substituting into the loss function of existing purified sample data and "current" model parameter
Number and the first second order local derviation matrix.
S408 determines machine learning model according to the first second order local derviation inverse of a matrix matrix and the first first-order partial derivative
First second order average gradient of loss function.
It in one embodiment, can be flat according to the first second order of the loss function of following formula computing machine learning model
Equal gradient:
Wherein, gc(θ) is the first second order average gradient;C is the abbreviation of clean, indicates average for calculating the first second order
The sample data of gradient is pure data;RcleanFor existing purified sample data;φ () indicates loss function;H(φ())
Indicate the second order local derviation matrix of loss function;For i-th of input data in existing purified sample data;It is existing
I-th of output data in purified sample data;θ is "current" model parameter;It is the of loss function
One second order local derviation inverse of a matrix matrix;For the first first-order partial derivative of loss function.
S410, obtains epicycle from taking the purified sample data obtained after the dirty sample data cleaning of part in dirty sample data,
Wherein, dirty sample data includes user characteristics sample data and the user accordingly demarcated portrait label.
S412, the purified sample data that epicycle is cleaned and "current" model parameter substitute into loss function.
S414 seeks substituting into the second of the loss function of purified sample data and "current" model parameter that epicycle is cleaned
First-order partial derivative and the second second order local derviation matrix.
S416 obtains the sampling probability for the corresponding dirty sample data of each purified sample data that epicycle is cleaned.
S418 seeks corresponding second second order local derviation matrix for every purified sample data that epicycle is cleaned
The ratio of inverse matrix and the product of the second first-order partial derivative and the sampling probability of corresponding dirty sample data.
S420 seeks the average value of each ratio, obtains the second second order average gradient of loss function.
It in one embodiment, can be according to the second second order average gradient of following formula counting loss function:
Wherein, gs(θ) is the second second order average gradient;S is the purified sample data that epicycle is cleaned;C is clean
Abbreviation indicates that the sample data for calculating the second second order average gradient is pure data;P (i) is i-th that epicycle extracts
The sampling probability of dirty sample data;φ () indicates loss function;H (φ ()) indicates the second order local derviation matrix of loss function;
I-th of input data in the purified sample data cleaned for epicycle;In the purified sample data cleaned for epicycle
I-th of output data;θ is "current" model parameter;For the second second order local derviation matrix of loss function
Inverse matrix;For the second first-order partial derivative of loss function.
S422, by the first second order average gradient and the second second order average gradient, respectively according to corresponding first weight and
Two Weights are summed, and the whole second order average gradient of loss function is obtained.
Wherein, the first weight, existing purified sample data account for whole sample datas before cleaning dirty sample data for epicycle
Accounting.Second weight, the dirty sample data cleaned for epicycle before dirty sample data account for the accounting of whole sample datas.
In one embodiment, the whole second order average gradient of loss function can be calculated according to following formula:
Wherein, g (θ) is the whole second order average gradient of loss function, RcleanFor before epicycle cleans dirty sample data
Some clean datas, R are whole sample datas, | Rclean| for the number of the existing clean data before epicycle cleans dirty sample data
Amount, | R | for whole sample data volumes, gc(θ) is the first second order average gradient of loss function, and c is the abbreviation of clean, is used for
It marks epicycle and cleans existing clean data before dirty sample data;RdirtyThe dirty sample number before dirty sample data is cleaned for epicycle
According to, | Rdirty| for the quantity of the existing clean data before epicycle cleans dirty sample data, gs(θ) is the two or two of loss function
Rank average gradient, s are the abbreviations of sample.Wherein, | R |=| Rdirty|+|Rclean|。For the first weight,For the second weight.
S424 adjusts "current" model parameter according to whole second order average gradient.
In one embodiment, "current" model parameter can be adjusted according to following formula:
Wherein, θnewFor the model parameter after adjustment, θ(d)For "current" model parameter, γ is learning rate,For
Loss function is in "current" model parameter θ(d)Under whole second order average gradient.
S426, judges whether the model parameter after adjustment meets trained termination condition, if it is not, then using next round as this
Wheel, return to step S402, if so, entering step S428.
S428 obtains user characteristic data, user characteristic data is inputted to the machine learning model of modulated mould preparation shape parameter,
Export user's portrait label.
Above-mentioned machine learning model training method, according to the average ladder of the first second order of existing clean data counting loss function
Degree, and the second second order average gradient of purified sample data counting loss function for being cleaned by epicycle, and then obtain
Under "current" model parameter, the whole second order average gradient of loss function, by the whole second order average gradient of loss function come
Adjustment is updated to "current" model parameter.Wherein, model is restrained to update model parameter according to second order average gradient
Speed, convergent speed is carried out faster to model than gradient descent method, the newer number of required iteration will be reduced, in turn
Reduce the loss of machine resources in model parameter renewal process.
In addition, the purified sample data that electronic equipment is cleaned by existing purified sample data and epicycle;To ask
The whole second order average gradient for taking loss function, ensure that and adjust model parameter based on pure sample data, avoid base
In the erroneous effects that clean data and the blended data of dirty data adjustment model parameter are brought, the standard of model parameter adjustment is improved
True rate.
Secondly, it in the second second order average gradient for seeking loss function, increases and dirty sample data is cleaned
Considering to the promotion degree of model accuracy rate afterwards so that the second second order average gradient of the loss function sought is more accurate
Really.
Then, machine learning model is drawn a portrait according to the corresponding user of model parameter for meeting training termination condition export with
The corresponding user's portrait label of user characteristic data, the accuracy rate for user's portrait label that output can be improved.
As shown in figure 5, in one embodiment, a kind of machine learning model training device 500 is provided, the device 500
Including:Sample data acquisition module 502, second order average gradient determining module 504 and model parameter adjust module 506, wherein:
Sample data acquisition module 502, for obtaining the existing purified sample data before epicycle cleans dirty sample data.
Second order average gradient determining module 504, for working as according to existing purified sample data and machine learning model
Preceding model parameter determines the first second order average gradient of the loss function of machine learning model.
Sample data acquisition module 502 is additionally operable to obtain epicycle after taking the dirty sample data cleaning in part in dirty sample data
Obtained purified sample data.
Second order average gradient determining module 504 is additionally operable to the purified sample data cleaned according to epicycle and "current" model
Parameter determines the second second order average gradient of loss function;According to the first second order average gradient and the second second order average gradient, obtain
Obtain the whole second order average gradient of loss function.
Model parameter adjusts module 506, for adjusting "current" model parameter according to whole second order average gradient;After adjustment
Model parameter when being unsatisfactory for training termination condition, regard next round as epicycle, 502 work of notice sample data acquisition module,
Until the model parameter after adjustment meets training termination condition.
In one embodiment, second order average gradient determining module 504 is additionally operable to existing purified sample data and machine
The "current" model parameter of device learning model substitutes into loss function;It seeks substituting into existing purified sample data and "current" model parameter
Loss function the first first-order partial derivative and the first second order local derviation matrix;According to the first second order local derviation inverse of a matrix matrix and
One first-order partial derivative determines the first second order average gradient of the loss function of machine learning model.
In one embodiment, second order average gradient determining module 504 is additionally operable to learn according to following formula computing machine
First second order average gradient of the loss function of model:
Wherein, gc(θ) is the first second order average gradient;C is the abbreviation of clean, indicates average for calculating the first second order
The sample data of gradient is pure data;RcleanFor existing purified sample data;φ () indicates loss function;H(φ())
Indicate the second order local derviation matrix of loss function;For i-th of input data in existing purified sample data;It is existing
I-th of output data in purified sample data;θ is "current" model parameter;It is the first of loss function
Second order local derviation inverse of a matrix matrix;For the first first-order partial derivative of loss function.
In one embodiment, second order average gradient determining module 504 is additionally operable to the purified sample for cleaning epicycle
Data and "current" model parameter substitute into loss function;It seeks substituting into the purified sample data and "current" model ginseng that epicycle is cleaned
Second first-order partial derivative of several loss functions and the second second order local derviation matrix;According to the second second order local derviation inverse of a matrix matrix and
Second first-order partial derivative determines the second second order average gradient of loss function.
In one embodiment, second order average gradient determining module 504 be additionally operable to obtain epicycle clean it is each pure
The sampling probability of the corresponding dirty sample data of sample data;For every purified sample data that epicycle is cleaned, seek pair
The product of the second second order local derviation inverse of a matrix matrix and the second first-order partial derivative answered and the sampling of corresponding dirty sample data are general
The ratio of rate;The average value for seeking each ratio obtains the second second order average gradient of loss function.
In one embodiment, second order average gradient determining module 504 is additionally operable to according to following formula counting loss function
The second second order average gradient:
Wherein, gs(θ) is the second second order average gradient;S is the purified sample data that epicycle is cleaned;C is clean
Abbreviation indicates that the sample data for calculating the second second order average gradient is pure data;P (i) is i-th that epicycle extracts
The sampling probability of dirty sample data;φ () indicates loss function;H (φ ()) indicates the second order local derviation matrix of loss function;
I-th of input data in the purified sample data cleaned for epicycle;In the purified sample data cleaned for epicycle
I-th of output data;θ is "current" model parameter;For the second second order local derviation matrix of loss function
Inverse matrix;For the second first-order partial derivative of loss function.
As shown in fig. 6, in one embodiment, dirty sample data includes user characteristics sample data and the use accordingly demarcated
Family portrait label.The device 500 further includes:
User's portrait label output module 508, after meeting training termination condition for model parameter after adjustment, is then obtained
User characteristic data is taken, user characteristic data is inputted to the machine learning model of modulated mould preparation shape parameter, output user, which draws a portrait, to mark
Label.
It should be noted that term " first " used in this application and " second " are only used for distinguishing, it is not used to suitable
The restriction of sequence, size, subordinate etc..
One of ordinary skill in the art will appreciate that realizing all or part of flow in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, which can be stored in a computer-readable storage and be situated between
In matter, the program is when being executed, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, storage medium above-mentioned can be
The non-volatile memory mediums such as magnetic disc, CD, read-only memory (Read-Only Memory, ROM) or random storage note
Recall body (Random Access Memory, RAM) etc..
Each technical characteristic of above example can be combined arbitrarily, to keep description succinct, not to above-described embodiment
In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance
Shield is all considered to be the range of this specification record.
Only several embodiments of the present invention are expressed for above example, the description thereof is more specific and detailed, but can not
Therefore it is construed as limiting the scope of the patent.It should be pointed out that for those of ordinary skill in the art,
Under the premise of not departing from present inventive concept, various modifications and improvements can be made, these are all within the scope of protection of the present invention.
Therefore, the protection domain of patent of the present invention should be determined by the appended claims.
Claims (15)
1. a kind of machine learning model training method, including:
Obtain the existing purified sample data before epicycle cleans dirty sample data;
According to the "current" model parameter of the existing purified sample data and machine learning model, the machine learning mould is determined
First second order average gradient of the loss function of type;
Epicycle is obtained from the purified sample data for taking the dirty sample data in part to be obtained after cleaning in dirty sample data;
The purified sample data and the "current" model parameter cleaned according to epicycle determine the of the loss function
Two second order average gradients;
According to the first second order average gradient and the second second order average gradient, the whole second order of the loss function is obtained
Average gradient;
The "current" model parameter is adjusted according to the whole second order average gradient;
When model parameter after adjustment is unsatisfactory for training termination condition, using next round as epicycle, described obtain at this is returned to
To continue to train the step of existing purified sample data before the dirty sample data of wheel cleaning, until the model parameter after adjustment meets
Training termination condition.
2. according to the method described in claim 1, it is characterized in that, described according to the existing purified sample data and machine
The "current" model parameter of learning model determines the first second order average gradient of the loss function of the machine learning model, including:
The "current" model parameter of the existing purified sample data and machine learning model is substituted into the loss function;
Seek substituting into the first single order of the loss function of the existing purified sample data and the "current" model parameter
Partial derivative and the first second order local derviation matrix;
According to the first second order local derviation inverse of a matrix matrix and first first-order partial derivative, the machine learning model is determined
Loss function the first second order average gradient.
3. according to the method described in claim 2, it is characterized in that, described according to the first second order local derviation inverse of a matrix matrix
With first first-order partial derivative, the first second order average gradient of the loss function of the machine learning model is determined, including:
The first second order average gradient of the loss function of the machine learning model is calculated according to following formula:
Wherein, gc(θ) is the first second order average gradient;C is the abbreviation of clean, is indicated for calculating the average ladder of first second order
The sample data of degree is pure data;RcleanFor the existing purified sample data;φ () indicates loss function;H(φ
()) indicate the second order local derviation matrix of the loss function;For i-th of input number in the existing purified sample data
According to;For i-th of output data in the existing purified sample data;θ is "current" model parameter;For the first second order local derviation inverse of a matrix matrix of the loss function;It is described
First first-order partial derivative of loss function.
4. according to the method described in claim 1, it is characterized in that, the purified sample number cleaned according to epicycle
According to the "current" model parameter, determine the second second order average gradient of the loss function, including:
The purified sample data that epicycle is cleaned and the "current" model parameter substitute into the loss function;
Seek substituting into the loss function of the purified sample data and the "current" model parameter that epicycle is cleaned
Second first-order partial derivative and the second second order local derviation matrix;
According to the second second order local derviation inverse of a matrix matrix and second first-order partial derivative, the of the loss function is determined
Two second order average gradients.
5. according to the method described in claim 4, it is characterized in that, described according to the second second order local derviation inverse of a matrix matrix
With second first-order partial derivative, the second second order average gradient of the loss function is determined, including:
Obtain the sampling probability for the corresponding dirty sample data of each purified sample data that epicycle is cleaned;
For every purified sample data that epicycle is cleaned, the corresponding second second order local derviation inverse of a matrix is sought
The ratio of the product of matrix and second first-order partial derivative and the sampling probability of corresponding dirty sample data;
The average value for seeking each ratio obtains the second second order average gradient of the loss function.
6. according to the method described in claim 5, it is characterized in that, the average value for seeking each ratio, obtains the damage
The second second order average gradient of function is lost, including:
The second second order average gradient of the loss function is calculated according to following formula:
Wherein, gs(θ) is the second second order average gradient;S is the purified sample data that epicycle is cleaned;C is the abbreviation of clean,
Indicate that the sample data for calculating the second second order average gradient is pure data;P (i) is i-th of dirty sample that epicycle extracts
The sampling probability of notebook data;φ () indicates loss function;H (φ ()) indicates the second order local derviation matrix of loss function;For this
I-th of input data in the purified sample data that wheel cleaning obtains;I-th in the purified sample data cleaned for epicycle
A output data;θ is "current" model parameter;For the second second order local derviation inverse of a matrix of loss function
Matrix;For the second first-order partial derivative of loss function.
7. according to the method described in claim 1, it is characterized in that, described according to the first second order average gradient and described
Two second order average gradients obtain the whole second order average gradient of the loss function, including:
By the first second order average gradient and the second second order average gradient, respectively according to corresponding first weight and second
Weight is summed, and the whole second order average gradient of the loss function is obtained;
Wherein, first weight, existing purified sample data account for whole sample datas before cleaning dirty sample data for epicycle
Accounting;
Second weight, the dirty sample data cleaned for epicycle before dirty sample data account for the accounting of whole sample datas.
8. method according to any one of claim 1 to 7, which is characterized in that the dirty sample data includes user spy
Sign sample data and the user's portrait label accordingly demarcated;The method further includes:
After model parameter after adjustment meets training termination condition, then
User characteristic data is obtained, the user characteristic data is inputted to the machine learning model of modulated mould preparation shape parameter,
Export user's portrait label.
9. a kind of machine learning model training device, which is characterized in that described device includes:
Sample data acquisition module, for obtaining the existing purified sample data before epicycle cleans dirty sample data;
Second order average gradient determining module, for the current mould according to the existing purified sample data and machine learning model
Shape parameter determines the first second order average gradient of the loss function of the machine learning model;
The sample data acquisition module is additionally operable to acquisition epicycle and is obtained after taking the dirty sample data cleaning in part in dirty sample data
The purified sample data arrived;
The second order average gradient determining module is additionally operable to the purified sample data cleaned according to epicycle and described works as
Preceding model parameter determines the second second order average gradient of the loss function;According to the first second order average gradient and described
Second second order average gradient obtains the whole second order average gradient of the loss function;
Model parameter adjusts module, for adjusting the "current" model parameter according to the whole second order average gradient;Work as adjustment
When model parameter afterwards is unsatisfactory for training termination condition, using next round as epicycle, the sample data acquisition module work is notified
Make, until the model parameter after adjustment meets training termination condition.
10. device according to claim 9, which is characterized in that the second order average gradient determining module is additionally operable to institute
The "current" model parameter for stating existing purified sample data and machine learning model substitutes into the loss function;It seeks described in substitution
The first first-order partial derivative and the first second order of the loss function of existing purified sample data and the "current" model parameter
Local derviation matrix;According to the first second order local derviation inverse of a matrix matrix and first first-order partial derivative, the engineering is determined
Practise the first second order average gradient of the loss function of model.
11. device according to claim 10, which is characterized in that the second order average gradient determining module be additionally operable to according to
Following formula calculates the first second order average gradient of the loss function of the machine learning model:
Wherein, gc(θ) is the first second order average gradient;C is the abbreviation of clean, is indicated for calculating the average ladder of first second order
The sample data of degree is pure data;RcleanFor the existing purified sample data;φ () indicates loss function;H(φ
()) indicate the second order local derviation matrix of the loss function;For i-th of input number in the existing purified sample data
According to;For i-th of output data in the existing purified sample data;θ is "current" model parameter;For the first second order local derviation inverse of a matrix matrix of the loss function;It is described
First first-order partial derivative of loss function.
12. device according to claim 9, which is characterized in that the second order average gradient determining module is additionally operable to this
The purified sample data and the "current" model parameter that wheel cleaning obtains substitute into the loss function;It is clear to seek substitution epicycle
Second first-order partial derivative of the loss function of the purified sample data and the "current" model parameter washed and
Two second order local derviation matrixes;According to the second second order local derviation inverse of a matrix matrix and second first-order partial derivative, determine described in
Second second order average gradient of loss function.
13. device according to claim 12, which is characterized in that the second order average gradient determining module is additionally operable to obtain
The sampling probability for the corresponding dirty sample data of each purified sample data that epicycle is cleaned;Epicycle is cleaned
Every purified sample data seek the corresponding second second order local derviation inverse of a matrix matrix and the second single order local derviation
The ratio of several products and the sampling probability of corresponding dirty sample data;The average value for seeking each ratio, obtains the loss
Second second order average gradient of function.
14. device according to claim 13, which is characterized in that the second order average gradient determining module be additionally operable to according to
Following formula calculates the second second order average gradient of the loss function:
Wherein, gs(θ) is the second second order average gradient;S is the purified sample data that epicycle is cleaned;C is the abbreviation of clean,
Indicate that the sample data for calculating the second second order average gradient is pure data;P (i) is i-th of dirty sample that epicycle extracts
The sampling probability of notebook data;φ () indicates loss function;H (φ ()) indicates the second order local derviation matrix of loss function;For this
I-th of input data in the purified sample data that wheel cleaning obtains;I-th in the purified sample data cleaned for epicycle
A output data;θ is "current" model parameter;For the second second order local derviation inverse of a matrix of loss function
Matrix;For the second first-order partial derivative of loss function.
15. the device according to any one of claim 9 to 14, which is characterized in that the dirty sample data includes user
Feature samples data and the user's portrait label accordingly demarcated;Described device further includes:
User's portrait label output module then obtains user after meeting training termination condition for model parameter after adjustment
The user characteristic data is inputted the machine learning model of modulated mould preparation shape parameter, output user's portrait by characteristic
Label.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710344182.8A CN108320026B (en) | 2017-05-16 | 2017-05-16 | Machine learning model training method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710344182.8A CN108320026B (en) | 2017-05-16 | 2017-05-16 | Machine learning model training method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108320026A true CN108320026A (en) | 2018-07-24 |
CN108320026B CN108320026B (en) | 2022-02-11 |
Family
ID=62892248
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710344182.8A Active CN108320026B (en) | 2017-05-16 | 2017-05-16 | Machine learning model training method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108320026B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109710793A (en) * | 2018-12-25 | 2019-05-03 | 科大讯飞股份有限公司 | A kind of Hash parameter determines method, apparatus, equipment and storage medium |
WO2020153934A1 (en) * | 2019-01-21 | 2020-07-30 | Hewlett-Packard Development Company, L.P. | Fault prediction model training with audio data |
CN112703511A (en) * | 2018-09-27 | 2021-04-23 | 华为技术有限公司 | Operation accelerator and data processing method |
CN113625175A (en) * | 2021-10-11 | 2021-11-09 | 北京理工大学深圳汽车研究院(电动车辆国家工程实验室深圳研究院) | SOC estimation method and system based on cloud big data platform |
US11409589B1 (en) | 2019-10-23 | 2022-08-09 | Relativity Oda Llc | Methods and systems for determining stopping point |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103646019A (en) * | 2013-12-31 | 2014-03-19 | 哈尔滨理工大学 | Method and device for fusing multiple machine translation systems |
WO2015078185A1 (en) * | 2013-11-29 | 2015-06-04 | 华为技术有限公司 | Convolutional neural network and target object detection method based on same |
CN104809139A (en) * | 2014-01-29 | 2015-07-29 | 日本电气株式会社 | Code file query method and device |
WO2016062044A1 (en) * | 2014-10-24 | 2016-04-28 | 华为技术有限公司 | Model parameter training method, device and system |
CN105678740A (en) * | 2015-12-30 | 2016-06-15 | 完美幻境(北京)科技有限公司 | Camera geometrical calibration processing method and apparatus |
CN105844706A (en) * | 2016-04-19 | 2016-08-10 | 浙江大学 | Full-automatic three-dimensional hair modeling method based on single image |
CN105931224A (en) * | 2016-04-14 | 2016-09-07 | 浙江大学 | Pathology identification method for routine scan CT image of liver based on random forests |
CN106062786A (en) * | 2014-09-12 | 2016-10-26 | 微软技术许可有限责任公司 | Computing system for training neural networks |
CN106295460A (en) * | 2015-05-12 | 2017-01-04 | 株式会社理光 | The detection method of people and equipment |
CN106548210A (en) * | 2016-10-31 | 2017-03-29 | 腾讯科技(深圳)有限公司 | Machine learning model training method and device |
-
2017
- 2017-05-16 CN CN201710344182.8A patent/CN108320026B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015078185A1 (en) * | 2013-11-29 | 2015-06-04 | 华为技术有限公司 | Convolutional neural network and target object detection method based on same |
CN103646019A (en) * | 2013-12-31 | 2014-03-19 | 哈尔滨理工大学 | Method and device for fusing multiple machine translation systems |
CN104809139A (en) * | 2014-01-29 | 2015-07-29 | 日本电气株式会社 | Code file query method and device |
CN106062786A (en) * | 2014-09-12 | 2016-10-26 | 微软技术许可有限责任公司 | Computing system for training neural networks |
WO2016062044A1 (en) * | 2014-10-24 | 2016-04-28 | 华为技术有限公司 | Model parameter training method, device and system |
CN106295460A (en) * | 2015-05-12 | 2017-01-04 | 株式会社理光 | The detection method of people and equipment |
CN105678740A (en) * | 2015-12-30 | 2016-06-15 | 完美幻境(北京)科技有限公司 | Camera geometrical calibration processing method and apparatus |
CN105931224A (en) * | 2016-04-14 | 2016-09-07 | 浙江大学 | Pathology identification method for routine scan CT image of liver based on random forests |
CN105844706A (en) * | 2016-04-19 | 2016-08-10 | 浙江大学 | Full-automatic three-dimensional hair modeling method based on single image |
CN106548210A (en) * | 2016-10-31 | 2017-03-29 | 腾讯科技(深圳)有限公司 | Machine learning model training method and device |
Non-Patent Citations (2)
Title |
---|
朱斐等: "一种解决连续空间问题的真实在线自然梯度AC算法", 《软件学报》 * |
谢锦等: "基于图像不变特征深度学习的交通标志分类", 《计算机辅助设计与图形学学报》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112703511A (en) * | 2018-09-27 | 2021-04-23 | 华为技术有限公司 | Operation accelerator and data processing method |
CN112703511B (en) * | 2018-09-27 | 2023-08-25 | 华为技术有限公司 | Operation accelerator and data processing method |
CN109710793A (en) * | 2018-12-25 | 2019-05-03 | 科大讯飞股份有限公司 | A kind of Hash parameter determines method, apparatus, equipment and storage medium |
WO2020153934A1 (en) * | 2019-01-21 | 2020-07-30 | Hewlett-Packard Development Company, L.P. | Fault prediction model training with audio data |
US11409589B1 (en) | 2019-10-23 | 2022-08-09 | Relativity Oda Llc | Methods and systems for determining stopping point |
US11921568B2 (en) | 2019-10-23 | 2024-03-05 | Relativity Oda Llc | Methods and systems for determining stopping point |
CN113625175A (en) * | 2021-10-11 | 2021-11-09 | 北京理工大学深圳汽车研究院(电动车辆国家工程实验室深圳研究院) | SOC estimation method and system based on cloud big data platform |
Also Published As
Publication number | Publication date |
---|---|
CN108320026B (en) | 2022-02-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108320026A (en) | Machine learning model training method and device | |
CN107358293B (en) | Neural network training method and device | |
TWI689871B (en) | Gradient lifting decision tree (GBDT) model feature interpretation method and device | |
CN106529569B (en) | Threedimensional model triangular facet feature learning classification method and device based on deep learning | |
CN106951499B (en) | A kind of knowledge mapping representation method based on translation model | |
CN108229267A (en) | Object properties detection, neural metwork training, method for detecting area and device | |
CN106611052A (en) | Text label determination method and device | |
CN108596774A (en) | Socialization information recommendation algorithm based on profound internet startup disk feature and system | |
CN109523018A (en) | A kind of picture classification method based on depth migration study | |
CN109583468A (en) | Training sample acquisition methods, sample predictions method and corresponding intrument | |
CN109299258A (en) | A kind of public sentiment event detecting method, device and equipment | |
CN107943874A (en) | Knowledge mapping processing method, device, computer equipment and storage medium | |
CN110276442A (en) | A kind of searching method and device of neural network framework | |
CN111291165B (en) | Method and device for embedding training word vector into model | |
CN107003834B (en) | Pedestrian detection device and method | |
CN105931271B (en) | A kind of action trail recognition methods of the people based on variation BP-HMM | |
CN108804577B (en) | Method for estimating interest degree of information tag | |
CN111210111B (en) | Urban environment assessment method and system based on online learning and crowdsourcing data analysis | |
CN109189921A (en) | Comment on the training method and device of assessment models | |
CN108228684A (en) | Training method, device, electronic equipment and the computer storage media of Clustering Model | |
CN103617146B (en) | A kind of machine learning method and device based on hardware resource consumption | |
Ozturk | Parametric estimation of location and scale parameters in ranked set sampling | |
CN109213831A (en) | Event detecting method and device calculate equipment and storage medium | |
CN103729431B (en) | Massive microblog data distributed classification device and method with increment and decrement function | |
CN105045906B (en) | The predictor method and device of impression information clicking rate |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |