CN107506775A - model training method and device - Google Patents
model training method and device Download PDFInfo
- Publication number
- CN107506775A CN107506775A CN201610421438.6A CN201610421438A CN107506775A CN 107506775 A CN107506775 A CN 107506775A CN 201610421438 A CN201610421438 A CN 201610421438A CN 107506775 A CN107506775 A CN 107506775A
- Authority
- CN
- China
- Prior art keywords
- learning model
- deep learning
- model
- training
- target domain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24147—Distances to closest patterns, e.g. nearest neighbour classification
Abstract
This application discloses a kind of model training method and device.Methods described includes:Obtain source domain deep learning model;Obtain the training data of target domain;Using the training data of the target domain, the weight parameter of the source domain deep learning model is adjusted, obtains target domain deep learning model;Using the target domain deep learning model, the data characteristics of the training data is extracted;K arest neighbors disaggregated models are trained using the data characteristics, obtain identification model.The embodiment of the present application reduces model training cost, ensure that model accuracy, while reduces over-fitting risk.
Description
Technical field
The application belongs to technical field of data recognition, specifically, is related to a kind of based on neighbouring general of deep learning and K
Taxonomy model.
Background technology
In actual applications, it usually needs the data such as image, sound, text are identified, to be entered according to recognition result
The corresponding operation of row.Such as image recognition is carried out to view data, to identify image category, realize to image classification;To sound
Data carry out voice recognition, to determine age of user, sex etc..
At present, the data such as image, sound, text are identified with what is typically realized using identification model, therefore first
Need to train identification model.
By taking image recognition as an example, identification model is Image Classifier, when carrying out Image Classifier training, it is necessary to obtain sample
This image, and the characteristics of image of sample image is extracted, then Image Classifier is identified using characteristics of image.And in order to carry
The comprehensive and accuracy of hi-vision feature representation, the characteristics of image of characteristics of image generally use deep learning model extraction, because
This obtains Image Classifier, it is necessary to train deep learning model first, but the training of deep learning model is usual to train
Substantial amounts of training data is needed, but the collection of a large amount of training datas is typically time-consuming and expensive, and if using compared with decimal
The deep learning model of amount, cause that the deep learning model that training obtains is inaccurate again, cause Image Classifier also inaccurate, and
Over-fitting risk be present.
The content of the invention
In view of this, technical problems to be solved in this application there is provided a kind of model training method and device, solve
Training pattern is inaccurate in the prior art, and the technical problem of over-fitting risk be present.
In order to solve the above-mentioned technical problem, the application has opened a kind of model training method, including:
Obtain source domain deep learning model;
Obtain the training data of target domain;
Using the training data of the target domain, the weight parameter of the source domain deep learning model is adjusted, is obtained
Target domain deep learning model;
Using the target domain deep learning model, the data characteristics of the training data is extracted;
K arest neighbors disaggregated models are trained using the data characteristics, obtain identification model.
Preferably, described using data characteristics training K arest neighbors disaggregated models, obtaining identification model includes:
The data characteristics is subjected to Feature Dimension Reduction, obtains low dimensional feature;
Using the low dimensional features training K arest neighbors disaggregated models, identification model is obtained.
Preferably, the training data using the target domain, the power of the source domain deep learning model is adjusted
Weight parameter, obtaining target domain deep learning model includes:
Using the weight parameter of the source domain deep learning model as initial parameter;
It is set lower than the learning rate of default rate;
Training data and the learning rate using the target domain, adjust the source domain deep learning model
Weight parameter, obtain target domain deep learning model.
Preferably, the training data using the target domain and the learning rate, it is deep to adjust the source domain
The weight parameter of learning model is spent, obtaining target domain deep learning model includes:
Training data and the learning rate using the target domain, according to the adjustment number less than preset times,
Repetition adjusts the weight parameter of the source domain deep learning model, obtains target domain deep learning model.
Preferably, the acquisition source domain deep learning model includes:
Obtain the source domain deep learning model with target domain categorical match.
A kind of model training apparatus, including:
Model module is obtained, for obtaining source domain deep learning model;
Data module is obtained, for obtaining the training data of target domain;
Learning model training module, for the training data using the target domain, adjust the source domain depth
The weight parameter of model is practised, obtains target domain deep learning model;
Characteristic extracting module.For utilizing the target domain deep learning model, the data of the training data are extracted
Feature;
Identification model training module, for using data characteristics training K arest neighbors disaggregated models, obtaining identification mould
Type.
Preferably, the identification model training module includes:
Dimensionality reduction unit, for the data characteristics to be carried out into Feature Dimension Reduction, obtain low dimensional feature;
Identification model training unit, for utilizing the low dimensional features training K arest neighbors disaggregated models, obtain identification mould
Type.
Preferably, the learning model training module includes:
Parameter set unit, for using the weight parameter of the source domain deep learning model as initial parameter, and set
The fixed learning rate less than default rate;
Learning model training unit, for the training data using the target domain and the learning rate, adjust institute
The weight parameter of source domain deep learning model is stated, obtains target domain deep learning model.
Preferably, the learning model training unit is specifically used for utilizing the training data of the target domain and described
Learning rate, according to the adjustment number less than preset times, repetition adjusts the weight parameter of the source domain deep learning model, obtained
Obtain target domain deep learning model.
Preferably, the model module that obtains is specifically used for obtaining the source domain deep learning with target domain categorical match
Model.
Compared with prior art, the application can be obtained including following technique effect:
Using source domain deep learning model and target domain training data, by transfer learning, training obtains target neck
Domain deep learning model.Source domain deep learning model is the deep learning model by training up acquisition, therefore by moving
Study is moved, target domain is in the case of the training data of selection of small quantity, by adjusting source domain deep learning model
Weight parameter, it is also possible to obtain by the target domain deep learning model trained up.And then utilize the target domain depth
Learning model extracts data characteristics, the training of model is identified, identification model is selected K arest neighbors disaggregated models, can reduced
Over-fitting risk, therefore the embodiment of the present application realizes the model training based on limited training data, both reduces model training
Cost, ensure that the degree of accuracy of model, reduce over-fitting risk again.
Certainly, implementing any product of the application must be not necessarily required to reach all the above technique effect simultaneously.
Brief description of the drawings
Accompanying drawing described herein is used for providing further understanding of the present application, forms the part of the application, this Shen
Schematic description and description please is used to explain the application, does not form the improper restriction to the application.In the accompanying drawings:
Fig. 1 is a kind of flow chart of model training method one embodiment of the embodiment of the present application;
Fig. 2 is the flow chart of one embodiment of the deep learning model training process of the embodiment of the present application;
Fig. 3 is a kind of flow chart of model training apparatus one embodiment of the embodiment of the present application.
Embodiment
Presently filed embodiment is described in detail below in conjunction with drawings and Examples, and thereby how the application is applied
Technological means can fully understand and implement according to this to solve technical problem and reach the implementation process of technical effect.
The technical scheme of the embodiment of the present application is mainly used in data identification, particularly image recognition, voice recognition, wants
Realize that data are identified, it is necessary to an identification model first be trained, by the data characteristics extracted input identification model progress data knowledge
Not.
In order to reduce the cost for collecting training data, the over-fitting for alleviating identification model, inventor has found that can
With using the training data by the source domain deep learning model that trains up and lesser amt, by transfer learning obtain through
The deep learning model for training up ground target domain is crossed, so as to extract data characteristics, can be solved because needing to obtain largely
Training data and caused by it is costly and time consuming the problem of, then data characteristics is input in K arest neighbors disaggregated models, trained
To identification model, the over-fitting problem of identification model is alleviated.
Therefore, inventor proposes the technical scheme of the application, in the embodiment of the present application, by obtaining source domain depth
Practise the training data of model and target domain;Using the training data of the target domain, the source domain deep learning is adjusted
The weight parameter of model, obtain target domain deep learning model;Using the target domain deep learning model, described in extraction
The data characteristics of training data;K arest neighbors disaggregated models are trained using data characteristics, obtain identification model.The embodiment of the present application,
By transfer learning, can train to obtain merely with the training data and source domain deep learning model of lesser amt target domain
The deep learning model of target domain, even if realizing using limited training data, can also obtain accurate deep learning model and
Identification model, reduce the cost for collecting data while ensure that the accuracy of model;K arest neighbors disaggregated models are recycled, are kept away
Exempt from because using lesser amt training deep learning model and caused by over-fitting problem.
Technical scheme is described in detail below in conjunction with accompanying drawing.
Fig. 1 is a kind of flow chart of model training method one embodiment of the embodiment of the present application, and this method can include
Following steps:
101:Obtain source domain deep learning model.
The deep learning model of source domain can select a variety of methods to train to obtain, can be with selected depth convolutional Neural net
Network, AutoEncoder (a kind of unsupervised learning algorithm) or DBM (Deep Boltzmann Machine, depth Boltzmann
Machine) etc..
Below by taking depth convolutional neural networks as an example, to being illustrated for the deep learning model of source domain.
Assuming that the configuration of source domain depth convolutional neural networks is as shown in Fig. 2 mainly include 2 convolution
(convolution) layer:Convolution1 and convolution2,5 pond (pooling) layers:Pooling1~
Pooling5,9 beginning (Inception) layers:Inception1~Inception9,3 full connection (full-
Connection) layer:Full-connection1~full-connection3, and 3 softmax layers:Softmax1~
Softmax3,1 loss (Dropout) layer:Dropout1.Wherein, softmax1, softmax2 layer, full- are added
Connection1 and full-connection2 layers are primarily to prevent BP (Back Propagation) training gradient from declining
Subtract.
In order to obtain more accurately deep learning model, parameters weighting can be initialized using random number,
LearningRate (learning rate) is arranged to a less number, such as LearningRate=0.01, allows model to restrain faster.
When nicety of grading is stable, turns LearningRate down and continue to train, until model converges to a good value.Training is completed
The weight parameter for obtaining depth convolutional neural networks afterwards is deep learning model.
It should be noted that Fig. 2 is only a kind of possible deep neural network, the application is not limited to this.As long as
Any deep neural network that data characteristics can be extracted all should be in the protection domain of the application.
102:Obtain the training data of target domain.
Because source domain deep learning model is to utilize a large amount of source domain data, carry out training up acquisition, therefore this
Apply for that embodiment can be based on source domain deep learning model and carry out transfer learning, select the target domain data of lesser amt.
For example, the training data for obtaining source domain is the 1000000 image labeling data for having mark altogether for including 1000 classification
Collection, image labeling can be the information such as the age of face, sex, and the training data of target domain can be simply to select comprising 7
10,000 view data for having mark of individual classification.
103:Using the training data of the target domain, the weight parameter of the source domain deep learning model is adjusted,
Obtain target domain deep learning model.
The weight parameter of source domain deep learning model can be as the initial parameter of target domain deep learning model, it
Afterwards, using the training data of target domain, initial parameter is adjusted, you can to obtain target domain deep learning model.
Wherein, the weight parameter of the source domain deep learning model is adjusted, can be real by way of regularized learning algorithm rate
It is existing, it is less using setting, the learning rate of such as less than default rate, the weight parameter of source domain deep learning model is adjusted
It is whole, reduce every time or increase weight parameter * learning rates, until default regularization condition is restrained or met to model.
It is of course also possible to set adjusting parameter, weight parameter is increased or decreased into the adjusting parameter every time, until model
Convergence meets default regularization condition.
Wherein, the adjustment each time to weight parameter is in the upper weight ginseng once adjusted after the adjustment obtained afterwards
Adjusted again on the basis of number.
104:Using the target domain deep learning model, the data characteristics of the training data is extracted.
Training data is inputted into target domain deep learning model, will be every by multilayer convolutional layer and multilayer pond layer
One training data carries out multilayer convolutional calculation and multilayer pond, extracts the data characteristics of the training data, extracts
Data characteristics there is stronger robustness.
105:Using data characteristics training K arest neighbors disaggregated model (k-NearestNeighbor, KNN), known
Other model.
The data characteristics that target domain deep learning model extraction goes out is input in K arest neighbors disaggregated models and instructed
Practice, obtain identification model.Because K arest neighbors disaggregated models are nonparametric models, therefore, K arest neighbors disaggregated model training is utilized
The over-fitting risk of identification model can be reduced.
Wherein, in order to be identified model, in addition to it can use K arest neighbors disaggregated models, it can also use and be based on K
Other distorted patterns of arest neighbors disaggregated model etc..
K arest neighbors disaggregated model basic ideas are:If the k in feature space, a sample is most like (namely special
It is closest in sign space) sample in it is most of belong to some classification, then this sample falls within this classification.K is nearest
In adjacent disaggregated model, selected neighbours are the objects correctly classified, and the model is on categorised decision only according to most adjacent
The classifications of one or several near samples determines the classification belonging to sample to be divided, and this model is simple, it can be readily appreciated that being easy to
Realize, without estimating parameter.
In the present embodiment, on the basis of the source domain deep learning model trained, led merely with lesser amt target
Domain training data is the deep learning model that can obtain the target domain by training up, and is realized in limited training data
In the case of, accurate target domain deep learning model can be obtained by transfer learning, improve target domain deep learning
The accuracy of model, reduce the cost for collecting training data;Model is identified in conjunction with K arest neighbors disaggregated models, is alleviated
The over-fitting problem of model.
Wherein, due to utilizing the data characteristics dimension of target domain deep learning model extraction higher, identification model is caused
The amount of calculation of training is larger, and training is slow and is difficult to obtain good performance, therefore in order to reduce the redundancy of data characteristics, enters one
Step improves the model training degree of accuracy, described to utilize data characteristics training K arest neighbors classification moulds as another embodiment
Type, obtaining identification model can include:
The data characteristics is subjected to Feature Dimension Reduction, obtains low dimensional feature;
Using the low dimensional features training K arest neighbors disaggregated models, identification model is obtained.
Wherein, the data characteristics is subjected to dimensionality reduction, the dimensionality reductions such as principal component analysis or linear discriminant analysis can be used
Mode, to obtain low dimensional feature.
Low dimensional feature and corresponding data mark are input to K arest neighbors disaggregated models and be trained, obtains identification mould
Type.
In the present embodiment, the data characteristics of higher-dimension is subjected to dimension-reduction treatment and obtains low dimensional feature, improves identification model
Training speed.
In addition, as another embodiment, the training data using the target domain, it is deep to adjust the source domain
The weight parameter of learning model is spent, obtaining target domain deep learning model can include:
Using the weight parameter of the source domain deep learning model as initial parameter;
It is set lower than the learning rate of default rate;
Training data and the learning rate using the target domain, adjust the source domain deep learning model
Weight parameter, obtain target domain deep learning model.
Wherein, the weight parameter of the source domain deep learning model is adjusted, until default adjust is restrained or met to model
During shelf-regulating Conditions, you can to obtain target domain deep learning model.
Because if adjustment number is excessive, over-fitting risk can be also improved, therefore in order to further reduce over fitting risk,
Default regularization condition can be that adjustment number is less than preset times.
Therefore, as another embodiment, the training data using the target domain and the learning rate, adjust
The weight parameter of the whole source domain deep learning model, obtaining target domain deep learning model can include:
Training data and the learning rate using the target domain, according to the adjustment number less than preset times,
Repetition adjusts the weight parameter of the source domain deep learning model, obtains target domain deep learning model.
In addition, in order to ensure the degree of accuracy of target domain deep learning model, source domain and target domain can be classifications
Two fields of matching, such as cat and dog, flowers and plants and trees.
Therefore, the acquisition source domain deep learning model can be:
Obtain the source domain deep learning model with the source domain of the categorical match of target domain.
In actual applications, the technical scheme of the embodiment of the present application can be used for for example carrying out image knowledge to view data
Not, to identify image category, realize to image classification;Voice recognition is carried out to voice data, to determine age of user, sex
Deng.
A kind of structural representation of model training apparatus one embodiment that Fig. 3 provides for the embodiment of the present application, the device
It can include:
Model module 301 is obtained, for obtaining source domain deep learning model.
Data module 302 is obtained, for obtaining the training data of target domain.
Because source domain deep learning model carries out using a large amount of source domain data training up acquisition, therefore this Shen
Please embodiment can only select the target domain data of lesser amt, transfer learning is carried out based on source domain deep learning model.
Learning model training module 303, for the training data using the target domain, adjust the source domain depth
The weight parameter of learning model, obtain target domain deep learning model.
The learning model training module is used for deep using the weight parameter of source domain deep learning model as target domain
The initial parameter of learning model is spent, afterwards, using the training data of target domain, initial parameter is adjusted.
The learning model training module is also used for being adjusted parameter setting, and weight parameter is increased or decreased into institute every time
Adjusting parameter is stated, until default regularization condition is restrained or met to model, you can to obtain target domain deep learning model.
Wherein, the adjustment each time to weight parameter is in the upper weight ginseng once adjusted after the adjustment obtained afterwards
Adjusted again on the basis of number.
Characteristic extracting module 304.For utilizing the target domain deep learning model, the number of the training data is extracted
According to feature.
Training data is inputted into target domain deep learning model, and multilayer convolutional calculation is carried out to each training data
And multilayer pond, characteristic extracting module are used for the data characteristics for extracting the training data.
Identification model training module 305, for using data characteristics training K arest neighbors disaggregated models, being identified
Model.
The data characteristics that characteristic extracting module is extracted is input in identification model training module, utilizes K arest neighbors point
Class model is trained, and obtains identification model.The over-fitting risk of identification model can be reduced using K arest neighbors disaggregated model.
Wherein, model is identified by training, K is based on except K arest neighbors disaggregated model can be used to use
Distorted pattern of arest neighbors disaggregated model etc..
In the present embodiment, on the basis of the source domain deep learning model trained, merely with lesser amt target domain
Training data is the deep learning model that can obtain target domain.Realize in the case of limited training data, pass through migration
Study can obtain accurate target domain deep learning model, improve the accuracy of target domain deep learning model, drop
The low cost for collecting training data;Data identification model is obtained in conjunction with K arest neighbors disaggregated models, crossing for model is alleviated and intends
Conjunction problem.
Wherein, due to higher using the data characteristics dimension of target domain deep learning model extraction.In order to reduce data
The redundancy of feature, model training speed and the degree of accuracy are further improved, as another embodiment, the identification model trains mould
Block includes:
Dimensionality reduction unit, for the data characteristics to be carried out into Feature Dimension Reduction, obtain low dimensional feature;
Identification model training unit, for utilizing the low dimensional features training K arest neighbors disaggregated models, obtain identification mould
Type.
Wherein, dimensionality reduction unit can specifically use the methods of principal component analysis or linear discriminant analysis and realize data characteristics
Dimensionality reduction, obtain low dimensional feature.
The data characteristics that characteristic extracting module is extracted is inputted into the dimensionality reduction unit of identification model training module, is obtained
Low dimensional feature.Low dimensional feature and corresponding data mark are input in identification model training unit again, utilize K arest neighbors
Disaggregated model is trained, and obtains identification model.
In the present embodiment, low dimensional feature is obtained using dimensionality reduction unit, improves the training speed of identification model.
In addition, as another embodiment, target domain training data is input in learning model training module and carried out
Training obtains target domain deep learning model, and the learning model training module includes:
Parameter set unit, for using the weight parameter of the source domain deep learning model as initial parameter, and set
The fixed learning rate less than default rate;
Learning model training unit, for the training data using the target domain and the learning rate, adjust institute
The weight parameter of source domain deep learning model is stated, obtains target domain deep learning model.
Wherein, after parameter set unit sets target domain deep learning model initial weight parameter, study is utilized
Model training unit adjusts the weight parameter of the source domain deep learning model, until default adjustment is restrained or met to model
During condition, you can to obtain target domain deep learning model.
, can be with the adjustment number of control weight parameter in order to reduce over-fitting risk, default regularization condition can be adjustment
Number is less than preset times.
Therefore, it is specifically used for the training using the target domain as another embodiment, learning model training unit
Data and the learning rate, according to the adjustment number less than preset times, repetition adjusts the source domain deep learning model
Weight parameter, obtain target domain deep learning model.
In addition, when source domain and target domain have categorical match relation, such as cat and dog, flowers and plants and trees, obtain
Target domain deep learning model have the preferable degree of accuracy.
Therefore, the model module that obtains is specifically used for obtaining the source domain deep learning mould with target domain categorical match
Type.
In actual applications, the technical scheme of the embodiment of the present application can be used for for example carrying out image knowledge to view data
Not, to identify image category, realize to image classification;Voice recognition is carried out to voice data, to determine age of user, sex
Deng.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net
Network interface and internal memory.
Internal memory may include computer-readable medium in volatile memory, random access memory (RAM) and/or
The forms such as Nonvolatile memory, such as read-only storage (ROM) or flash memory (flash RAM).Internal memory is computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology come realize information store.Information can be computer-readable instruction, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moved
State random access memory (DRAM), other kinds of random access memory (RAM), read-only storage (ROM), electric erasable
Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read-only storage (CD-ROM),
Digital versatile disc (DVD) or other optical storages, magnetic cassette tape, the storage of tape magnetic rigid disk or other magnetic storage apparatus
Or any other non-transmission medium, the information that can be accessed by a computing device available for storage.Define, calculate according to herein
Machine computer-readable recording medium does not include non-temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
Some vocabulary has such as been used to censure specific components among specification and claim.Those skilled in the art should
It is understood that hardware manufacturer may call same component with different nouns.This specification and claims are not with name
The difference of title is used as the mode for distinguishing component, but is used as the criterion of differentiation with the difference of component functionally.Such as logical
The "comprising" of piece specification and claim mentioned in is an open language, therefore should be construed to " include but do not limit
In "." substantially " refer in receivable error range, those skilled in the art can be described within a certain error range solution
Technical problem, basically reach the technique effect.In addition, " coupling " one word is herein comprising any direct and indirect electric property coupling
Means.Therefore, if the first device of described in the text one is coupled to a second device, representing the first device can directly electrical coupling
The second device is connected to, or the second device is electrically coupled to indirectly by other devices or coupling means.Specification
Subsequent descriptions for implement the application better embodiment, so it is described description be for the purpose of the rule for illustrating the application,
It is not limited to scope of the present application.The protection domain of the application is worked as to be defined depending on appended claims institute defender.
It should also be noted that, term " comprising ", "comprising" or its any other variant are intended to nonexcludability
Comprising, so that commodity or system including a series of elements not only include those key elements, but also including without clear and definite
The other element listed, or also include for this commodity or the intrinsic key element of system.In the feelings not limited more
Under condition, the key element that is limited by sentence "including a ...", it is not excluded that in the commodity including the key element or system also
Other identical element be present.
Some preferred embodiments of the application have shown and described in described above, but as previously described, it should be understood that the application
Be not limited to form disclosed herein, be not to be taken as the exclusion to other embodiment, and available for various other combinations,
Modification and environment, and above-mentioned teaching or the technology or knowledge of association area can be passed through in application contemplated scope described herein
It is modified., then all should be in this Shen and the change and change that those skilled in the art are carried out do not depart from spirit and scope
Please be in the protection domain of appended claims.
Claims (10)
- A kind of 1. model training method, it is characterised in that including:Obtain source domain deep learning model;Obtain the training data of target domain;Using the training data of the target domain, the weight parameter of the source domain deep learning model is adjusted, obtains target Field deep learning model;Using the target domain deep learning model, the data characteristics of the training data is extracted;K arest neighbors disaggregated models are trained using the data characteristics, obtain identification model.
- 2. according to the method for claim 1, it is characterised in that described to utilize data characteristics training K arest neighbors classification Model, obtaining identification model includes:The data characteristics is subjected to Feature Dimension Reduction, obtains low dimensional feature;Using the low dimensional features training K arest neighbors disaggregated models, identification model is obtained.
- 3. according to the method for claim 1, it is characterised in that the training data using the target domain, adjustment The weight parameter of the source domain deep learning model, obtaining target domain deep learning model includes:Using the weight parameter of the source domain deep learning model as initial parameter;It is set lower than the learning rate of default rate;Training data and the learning rate using the target domain, adjust the weight of the source domain deep learning model Parameter, obtain target domain deep learning model.
- 4. according to the method for claim 3, it is characterised in that the training data and institute using the target domain Learning rate is stated, adjusts the weight parameter of the source domain deep learning model, obtaining target domain deep learning model includes:Training data and the learning rate using the target domain, according to the adjustment number less than preset times, repeat The weight parameter of the source domain deep learning model is adjusted, obtains target domain deep learning model.
- 5. according to the method for claim 1, it is characterised in that the acquisition source domain deep learning model includes:Obtain the source domain deep learning model with target domain categorical match.
- A kind of 6. model training apparatus, it is characterised in that including:Model module is obtained, for obtaining source domain deep learning model;Data module is obtained, for obtaining the training data of target domain;Learning model training module, for the training data using the target domain, adjust the source domain deep learning mould The weight parameter of type, obtain target domain deep learning model;Characteristic extracting module.For utilizing the target domain deep learning model, the data characteristics of the training data is extracted;Identification model training module, for using data characteristics training K arest neighbors disaggregated models, obtaining identification model.
- 7. device according to claim 6, it is characterised in that the identification model training module includes:Dimensionality reduction unit, for the data characteristics to be carried out into Feature Dimension Reduction, obtain low dimensional feature;Identification model training unit, for utilizing the low dimensional features training K arest neighbors disaggregated models, obtain identification model.
- 8. device according to claim 6, it is characterised in that the learning model training module includes:Parameter set unit, for using the weight parameter of the source domain deep learning model as initial parameter, and set low In the learning rate of default rate;Learning model training unit, for the training data using the target domain and the learning rate, adjust the source The weight parameter of field deep learning model, obtain target domain deep learning model.
- 9. device according to claim 8, it is characterised in that learning model training unit is specifically used for utilizing the target The training data in field and the learning rate, according to the adjustment number less than preset times, it is deep that repetition adjusts the source domain The weight parameter of learning model is spent, obtains target domain deep learning model.
- 10. device according to claim 6, it is characterised in that the acquisition model module is specifically used for obtaining and target The source domain deep learning model of field categorical match.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610421438.6A CN107506775A (en) | 2016-06-14 | 2016-06-14 | model training method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610421438.6A CN107506775A (en) | 2016-06-14 | 2016-06-14 | model training method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107506775A true CN107506775A (en) | 2017-12-22 |
Family
ID=60679042
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610421438.6A Pending CN107506775A (en) | 2016-06-14 | 2016-06-14 | model training method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107506775A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108108355A (en) * | 2017-12-25 | 2018-06-01 | 北京牡丹电子集团有限责任公司数字电视技术中心 | Text emotion analysis method and system based on deep learning |
CN108268362A (en) * | 2018-02-27 | 2018-07-10 | 郑州云海信息技术有限公司 | A kind of method and device that curve graph is drawn under NVcaffe frames |
CN109376766A (en) * | 2018-09-18 | 2019-02-22 | 平安科技(深圳)有限公司 | A kind of portrait prediction classification method, device and equipment |
CN109558873A (en) * | 2018-12-03 | 2019-04-02 | 哈尔滨工业大学 | A kind of mode identification method based on this stack autoencoder network that changes |
CN109840274A (en) * | 2018-12-28 | 2019-06-04 | 北京百度网讯科技有限公司 | Data processing method and device, storage medium |
CN110458572A (en) * | 2019-07-08 | 2019-11-15 | 阿里巴巴集团控股有限公司 | The determination method of consumer's risk and the method for building up of target risk identification model |
CN110942323A (en) * | 2018-09-25 | 2020-03-31 | 优估(上海)信息科技有限公司 | Evaluation model construction method, device and system |
CN111310519A (en) * | 2018-12-11 | 2020-06-19 | 成都智叟智能科技有限公司 | Goods deep learning training method based on machine vision and data sampling |
CN111553429A (en) * | 2020-04-30 | 2020-08-18 | 中国银行股份有限公司 | Fingerprint identification model migration method, device and system |
CN111601418A (en) * | 2020-05-25 | 2020-08-28 | 博彦集智科技有限公司 | Color temperature adjusting method and device, storage medium and processor |
CN111797870A (en) * | 2019-04-09 | 2020-10-20 | Oppo广东移动通信有限公司 | Optimization method and device of algorithm model, storage medium and electronic equipment |
WO2020232874A1 (en) * | 2019-05-20 | 2020-11-26 | 平安科技(深圳)有限公司 | Modeling method and apparatus based on transfer learning, and computer device and storage medium |
CN112580408A (en) * | 2019-09-30 | 2021-03-30 | 杭州海康威视数字技术股份有限公司 | Deep learning model training method and device and electronic equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101794396A (en) * | 2010-03-25 | 2010-08-04 | 西安电子科技大学 | System and method for recognizing remote sensing image target based on migration network learning |
CN104239554A (en) * | 2014-09-24 | 2014-12-24 | 南开大学 | Cross-domain and cross-category news commentary emotion prediction method |
CN105069472A (en) * | 2015-08-03 | 2015-11-18 | 电子科技大学 | Vehicle detection method based on convolutional neural network self-adaption |
US20160078359A1 (en) * | 2014-09-12 | 2016-03-17 | Xerox Corporation | System for domain adaptation with a domain-specific class means classifier |
-
2016
- 2016-06-14 CN CN201610421438.6A patent/CN107506775A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101794396A (en) * | 2010-03-25 | 2010-08-04 | 西安电子科技大学 | System and method for recognizing remote sensing image target based on migration network learning |
US20160078359A1 (en) * | 2014-09-12 | 2016-03-17 | Xerox Corporation | System for domain adaptation with a domain-specific class means classifier |
CN104239554A (en) * | 2014-09-24 | 2014-12-24 | 南开大学 | Cross-domain and cross-category news commentary emotion prediction method |
CN105069472A (en) * | 2015-08-03 | 2015-11-18 | 电子科技大学 | Vehicle detection method based on convolutional neural network self-adaption |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108108355A (en) * | 2017-12-25 | 2018-06-01 | 北京牡丹电子集团有限责任公司数字电视技术中心 | Text emotion analysis method and system based on deep learning |
CN108268362A (en) * | 2018-02-27 | 2018-07-10 | 郑州云海信息技术有限公司 | A kind of method and device that curve graph is drawn under NVcaffe frames |
CN109376766B (en) * | 2018-09-18 | 2023-10-24 | 平安科技(深圳)有限公司 | Portrait prediction classification method, device and equipment |
CN109376766A (en) * | 2018-09-18 | 2019-02-22 | 平安科技(深圳)有限公司 | A kind of portrait prediction classification method, device and equipment |
CN110942323A (en) * | 2018-09-25 | 2020-03-31 | 优估(上海)信息科技有限公司 | Evaluation model construction method, device and system |
CN109558873A (en) * | 2018-12-03 | 2019-04-02 | 哈尔滨工业大学 | A kind of mode identification method based on this stack autoencoder network that changes |
CN109558873B (en) * | 2018-12-03 | 2019-11-05 | 哈尔滨工业大学 | A kind of mode identification method based on this stack autoencoder network that changes |
CN111310519B (en) * | 2018-12-11 | 2024-01-05 | 成都智叟智能科技有限公司 | Goods deep learning training method based on machine vision and data sampling |
CN111310519A (en) * | 2018-12-11 | 2020-06-19 | 成都智叟智能科技有限公司 | Goods deep learning training method based on machine vision and data sampling |
CN109840274A (en) * | 2018-12-28 | 2019-06-04 | 北京百度网讯科技有限公司 | Data processing method and device, storage medium |
CN111797870A (en) * | 2019-04-09 | 2020-10-20 | Oppo广东移动通信有限公司 | Optimization method and device of algorithm model, storage medium and electronic equipment |
WO2020232874A1 (en) * | 2019-05-20 | 2020-11-26 | 平安科技(深圳)有限公司 | Modeling method and apparatus based on transfer learning, and computer device and storage medium |
CN110458572B (en) * | 2019-07-08 | 2023-11-24 | 创新先进技术有限公司 | User risk determining method and target risk recognition model establishing method |
CN110458572A (en) * | 2019-07-08 | 2019-11-15 | 阿里巴巴集团控股有限公司 | The determination method of consumer's risk and the method for building up of target risk identification model |
CN112580408A (en) * | 2019-09-30 | 2021-03-30 | 杭州海康威视数字技术股份有限公司 | Deep learning model training method and device and electronic equipment |
CN112580408B (en) * | 2019-09-30 | 2024-03-12 | 杭州海康威视数字技术股份有限公司 | Deep learning model training method and device and electronic equipment |
CN111553429A (en) * | 2020-04-30 | 2020-08-18 | 中国银行股份有限公司 | Fingerprint identification model migration method, device and system |
CN111553429B (en) * | 2020-04-30 | 2023-11-03 | 中国银行股份有限公司 | Fingerprint identification model migration method, device and system |
CN111601418A (en) * | 2020-05-25 | 2020-08-28 | 博彦集智科技有限公司 | Color temperature adjusting method and device, storage medium and processor |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107506775A (en) | model training method and device | |
Zhou et al. | A classification model of power equipment defect texts based on convolutional neural network | |
CN108021947B (en) | A kind of layering extreme learning machine target identification method of view-based access control model | |
Xu et al. | Maize diseases identification method based on multi-scale convolutional global pooling neural network | |
CN108197109A (en) | A kind of multilingual analysis method and device based on natural language processing | |
CN106651830A (en) | Image quality test method based on parallel convolutional neural network | |
CN106445919A (en) | Sentiment classifying method and device | |
CN108491874A (en) | A kind of image list sorting technique for fighting network based on production | |
CN106960040B (en) | A kind of classification of URL determines method and device | |
CN108898162A (en) | A kind of data mask method, device, equipment and computer readable storage medium | |
CN111680160A (en) | Deep migration learning method for text emotion classification | |
Chellapandi et al. | Comparison of pre-trained models using transfer learning for detecting plant disease | |
CN105389583A (en) | Image classifier generation method, and image classification method and device | |
CN110909125B (en) | Detection method of media rumor of news-level society | |
CN109902672A (en) | Image labeling method and device, storage medium, computer equipment | |
CN108062302A (en) | A kind of recognition methods of particular text information and device | |
CN108846047A (en) | A kind of picture retrieval method and system based on convolution feature | |
CN105045913B (en) | File classification method based on WordNet and latent semantic analysis | |
CN106709528A (en) | Method and device of vehicle reidentification based on multiple objective function deep learning | |
CN107392310A (en) | neural network model training method and device | |
CN109062958B (en) | Primary school composition automatic classification method based on TextRank and convolutional neural network | |
CN110110092A (en) | A kind of knowledge mapping construction method and relevant device | |
CN109947928A (en) | A kind of retrieval type artificial intelligence question and answer robot development approach | |
CN108805102A (en) | A kind of video caption detection and recognition methods and system based on deep learning | |
CN106997484A (en) | A kind of method and device for optimizing user credit model modeling process |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
EE01 | Entry into force of recordation of patent licensing contract | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20171222 Assignee: Apple R&D (Beijing) Co., Ltd. Assignor: BEIJING MOSHANGHUA TECHNOLOGY CO., LTD. Contract record no.: 2019990000054 Denomination of invention: Method and device for recognizing class of social contact short texts and method and device for training classification models License type: Exclusive License Record date: 20190211 |
|
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171222 |