CN109145937A - A kind of method and device of model training - Google Patents

A kind of method and device of model training Download PDF

Info

Publication number
CN109145937A
CN109145937A CN201810664585.5A CN201810664585A CN109145937A CN 109145937 A CN109145937 A CN 109145937A CN 201810664585 A CN201810664585 A CN 201810664585A CN 109145937 A CN109145937 A CN 109145937A
Authority
CN
China
Prior art keywords
sample data
cluster labels
cluster
data
class categories
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810664585.5A
Other languages
Chinese (zh)
Inventor
张志伟
王树强
王希爱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN201810664585.5A priority Critical patent/CN109145937A/en
Publication of CN109145937A publication Critical patent/CN109145937A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The embodiment of the invention provides a kind of method and devices of model training, the method comprise the steps that obtaining the sample data to be trained in specified class categories;Feature extraction is carried out to the sample data to be trained, obtains the corresponding characteristic information of the specified class categories;The corresponding characteristic information of the specified class categories is clustered, multiple cluster labels are obtained;Data balancing processing is carried out to the corresponding sample data of the cluster labels;Using data balancing treated sample data as target sample data;Using the target sample data, training designated model.The present invention can refine the label in existing class categories by above-mentioned unsupervised method, realize that the sample in classification is balanced, balanced sample data is provided for model, the model of the available optimization of model training is carried out according to the sample data of the equilibrium, data are carried out using the model of the optimization and predict available more accurate prediction result, improve the accuracy rate of model prediction.

Description

A kind of method and device of model training
Technical field
The present invention relates to modeling technique fields, method, a kind of dress of model training more particularly to a kind of model training It sets, a kind of model training systems and one or more machine readable media.
Background technique
Image classification is to be distinguished different classes of image according to the semantic information of image, is important in computer vision Basic problem and image detection, image segmentation, object tracking, behavioural analysis etc. other high-rise visual tasks basis.
In recent years, deep learning has obtained answering extensively in related fieldss such as video image, speech recognition, natural language processings With.An important branch of the convolutional neural networks (CNN) as deep learning, due to its superpower capability of fitting and end-to-end Global optimization ability so that image classification task application convolutional neural networks after, precision of prediction is substantially improved.
Although current image classification model is provided with certain classification capacity to image, a large amount of predictions are still suffered from How the sample of mistake advanced optimizes image classification model as a problem to be solved.
Summary of the invention
The embodiment of the present invention is the technical problem to be solved is that a kind of method of model training is provided, to solve existing mould There are the sample of a large amount of prediction errors, the lower problems of predictablity rate for type.
Correspondingly, the embodiment of the invention also provides a kind of devices of model training, a kind of model training systems and one A or multiple machine readable medias, to guarantee the implementation and application of the above method.
To solve the above-mentioned problems, the invention discloses a kind of methods of model training, which comprises
Obtain the sample data to be trained in specified class categories;
Feature extraction is carried out to the sample data to be trained, obtains the corresponding feature letter of the specified class categories Breath;
The corresponding characteristic information of the specified class categories is clustered, multiple cluster labels are obtained;
Data balancing processing is carried out to the corresponding sample data of the cluster labels;
Using data balancing treated sample data as target sample data;
Using the target sample data, training designated model.
Preferably, before the progress data balancing processing to the cluster labels corresponding sample data, further includes:
The cluster labels for meeting preset condition are filtered out from the multiple cluster labels.
It is preferably, described that the cluster labels for meeting preset condition are filtered out from the multiple cluster labels, comprising:
From the sample data of the specified class categories, the number of the corresponding sample data of each cluster labels is counted respectively Amount;
It calculates in the specified class categories, the average of samples of each cluster labels;
The quantity for abandoning sample data is lower than the cluster labels of the average of samples, obtains the cluster labels of reservation.
It is preferably, described that data balancing processing is carried out to the corresponding sample data of the cluster labels, comprising:
The quantity of sample data in more each candidate cluster label, choose at least one cluster labels of minimum number into Row data enhancing processing.
Preferably, described that the corresponding characteristic information of the specified class categories is clustered, multiple cluster labels are obtained, Include:
Set the quantity of cluster centre;
According to the quantity of the cluster centre, the characteristic information is gathered by the way of cosine cosine distance Class obtains multiple cluster labels.
The invention also discloses a kind of device of model training, described device includes:
Sample data obtains module, for obtaining the sample data to be trained in specified class categories;
Characteristic extracting module obtains the specified classification for carrying out feature extraction to the sample data to be trained The corresponding characteristic information of classification;
Cluster module obtains multiple clusters marks for clustering to the corresponding characteristic information of the specified class categories Label;
Balance processing module, for carrying out data balancing processing to the corresponding sample data of the cluster labels;
Model training module, for using data balancing treated sample data as target sample data, and using institute State target sample data, training designated model.
Preferably, described device further include:
Screening module, for filtering out the cluster labels for meeting preset condition from the multiple cluster labels.
Preferably, the screening module includes:
Statistic submodule, for counting each cluster labels pair respectively from the sample data of the specified class categories The quantity for the sample data answered;
Average of samples computational submodule, for calculating in the specified class categories, the sample of each cluster labels is flat Mean;
Sample abandons submodule, and the quantity for abandoning sample data is lower than the cluster labels of the average of samples, obtains The cluster labels that must retain.
Preferably, the balance processing module is also used to:
The quantity of sample data in more each candidate cluster label, choose at least one cluster labels of minimum number into Row data enhancing processing.
Preferably, the cluster module is also used to:
Set the quantity of cluster centre;
According to the quantity of the cluster centre, the characteristic information is gathered by the way of cosine cosine distance Class obtains multiple cluster labels.
The invention also discloses a kind of model training systems, comprising:
One or more processors;With
One or more machine readable medias of instruction are stored thereon with, are executed when by one or more of processors When, so that the method that the electronic equipment executes a kind of above-mentioned model training.
The invention also discloses one or more machine readable medias, are stored thereon with instruction, when by one or more When managing device execution, so that the method that the processor executes a kind of above-mentioned model training.
Compared with prior art, the embodiment of the present invention includes following advantages:
In embodiments of the present invention, when obtain in specified class categories after the sample data trained, can be to this Sample data carries out feature extraction, obtains the specified corresponding characteristic information of class categories, then clusters, obtain to characteristic information Data balancing processing is carried out to multiple cluster labels, and to the sample data in cluster labels.Pass through above-mentioned unsupervised method The label in existing class categories is refined, realizes that the sample in classification is balanced, balanced sample data is provided for model, according to this Balanced sample data carries out the model of the available optimization of model training, and using the model of the optimization to carry out data prediction can be with More accurate prediction result is obtained, the accuracy rate of model prediction is improved.
Detailed description of the invention
Fig. 1 is a kind of step flow chart of the embodiment of the method for model training of the embodiment of the present invention;
Fig. 2 is a kind of structural block diagram of the Installation practice of model training of the embodiment of the present invention.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real Applying mode, the present invention is described in further detail.
Referring to Fig.1, a kind of step flow chart of the embodiment of the method for model training of the embodiment of the present invention is shown, specifically It may include steps of:
Step 101, the sample data to be trained in specified class categories is obtained;
In one embodiment, specified class categories can be the class categories determining by preset disaggregated model, And/or the specified class categories are also possible to through artificially specified class categories, the embodiment of the present invention to this with no restriction.
In one embodiment, which can be the class categories of coarseness, for example, a differentiation figure As the inside is either with or without the scene of people, specified class categories may include " someone " and " no one " the two class categories.
The embodiment of the present invention can be applied to the processing of the sample data in classification.It, can for each specified class categories To obtain all sample datas in the specified class categories, i.e. all samples for being identified as specified class categories in acquisition training set Notebook data.For example, being identified as all sample datas of " someone " classification.
As an example, which may include image data.
Step 102, feature extraction is carried out to the sample data to be trained, it is corresponding obtains the specified class categories Characteristic information;
Obtain in specified class categories after the sample data trained, can be by a disaggregated model, such as convolution mind It through network model, specifies the sample data to be trained in class categories to carry out feature extraction this, obtains the specified classification class Not corresponding characteristic information.
In one embodiment, for i-th of specified class categories, the characteristic information (i.e. characteristic set) extracted can With with featureiIt indicates.
Step 103, the corresponding characteristic information of the specified class categories is clustered, obtains multiple cluster labels;
After obtaining the specified corresponding characteristic information of class categories, this feature information can be clustered, to obtain Multiple cluster labels.
In one embodiment, the characteristic information that K-means specifies class categories to each can be used featureiIt is clustered.
Wherein, K-means algorithm is hard clustering algorithm, is the clustering algorithm based on distance, using distance as similar The evaluation index of property, that is, think that the distance of two objects is closer, similarity is bigger.The algorithm thinks that cluster is by apart from close Object composition, therefore obtaining compact and independent cluster as final goal.The input of K-MEANS algorithm are as follows: in cluster Heart number k and the corresponding characteristic information of specified class categories;Its output is: meeting multiple cluster marks of variance minimum sandards Label.
Specifically, in one embodiment, step 103 may include following sub-step:
Sub-step S11 sets the quantity of cluster centre;
In one embodiment, the quantity K of cluster centre can be selected according to the priori knowledge of data.For mistake The quantity K of cluster centre can be set greater than the value of priori knowledge by the purpose of noise filtering data.For example, being identified as " has There are 3 subclassifications, respectively " man ", " woman " and " child " in the specified class categories of people ", then it can be by cluster centre Quantity K is set greater than the value of numerical value 3, for example, K is set as numerical value 5.
It should be noted that the setting of the quantity of cluster centre can be set by receiving the input value of user, it can also To set after obtaining priori knowledge automatically, the embodiment of the present invention to this with no restriction.
Sub-step S12 believes the feature by the way of cosine cosine distance according to the quantity of the cluster centre Breath is clustered, and multiple cluster labels are obtained.
In the present embodiment, in order to match with the feature space that disaggregated model extracts, the distance measure of K-means algorithm can It to use cosine distance, is clustered according to cosine similarity, finally obtains multiple cluster labels j.
In a preferred embodiment of an embodiment of the present invention, can also include the following steps:
The cluster labels for meeting preset condition are filtered out from the multiple cluster labels;
After obtaining multiple cluster labels in specified class categories, the cluster mark for not meeting preset condition can be deleted Label retain the cluster labels for meeting preset condition.
In a preferred embodiment of an embodiment of the present invention, it is above-mentioned filtered out from the multiple cluster labels meet it is pre- If the cluster labels of condition can further include following sub-step:
Sub-step S21 counts the corresponding sample of each cluster labels from the sample data of the specified class categories respectively The quantity of notebook data;
In one embodiment, specify the quantity of the corresponding sample data of cluster labels j in class categories i that can indicate For Counti,j.For example, cluster labels are the number of samples of " man " in this specified class categories of statistics " someone ".
Sub-step S22 is calculated in the specified class categories, the average of samples of each cluster labels;
In one embodiment, in the average of samples of cluster labels=specified class categories all samples quantity (CountiThe quantity of)/cluster labels.
Sub-step S23, the quantity for abandoning sample data are lower than the cluster labels of the average of samples, obtain the poly- of reservation Class label.
In one embodiment, a threshold tau can be set in advance, theoretical selective value can be that (K is poly- to τ=1/K The quantity of class label).If Counti,j<τ*Counti, i.e., the sample size of some cluster labels be less than average sample quantity, Then abandon cluster labels j.After having traversed all cluster labels, the cluster labels finally retained are obtained.
Step 104, data balancing processing is carried out to the corresponding sample data of the cluster labels;
For remaining cluster labels, data balancing can be carried out using data enhancing or other data balancing algorithms Processing, to reach data balancing in specified class categories.
In a preferred embodiment of an embodiment of the present invention, step 104 can further include following sub-step:
The quantity of sample data in more each candidate cluster label, choose at least one cluster labels of minimum number into Row data enhancing processing.
For example, under the specified class categories of " someone ", altogether there are three cluster labels, respectively " man ", " woman " and " child ", wherein " child " this cluster labels cause the corresponding sample data of this cluster labels seldom for some reason, That is the cluster labels of " child " are unbalanced, eventually lead to model and are deteriorated to the recognition capability of " child ", it is thus possible to increase " small Sample data in this cluster labels of child ", so that data balancing.
Step 105, using data balancing treated sample data as target sample data, and the target sample is used Data, training designated model.
Specifically, carrying out data balancing, treated that sample data can be used as final training data (i.e. target sample Data), can be with re -training model according to the target sample data, therefore obtained model is sample in specified class categories Balanced Optimized model, the accuracy for carrying out data prediction using the Optimized model are higher.
In embodiments of the present invention, when obtain in specified class categories after the sample data trained, can be to this Sample data carries out feature extraction, obtains the specified corresponding characteristic information of class categories, then clusters, obtain to characteristic information Data balancing processing is carried out to multiple cluster labels, and to the sample data in cluster labels.Pass through above-mentioned unsupervised method The label in existing class categories is refined, realizes that the sample in classification is balanced, balanced sample data is provided for model, according to this Balanced sample data carries out the model of the available optimization of model training, and using the model of the optimization to carry out data prediction can be with More accurate prediction result is obtained, the accuracy rate of model prediction is improved.
It should be noted that for simple description, therefore, it is stated as a series of action groups for embodiment of the method It closes, but those skilled in the art should understand that, embodiment of that present invention are not limited by the describe sequence of actions, because according to According to the embodiment of the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art also should Know, the embodiments described in the specification are all preferred embodiments, and the related movement not necessarily present invention is implemented Necessary to example.
Referring to Fig. 2, a kind of structural block diagram of the Installation practice of model training of the embodiment of the present invention is shown, specifically may be used To include following module:
Sample data obtains module 201, for obtaining the sample data to be trained in specified class categories;
Characteristic extracting module 202 obtains described specified point for carrying out feature extraction to the sample data to be trained The corresponding characteristic information of class classification;
Cluster module 203 obtains multiple clusters for clustering to the corresponding characteristic information of the specified class categories Label;
Balance processing module 204, for carrying out data balancing processing to the corresponding sample data of the cluster labels;
Model training module 205 for using data balancing treated sample data as target sample data, and uses The target sample data, training designated model.
In a preferred embodiment of an embodiment of the present invention, described device can also include following module:
Screening module, for filtering out the cluster labels for meeting preset condition from the multiple cluster labels.
In a preferred embodiment of an embodiment of the present invention, the screening module can further include following submodule Block:
Statistic submodule, for counting each cluster labels pair respectively from the sample data of the specified class categories The quantity for the sample data answered;
Average of samples computational submodule, for calculating in the specified class categories, the sample of each cluster labels is flat Mean;
Sample abandons submodule, and the quantity for abandoning sample data is lower than the cluster labels of the average of samples, obtains The cluster labels that must retain.
In a preferred embodiment of an embodiment of the present invention, the balance processing module 204 is also used to:
The quantity of sample data in more each candidate cluster label, choose at least one cluster labels of minimum number into Row data enhancing processing.
In a preferred embodiment of an embodiment of the present invention, the cluster module 203 is also used to:
Set the quantity of cluster centre;
According to the quantity of the cluster centre, the characteristic information is gathered by the way of cosine cosine distance Class obtains multiple cluster labels.
For device embodiment, since it is substantially similar to above method embodiment, so be described relatively simple, The relevent part can refer to the partial explaination of embodiments of method.
The embodiment of the invention also provides a kind of model training systems, comprising:
One or more processors;With
One or more machine readable medias of instruction are stored thereon with, are executed when by one or more of processors When, so that the method that the electronic equipment executes a kind of above-mentioned model training.
The embodiment of the invention also provides one or more machine readable medias, are stored thereon with instruction, when by one or When multiple processors execute, so that the method that the processor executes a kind of above-mentioned model training.
All the embodiments in this specification are described in a progressive manner, the highlights of each of the examples are with The difference of other embodiments, the same or similar parts between the embodiments can be referred to each other.
It should be understood by those skilled in the art that, the embodiment of the embodiment of the present invention can provide as method, apparatus or calculate Machine program product.Therefore, the embodiment of the present invention can be used complete hardware embodiment, complete software embodiment or combine software and The form of the embodiment of hardware aspect.Moreover, the embodiment of the present invention can be used one or more wherein include computer can With in the computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) of program code The form of the computer program product of implementation.
The embodiment of the present invention be referring to according to the method for the embodiment of the present invention, terminal device (system) and computer program The flowchart and/or the block diagram of product describes.It should be understood that flowchart and/or the block diagram can be realized by computer program instructions In each flow and/or block and flowchart and/or the block diagram in process and/or box combination.It can provide these Computer program instructions are set to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing terminals Standby processor is to generate a machine, so that being held by the processor of computer or other programmable data processing terminal devices Capable instruction generates for realizing in one or more flows of the flowchart and/or one or more blocks of the block diagram The device of specified function.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing terminal devices In computer-readable memory operate in a specific manner, so that instruction stored in the computer readable memory generates packet The manufacture of command device is included, which realizes in one side of one or more flows of the flowchart and/or block diagram The function of being specified in frame or multiple boxes.
These computer program instructions can also be loaded into computer or other programmable data processing terminal devices, so that Series of operation steps are executed on computer or other programmable terminal equipments to generate computer implemented processing, thus The instruction executed on computer or other programmable terminal equipments is provided for realizing in one or more flows of the flowchart And/or in one or more blocks of the block diagram specify function the step of.
Although the preferred embodiment of the embodiment of the present invention has been described, once a person skilled in the art knows bases This creative concept, then additional changes and modifications can be made to these embodiments.So the following claims are intended to be interpreted as Including preferred embodiment and fall into all change and modification of range of embodiment of the invention.
Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that process, method, article or terminal device including a series of elements not only wrap Those elements are included, but also including other elements that are not explicitly listed, or further includes for this process, method, article Or the element that terminal device is intrinsic.In the absence of more restrictions, being wanted by what sentence "including a ..." limited Element, it is not excluded that there is also other identical elements in process, method, article or the terminal device for including the element.
A kind of method and device of model training provided by the present invention is described in detail above, it is used herein A specific example illustrates the principle and implementation of the invention, and the above embodiments are only used to help understand Method and its core concept of the invention;At the same time, for those skilled in the art is having according to the thought of the present invention There will be changes in body embodiment and application range, in conclusion the content of the present specification should not be construed as to the present invention Limitation.

Claims (12)

1. a kind of method of model training, which is characterized in that the described method includes:
Obtain the sample data to be trained in specified class categories;
Feature extraction is carried out to the sample data to be trained, obtains the corresponding characteristic information of the specified class categories;
The corresponding characteristic information of the specified class categories is clustered, multiple cluster labels are obtained;
Data balancing processing is carried out to the corresponding sample data of the cluster labels;
Using data balancing treated sample data as target sample data;
Using the target sample data, training designated model.
2. the method according to claim 1, wherein it is described to the corresponding sample data of the cluster labels into Before row data equilibrium treatment, further includes:
The cluster labels for meeting preset condition are filtered out from the multiple cluster labels.
3. according to the method described in claim 2, it is characterized in that, it is described filtered out from the multiple cluster labels meet it is pre- If the cluster labels of condition, comprising:
From the sample data of the specified class categories, the quantity of the corresponding sample data of each cluster labels is counted respectively;
It calculates in the specified class categories, the average of samples of each cluster labels;
The quantity for abandoning sample data is lower than the cluster labels of the average of samples, obtains the cluster labels of reservation.
4. according to the method described in claim 3, it is characterized in that, described carry out the corresponding sample data of the cluster labels Data balancing processing, comprising:
The quantity of sample data in more each candidate cluster label, at least one cluster labels for choosing minimum number are counted It is handled according to enhancing.
5. method according to claim 1-4, which is characterized in that described corresponding to the specified class categories Characteristic information is clustered, and multiple cluster labels are obtained, comprising:
Set the quantity of cluster centre;
According to the quantity of the cluster centre, the characteristic information is clustered by the way of cosine cosine distance, is obtained To multiple cluster labels.
6. a kind of device of model training, which is characterized in that described device includes:
Sample data obtains module, for obtaining the sample data to be trained in specified class categories;
Characteristic extracting module obtains the specified class categories for carrying out feature extraction to the sample data to be trained Corresponding characteristic information;
Cluster module obtains multiple cluster labels for clustering to the corresponding characteristic information of the specified class categories;
Balance processing module, for carrying out data balancing processing to the corresponding sample data of the cluster labels;
Model training module, for data balancing treated sample data to be used as to target sample data, and the use mesh Standard specimen notebook data, training designated model.
7. device according to claim 6, which is characterized in that further include:
Screening module, for filtering out the cluster labels for meeting preset condition from the multiple cluster labels.
8. device according to claim 7, which is characterized in that the screening module includes:
Statistic submodule, for it is corresponding to count each cluster labels respectively from the sample data of the specified class categories The quantity of sample data;
Average of samples computational submodule, for calculating in the specified class categories, the average of samples of each cluster labels;
Sample abandons submodule, and the quantity for abandoning sample data is lower than the cluster labels of the average of samples, is protected The cluster labels stayed.
9. device according to claim 8, which is characterized in that the balance processing module is also used to:
The quantity of sample data in more each candidate cluster label, at least one cluster labels for choosing minimum number are counted It is handled according to enhancing.
10. according to the described in any item devices of claim 6-9, which is characterized in that the cluster module is also used to:
Set the quantity of cluster centre;
According to the quantity of the cluster centre, the characteristic information is clustered by the way of cosine cosine distance, is obtained To multiple cluster labels.
11. a kind of model training systems characterized by comprising
One or more processors;With
One or more machine readable medias of instruction are stored thereon with, when being executed by one or more of processors, are made The method for obtaining a kind of model training that the electronic equipment is executed such as claim 1-5 one or more.
12. one or more machine readable medias, are stored thereon with instruction, when executed by one or more processors, so that The method that the processor executes a kind of model training such as claim 1-5 one or more.
CN201810664585.5A 2018-06-25 2018-06-25 A kind of method and device of model training Pending CN109145937A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810664585.5A CN109145937A (en) 2018-06-25 2018-06-25 A kind of method and device of model training

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810664585.5A CN109145937A (en) 2018-06-25 2018-06-25 A kind of method and device of model training

Publications (1)

Publication Number Publication Date
CN109145937A true CN109145937A (en) 2019-01-04

Family

ID=64802349

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810664585.5A Pending CN109145937A (en) 2018-06-25 2018-06-25 A kind of method and device of model training

Country Status (1)

Country Link
CN (1) CN109145937A (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934281A (en) * 2019-03-08 2019-06-25 电子科技大学 A kind of unsupervised training method of two sorter networks
CN110147851A (en) * 2019-05-29 2019-08-20 北京达佳互联信息技术有限公司 Method for screening images, device, computer equipment and storage medium
CN110570469A (en) * 2019-08-16 2019-12-13 广州威尔森信息科技有限公司 intelligent identification method for angle position of automobile picture
CN110689066A (en) * 2019-09-24 2020-01-14 成都考拉悠然科技有限公司 Training method combining face recognition data equalization and enhancement
CN111191122A (en) * 2019-12-20 2020-05-22 重庆邮电大学 Learning resource recommendation system based on user portrait
CN111553173A (en) * 2020-04-23 2020-08-18 苏州思必驰信息科技有限公司 Natural language generation training method and device
CN111611388A (en) * 2020-05-29 2020-09-01 北京学之途网络科技有限公司 Account classification method, device and equipment
CN111611457A (en) * 2020-05-20 2020-09-01 北京金山云网络技术有限公司 Page classification method, device, equipment and storage medium
CN111783869A (en) * 2020-06-29 2020-10-16 杭州海康威视数字技术股份有限公司 Training data screening method and device, electronic equipment and storage medium
CN111797876A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Data classification method and device, storage medium and electronic equipment
CN111860671A (en) * 2020-07-28 2020-10-30 中山大学 Classification model training method and device, terminal equipment and readable storage medium
CN111970584A (en) * 2020-07-08 2020-11-20 国网宁夏电力有限公司电力科学研究院 Method, device and equipment for processing data and storage medium
CN112749724A (en) * 2019-10-31 2021-05-04 阿里巴巴集团控股有限公司 Method and equipment for training classifier and predicting application performance expansibility
CN112861512A (en) * 2021-02-05 2021-05-28 北京百度网讯科技有限公司 Data processing method, device, equipment and storage medium
CN113128536A (en) * 2019-12-31 2021-07-16 奇安信科技集团股份有限公司 Unsupervised learning method, system, computer device and readable storage medium
CN113408282A (en) * 2021-08-06 2021-09-17 腾讯科技(深圳)有限公司 Method, device, equipment and storage medium for topic model training and topic prediction
WO2021189830A1 (en) * 2020-03-26 2021-09-30 平安科技(深圳)有限公司 Sample data optimization method, apparatus and device, and storage medium
CN113487320A (en) * 2021-06-28 2021-10-08 深圳索信达数据技术有限公司 Fraud transaction detection method, device, computer equipment and storage medium
CN114077860A (en) * 2020-08-18 2022-02-22 鸿富锦精密电子(天津)有限公司 Method and system for sorting parts before assembly, electronic device and storage medium
WO2023166747A1 (en) * 2022-03-04 2023-09-07 日本電信電話株式会社 Training data generation device, training data generation method, and program

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103048273A (en) * 2012-11-09 2013-04-17 江苏大学 Fruit near infrared spectrum sorting method based on fuzzy clustering
CN103617429A (en) * 2013-12-16 2014-03-05 苏州大学 Sorting method and system for active learning
US8787692B1 (en) * 2011-04-08 2014-07-22 Google Inc. Image compression using exemplar dictionary based on hierarchical clustering
CN106599935A (en) * 2016-12-29 2017-04-26 重庆邮电大学 Three-decision unbalanced data oversampling method based on Spark big data platform
CN106682684A (en) * 2016-11-23 2017-05-17 天津津航计算技术研究所 K-means clustering-based target recognition method
CN107194430A (en) * 2017-05-27 2017-09-22 北京三快在线科技有限公司 A kind of screening sample method and device, electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8787692B1 (en) * 2011-04-08 2014-07-22 Google Inc. Image compression using exemplar dictionary based on hierarchical clustering
CN103048273A (en) * 2012-11-09 2013-04-17 江苏大学 Fruit near infrared spectrum sorting method based on fuzzy clustering
CN103617429A (en) * 2013-12-16 2014-03-05 苏州大学 Sorting method and system for active learning
CN106682684A (en) * 2016-11-23 2017-05-17 天津津航计算技术研究所 K-means clustering-based target recognition method
CN106599935A (en) * 2016-12-29 2017-04-26 重庆邮电大学 Three-decision unbalanced data oversampling method based on Spark big data platform
CN107194430A (en) * 2017-05-27 2017-09-22 北京三快在线科技有限公司 A kind of screening sample method and device, electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张良均: "《MATLAB数据分析与挖掘实战》", 30 June 2015, 机械工业出版社 *

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934281A (en) * 2019-03-08 2019-06-25 电子科技大学 A kind of unsupervised training method of two sorter networks
CN111797876A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Data classification method and device, storage medium and electronic equipment
CN110147851A (en) * 2019-05-29 2019-08-20 北京达佳互联信息技术有限公司 Method for screening images, device, computer equipment and storage medium
CN110147851B (en) * 2019-05-29 2022-04-01 北京达佳互联信息技术有限公司 Image screening method and device, computer equipment and storage medium
CN110570469A (en) * 2019-08-16 2019-12-13 广州威尔森信息科技有限公司 intelligent identification method for angle position of automobile picture
CN110570469B (en) * 2019-08-16 2020-08-25 广州威尔森信息科技有限公司 Intelligent identification method for angle position of automobile picture
CN110689066A (en) * 2019-09-24 2020-01-14 成都考拉悠然科技有限公司 Training method combining face recognition data equalization and enhancement
CN112749724A (en) * 2019-10-31 2021-05-04 阿里巴巴集团控股有限公司 Method and equipment for training classifier and predicting application performance expansibility
CN111191122A (en) * 2019-12-20 2020-05-22 重庆邮电大学 Learning resource recommendation system based on user portrait
CN113128536A (en) * 2019-12-31 2021-07-16 奇安信科技集团股份有限公司 Unsupervised learning method, system, computer device and readable storage medium
WO2021189830A1 (en) * 2020-03-26 2021-09-30 平安科技(深圳)有限公司 Sample data optimization method, apparatus and device, and storage medium
CN111553173A (en) * 2020-04-23 2020-08-18 苏州思必驰信息科技有限公司 Natural language generation training method and device
CN111553173B (en) * 2020-04-23 2023-09-15 思必驰科技股份有限公司 Natural language generation training method and device
CN111611457A (en) * 2020-05-20 2020-09-01 北京金山云网络技术有限公司 Page classification method, device, equipment and storage medium
CN111611457B (en) * 2020-05-20 2024-01-02 北京金山云网络技术有限公司 Page classification method, device, equipment and storage medium
CN111611388A (en) * 2020-05-29 2020-09-01 北京学之途网络科技有限公司 Account classification method, device and equipment
CN111783869A (en) * 2020-06-29 2020-10-16 杭州海康威视数字技术股份有限公司 Training data screening method and device, electronic equipment and storage medium
CN111970584A (en) * 2020-07-08 2020-11-20 国网宁夏电力有限公司电力科学研究院 Method, device and equipment for processing data and storage medium
CN111860671A (en) * 2020-07-28 2020-10-30 中山大学 Classification model training method and device, terminal equipment and readable storage medium
CN114077860A (en) * 2020-08-18 2022-02-22 鸿富锦精密电子(天津)有限公司 Method and system for sorting parts before assembly, electronic device and storage medium
CN112861512A (en) * 2021-02-05 2021-05-28 北京百度网讯科技有限公司 Data processing method, device, equipment and storage medium
CN113487320A (en) * 2021-06-28 2021-10-08 深圳索信达数据技术有限公司 Fraud transaction detection method, device, computer equipment and storage medium
CN113408282B (en) * 2021-08-06 2021-11-09 腾讯科技(深圳)有限公司 Method, device, equipment and storage medium for topic model training and topic prediction
CN113408282A (en) * 2021-08-06 2021-09-17 腾讯科技(深圳)有限公司 Method, device, equipment and storage medium for topic model training and topic prediction
WO2023166747A1 (en) * 2022-03-04 2023-09-07 日本電信電話株式会社 Training data generation device, training data generation method, and program

Similar Documents

Publication Publication Date Title
CN109145937A (en) A kind of method and device of model training
CN111523621B (en) Image recognition method and device, computer equipment and storage medium
CN110942072B (en) Quality score based on quality assessment, detection model training and detection method and device
CN109190646B (en) A kind of data predication method neural network based, device and nerve network system
CN110633745A (en) Image classification training method and device based on artificial intelligence and storage medium
CN104899579A (en) Face recognition method and face recognition device
CN113591527A (en) Object track identification method and device, electronic equipment and storage medium
CN110298297A (en) Flame identification method and device
CN104778230B (en) A kind of training of video data segmentation model, video data cutting method and device
CN113656582B (en) Training method of neural network model, image retrieval method, device and medium
CN110245679A (en) Image clustering method, device, electronic equipment and computer readable storage medium
CN113837308B (en) Knowledge distillation-based model training method and device and electronic equipment
CN108197177A (en) Monitoring method, device, storage medium and the computer equipment of business object
CN111783712A (en) Video processing method, device, equipment and medium
CN109214412A (en) A kind of training method and device of disaggregated model
CN111160959B (en) User click conversion prediction method and device
CN115062709A (en) Model optimization method, device, equipment, storage medium and program product
CN112712068B (en) Key point detection method and device, electronic equipment and storage medium
CN113806501B (en) Training method of intention recognition model, intention recognition method and equipment
CN115567371B (en) Abnormity detection method, device, equipment and readable storage medium
CN112884730B (en) Cooperative significance object detection method and system
CN115546554A (en) Sensitive image identification method, device, equipment and computer readable storage medium
CN115129671A (en) Log detection method, log detection device and computer-readable storage medium
CN115052154A (en) Model training and video coding method, device, equipment and storage medium
CN110188798B (en) Object classification method and model training method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190104

RJ01 Rejection of invention patent application after publication