CN109189767A - Data processing method, device, electronic equipment and storage medium - Google Patents
Data processing method, device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN109189767A CN109189767A CN201810866737.XA CN201810866737A CN109189767A CN 109189767 A CN109189767 A CN 109189767A CN 201810866737 A CN201810866737 A CN 201810866737A CN 109189767 A CN109189767 A CN 109189767A
- Authority
- CN
- China
- Prior art keywords
- training data
- data
- prediction
- label
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06395—Quality analysis or management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/12—Hotels or restaurants
Abstract
The data processing method of the disclosure, belongs to field of computer technology, solves the problems, such as to carry out data processing cost height, low efficiency using manual method in the prior art.The data processing method of the embodiment of the present disclosure includes: based on training data training objective model;Test data is predicted by the object module, determines the predictablity rate of the object module;The training data is predicted by the object module, determines the prediction label and prediction result confidence level of every training data;According to the preset label of the training data, prediction label and prediction result confidence level and the predictablity rate, the training data is handled.The data processing method that the disclosure provides is by determining the predictablity rate of object module based on test data, and the predictablity rate of combining target model and the prediction result confidence level of training data handle training data, help to promote data-handling efficiency and accuracy, reduces data processing cost.
Description
Technical field
This disclosure relates to which field of computer technology, more particularly to a kind of data processing method, device, electronic equipment and is deposited
Storage media.
Background technique
The conventional means that Classification and Identification is current object classification is carried out based on the obtained model of training, wherein object includes
But be not limited to image, user behavior and trade company etc..By taking the classification of hotel's picture quality of wine trip platform as an example, usually base first
In manually having demarcated figure to hotel's image of credit rating label training hotel's picture quality disaggregated model, then, then it is based on instruction
The hotel's picture quality disaggregated model perfected carries out Classification and Identification to target hotel image, with the determination target hotel image
Credit rating.Be based on training data train classification models in the prior art, and based on the obtained disaggregated model of training to object into
In the application of row Classification and Identification, the quality of training data quality directly affects the classification accuracy for the disaggregated model that training obtains,
Accordingly, it is desirable to provide a kind of scheme for improving training data.
Summary of the invention
The disclosure provides a kind of data processing method, helps to promote data-handling efficiency and accuracy, and reduce data
Processing cost.
In a first aspect, the embodiment of the present disclosure provides a kind of data processing method includes:
Based on training data training objective model, wherein the training data includes preset label;
Test data is predicted by the object module, determines the predictablity rate of the object module;
The training data is predicted by the object module, determines the prediction label of every training data
With prediction result confidence level;
It is accurate according to the preset label of the training data, prediction label and prediction result confidence level and the prediction
Rate handles the training data.
Second aspect, the embodiment of the present disclosure provide a kind of data processing equipment, comprising:
Object module training module, for being based on training data training objective model, wherein the training data includes pre-
Set label;
Model prediction accuracy rate determining module determines institute for predicting by the object module test data
State the predictablity rate of object module;
Training data prediction module, for the object module by object module training module training to the training
Data are predicted, determine the prediction label and prediction result confidence level of every training data;
Data processing module, for according to the preset label of the training data, prediction label and prediction result confidence level,
And the predictablity rate, the training data is handled.
The third aspect, the embodiment of the present disclosure additionally provide a kind of electronic equipment, including memory, processor and are stored in institute
The computer program that can be run on memory and on a processor is stated, the processor realizes this when executing the computer program
Data processing method described in open embodiment.
Fourth aspect, the embodiment of the present disclosure provide a kind of computer readable storage medium, are stored thereon with computer journey
Sequence, when which is executed by processor the step of data processing method described in the embodiment of the present disclosure.
The data processing method that the embodiment of the present disclosure provides, by being based on training data training objective model, wherein described
Training data includes preset label;Then, test data is predicted by the object module, determines the object module
Predictablity rate;And the training data is predicted by the object module, determine every training data
Prediction label and prediction result confidence level;Finally, according to preset label, prediction label and the prediction result of the training data
Confidence level and the predictablity rate, handle the training data, solve in the prior art using manual method
The problem of carrying out high data processing cost, low efficiency, and since artificial subjective factor leads to data processed result reliability not
High problem.The data processing method that the embodiment of the present disclosure provides, the prediction by determining object module based on test data are quasi-
True rate, and the prediction result confidence level of the predictablity rate of combining target model and every training data counts training data
According to denoising and classification, help to promote data-handling efficiency and accuracy, and reduce data processing cost.
Detailed description of the invention
It, below will be in embodiment or description of the prior art in order to illustrate more clearly of the technical solution of the embodiment of the present disclosure
Required attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is only some realities of the disclosure
Example is applied, it for those of ordinary skill in the art, without any creative labor, can also be attached according to these
Figure obtains other attached drawings.
Fig. 1 is the data processing method flow chart of the embodiment of the present disclosure one;
Fig. 2 is the flow chart of the data processing method of the embodiment of the present disclosure two;
Fig. 3 is the confusion matrix schematic diagram of the data processing method building of the embodiment of the present disclosure two;
Fig. 4 is the flow chart of the data processing method of the embodiment of the present disclosure three;
Fig. 5 is one of the data processing equipment structural diagram of the embodiment of the present disclosure four;
Fig. 6 is the second structural representation of the data processing equipment of the embodiment of the present disclosure four;
Fig. 7 is one of the structural schematic diagram of data processing equipment of the embodiment of the present disclosure five.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present disclosure, the technical solution in the embodiment of the present disclosure is carried out clear, complete
Site preparation description, it is clear that described embodiment is disclosure a part of the embodiment, instead of all the embodiments.Based on this public affairs
Embodiment in opening, every other implementation obtained by those of ordinary skill in the art without making creative efforts
Example belongs to the range of disclosure protection.
Embodiment one
A kind of data processing method that the embodiment of the present disclosure provides, as shown in Figure 1, this method comprises: step 110 is to step
140。
Step 110, it is based on training data training objective model.
Wherein, the training data includes preset label.
During carrying out has the model training of supervision, it is necessary first to collect a large amount of training samples, as training data, often
A training sample is a training data, in general, every training data is previously provided with sample label.With training image quality point
For grade model, training data is a sheet by a sheet image.It is in advance every trained number before training image quality grading model
According to that is, sample label is arranged in every image, and the sample label is used to indicate the credit rating of image.With three disaggregated models of training
For, the sample label of every training data can be set in advance as any one in the super credit ratings such as excellent, normal, poor
A rank.
When it is implemented, Data Analysis Services can also be passed through by being manually the preset label of every training data
For the preset label of every training data.
After collecting training data, using the training data as the input of object module, with the training data
Output of the preset label as the object module, pass through and execute Training, the training object module.
In some embodiments of the present disclosure, the object module can (Google be for mobile phone etc. for MobileNet
A kind of depth convolutional neural networks for lightweight that embedded device proposes) three sorter networks, or other have supervision net
Network, the disclosure to the structure of object module without limitation, as long as there is supervision network.Based on training data training objective mould
The specific method of type has supervision network model specific method referring in the prior art, and the disclosure does not limit this.
The disclosure can also be other disaggregated models, such as image point when it is implemented, be not limited to picture quality hierarchy model
Class model, user's disaggregated model, product classification model etc..The object module is also not necessarily limited to three disaggregated models, can also be two
Disaggregated model, four disaggregated models etc..The preset label of the value range and training data of the result of object module output is fetched
Commensurate in scope.
Step 120, test data is predicted by the object module, determines that the prediction of the object module is accurate
Rate.
The disclosure is when it is implemented, also need to obtain test sample in advance, as test data, and is each test sample
Sample label is set, and sample label is arranged in as every test data.The sample label is the preset mark of the test data
Label, are used to indicate the real property information of the test data.It illustrates by image of test data, the preset label can be
The attribute informations such as the true classification of described image, true grade.
After training obtains object module, using preset test data as the input of the object module, with determination
The prediction result of every test data, the prediction result include the prediction label and prediction result confidence of the test data of input
Degree.It, will be as a test for picture quality hierarchy models of the object module for three classification using test data as image
The piece image of data is input to after described image quality grading model, and described image quality grading model is to described in input
Image carry out image quality level prediction, and export described image image quality level (such as it is super it is excellent, normal, poor in appoint
Anticipate a grade) and described image belong to the confidence score of the image quality level.
After each test data is predicted by the object module, corresponding prediction label and prediction will be all obtained
As a result confidence level.Further, by prediction label to the test data for being input to the object module and preset label into
Row compares, and determines the ratio of prediction label and the identical test data of preset label and whole test datas, determine described in
The predictablity rate of object module.
Step 130, the training data is predicted by the object module, determines every training data
Prediction label and prediction result confidence level.
After training obtains object module, using preset training data as the input of the object module, with determination
The prediction result of every training data, the prediction result include the prediction label and prediction result confidence of the training data of input
Degree.By taking the picture quality hierarchy model that training data is image, the object module is three classification as an example, a training will be used as
The piece image of data is input to after described image quality grading model, and described image quality grading model will be to the institute of input
State image carry out image quality level prediction, and export described image image quality level (such as it is super it is excellent, normal, poor in
Any one grade) and described image belong to the confidence score of the image quality level.
After each training data is predicted by the object module, corresponding prediction label and prediction will be all obtained
As a result confidence level.
Step 140, according to the preset label of the training data, prediction label and prediction result confidence level and described
Predictablity rate handles the training data.
For each training data, the preset label and prediction label of the training data are further analyzed, it is found that
The preset label and prediction label of some training datas are different.For example, its preset label of piece image is normal quality etc.
Grade, and after being predicted by object module, the prediction label of the image is inferior grade, different for preset label and prediction label
The training data of cause, the abnormal training data of prediction result performance is defined as in the disclosure, and the performance of these prediction results is different
Normal training data may result in the model inaccuracy that training obtains during the training object module and therefore need
Data processing is carried out in conjunction with different situations, the training data abnormal to prediction result performance.The application is firstly, according to described
Data handling conditions are arranged in predictablity rate, then, according to the similarities and differences of the preset label and the prediction label and described
Relationship between prediction result confidence level and the data handling conditions handles the training data.
In one embodiment of the present disclosure, due to the usually manual mark of preset label of training data, pole
It there may be and very likely there is noise, such as the true class label of training data and the inconsistent situation of preset label, base
In the training data training objective model of tag error, it will model prediction accuracy rate is caused to decline.Common in the art
Data processing method is that the method manually denoised removes this data noise.The inventors of the present application found that using manual method
It is high to carry out data processing cost, low efficiency, and there are artificial subjective factor, data processed result reliability is not high.The application
Inventor it has furthermore been found that when the preset label of certain training data and when prediction label difference, i.e. prediction result performance is different
Chang Shi, also, the corresponding prediction result confidence level of prediction label of this training data is again very high, meets preset confidence level item
Part, then it is assumed that this training data is the training data of preset label for labelling mistake, using this training data as noise data.
Therefore, the noise in data can be removed by data processing method disclosed in the present application.Wherein, preset confidence level condition can
To be determined according to the predictablity rate of the object module.
Separately in one embodiment of the present disclosure, it is assumed that noise is not present in the training data, then, when certain training
When the preset label and prediction label difference of data, i.e., when prediction result performance is abnormal, also, the pre- mark of this training data
It is again very high to sign corresponding prediction result confidence level, meets preset confidence level condition, it may be considered that this training data is pre-
It is bigger to survey difficulty, for the object module, be difficult to distinguish the training data be the corresponding classification of preset label or
For the corresponding classification of the prediction label.That is, this training data and preset label are the training data ratio of the prediction label
It is more similar, then it is determined as the abnormal training data of prediction result performance easily to obscure training data.Wherein, preset confidence
Degree condition can be determined according to the predictablity rate of the object module.
In some embodiments of the present disclosure, after training obtains object module, it can also first carry out through the target
Model predicts the training data, determines the prediction label of every training data and the step of prediction result confidence level
Suddenly, it then executes and test data is predicted again by the object module, determine the predictablity rate of the object module
The step of.
The data processing method that the embodiment of the present disclosure provides, by being based on training data training objective model, wherein described
Training data includes preset label;Then, test data is predicted by the object module, determines the object module
Predictablity rate;And the training data is predicted by the object module, determine every training data
Prediction label and prediction result confidence level;Finally, according to preset label, prediction label and the prediction result of the training data
Confidence level and the predictablity rate, handle the training data, solve in the prior art using manual method
The problem of carrying out high data processing cost, low efficiency, and since artificial subjective factor leads to data processed result reliability not
High problem.The data processing method that the embodiment of the present disclosure provides, the prediction by determining object module based on test data are quasi-
True rate, and the prediction result confidence level of the predictablity rate of combining target model and every training data counts training data
According to denoising and classification, help to promote data-handling efficiency and accuracy, and reduce data processing cost.
Embodiment two
The embodiment of the present disclosure provides a kind of data processing method, as shown in Fig. 2, this method comprises: step 210 is to step
250。
Step 210, it is based on training data training objective model.
Wherein, the training data includes preset label.
Specific embodiment based on training data training objective model is referring to embodiment one, and this embodiment is not repeated.
Step 220, test data is predicted by the object module, determines that the prediction of the object module is accurate
Rate.
Test data is predicted by the object module, determines the specific of the predictablity rate of the object module
Embodiment is referring to embodiment one, and this embodiment is not repeated.
Step 230, the training data is predicted by the object module, determines every training data
Prediction label and prediction result confidence level.
The training data is predicted by the object module, determines the prediction label of every training data
Specific embodiment with prediction result confidence level is referring to embodiment one, and this embodiment is not repeated.
After each training data is predicted by the object module, corresponding prediction label and prediction will be all obtained
As a result confidence level.
Step 240, according to the preset label of the training data, prediction label and prediction result confidence level and described
Predictablity rate handles the training data.
In some embodiments of the present disclosure, preset label, prediction label and the prediction according to the training data
As a result confidence level and the predictablity rate, handle the training data, comprising: show prediction result abnormal
The training data classify according to the combination of two of the preset label and the prediction label, determine several groups exception
Training data, wherein the abnormal training data of the prediction result performance includes: the preset label and the pre- mark
Sign the different training datas;For exception training data described in every group, determine that the prediction result confidence level meets respectively
The abnormal training data of default first data handling conditions is noise data, wherein the default first data processing item
Part is determining according to the predictablity rate, for example, described highest A% training data of prediction result confidence level, the A% root
It is determined according to the predictablity rate, as A% is equal to the predictablity rate.
Assuming that being tri- sorter network model of MobileNet, the training based on the object module that training data training obtains
Data are hotel's image, and the preset label of hotel's image includes: tri- credit ratings of S, A and BC, passes through the target mould
After type predicts preset test data, determine that the predictablity rate of the object module is 60%, then it can be according to institute
It states predictablity rate and determines the first data handling conditions are as follows: prediction result is set in every group of abnormal test data of prediction result performance
The test data of reliability highest 60% is noise data.
In some embodiments of the present disclosure, prediction result can be showed by the confusion matrix of building training data different
The normal training data is classified according to the combination of two of the preset label and the prediction label, determines that several groups are different
Normal training data.Wherein, the abnormal training data of prediction result performance includes: the preset label and the prediction label
The different training datas.
For example, firstly, being indexed respectively using preset label S, A and BC as the line index of confusion matrix and column index
The first row to the third line matrix element, and, first row to tertial matrix element;It then, is S and prediction by preset label
Label be the training data of S quantity as the first row first row matrix element element value, i.e. line index S and column index S
The element value of the matrix element of index, the quantity for the training data that using preset label be S and prediction label is A is as the first row
The element value of the matrix element of the element value of the matrix element of two column, i.e. line index S and column index A index, and so on, building
The confusion matrix of training data.The confusion matrix of building is as shown in Figure 3.Confusion matrix illustrates the training number of a certain preset label
According to the quantity for being predicted to be different prediction labels, wherein each matrix element of confusion matrix indicates that preset label is corresponding and is somebody's turn to do
The corresponding prediction of column index of the matrix element column is predicted to be in the training data of matrix element line index of the row
The training data quantity of label, the sum of matrix element of every a line of confusion matrix are the instruction with the corresponding preset label of the row
Practice the summation of data.That is each matrix element of confusion matrix corresponds to a grouping of training data, the row rope of matrix element
Draw corresponding preset label with column index when corresponding prediction label difference, the corresponding training data of the matrix element is grouped into one
The grouping of a exception training data.If the tertial matrix element value of the third line in Fig. 3 is 589, then it represents that preset label
The training data for being BC for prediction label in the training data of S is 589.Further, according to prediction result confidence level by height
It is sorted from front to back to training data of the low sequence to this 589 recognition result exceptions, then can be approximately considered sequence and lean on
The data prediction result of preceding 60% be it is believable, since its prediction label is different from preset label, it is possible to think to sort
It is noise that 60% forward data, which have a possibility that very big, i.e. the training data that has mislabeled of label, that is, determines preceding 60% instruction
Practicing data is noise data.And 40% data rearward of sorting are it may be considered that be more indistinguishable training data, to its into
Row retains, and has larger help for the accuracy of identification of subsequent lift scheme.
According to above-mentioned method, the training data that can be identified to each preset label divides according to prediction label respectively
Group obtains several groups training data.For example, the training data that preset label is S can be to divide 3 groups, respectively prediction label is S
One group, one group that one group that prediction label is A and prediction label are BC, for the training different with prediction label of threshold value label
Data are determined as abnormal training data in the present embodiment.I.e. preset label is one group of training data that S prediction label is A and pre-
Setting label is one group of training data that S prediction label is BC, will be confirmed as two groups of exception training datas.According to the method described above,
6 groups of exception training datas will be determined in the present embodiment.Then, it based on determining first data handling conditions, determines respectively
Noise data in this 6 groups of exception training datas.
Step 250, based on the training data in the training data in addition to the noise data, optimize the mesh
Mark model.
It is described according to the preset label of the training data, prediction label and pre- in other embodiments of the disclosure
After the step of surveying result confidence level and the predictablity rate, handling the training data, further includes: be based on
The training data in the training data in addition to the noise data, optimizes the object module.
The description of base step 240 in this present embodiment, 40% training after further determining that in every group of exception training data
Data and preset label and the consistent training data of prediction label advanced optimize the training object module.
The data processing method that the embodiment of the present disclosure provides, by being based on training data training objective model, wherein described
Training data includes preset label;Then, test data is predicted by the object module, determines the object module
Predictablity rate;And the training data is predicted by the object module, determine every training data
Prediction label and prediction result confidence level;Finally, according to preset label, prediction label and the prediction result of the training data
Confidence level and the predictablity rate, handle the training data, solve in the prior art using manual method
The problem of carrying out high data processing cost, low efficiency, and since artificial subjective factor leads to data processed result reliability not
High problem.The data processing method that the embodiment of the present disclosure provides, the prediction by determining object module based on test data are quasi-
True rate, and the prediction result confidence level of the predictablity rate of combining target model and every training data counts training data
According to denoising and classification, help to promote data-handling efficiency and accuracy, and reduce data processing cost.
Further, due to the usually manual mark of preset label of training data, it is most likely that there are preset
The case where tag error, the training data training objective model based on tag error, it will model prediction accuracy rate is caused to decline.
Therefore, when the preset label of certain training data and prediction label difference, i.e., when prediction result performance is abnormal, by combining mesh
The prediction result confidence level distribution for marking the predictablity rate and every training data of model, determines noise data, can effectively know
The training data of not preset tag error.The training object module is advanced optimized by the training data after removal noise,
Further to promote the predictablity rate of object module.
Present inventor is by going the prior art the study found that carrying out data using emotion consistency discrimination method
It is whether consistent with the feeling polarities of adjective noun pair according to feeling polarities integrated value when making an uproar, determine whether data are noise, one
Cause then saves, inconsistent, deletes, and this method is related to feeling polarities task, does not have universality.Also, it is based on multi-modal depth
The probability sampling model for spending convolutional neural networks removes noise, and this method deletes the similar emotion score of all categories with probability P
Example, core concept refers to that the difference being predicted to be between positive and passive emotion score when a trained example is got over
When big, which will be carried over into training set, otherwise the probability that the example is deleted from training set is bigger.Base
It is directly determined in the probability sampling model of multi-modal depth convolutional neural networks according to the difference of different classes of absolute prediction value
It is fixed whether to retain, reasonability is lacked for the noise data of tag error.By taking hotel's credit rating identifies scene as an example, it is assumed that true
The image of real S grade is BC grade by error flag, because the example of two grades itself has distinction, the image quilt
The predicted value difference for being predicted as S and BC grade is still very big, and only prediction label is S and preset label B C difference, but this
The noise data as caused by tag error still can be retained when based on feeling polarities denoising, can reduce the mould that training obtains
The predictablity rate of type.
It also, include the noise data of label during carrying out model training for supervised learning, directly using pre-
It is inappropriate for surveying absolute value and carrying out noise judgement.The disclosure takes full advantage of the probability distribution of prediction classification sample, is based on mould
The accuracy rate of type obtains good trade-off between noise data and hard case (more difficult differentiation sample).Disclosure base first
The training of object module is carried out in original training data, the predictablity rate A% of object module is then determined using test data,
The classification of determining prediction error later, the data of the removal higher A% ratio of confidence level, the training data after being cleaned, into
The re -training of row model, can the effective obtained predictablity rate of model of training for promotion.
Embodiment three
The embodiment of the present disclosure provides a kind of data processing method, as shown in figure 4, this method comprises: step 410 is to step
450。
Step 410, it is based on training data training objective model.
Wherein, the training data includes preset label.
Specific embodiment based on training data training objective model is referring to embodiment one, and this embodiment is not repeated.
Step 420, test data is predicted by the object module, determines that the prediction of the object module is accurate
Rate.
Test data is predicted by the object module, determines the specific of the predictablity rate of the object module
Embodiment is referring to embodiment one, and this embodiment is not repeated.
Step 430, the training data is predicted by the object module, determines every training data
Prediction label and prediction result confidence level.
The training data is predicted by the object module, determines the prediction label of every training data
Specific embodiment with prediction result confidence level is referring to embodiment one, and this embodiment is not repeated.
After each training data is predicted by the object module, corresponding prediction label and prediction will be all obtained
As a result confidence level.
Step 440, according to the preset label of the training data, prediction label and prediction result confidence level and described
Predictablity rate handles the training data.
In some embodiments of the present disclosure, preset label, prediction label and the prediction according to the training data
As a result confidence level and the predictablity rate, handle the training data, comprising: show prediction result abnormal
The training data classify according to the combination of two of the preset label and the prediction label, determine several groups exception
Training data, wherein the abnormal training data of the prediction result performance includes: the preset label and the pre- mark
Sign the different training datas;For exception training data described in every group, determine that the prediction result confidence level meets respectively
The abnormal training data of default second data handling conditions is easily to obscure training data, wherein default second data
Treatment conditions are determined according to the predictablity rate.For example, second data handling conditions are the prediction result confidence level
Highest B% training data, the B% are determined according to the predictablity rate, as B% is equal to the predictablity rate.
For example, first by training data according to preset labeling, for the present embodiment, preset label include S,
Training data can be then divided into 3 classes by A and BC.Further, for every class training data, further divide according to prediction label
It is multiple groups, for the present embodiment, every class training data can be further divided into 3 groups.According to this classification method, this reality
To be divided by applying the training data in example by 9 groups, the combination difference of this corresponding preset label of 9 groups of training datas and prediction label
Are as follows: S and S, S and A, S and BC, A and S, A and A, A and BC, BC and S, BC and A, BC and BC.Then, by preset label and prediction
The different corresponding training data of combination of label is determined as the abnormal training data of prediction result performance.Specific to the present embodiment
Speech, by the combination of preset label and prediction label: S and A, S and BC, A and S, A and BC, BC and S, BC and the corresponding training data of A
It is determined as the abnormal training data of prediction result performance.
It further, can be according to the prediction result confidence level from high to low for exception training data described in every group
Sequence, sort from front to back to the training data in every group of exception training data respectively, and determine every group of exception training data
In, the prediction result confidence level meets the training data of default second data handling conditions, such as training data of preceding B%
Easily to obscure training data.Wherein, default second data handling conditions are determined according to the predictablity rate.For example, pre-
If the second data handling conditions are the training data of the highest B% of confidence level, wherein B% is equal to the prediction of the object module
Accuracy rate, alternatively, setting B% is equal to the 90% of the predictablity rate of the object module according to specific business need.
Step 450, easily training data is obscured based on described, optimize the object module.
It is described according to the preset label of the training data, prediction label and pre- in other embodiments of the disclosure
Result confidence level and the predictablity rate are surveyed, after handling the training data, further includes: based on described easy
Obscure training data, optimizes the object module.
When noise is not present in the training data, that is, when the training data of tag error is not present, when certain training
When the preset label and prediction label difference of data, i.e., when prediction result performance is abnormal, it may be considered that this training data
It predicts that difficulty is bigger, for the object module, is difficult to distinguish the training data to be the corresponding classification of preset label
It or is the corresponding classification of the prediction label.Therefore, the target mould can be optimized further by easily obscuring training data
Type.
It is described easily to obscure training data based on described in some embodiments of the present disclosure, optimize the object module, wraps
Include: according to the prediction result confidence level, determine every group respectively described in easily obscure training data described in exception training data
The prediction complexity matched;According to the sequence of the prediction complexity from the easier to the more advanced, it is based on and the prediction complexity
That matches described easily obscures training data, object module described in iteration optimization.
In some embodiments of the present disclosure, for every group of exception training data, respectively according to the prediction result confidence level,
The matched prediction complexity of training data is easily obscured described in exception training data described in determining every group.The prediction result is set
Reliability is higher, illustrates that the confidence level that this training data is prediction label is higher, that is, distinguishing this training data is preset label
The difficulty of grade or prediction label grade is bigger.When it is implemented, can be according to the quantity or prediction of specific abnormal training data
Prediction complexity is divided into multiple grades, such as high, medium and low 3 grades by the as a result value range of confidence level.Then, for
Every group of exception training data is the matched high-grade prediction difficulty or ease journey of the easy obfuscated data of prediction result confidence level highest 30%
Degree is that the matched prediction of prediction complexity of the 30% minimum easy matched inferior grade of obfuscated data of prediction result confidence level is difficult
Easy degree, for the prediction complexity of the easy matched middle grade of obfuscated data of other in the group.
Further, the training data for being primarily based on matching inferior grade prediction complexity in all groups of easy obfuscated datas is excellent
Change the object module, obtains object module M1;Then, then based on matching middle grade in all groups of easy obfuscated datas predict difficulty or ease
The training data of degree optimizes the object module M1, obtains object module M2;Finally, again based in all groups of easy obfuscated datas
The training data for matching high-grade prediction complexity optimizes the object module M2, obtains object module M3.Finally, target mould
Type M3 is as the object module after optimization.
By using training process from the easier to the more advanced, the feature learning ability of model is stepped up, can be instructed with lift scheme
Practice efficiency.
It is described easily to obscure training data based on described in other embodiments of the disclosure, optimize the object module,
Comprise determining that the similar training data for easily obscuring training data, wherein the preset label of the similar training data and institute
The prediction label for stating easy obfuscated data is identical;Based on the similar training data trained number similar with the easy obfuscated data building
According to right;Based on the similar training data to the optimization object module.
For specific list the present embodiment, firstly, by the preset label and the prediction label for easily obscuring training data
The identical training data, such as by preset label be S training data Data1 and prediction label be that S easily obscures trained number
According to Data2 as the similar training data for easily obscuring training data Data1.Then, similar instruction is constructed based on Data1 and Data 2
Practice data pair.It illustrates by image of training data, if the preset label of a certain image Picture1 is S grade, prediction label
For BC grade, then illustrates described image Picture1 and preset label is image Picture2, Picture3 ... tool of BC grade
Have certain similitude, then can based on described image Picture1 training data similar with described image Picture2 building to,
It is right based on described image Picture1 and the similar training data of described image Picture3 building ....Further, it can be based on
The similar image pair of building, optimizes the object module.
The similar image pair based on building, optimizes the specific embodiment of the object module referring to the prior art,
This embodiment is not repeated.
In field of image search, carry out hard case's (hardly possible distinguishes example) using the data processing method in the disclosure
It selects, it can further boosting algorithm performance.It is general in image retrieval all disaggregated model to be used to carry out pre-training, then construct
Image to the further distinction for promoting characteristics of image, image to comprising between the same classification, also comprising it is different classes of it
Between.Wherein, the image pair that different classes of is selected by the data processing method of the disclosure, using basic model to training number
According to the data of concentration carry out prediction obtain it is different obscure training data, such as S grade error prediction is BC grade, it is assumed that data itself
Completely, noise is not included, that training data for showing to be predicted as BC grade and the original training data for BC grade have centainly
Similitude, the training data based on the training data and original BC grade that are predicted as BC grade constructs image pair, compared to random
The training data chosen in S and BC grade constructs similar image pair, can further promote the feature representation ability of image, promotes instruction
The predictablity rate of the model got.
The data processing method that the embodiment of the present disclosure provides, by being based on training data training objective model, wherein described
Training data includes preset label;Then, test data is predicted by the object module, determines the object module
Predictablity rate;And the training data is predicted by the object module, determine every training data
Prediction label and prediction result confidence level;Finally, according to preset label, prediction label and the prediction result of the training data
Confidence level and the predictablity rate, handle the training data, solve in the prior art using manual method
The problem of carrying out high data processing cost, low efficiency, and since artificial subjective factor leads to data processed result reliability not
High problem.The data processing method that the embodiment of the present disclosure provides, the prediction by determining object module based on test data are quasi-
True rate, and the prediction result confidence level of the predictablity rate of combining target model and every training data counts training data
Classify according to denoising, helps to promote data-handling efficiency and accuracy, and reduce data processing cost.
Further, when the preset label of certain training data and prediction label difference, i.e. prediction result performance is abnormal
When, it is distributed by the predictablity rate of combining target model and the prediction result confidence level of every training data, determines more difficult area
The training data divided, and the training data based on more indistinguishable training data advanced optimizes the training object module, with
Further promote the predictablity rate of object module.
Example IV
The embodiment of the present disclosure provides a kind of data processing equipment, as shown in figure 5, described device includes:
Object module training module 510, for being based on training data training objective model, wherein the training data packet
Include preset label;
Model prediction accuracy rate determining module 520 is determined for being predicted by the object module test data
The predictablity rate of the object module;
Training data prediction module 530, for the object module by the object module training module 510 training to institute
It states training data to be predicted, determines the prediction label and prediction result confidence level of every training data;
Data processing module 540, for according to the preset label of the training data, prediction label and prediction result confidence
Degree and the predictablity rate, handle the training data.
Optionally, as shown in fig. 6, the data processing module 540 further comprises:
First data grouping submodule 5401, for prediction result to be showed the abnormal training data according to described pre-
The combination of two for setting label and the prediction label is classified, and determines several groups exception training data, wherein the prediction knot
The abnormal training data of fruit performance includes: the preset label training data different with the prediction label;
Noise data determines submodule 5402, for determining the prediction respectively for exception training data described in every group
As a result the abnormal training data that confidence level meets default first data handling conditions is noise data, wherein described default
First data handling conditions are determined according to the predictablity rate.
Optionally, as shown in fig. 6, described device further include:
First model optimization module 550, for based on the instruction in the training data in addition to the noise data
Practice data, optimizes the object module.
The data processing equipment that the embodiment of the present disclosure provides, by being based on training data training objective model, wherein described
Training data includes preset label;Then, test data is predicted by the object module, determines the object module
Predictablity rate;And the training data is predicted by the object module, determine every training data
Prediction label and prediction result confidence level;Finally, according to preset label, prediction label and the prediction result of the training data
Confidence level and the predictablity rate, handle the training data, solve in the prior art using manual method
The problem of carrying out high data processing cost, low efficiency, and since artificial subjective factor leads to data processed result reliability not
High problem.The data processing equipment that the embodiment of the present disclosure provides, the prediction by determining object module based on test data are quasi-
True rate, and the prediction result confidence level of the predictablity rate of combining target model and every training data counts training data
According to denoising and classification, help to promote data-handling efficiency and accuracy, and reduce data processing cost.
Further, due to the usually manual mark of preset label of training data, it is most likely that there are preset
The case where tag error, the training data training objective model based on tag error, it will model prediction accuracy rate is caused to decline.
Therefore, when the preset label of certain training data and prediction label difference, i.e., when prediction result performance is abnormal, by combining mesh
The prediction result confidence level distribution for marking the predictablity rate and every training data of model, determines noise data, can effectively know
The training data of not preset tag error.The training object module is advanced optimized by the training data after removal noise,
Further to promote the predictablity rate of object module.
Embodiment five
Reference implementation example four, in another embodiment of the disclosure, as shown in fig. 7, the data processing module 540 into
One step includes:
First data grouping submodule 5401, for prediction result to be showed the abnormal training data according to described pre-
The combination of two for setting label and the prediction label is classified, and determines several groups exception training data, wherein the prediction knot
The abnormal training data of fruit performance includes: the preset label training data different with the prediction label;
Easily obscuring training data determines submodule 5403, for determining institute respectively for exception training data described in every group
Stating prediction result confidence level to meet the abnormal training data of default second data handling conditions is easily to obscure training data,
In, default second data handling conditions are determined according to the predictablity rate.
Optionally, as shown in fig. 7, described device further include:
Second model optimization module 560 optimizes the object module for easily obscuring training data based on described.
In one embodiment of the present disclosure, the second model optimization module 560 is further used for:
According to the prediction result confidence level, determine every group respectively described in easily obscure trained number described in exception training data
According to matched prediction complexity;
According to the sequence of the prediction complexity from the easier to the more advanced, based on matched described easy with the prediction complexity
Obscure training data, object module described in iteration optimization.
In another embodiment of the disclosure, the second model optimization module 560 is further used for:
Determine the similar training data for easily obscuring training data, wherein the preset label of the similar training data
It is identical as the prediction label of the easy obfuscated data;
Based on similar training data training data pair similar with the easy obfuscated data building;
Based on the similar training data to the optimization object module.
The data processing equipment that the embodiment of the present disclosure provides, for realizing described in the embodiment of the present disclosure one to embodiment three
Data processing method each step, the specific embodiment of each module of device is referring to corresponding steps, and details are not described herein again.
The data processing equipment that the embodiment of the present disclosure provides, by being based on training data training objective model, wherein described
Training data includes preset label;Then, test data is predicted by the object module, determines the object module
Predictablity rate;And the training data is predicted by the object module, determine every training data
Prediction label and prediction result confidence level;Finally, according to preset label, prediction label and the prediction result of the training data
Confidence level and the predictablity rate, handle the training data, solve in the prior art using manual method
The problem of carrying out high data processing cost, low efficiency, and since artificial subjective factor leads to data processed result reliability not
High problem.The data processing equipment that the embodiment of the present disclosure provides, the prediction by determining object module based on test data are quasi-
True rate, and the prediction result confidence level of the predictablity rate of combining target model and every training data counts training data
According to classification, help to promote data-handling efficiency and accuracy, and reduce data processing cost.
Further, when the preset label of certain training data and prediction label difference, i.e. prediction result performance is abnormal
When, it is distributed by the predictablity rate of combining target model and the prediction result confidence level of every training data, determines more difficult area
The training data divided, and the training data based on more indistinguishable training data advanced optimizes the training object module, with
Further promote the predictablity rate of object module.
Correspondingly, the disclosure additionally provides a kind of electronic equipment, including memory, processor and it is stored in the memory
Computer program that is upper and can running on a processor, the processor are realized when executing the computer program as the disclosure is real
Apply data processing method described in example one to any one embodiment of embodiment three.The electronic equipment can be PC machine, movement
Terminal, personal digital assistant, tablet computer etc..
The disclosure additionally provides a kind of computer readable storage medium, is stored thereon with computer program, which is located
Manage the step that the data processing method as described in the embodiment of the present disclosure one to any one embodiment of embodiment three is realized when device executes
Suddenly.
All the embodiments in this specification are described in a progressive manner, the highlights of each of the examples are with
The difference of other embodiments, the same or similar parts between the embodiments can be referred to each other.For Installation practice
For, since it is basically similar to the method embodiment, so being described relatively simple, referring to the portion of embodiment of the method in place of correlation
It defends oneself bright.
A kind of data processing method and device provided above to the disclosure is described in detail, tool used herein
Body example is expounded the principle and embodiment of the disclosure, this public affairs that the above embodiments are only used to help understand
The method and its core concept opened;At the same time, for those skilled in the art, according to the thought of the disclosure, specific real
Apply in mode and application range that there will be changes, in conclusion the content of the present specification should not be construed as the limit to the disclosure
System.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can
It realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware realization.Based on such reason
Solution, substantially the part that contributes to existing technology can embody above-mentioned technical proposal in the form of software products in other words
Come, which may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including
Some instructions are used so that a computer equipment (can be personal computer, server or the network equipment etc.) executes respectively
Method described in certain parts of a embodiment or embodiment.
Claims (16)
1. a kind of data processing method characterized by comprising
Based on training data training objective model, wherein the training data includes preset label;
Test data is predicted by the object module, determines the predictablity rate of the object module;
The training data is predicted by the object module, determines the prediction label of every training data and pre-
Survey result confidence level;
It is right according to the preset label of the training data, prediction label and prediction result confidence level and the predictablity rate
The training data is handled.
2. the method according to claim 1, wherein described according to the preset label of the training data, prediction
Label and prediction result confidence level and the predictablity rate, the step of processing the training data, comprising:
By the abnormal training data of prediction result performance according to the combination of two of the preset label and the prediction label
Classify, determine several groups exception training data, wherein the prediction result shows the abnormal training data and includes:
The preset label training data different with the prediction label;
For exception training data described in every group, determine that the prediction result confidence level meets default first data processing item respectively
The abnormal training data of part is noise data, wherein default first data handling conditions are accurate according to the prediction
Rate determines.
3. according to the method described in claim 2, it is characterized in that, described according to the preset label of the training data, prediction
Label and prediction result confidence level and the predictablity rate also wrap after the step of handling the training data
It includes:
Based on the training data in the training data in addition to the noise data, optimize the object module.
4. the method according to claim 1, wherein described according to the preset label of the training data, prediction
Label and prediction result confidence level and the predictablity rate, the step of processing the training data, comprising:
By the abnormal training data of prediction result performance according to the combination of two of the preset label and the prediction label
Classify, determine several groups exception training data, wherein the prediction result shows the abnormal training data and includes:
The preset label training data different with the prediction label;
For exception training data described in every group, determine that the prediction result confidence level meets default second data processing item respectively
The abnormal training data of part is easily to obscure training data, wherein default second data handling conditions are according to described pre-
Accuracy rate is surveyed to determine.
5. according to the method described in claim 4, it is characterized in that, described according to the preset label of the training data, prediction
Label and prediction result confidence level and the predictablity rate also wrap after the step of handling the training data
It includes:
Easily training data is obscured based on described, optimizes the object module.
6. according to the method described in claim 5, it is characterized in that, described easily obscure training data based on described, described in optimization
The step of object module, comprising:
According to the prediction result confidence level, determine every group respectively described in easily obscure training data described in exception training data
The prediction complexity matched;
According to the prediction complexity sequence from the easier to the more advanced, based on the prediction complexity is matched described easily obscures
Training data, object module described in iteration optimization.
7. according to the method described in claim 5, it is characterized in that, described easily obscure training data based on described, described in optimization
The step of object module, comprising:
Determine the similar training data for easily obscuring training data, wherein the preset label of the similar training data and institute
The prediction label for stating easy obfuscated data is identical;
Based on similar training data training data pair similar with the easy obfuscated data building;
Based on the similar training data to the optimization object module.
8. a kind of data processing equipment characterized by comprising
Object module training module, for being based on training data training objective model, wherein the training data includes preset mark
Label;
Model prediction accuracy rate determining module determines the mesh for predicting by the object module test data
Mark the predictablity rate of model;
Training data prediction module, for the object module by object module training module training to the training data
It is predicted, determines the prediction label and prediction result confidence level of every training data;
Data processing module, for according to the preset label of the training data, prediction label and prediction result confidence level, and
The predictablity rate handles the training data.
9. device according to claim 8, which is characterized in that the data processing module further comprises:
First data grouping submodule, for by the abnormal training data of prediction result performance according to the preset label and
The combination of two of the prediction label is classified, and determines several groups exception training data, wherein the prediction result performance is different
The normal training data includes: the preset label training data different with the prediction label;
Noise data determines submodule, for determining the prediction result confidence respectively for exception training data described in every group
The abnormal training data that degree meets default first data handling conditions is noise data, wherein default first data
Treatment conditions are determined according to the predictablity rate.
10. device according to claim 9, which is characterized in that further include:
First model optimization module, for based on the training data in the training data in addition to the noise data,
Optimize the object module.
11. device according to claim 8, which is characterized in that the data processing module further comprises:
First data grouping submodule, for by the abnormal training data of prediction result performance according to the preset label and
The combination of two of the prediction label is classified, and determines several groups exception training data, wherein the prediction result performance is different
The normal training data includes: the preset label training data different with the prediction label;
Easily obscuring training data determines submodule, for determining the prediction knot respectively for exception training data described in every group
The abnormal training data that fruit confidence level meets default second data handling conditions is easily to obscure training data, wherein described
Default second data handling conditions are determined according to the predictablity rate.
12. device according to claim 11, which is characterized in that further include:
Second model optimization module optimizes the object module for easily obscuring training data based on described.
13. device according to claim 12, which is characterized in that the second model optimization module is further used for:
According to the prediction result confidence level, determine every group respectively described in easily obscure training data described in exception training data
The prediction complexity matched;
According to the prediction complexity sequence from the easier to the more advanced, based on the prediction complexity is matched described easily obscures
Training data, object module described in iteration optimization.
14. device according to claim 12, which is characterized in that the second model optimization module is further used for:
Determine the similar training data for easily obscuring training data, wherein the preset label of the similar training data and institute
The prediction label for stating easy obfuscated data is identical;
Based on similar training data training data pair similar with the easy obfuscated data building;
Based on the similar training data to the optimization object module.
15. a kind of electronic equipment, including memory, processor and it is stored on the memory and can runs on a processor
Computer program, which is characterized in that the processor realizes claim 1 to 7 any one when executing the computer program
The data processing method.
16. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
The step of data processing method described in claim 1 to 7 any one is realized when execution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810866737.XA CN109189767B (en) | 2018-08-01 | 2018-08-01 | Data processing method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810866737.XA CN109189767B (en) | 2018-08-01 | 2018-08-01 | Data processing method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109189767A true CN109189767A (en) | 2019-01-11 |
CN109189767B CN109189767B (en) | 2021-07-23 |
Family
ID=64920386
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810866737.XA Active CN109189767B (en) | 2018-08-01 | 2018-08-01 | Data processing method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109189767B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110705596A (en) * | 2019-09-04 | 2020-01-17 | 北京三快在线科技有限公司 | White screen detection method and device, electronic equipment and storage medium |
CN110909688A (en) * | 2019-11-26 | 2020-03-24 | 南京甄视智能科技有限公司 | Face detection small model optimization training method, face detection method and computer system |
CN110929785A (en) * | 2019-11-21 | 2020-03-27 | 中国科学院深圳先进技术研究院 | Data classification method and device, terminal equipment and readable storage medium |
CN111078877A (en) * | 2019-12-05 | 2020-04-28 | 支付宝(杭州)信息技术有限公司 | Data processing method, training method of text classification model, and text classification method and device |
CN111144216A (en) * | 2019-11-27 | 2020-05-12 | 北京三快在线科技有限公司 | Picture label generation method and device, electronic equipment and readable storage medium |
CN111325278A (en) * | 2020-02-26 | 2020-06-23 | 重庆金山医疗技术研究院有限公司 | Image processing method, device and storage medium |
WO2020178687A1 (en) * | 2019-03-07 | 2020-09-10 | International Business Machines Corporation | Computer model machine learning based on correlations of training data with performance trends |
CN111724136A (en) * | 2020-06-23 | 2020-09-29 | 平安医疗健康管理股份有限公司 | Method and device for entering information of first page of medical record and computer equipment |
CN112749516A (en) * | 2021-02-03 | 2021-05-04 | 江南机电设计研究所 | System combination model reliability intelligent evaluation method suitable for multi-type data characteristics |
CN113360643A (en) * | 2021-05-27 | 2021-09-07 | 重庆南鹏人工智能科技研究院有限公司 | Electronic medical record data quality evaluation method based on short text classification |
CN114417987A (en) * | 2022-01-11 | 2022-04-29 | 支付宝(杭州)信息技术有限公司 | Model training method, data identification method, device and equipment |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070112701A1 (en) * | 2005-08-15 | 2007-05-17 | Microsoft Corporation | Optimization of cascaded classifiers |
CN101378519A (en) * | 2008-09-28 | 2009-03-04 | 宁波大学 | Method for evaluating quality-lose referrence image quality base on Contourlet transformation |
CN101540048A (en) * | 2009-04-21 | 2009-09-23 | 北京航空航天大学 | Image quality evaluating method based on support vector machine |
CN102567744A (en) * | 2011-12-29 | 2012-07-11 | 中国科学院自动化研究所 | Method for determining quality of iris image based on machine learning |
CN104834898A (en) * | 2015-04-09 | 2015-08-12 | 华南理工大学 | Quality classification method for portrait photography image |
CN105046277A (en) * | 2015-07-15 | 2015-11-11 | 华南农业大学 | Robust mechanism research method of characteristic significance in image quality evaluation |
CN105426826A (en) * | 2015-11-09 | 2016-03-23 | 张静 | Tag noise correction based crowd-sourced tagging data quality improvement method |
CN106709511A (en) * | 2016-12-08 | 2017-05-24 | 华中师范大学 | Urban rail transit panoramic monitoring video fault detection method based on depth learning |
CN107463953A (en) * | 2017-07-21 | 2017-12-12 | 上海交通大学 | Image classification method and system based on quality insertion in the case of label is noisy |
CN107562859A (en) * | 2017-08-29 | 2018-01-09 | 武汉斗鱼网络科技有限公司 | A kind of disaggregated model training system and its implementation |
CN107688823A (en) * | 2017-07-20 | 2018-02-13 | 北京三快在线科技有限公司 | A kind of characteristics of image acquisition methods and device, electronic equipment |
CN107704806A (en) * | 2017-09-01 | 2018-02-16 | 深圳市唯特视科技有限公司 | A kind of method that quality of human face image prediction is carried out based on depth convolutional neural networks |
CN108122002A (en) * | 2017-12-18 | 2018-06-05 | 东软集团股份有限公司 | Training sample acquisition methods and device |
CN108345846A (en) * | 2018-01-29 | 2018-07-31 | 华东师范大学 | A kind of Human bodys' response method and identifying system based on convolutional neural networks |
-
2018
- 2018-08-01 CN CN201810866737.XA patent/CN109189767B/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070112701A1 (en) * | 2005-08-15 | 2007-05-17 | Microsoft Corporation | Optimization of cascaded classifiers |
CN101378519A (en) * | 2008-09-28 | 2009-03-04 | 宁波大学 | Method for evaluating quality-lose referrence image quality base on Contourlet transformation |
CN101540048A (en) * | 2009-04-21 | 2009-09-23 | 北京航空航天大学 | Image quality evaluating method based on support vector machine |
CN102567744A (en) * | 2011-12-29 | 2012-07-11 | 中国科学院自动化研究所 | Method for determining quality of iris image based on machine learning |
CN104834898A (en) * | 2015-04-09 | 2015-08-12 | 华南理工大学 | Quality classification method for portrait photography image |
CN105046277A (en) * | 2015-07-15 | 2015-11-11 | 华南农业大学 | Robust mechanism research method of characteristic significance in image quality evaluation |
CN105426826A (en) * | 2015-11-09 | 2016-03-23 | 张静 | Tag noise correction based crowd-sourced tagging data quality improvement method |
CN106709511A (en) * | 2016-12-08 | 2017-05-24 | 华中师范大学 | Urban rail transit panoramic monitoring video fault detection method based on depth learning |
CN107688823A (en) * | 2017-07-20 | 2018-02-13 | 北京三快在线科技有限公司 | A kind of characteristics of image acquisition methods and device, electronic equipment |
CN107463953A (en) * | 2017-07-21 | 2017-12-12 | 上海交通大学 | Image classification method and system based on quality insertion in the case of label is noisy |
CN107562859A (en) * | 2017-08-29 | 2018-01-09 | 武汉斗鱼网络科技有限公司 | A kind of disaggregated model training system and its implementation |
CN107704806A (en) * | 2017-09-01 | 2018-02-16 | 深圳市唯特视科技有限公司 | A kind of method that quality of human face image prediction is carried out based on depth convolutional neural networks |
CN108122002A (en) * | 2017-12-18 | 2018-06-05 | 东软集团股份有限公司 | Training sample acquisition methods and device |
CN108345846A (en) * | 2018-01-29 | 2018-07-31 | 华东师范大学 | A kind of Human bodys' response method and identifying system based on convolutional neural networks |
Non-Patent Citations (3)
Title |
---|
GUOQING GUI 等: ""Data-Driven Support Vector Machine with Optimization Techniques for Structural Health Monitoring and Damage Detection"", 《KSCE JOURNAL OF CIVIL ENGINEERING》 * |
夏战国 等: ""类不均衡的半监督高斯过程分类算法"", 《通信学报》 * |
范媛媛 等: ""基于BP神经网络的图像质量评价参数优化"", 《应用光学》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2596438A (en) * | 2019-03-07 | 2021-12-29 | Ibm | Computer model machine learning based on correlations of training data with performance trends |
US11809966B2 (en) | 2019-03-07 | 2023-11-07 | International Business Machines Corporation | Computer model machine learning based on correlations of training data with performance trends |
WO2020178687A1 (en) * | 2019-03-07 | 2020-09-10 | International Business Machines Corporation | Computer model machine learning based on correlations of training data with performance trends |
CN110705596A (en) * | 2019-09-04 | 2020-01-17 | 北京三快在线科技有限公司 | White screen detection method and device, electronic equipment and storage medium |
CN110929785A (en) * | 2019-11-21 | 2020-03-27 | 中国科学院深圳先进技术研究院 | Data classification method and device, terminal equipment and readable storage medium |
CN110929785B (en) * | 2019-11-21 | 2023-12-05 | 中国科学院深圳先进技术研究院 | Data classification method, device, terminal equipment and readable storage medium |
CN110909688A (en) * | 2019-11-26 | 2020-03-24 | 南京甄视智能科技有限公司 | Face detection small model optimization training method, face detection method and computer system |
CN111144216A (en) * | 2019-11-27 | 2020-05-12 | 北京三快在线科技有限公司 | Picture label generation method and device, electronic equipment and readable storage medium |
CN111078877B (en) * | 2019-12-05 | 2023-03-21 | 支付宝(杭州)信息技术有限公司 | Data processing method, training method of text classification model, and text classification method and device |
CN111078877A (en) * | 2019-12-05 | 2020-04-28 | 支付宝(杭州)信息技术有限公司 | Data processing method, training method of text classification model, and text classification method and device |
CN111325278B (en) * | 2020-02-26 | 2023-08-29 | 重庆金山医疗技术研究院有限公司 | Image processing method, device and storage medium |
CN111325278A (en) * | 2020-02-26 | 2020-06-23 | 重庆金山医疗技术研究院有限公司 | Image processing method, device and storage medium |
CN111724136A (en) * | 2020-06-23 | 2020-09-29 | 平安医疗健康管理股份有限公司 | Method and device for entering information of first page of medical record and computer equipment |
CN112749516A (en) * | 2021-02-03 | 2021-05-04 | 江南机电设计研究所 | System combination model reliability intelligent evaluation method suitable for multi-type data characteristics |
CN112749516B (en) * | 2021-02-03 | 2023-08-25 | 江南机电设计研究所 | Intelligent evaluation method for credibility of system combination model adapting to multi-type data characteristics |
CN113360643A (en) * | 2021-05-27 | 2021-09-07 | 重庆南鹏人工智能科技研究院有限公司 | Electronic medical record data quality evaluation method based on short text classification |
CN114417987A (en) * | 2022-01-11 | 2022-04-29 | 支付宝(杭州)信息技术有限公司 | Model training method, data identification method, device and equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109189767B (en) | 2021-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109189767A (en) | Data processing method, device, electronic equipment and storage medium | |
CN103455545B (en) | The method and system of the location estimation of social network user | |
CN107403198B (en) | Official website identification method based on cascade classifier | |
CN111581385B (en) | Unbalanced data sampling Chinese text category recognition system and method | |
Bochinski et al. | Deep active learning for in situ plankton classification | |
CN111475615B (en) | Fine granularity emotion prediction method, device and system for emotion enhancement and storage medium | |
CN104615730B (en) | A kind of multi-tag sorting technique and device | |
CN110489578A (en) | Image processing method, device and computer equipment | |
CN109213859A (en) | A kind of Method for text detection, apparatus and system | |
US20230177626A1 (en) | Systems and methods for determining structured proceeding outcomes | |
CN110032859A (en) | Abnormal account's discrimination method and device and medium | |
CN106537387B (en) | Retrieval/storage image associated with event | |
CN108229527A (en) | Training and video analysis method and apparatus, electronic equipment, storage medium, program | |
CN108733644A (en) | A kind of text emotion analysis method, computer readable storage medium and terminal device | |
CN106445908A (en) | Text identification method and apparatus | |
Hare et al. | Verifying FAD-association in purse seine catches on the basis of catch sampling | |
CN109800309A (en) | Classroom Discourse genre classification methods and device | |
CN112749330B (en) | Information pushing method, device, computer equipment and storage medium | |
CN110796171A (en) | Unclassified sample processing method and device of machine learning model and electronic equipment | |
CN111160959A (en) | User click conversion estimation method and device | |
CN111859967A (en) | Entity identification method and device and electronic equipment | |
CN116648698A (en) | Dynamic facet ordering | |
CN108241867A (en) | A kind of sorting technique and device | |
CN110909768B (en) | Method and device for acquiring marked data | |
CN108021565A (en) | A kind of analysis method and device of the user satisfaction based on linguistic level |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |