CN109189767A - Data processing method, device, electronic equipment and storage medium - Google Patents

Data processing method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN109189767A
CN109189767A CN201810866737.XA CN201810866737A CN109189767A CN 109189767 A CN109189767 A CN 109189767A CN 201810866737 A CN201810866737 A CN 201810866737A CN 109189767 A CN109189767 A CN 109189767A
Authority
CN
China
Prior art keywords
training data
data
prediction
label
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810866737.XA
Other languages
Chinese (zh)
Other versions
CN109189767B (en
Inventor
康丽萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN201810866737.XA priority Critical patent/CN109189767B/en
Publication of CN109189767A publication Critical patent/CN109189767A/en
Application granted granted Critical
Publication of CN109189767B publication Critical patent/CN109189767B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/12Hotels or restaurants

Abstract

The data processing method of the disclosure, belongs to field of computer technology, solves the problems, such as to carry out data processing cost height, low efficiency using manual method in the prior art.The data processing method of the embodiment of the present disclosure includes: based on training data training objective model;Test data is predicted by the object module, determines the predictablity rate of the object module;The training data is predicted by the object module, determines the prediction label and prediction result confidence level of every training data;According to the preset label of the training data, prediction label and prediction result confidence level and the predictablity rate, the training data is handled.The data processing method that the disclosure provides is by determining the predictablity rate of object module based on test data, and the predictablity rate of combining target model and the prediction result confidence level of training data handle training data, help to promote data-handling efficiency and accuracy, reduces data processing cost.

Description

Data processing method, device, electronic equipment and storage medium
Technical field
This disclosure relates to which field of computer technology, more particularly to a kind of data processing method, device, electronic equipment and is deposited Storage media.
Background technique
The conventional means that Classification and Identification is current object classification is carried out based on the obtained model of training, wherein object includes But be not limited to image, user behavior and trade company etc..By taking the classification of hotel's picture quality of wine trip platform as an example, usually base first In manually having demarcated figure to hotel's image of credit rating label training hotel's picture quality disaggregated model, then, then it is based on instruction The hotel's picture quality disaggregated model perfected carries out Classification and Identification to target hotel image, with the determination target hotel image Credit rating.Be based on training data train classification models in the prior art, and based on the obtained disaggregated model of training to object into In the application of row Classification and Identification, the quality of training data quality directly affects the classification accuracy for the disaggregated model that training obtains, Accordingly, it is desirable to provide a kind of scheme for improving training data.
Summary of the invention
The disclosure provides a kind of data processing method, helps to promote data-handling efficiency and accuracy, and reduce data Processing cost.
In a first aspect, the embodiment of the present disclosure provides a kind of data processing method includes:
Based on training data training objective model, wherein the training data includes preset label;
Test data is predicted by the object module, determines the predictablity rate of the object module;
The training data is predicted by the object module, determines the prediction label of every training data With prediction result confidence level;
It is accurate according to the preset label of the training data, prediction label and prediction result confidence level and the prediction Rate handles the training data.
Second aspect, the embodiment of the present disclosure provide a kind of data processing equipment, comprising:
Object module training module, for being based on training data training objective model, wherein the training data includes pre- Set label;
Model prediction accuracy rate determining module determines institute for predicting by the object module test data State the predictablity rate of object module;
Training data prediction module, for the object module by object module training module training to the training Data are predicted, determine the prediction label and prediction result confidence level of every training data;
Data processing module, for according to the preset label of the training data, prediction label and prediction result confidence level, And the predictablity rate, the training data is handled.
The third aspect, the embodiment of the present disclosure additionally provide a kind of electronic equipment, including memory, processor and are stored in institute The computer program that can be run on memory and on a processor is stated, the processor realizes this when executing the computer program Data processing method described in open embodiment.
Fourth aspect, the embodiment of the present disclosure provide a kind of computer readable storage medium, are stored thereon with computer journey Sequence, when which is executed by processor the step of data processing method described in the embodiment of the present disclosure.
The data processing method that the embodiment of the present disclosure provides, by being based on training data training objective model, wherein described Training data includes preset label;Then, test data is predicted by the object module, determines the object module Predictablity rate;And the training data is predicted by the object module, determine every training data Prediction label and prediction result confidence level;Finally, according to preset label, prediction label and the prediction result of the training data Confidence level and the predictablity rate, handle the training data, solve in the prior art using manual method The problem of carrying out high data processing cost, low efficiency, and since artificial subjective factor leads to data processed result reliability not High problem.The data processing method that the embodiment of the present disclosure provides, the prediction by determining object module based on test data are quasi- True rate, and the prediction result confidence level of the predictablity rate of combining target model and every training data counts training data According to denoising and classification, help to promote data-handling efficiency and accuracy, and reduce data processing cost.
Detailed description of the invention
It, below will be in embodiment or description of the prior art in order to illustrate more clearly of the technical solution of the embodiment of the present disclosure Required attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is only some realities of the disclosure Example is applied, it for those of ordinary skill in the art, without any creative labor, can also be attached according to these Figure obtains other attached drawings.
Fig. 1 is the data processing method flow chart of the embodiment of the present disclosure one;
Fig. 2 is the flow chart of the data processing method of the embodiment of the present disclosure two;
Fig. 3 is the confusion matrix schematic diagram of the data processing method building of the embodiment of the present disclosure two;
Fig. 4 is the flow chart of the data processing method of the embodiment of the present disclosure three;
Fig. 5 is one of the data processing equipment structural diagram of the embodiment of the present disclosure four;
Fig. 6 is the second structural representation of the data processing equipment of the embodiment of the present disclosure four;
Fig. 7 is one of the structural schematic diagram of data processing equipment of the embodiment of the present disclosure five.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present disclosure, the technical solution in the embodiment of the present disclosure is carried out clear, complete Site preparation description, it is clear that described embodiment is disclosure a part of the embodiment, instead of all the embodiments.Based on this public affairs Embodiment in opening, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example belongs to the range of disclosure protection.
Embodiment one
A kind of data processing method that the embodiment of the present disclosure provides, as shown in Figure 1, this method comprises: step 110 is to step 140。
Step 110, it is based on training data training objective model.
Wherein, the training data includes preset label.
During carrying out has the model training of supervision, it is necessary first to collect a large amount of training samples, as training data, often A training sample is a training data, in general, every training data is previously provided with sample label.With training image quality point For grade model, training data is a sheet by a sheet image.It is in advance every trained number before training image quality grading model According to that is, sample label is arranged in every image, and the sample label is used to indicate the credit rating of image.With three disaggregated models of training For, the sample label of every training data can be set in advance as any one in the super credit ratings such as excellent, normal, poor A rank.
When it is implemented, Data Analysis Services can also be passed through by being manually the preset label of every training data For the preset label of every training data.
After collecting training data, using the training data as the input of object module, with the training data Output of the preset label as the object module, pass through and execute Training, the training object module.
In some embodiments of the present disclosure, the object module can (Google be for mobile phone etc. for MobileNet A kind of depth convolutional neural networks for lightweight that embedded device proposes) three sorter networks, or other have supervision net Network, the disclosure to the structure of object module without limitation, as long as there is supervision network.Based on training data training objective mould The specific method of type has supervision network model specific method referring in the prior art, and the disclosure does not limit this.
The disclosure can also be other disaggregated models, such as image point when it is implemented, be not limited to picture quality hierarchy model Class model, user's disaggregated model, product classification model etc..The object module is also not necessarily limited to three disaggregated models, can also be two Disaggregated model, four disaggregated models etc..The preset label of the value range and training data of the result of object module output is fetched Commensurate in scope.
Step 120, test data is predicted by the object module, determines that the prediction of the object module is accurate Rate.
The disclosure is when it is implemented, also need to obtain test sample in advance, as test data, and is each test sample Sample label is set, and sample label is arranged in as every test data.The sample label is the preset mark of the test data Label, are used to indicate the real property information of the test data.It illustrates by image of test data, the preset label can be The attribute informations such as the true classification of described image, true grade.
After training obtains object module, using preset test data as the input of the object module, with determination The prediction result of every test data, the prediction result include the prediction label and prediction result confidence of the test data of input Degree.It, will be as a test for picture quality hierarchy models of the object module for three classification using test data as image The piece image of data is input to after described image quality grading model, and described image quality grading model is to described in input Image carry out image quality level prediction, and export described image image quality level (such as it is super it is excellent, normal, poor in appoint Anticipate a grade) and described image belong to the confidence score of the image quality level.
After each test data is predicted by the object module, corresponding prediction label and prediction will be all obtained As a result confidence level.Further, by prediction label to the test data for being input to the object module and preset label into Row compares, and determines the ratio of prediction label and the identical test data of preset label and whole test datas, determine described in The predictablity rate of object module.
Step 130, the training data is predicted by the object module, determines every training data Prediction label and prediction result confidence level.
After training obtains object module, using preset training data as the input of the object module, with determination The prediction result of every training data, the prediction result include the prediction label and prediction result confidence of the training data of input Degree.By taking the picture quality hierarchy model that training data is image, the object module is three classification as an example, a training will be used as The piece image of data is input to after described image quality grading model, and described image quality grading model will be to the institute of input State image carry out image quality level prediction, and export described image image quality level (such as it is super it is excellent, normal, poor in Any one grade) and described image belong to the confidence score of the image quality level.
After each training data is predicted by the object module, corresponding prediction label and prediction will be all obtained As a result confidence level.
Step 140, according to the preset label of the training data, prediction label and prediction result confidence level and described Predictablity rate handles the training data.
For each training data, the preset label and prediction label of the training data are further analyzed, it is found that The preset label and prediction label of some training datas are different.For example, its preset label of piece image is normal quality etc. Grade, and after being predicted by object module, the prediction label of the image is inferior grade, different for preset label and prediction label The training data of cause, the abnormal training data of prediction result performance is defined as in the disclosure, and the performance of these prediction results is different Normal training data may result in the model inaccuracy that training obtains during the training object module and therefore need Data processing is carried out in conjunction with different situations, the training data abnormal to prediction result performance.The application is firstly, according to described Data handling conditions are arranged in predictablity rate, then, according to the similarities and differences of the preset label and the prediction label and described Relationship between prediction result confidence level and the data handling conditions handles the training data.
In one embodiment of the present disclosure, due to the usually manual mark of preset label of training data, pole It there may be and very likely there is noise, such as the true class label of training data and the inconsistent situation of preset label, base In the training data training objective model of tag error, it will model prediction accuracy rate is caused to decline.Common in the art Data processing method is that the method manually denoised removes this data noise.The inventors of the present application found that using manual method It is high to carry out data processing cost, low efficiency, and there are artificial subjective factor, data processed result reliability is not high.The application Inventor it has furthermore been found that when the preset label of certain training data and when prediction label difference, i.e. prediction result performance is different Chang Shi, also, the corresponding prediction result confidence level of prediction label of this training data is again very high, meets preset confidence level item Part, then it is assumed that this training data is the training data of preset label for labelling mistake, using this training data as noise data. Therefore, the noise in data can be removed by data processing method disclosed in the present application.Wherein, preset confidence level condition can To be determined according to the predictablity rate of the object module.
Separately in one embodiment of the present disclosure, it is assumed that noise is not present in the training data, then, when certain training When the preset label and prediction label difference of data, i.e., when prediction result performance is abnormal, also, the pre- mark of this training data It is again very high to sign corresponding prediction result confidence level, meets preset confidence level condition, it may be considered that this training data is pre- It is bigger to survey difficulty, for the object module, be difficult to distinguish the training data be the corresponding classification of preset label or For the corresponding classification of the prediction label.That is, this training data and preset label are the training data ratio of the prediction label It is more similar, then it is determined as the abnormal training data of prediction result performance easily to obscure training data.Wherein, preset confidence Degree condition can be determined according to the predictablity rate of the object module.
In some embodiments of the present disclosure, after training obtains object module, it can also first carry out through the target Model predicts the training data, determines the prediction label of every training data and the step of prediction result confidence level Suddenly, it then executes and test data is predicted again by the object module, determine the predictablity rate of the object module The step of.
The data processing method that the embodiment of the present disclosure provides, by being based on training data training objective model, wherein described Training data includes preset label;Then, test data is predicted by the object module, determines the object module Predictablity rate;And the training data is predicted by the object module, determine every training data Prediction label and prediction result confidence level;Finally, according to preset label, prediction label and the prediction result of the training data Confidence level and the predictablity rate, handle the training data, solve in the prior art using manual method The problem of carrying out high data processing cost, low efficiency, and since artificial subjective factor leads to data processed result reliability not High problem.The data processing method that the embodiment of the present disclosure provides, the prediction by determining object module based on test data are quasi- True rate, and the prediction result confidence level of the predictablity rate of combining target model and every training data counts training data According to denoising and classification, help to promote data-handling efficiency and accuracy, and reduce data processing cost.
Embodiment two
The embodiment of the present disclosure provides a kind of data processing method, as shown in Fig. 2, this method comprises: step 210 is to step 250。
Step 210, it is based on training data training objective model.
Wherein, the training data includes preset label.
Specific embodiment based on training data training objective model is referring to embodiment one, and this embodiment is not repeated.
Step 220, test data is predicted by the object module, determines that the prediction of the object module is accurate Rate.
Test data is predicted by the object module, determines the specific of the predictablity rate of the object module Embodiment is referring to embodiment one, and this embodiment is not repeated.
Step 230, the training data is predicted by the object module, determines every training data Prediction label and prediction result confidence level.
The training data is predicted by the object module, determines the prediction label of every training data Specific embodiment with prediction result confidence level is referring to embodiment one, and this embodiment is not repeated.
After each training data is predicted by the object module, corresponding prediction label and prediction will be all obtained As a result confidence level.
Step 240, according to the preset label of the training data, prediction label and prediction result confidence level and described Predictablity rate handles the training data.
In some embodiments of the present disclosure, preset label, prediction label and the prediction according to the training data As a result confidence level and the predictablity rate, handle the training data, comprising: show prediction result abnormal The training data classify according to the combination of two of the preset label and the prediction label, determine several groups exception Training data, wherein the abnormal training data of the prediction result performance includes: the preset label and the pre- mark Sign the different training datas;For exception training data described in every group, determine that the prediction result confidence level meets respectively The abnormal training data of default first data handling conditions is noise data, wherein the default first data processing item Part is determining according to the predictablity rate, for example, described highest A% training data of prediction result confidence level, the A% root It is determined according to the predictablity rate, as A% is equal to the predictablity rate.
Assuming that being tri- sorter network model of MobileNet, the training based on the object module that training data training obtains Data are hotel's image, and the preset label of hotel's image includes: tri- credit ratings of S, A and BC, passes through the target mould After type predicts preset test data, determine that the predictablity rate of the object module is 60%, then it can be according to institute It states predictablity rate and determines the first data handling conditions are as follows: prediction result is set in every group of abnormal test data of prediction result performance The test data of reliability highest 60% is noise data.
In some embodiments of the present disclosure, prediction result can be showed by the confusion matrix of building training data different The normal training data is classified according to the combination of two of the preset label and the prediction label, determines that several groups are different Normal training data.Wherein, the abnormal training data of prediction result performance includes: the preset label and the prediction label The different training datas.
For example, firstly, being indexed respectively using preset label S, A and BC as the line index of confusion matrix and column index The first row to the third line matrix element, and, first row to tertial matrix element;It then, is S and prediction by preset label Label be the training data of S quantity as the first row first row matrix element element value, i.e. line index S and column index S The element value of the matrix element of index, the quantity for the training data that using preset label be S and prediction label is A is as the first row The element value of the matrix element of the element value of the matrix element of two column, i.e. line index S and column index A index, and so on, building The confusion matrix of training data.The confusion matrix of building is as shown in Figure 3.Confusion matrix illustrates the training number of a certain preset label According to the quantity for being predicted to be different prediction labels, wherein each matrix element of confusion matrix indicates that preset label is corresponding and is somebody's turn to do The corresponding prediction of column index of the matrix element column is predicted to be in the training data of matrix element line index of the row The training data quantity of label, the sum of matrix element of every a line of confusion matrix are the instruction with the corresponding preset label of the row Practice the summation of data.That is each matrix element of confusion matrix corresponds to a grouping of training data, the row rope of matrix element Draw corresponding preset label with column index when corresponding prediction label difference, the corresponding training data of the matrix element is grouped into one The grouping of a exception training data.If the tertial matrix element value of the third line in Fig. 3 is 589, then it represents that preset label The training data for being BC for prediction label in the training data of S is 589.Further, according to prediction result confidence level by height It is sorted from front to back to training data of the low sequence to this 589 recognition result exceptions, then can be approximately considered sequence and lean on The data prediction result of preceding 60% be it is believable, since its prediction label is different from preset label, it is possible to think to sort It is noise that 60% forward data, which have a possibility that very big, i.e. the training data that has mislabeled of label, that is, determines preceding 60% instruction Practicing data is noise data.And 40% data rearward of sorting are it may be considered that be more indistinguishable training data, to its into Row retains, and has larger help for the accuracy of identification of subsequent lift scheme.
According to above-mentioned method, the training data that can be identified to each preset label divides according to prediction label respectively Group obtains several groups training data.For example, the training data that preset label is S can be to divide 3 groups, respectively prediction label is S One group, one group that one group that prediction label is A and prediction label are BC, for the training different with prediction label of threshold value label Data are determined as abnormal training data in the present embodiment.I.e. preset label is one group of training data that S prediction label is A and pre- Setting label is one group of training data that S prediction label is BC, will be confirmed as two groups of exception training datas.According to the method described above, 6 groups of exception training datas will be determined in the present embodiment.Then, it based on determining first data handling conditions, determines respectively Noise data in this 6 groups of exception training datas.
Step 250, based on the training data in the training data in addition to the noise data, optimize the mesh Mark model.
It is described according to the preset label of the training data, prediction label and pre- in other embodiments of the disclosure After the step of surveying result confidence level and the predictablity rate, handling the training data, further includes: be based on The training data in the training data in addition to the noise data, optimizes the object module.
The description of base step 240 in this present embodiment, 40% training after further determining that in every group of exception training data Data and preset label and the consistent training data of prediction label advanced optimize the training object module.
The data processing method that the embodiment of the present disclosure provides, by being based on training data training objective model, wherein described Training data includes preset label;Then, test data is predicted by the object module, determines the object module Predictablity rate;And the training data is predicted by the object module, determine every training data Prediction label and prediction result confidence level;Finally, according to preset label, prediction label and the prediction result of the training data Confidence level and the predictablity rate, handle the training data, solve in the prior art using manual method The problem of carrying out high data processing cost, low efficiency, and since artificial subjective factor leads to data processed result reliability not High problem.The data processing method that the embodiment of the present disclosure provides, the prediction by determining object module based on test data are quasi- True rate, and the prediction result confidence level of the predictablity rate of combining target model and every training data counts training data According to denoising and classification, help to promote data-handling efficiency and accuracy, and reduce data processing cost.
Further, due to the usually manual mark of preset label of training data, it is most likely that there are preset The case where tag error, the training data training objective model based on tag error, it will model prediction accuracy rate is caused to decline. Therefore, when the preset label of certain training data and prediction label difference, i.e., when prediction result performance is abnormal, by combining mesh The prediction result confidence level distribution for marking the predictablity rate and every training data of model, determines noise data, can effectively know The training data of not preset tag error.The training object module is advanced optimized by the training data after removal noise, Further to promote the predictablity rate of object module.
Present inventor is by going the prior art the study found that carrying out data using emotion consistency discrimination method It is whether consistent with the feeling polarities of adjective noun pair according to feeling polarities integrated value when making an uproar, determine whether data are noise, one Cause then saves, inconsistent, deletes, and this method is related to feeling polarities task, does not have universality.Also, it is based on multi-modal depth The probability sampling model for spending convolutional neural networks removes noise, and this method deletes the similar emotion score of all categories with probability P Example, core concept refers to that the difference being predicted to be between positive and passive emotion score when a trained example is got over When big, which will be carried over into training set, otherwise the probability that the example is deleted from training set is bigger.Base It is directly determined in the probability sampling model of multi-modal depth convolutional neural networks according to the difference of different classes of absolute prediction value It is fixed whether to retain, reasonability is lacked for the noise data of tag error.By taking hotel's credit rating identifies scene as an example, it is assumed that true The image of real S grade is BC grade by error flag, because the example of two grades itself has distinction, the image quilt The predicted value difference for being predicted as S and BC grade is still very big, and only prediction label is S and preset label B C difference, but this The noise data as caused by tag error still can be retained when based on feeling polarities denoising, can reduce the mould that training obtains The predictablity rate of type.
It also, include the noise data of label during carrying out model training for supervised learning, directly using pre- It is inappropriate for surveying absolute value and carrying out noise judgement.The disclosure takes full advantage of the probability distribution of prediction classification sample, is based on mould The accuracy rate of type obtains good trade-off between noise data and hard case (more difficult differentiation sample).Disclosure base first The training of object module is carried out in original training data, the predictablity rate A% of object module is then determined using test data, The classification of determining prediction error later, the data of the removal higher A% ratio of confidence level, the training data after being cleaned, into The re -training of row model, can the effective obtained predictablity rate of model of training for promotion.
Embodiment three
The embodiment of the present disclosure provides a kind of data processing method, as shown in figure 4, this method comprises: step 410 is to step 450。
Step 410, it is based on training data training objective model.
Wherein, the training data includes preset label.
Specific embodiment based on training data training objective model is referring to embodiment one, and this embodiment is not repeated.
Step 420, test data is predicted by the object module, determines that the prediction of the object module is accurate Rate.
Test data is predicted by the object module, determines the specific of the predictablity rate of the object module Embodiment is referring to embodiment one, and this embodiment is not repeated.
Step 430, the training data is predicted by the object module, determines every training data Prediction label and prediction result confidence level.
The training data is predicted by the object module, determines the prediction label of every training data Specific embodiment with prediction result confidence level is referring to embodiment one, and this embodiment is not repeated.
After each training data is predicted by the object module, corresponding prediction label and prediction will be all obtained As a result confidence level.
Step 440, according to the preset label of the training data, prediction label and prediction result confidence level and described Predictablity rate handles the training data.
In some embodiments of the present disclosure, preset label, prediction label and the prediction according to the training data As a result confidence level and the predictablity rate, handle the training data, comprising: show prediction result abnormal The training data classify according to the combination of two of the preset label and the prediction label, determine several groups exception Training data, wherein the abnormal training data of the prediction result performance includes: the preset label and the pre- mark Sign the different training datas;For exception training data described in every group, determine that the prediction result confidence level meets respectively The abnormal training data of default second data handling conditions is easily to obscure training data, wherein default second data Treatment conditions are determined according to the predictablity rate.For example, second data handling conditions are the prediction result confidence level Highest B% training data, the B% are determined according to the predictablity rate, as B% is equal to the predictablity rate.
For example, first by training data according to preset labeling, for the present embodiment, preset label include S, Training data can be then divided into 3 classes by A and BC.Further, for every class training data, further divide according to prediction label It is multiple groups, for the present embodiment, every class training data can be further divided into 3 groups.According to this classification method, this reality To be divided by applying the training data in example by 9 groups, the combination difference of this corresponding preset label of 9 groups of training datas and prediction label Are as follows: S and S, S and A, S and BC, A and S, A and A, A and BC, BC and S, BC and A, BC and BC.Then, by preset label and prediction The different corresponding training data of combination of label is determined as the abnormal training data of prediction result performance.Specific to the present embodiment Speech, by the combination of preset label and prediction label: S and A, S and BC, A and S, A and BC, BC and S, BC and the corresponding training data of A It is determined as the abnormal training data of prediction result performance.
It further, can be according to the prediction result confidence level from high to low for exception training data described in every group Sequence, sort from front to back to the training data in every group of exception training data respectively, and determine every group of exception training data In, the prediction result confidence level meets the training data of default second data handling conditions, such as training data of preceding B% Easily to obscure training data.Wherein, default second data handling conditions are determined according to the predictablity rate.For example, pre- If the second data handling conditions are the training data of the highest B% of confidence level, wherein B% is equal to the prediction of the object module Accuracy rate, alternatively, setting B% is equal to the 90% of the predictablity rate of the object module according to specific business need.
Step 450, easily training data is obscured based on described, optimize the object module.
It is described according to the preset label of the training data, prediction label and pre- in other embodiments of the disclosure Result confidence level and the predictablity rate are surveyed, after handling the training data, further includes: based on described easy Obscure training data, optimizes the object module.
When noise is not present in the training data, that is, when the training data of tag error is not present, when certain training When the preset label and prediction label difference of data, i.e., when prediction result performance is abnormal, it may be considered that this training data It predicts that difficulty is bigger, for the object module, is difficult to distinguish the training data to be the corresponding classification of preset label It or is the corresponding classification of the prediction label.Therefore, the target mould can be optimized further by easily obscuring training data Type.
It is described easily to obscure training data based on described in some embodiments of the present disclosure, optimize the object module, wraps Include: according to the prediction result confidence level, determine every group respectively described in easily obscure training data described in exception training data The prediction complexity matched;According to the sequence of the prediction complexity from the easier to the more advanced, it is based on and the prediction complexity That matches described easily obscures training data, object module described in iteration optimization.
In some embodiments of the present disclosure, for every group of exception training data, respectively according to the prediction result confidence level, The matched prediction complexity of training data is easily obscured described in exception training data described in determining every group.The prediction result is set Reliability is higher, illustrates that the confidence level that this training data is prediction label is higher, that is, distinguishing this training data is preset label The difficulty of grade or prediction label grade is bigger.When it is implemented, can be according to the quantity or prediction of specific abnormal training data Prediction complexity is divided into multiple grades, such as high, medium and low 3 grades by the as a result value range of confidence level.Then, for Every group of exception training data is the matched high-grade prediction difficulty or ease journey of the easy obfuscated data of prediction result confidence level highest 30% Degree is that the matched prediction of prediction complexity of the 30% minimum easy matched inferior grade of obfuscated data of prediction result confidence level is difficult Easy degree, for the prediction complexity of the easy matched middle grade of obfuscated data of other in the group.
Further, the training data for being primarily based on matching inferior grade prediction complexity in all groups of easy obfuscated datas is excellent Change the object module, obtains object module M1;Then, then based on matching middle grade in all groups of easy obfuscated datas predict difficulty or ease The training data of degree optimizes the object module M1, obtains object module M2;Finally, again based in all groups of easy obfuscated datas The training data for matching high-grade prediction complexity optimizes the object module M2, obtains object module M3.Finally, target mould Type M3 is as the object module after optimization.
By using training process from the easier to the more advanced, the feature learning ability of model is stepped up, can be instructed with lift scheme Practice efficiency.
It is described easily to obscure training data based on described in other embodiments of the disclosure, optimize the object module, Comprise determining that the similar training data for easily obscuring training data, wherein the preset label of the similar training data and institute The prediction label for stating easy obfuscated data is identical;Based on the similar training data trained number similar with the easy obfuscated data building According to right;Based on the similar training data to the optimization object module.
For specific list the present embodiment, firstly, by the preset label and the prediction label for easily obscuring training data The identical training data, such as by preset label be S training data Data1 and prediction label be that S easily obscures trained number According to Data2 as the similar training data for easily obscuring training data Data1.Then, similar instruction is constructed based on Data1 and Data 2 Practice data pair.It illustrates by image of training data, if the preset label of a certain image Picture1 is S grade, prediction label For BC grade, then illustrates described image Picture1 and preset label is image Picture2, Picture3 ... tool of BC grade Have certain similitude, then can based on described image Picture1 training data similar with described image Picture2 building to, It is right based on described image Picture1 and the similar training data of described image Picture3 building ....Further, it can be based on The similar image pair of building, optimizes the object module.
The similar image pair based on building, optimizes the specific embodiment of the object module referring to the prior art, This embodiment is not repeated.
In field of image search, carry out hard case's (hardly possible distinguishes example) using the data processing method in the disclosure It selects, it can further boosting algorithm performance.It is general in image retrieval all disaggregated model to be used to carry out pre-training, then construct Image to the further distinction for promoting characteristics of image, image to comprising between the same classification, also comprising it is different classes of it Between.Wherein, the image pair that different classes of is selected by the data processing method of the disclosure, using basic model to training number According to the data of concentration carry out prediction obtain it is different obscure training data, such as S grade error prediction is BC grade, it is assumed that data itself Completely, noise is not included, that training data for showing to be predicted as BC grade and the original training data for BC grade have centainly Similitude, the training data based on the training data and original BC grade that are predicted as BC grade constructs image pair, compared to random The training data chosen in S and BC grade constructs similar image pair, can further promote the feature representation ability of image, promotes instruction The predictablity rate of the model got.
The data processing method that the embodiment of the present disclosure provides, by being based on training data training objective model, wherein described Training data includes preset label;Then, test data is predicted by the object module, determines the object module Predictablity rate;And the training data is predicted by the object module, determine every training data Prediction label and prediction result confidence level;Finally, according to preset label, prediction label and the prediction result of the training data Confidence level and the predictablity rate, handle the training data, solve in the prior art using manual method The problem of carrying out high data processing cost, low efficiency, and since artificial subjective factor leads to data processed result reliability not High problem.The data processing method that the embodiment of the present disclosure provides, the prediction by determining object module based on test data are quasi- True rate, and the prediction result confidence level of the predictablity rate of combining target model and every training data counts training data Classify according to denoising, helps to promote data-handling efficiency and accuracy, and reduce data processing cost.
Further, when the preset label of certain training data and prediction label difference, i.e. prediction result performance is abnormal When, it is distributed by the predictablity rate of combining target model and the prediction result confidence level of every training data, determines more difficult area The training data divided, and the training data based on more indistinguishable training data advanced optimizes the training object module, with Further promote the predictablity rate of object module.
Example IV
The embodiment of the present disclosure provides a kind of data processing equipment, as shown in figure 5, described device includes:
Object module training module 510, for being based on training data training objective model, wherein the training data packet Include preset label;
Model prediction accuracy rate determining module 520 is determined for being predicted by the object module test data The predictablity rate of the object module;
Training data prediction module 530, for the object module by the object module training module 510 training to institute It states training data to be predicted, determines the prediction label and prediction result confidence level of every training data;
Data processing module 540, for according to the preset label of the training data, prediction label and prediction result confidence Degree and the predictablity rate, handle the training data.
Optionally, as shown in fig. 6, the data processing module 540 further comprises:
First data grouping submodule 5401, for prediction result to be showed the abnormal training data according to described pre- The combination of two for setting label and the prediction label is classified, and determines several groups exception training data, wherein the prediction knot The abnormal training data of fruit performance includes: the preset label training data different with the prediction label;
Noise data determines submodule 5402, for determining the prediction respectively for exception training data described in every group As a result the abnormal training data that confidence level meets default first data handling conditions is noise data, wherein described default First data handling conditions are determined according to the predictablity rate.
Optionally, as shown in fig. 6, described device further include:
First model optimization module 550, for based on the instruction in the training data in addition to the noise data Practice data, optimizes the object module.
The data processing equipment that the embodiment of the present disclosure provides, by being based on training data training objective model, wherein described Training data includes preset label;Then, test data is predicted by the object module, determines the object module Predictablity rate;And the training data is predicted by the object module, determine every training data Prediction label and prediction result confidence level;Finally, according to preset label, prediction label and the prediction result of the training data Confidence level and the predictablity rate, handle the training data, solve in the prior art using manual method The problem of carrying out high data processing cost, low efficiency, and since artificial subjective factor leads to data processed result reliability not High problem.The data processing equipment that the embodiment of the present disclosure provides, the prediction by determining object module based on test data are quasi- True rate, and the prediction result confidence level of the predictablity rate of combining target model and every training data counts training data According to denoising and classification, help to promote data-handling efficiency and accuracy, and reduce data processing cost.
Further, due to the usually manual mark of preset label of training data, it is most likely that there are preset The case where tag error, the training data training objective model based on tag error, it will model prediction accuracy rate is caused to decline. Therefore, when the preset label of certain training data and prediction label difference, i.e., when prediction result performance is abnormal, by combining mesh The prediction result confidence level distribution for marking the predictablity rate and every training data of model, determines noise data, can effectively know The training data of not preset tag error.The training object module is advanced optimized by the training data after removal noise, Further to promote the predictablity rate of object module.
Embodiment five
Reference implementation example four, in another embodiment of the disclosure, as shown in fig. 7, the data processing module 540 into One step includes:
First data grouping submodule 5401, for prediction result to be showed the abnormal training data according to described pre- The combination of two for setting label and the prediction label is classified, and determines several groups exception training data, wherein the prediction knot The abnormal training data of fruit performance includes: the preset label training data different with the prediction label;
Easily obscuring training data determines submodule 5403, for determining institute respectively for exception training data described in every group Stating prediction result confidence level to meet the abnormal training data of default second data handling conditions is easily to obscure training data, In, default second data handling conditions are determined according to the predictablity rate.
Optionally, as shown in fig. 7, described device further include:
Second model optimization module 560 optimizes the object module for easily obscuring training data based on described.
In one embodiment of the present disclosure, the second model optimization module 560 is further used for:
According to the prediction result confidence level, determine every group respectively described in easily obscure trained number described in exception training data According to matched prediction complexity;
According to the sequence of the prediction complexity from the easier to the more advanced, based on matched described easy with the prediction complexity Obscure training data, object module described in iteration optimization.
In another embodiment of the disclosure, the second model optimization module 560 is further used for:
Determine the similar training data for easily obscuring training data, wherein the preset label of the similar training data It is identical as the prediction label of the easy obfuscated data;
Based on similar training data training data pair similar with the easy obfuscated data building;
Based on the similar training data to the optimization object module.
The data processing equipment that the embodiment of the present disclosure provides, for realizing described in the embodiment of the present disclosure one to embodiment three Data processing method each step, the specific embodiment of each module of device is referring to corresponding steps, and details are not described herein again.
The data processing equipment that the embodiment of the present disclosure provides, by being based on training data training objective model, wherein described Training data includes preset label;Then, test data is predicted by the object module, determines the object module Predictablity rate;And the training data is predicted by the object module, determine every training data Prediction label and prediction result confidence level;Finally, according to preset label, prediction label and the prediction result of the training data Confidence level and the predictablity rate, handle the training data, solve in the prior art using manual method The problem of carrying out high data processing cost, low efficiency, and since artificial subjective factor leads to data processed result reliability not High problem.The data processing equipment that the embodiment of the present disclosure provides, the prediction by determining object module based on test data are quasi- True rate, and the prediction result confidence level of the predictablity rate of combining target model and every training data counts training data According to classification, help to promote data-handling efficiency and accuracy, and reduce data processing cost.
Further, when the preset label of certain training data and prediction label difference, i.e. prediction result performance is abnormal When, it is distributed by the predictablity rate of combining target model and the prediction result confidence level of every training data, determines more difficult area The training data divided, and the training data based on more indistinguishable training data advanced optimizes the training object module, with Further promote the predictablity rate of object module.
Correspondingly, the disclosure additionally provides a kind of electronic equipment, including memory, processor and it is stored in the memory Computer program that is upper and can running on a processor, the processor are realized when executing the computer program as the disclosure is real Apply data processing method described in example one to any one embodiment of embodiment three.The electronic equipment can be PC machine, movement Terminal, personal digital assistant, tablet computer etc..
The disclosure additionally provides a kind of computer readable storage medium, is stored thereon with computer program, which is located Manage the step that the data processing method as described in the embodiment of the present disclosure one to any one embodiment of embodiment three is realized when device executes Suddenly.
All the embodiments in this specification are described in a progressive manner, the highlights of each of the examples are with The difference of other embodiments, the same or similar parts between the embodiments can be referred to each other.For Installation practice For, since it is basically similar to the method embodiment, so being described relatively simple, referring to the portion of embodiment of the method in place of correlation It defends oneself bright.
A kind of data processing method and device provided above to the disclosure is described in detail, tool used herein Body example is expounded the principle and embodiment of the disclosure, this public affairs that the above embodiments are only used to help understand The method and its core concept opened;At the same time, for those skilled in the art, according to the thought of the disclosure, specific real Apply in mode and application range that there will be changes, in conclusion the content of the present specification should not be construed as the limit to the disclosure System.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can It realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware realization.Based on such reason Solution, substantially the part that contributes to existing technology can embody above-mentioned technical proposal in the form of software products in other words Come, which may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including Some instructions are used so that a computer equipment (can be personal computer, server or the network equipment etc.) executes respectively Method described in certain parts of a embodiment or embodiment.

Claims (16)

1. a kind of data processing method characterized by comprising
Based on training data training objective model, wherein the training data includes preset label;
Test data is predicted by the object module, determines the predictablity rate of the object module;
The training data is predicted by the object module, determines the prediction label of every training data and pre- Survey result confidence level;
It is right according to the preset label of the training data, prediction label and prediction result confidence level and the predictablity rate The training data is handled.
2. the method according to claim 1, wherein described according to the preset label of the training data, prediction Label and prediction result confidence level and the predictablity rate, the step of processing the training data, comprising:
By the abnormal training data of prediction result performance according to the combination of two of the preset label and the prediction label Classify, determine several groups exception training data, wherein the prediction result shows the abnormal training data and includes: The preset label training data different with the prediction label;
For exception training data described in every group, determine that the prediction result confidence level meets default first data processing item respectively The abnormal training data of part is noise data, wherein default first data handling conditions are accurate according to the prediction Rate determines.
3. according to the method described in claim 2, it is characterized in that, described according to the preset label of the training data, prediction Label and prediction result confidence level and the predictablity rate also wrap after the step of handling the training data It includes:
Based on the training data in the training data in addition to the noise data, optimize the object module.
4. the method according to claim 1, wherein described according to the preset label of the training data, prediction Label and prediction result confidence level and the predictablity rate, the step of processing the training data, comprising:
By the abnormal training data of prediction result performance according to the combination of two of the preset label and the prediction label Classify, determine several groups exception training data, wherein the prediction result shows the abnormal training data and includes: The preset label training data different with the prediction label;
For exception training data described in every group, determine that the prediction result confidence level meets default second data processing item respectively The abnormal training data of part is easily to obscure training data, wherein default second data handling conditions are according to described pre- Accuracy rate is surveyed to determine.
5. according to the method described in claim 4, it is characterized in that, described according to the preset label of the training data, prediction Label and prediction result confidence level and the predictablity rate also wrap after the step of handling the training data It includes:
Easily training data is obscured based on described, optimizes the object module.
6. according to the method described in claim 5, it is characterized in that, described easily obscure training data based on described, described in optimization The step of object module, comprising:
According to the prediction result confidence level, determine every group respectively described in easily obscure training data described in exception training data The prediction complexity matched;
According to the prediction complexity sequence from the easier to the more advanced, based on the prediction complexity is matched described easily obscures Training data, object module described in iteration optimization.
7. according to the method described in claim 5, it is characterized in that, described easily obscure training data based on described, described in optimization The step of object module, comprising:
Determine the similar training data for easily obscuring training data, wherein the preset label of the similar training data and institute The prediction label for stating easy obfuscated data is identical;
Based on similar training data training data pair similar with the easy obfuscated data building;
Based on the similar training data to the optimization object module.
8. a kind of data processing equipment characterized by comprising
Object module training module, for being based on training data training objective model, wherein the training data includes preset mark Label;
Model prediction accuracy rate determining module determines the mesh for predicting by the object module test data Mark the predictablity rate of model;
Training data prediction module, for the object module by object module training module training to the training data It is predicted, determines the prediction label and prediction result confidence level of every training data;
Data processing module, for according to the preset label of the training data, prediction label and prediction result confidence level, and The predictablity rate handles the training data.
9. device according to claim 8, which is characterized in that the data processing module further comprises:
First data grouping submodule, for by the abnormal training data of prediction result performance according to the preset label and The combination of two of the prediction label is classified, and determines several groups exception training data, wherein the prediction result performance is different The normal training data includes: the preset label training data different with the prediction label;
Noise data determines submodule, for determining the prediction result confidence respectively for exception training data described in every group The abnormal training data that degree meets default first data handling conditions is noise data, wherein default first data Treatment conditions are determined according to the predictablity rate.
10. device according to claim 9, which is characterized in that further include:
First model optimization module, for based on the training data in the training data in addition to the noise data, Optimize the object module.
11. device according to claim 8, which is characterized in that the data processing module further comprises:
First data grouping submodule, for by the abnormal training data of prediction result performance according to the preset label and The combination of two of the prediction label is classified, and determines several groups exception training data, wherein the prediction result performance is different The normal training data includes: the preset label training data different with the prediction label;
Easily obscuring training data determines submodule, for determining the prediction knot respectively for exception training data described in every group The abnormal training data that fruit confidence level meets default second data handling conditions is easily to obscure training data, wherein described Default second data handling conditions are determined according to the predictablity rate.
12. device according to claim 11, which is characterized in that further include:
Second model optimization module optimizes the object module for easily obscuring training data based on described.
13. device according to claim 12, which is characterized in that the second model optimization module is further used for:
According to the prediction result confidence level, determine every group respectively described in easily obscure training data described in exception training data The prediction complexity matched;
According to the prediction complexity sequence from the easier to the more advanced, based on the prediction complexity is matched described easily obscures Training data, object module described in iteration optimization.
14. device according to claim 12, which is characterized in that the second model optimization module is further used for:
Determine the similar training data for easily obscuring training data, wherein the preset label of the similar training data and institute The prediction label for stating easy obfuscated data is identical;
Based on similar training data training data pair similar with the easy obfuscated data building;
Based on the similar training data to the optimization object module.
15. a kind of electronic equipment, including memory, processor and it is stored on the memory and can runs on a processor Computer program, which is characterized in that the processor realizes claim 1 to 7 any one when executing the computer program The data processing method.
16. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor The step of data processing method described in claim 1 to 7 any one is realized when execution.
CN201810866737.XA 2018-08-01 2018-08-01 Data processing method and device, electronic equipment and storage medium Active CN109189767B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810866737.XA CN109189767B (en) 2018-08-01 2018-08-01 Data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810866737.XA CN109189767B (en) 2018-08-01 2018-08-01 Data processing method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109189767A true CN109189767A (en) 2019-01-11
CN109189767B CN109189767B (en) 2021-07-23

Family

ID=64920386

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810866737.XA Active CN109189767B (en) 2018-08-01 2018-08-01 Data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109189767B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110705596A (en) * 2019-09-04 2020-01-17 北京三快在线科技有限公司 White screen detection method and device, electronic equipment and storage medium
CN110909688A (en) * 2019-11-26 2020-03-24 南京甄视智能科技有限公司 Face detection small model optimization training method, face detection method and computer system
CN110929785A (en) * 2019-11-21 2020-03-27 中国科学院深圳先进技术研究院 Data classification method and device, terminal equipment and readable storage medium
CN111078877A (en) * 2019-12-05 2020-04-28 支付宝(杭州)信息技术有限公司 Data processing method, training method of text classification model, and text classification method and device
CN111144216A (en) * 2019-11-27 2020-05-12 北京三快在线科技有限公司 Picture label generation method and device, electronic equipment and readable storage medium
CN111325278A (en) * 2020-02-26 2020-06-23 重庆金山医疗技术研究院有限公司 Image processing method, device and storage medium
WO2020178687A1 (en) * 2019-03-07 2020-09-10 International Business Machines Corporation Computer model machine learning based on correlations of training data with performance trends
CN111724136A (en) * 2020-06-23 2020-09-29 平安医疗健康管理股份有限公司 Method and device for entering information of first page of medical record and computer equipment
CN112749516A (en) * 2021-02-03 2021-05-04 江南机电设计研究所 System combination model reliability intelligent evaluation method suitable for multi-type data characteristics
CN113360643A (en) * 2021-05-27 2021-09-07 重庆南鹏人工智能科技研究院有限公司 Electronic medical record data quality evaluation method based on short text classification
CN114417987A (en) * 2022-01-11 2022-04-29 支付宝(杭州)信息技术有限公司 Model training method, data identification method, device and equipment

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070112701A1 (en) * 2005-08-15 2007-05-17 Microsoft Corporation Optimization of cascaded classifiers
CN101378519A (en) * 2008-09-28 2009-03-04 宁波大学 Method for evaluating quality-lose referrence image quality base on Contourlet transformation
CN101540048A (en) * 2009-04-21 2009-09-23 北京航空航天大学 Image quality evaluating method based on support vector machine
CN102567744A (en) * 2011-12-29 2012-07-11 中国科学院自动化研究所 Method for determining quality of iris image based on machine learning
CN104834898A (en) * 2015-04-09 2015-08-12 华南理工大学 Quality classification method for portrait photography image
CN105046277A (en) * 2015-07-15 2015-11-11 华南农业大学 Robust mechanism research method of characteristic significance in image quality evaluation
CN105426826A (en) * 2015-11-09 2016-03-23 张静 Tag noise correction based crowd-sourced tagging data quality improvement method
CN106709511A (en) * 2016-12-08 2017-05-24 华中师范大学 Urban rail transit panoramic monitoring video fault detection method based on depth learning
CN107463953A (en) * 2017-07-21 2017-12-12 上海交通大学 Image classification method and system based on quality insertion in the case of label is noisy
CN107562859A (en) * 2017-08-29 2018-01-09 武汉斗鱼网络科技有限公司 A kind of disaggregated model training system and its implementation
CN107688823A (en) * 2017-07-20 2018-02-13 北京三快在线科技有限公司 A kind of characteristics of image acquisition methods and device, electronic equipment
CN107704806A (en) * 2017-09-01 2018-02-16 深圳市唯特视科技有限公司 A kind of method that quality of human face image prediction is carried out based on depth convolutional neural networks
CN108122002A (en) * 2017-12-18 2018-06-05 东软集团股份有限公司 Training sample acquisition methods and device
CN108345846A (en) * 2018-01-29 2018-07-31 华东师范大学 A kind of Human bodys' response method and identifying system based on convolutional neural networks

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070112701A1 (en) * 2005-08-15 2007-05-17 Microsoft Corporation Optimization of cascaded classifiers
CN101378519A (en) * 2008-09-28 2009-03-04 宁波大学 Method for evaluating quality-lose referrence image quality base on Contourlet transformation
CN101540048A (en) * 2009-04-21 2009-09-23 北京航空航天大学 Image quality evaluating method based on support vector machine
CN102567744A (en) * 2011-12-29 2012-07-11 中国科学院自动化研究所 Method for determining quality of iris image based on machine learning
CN104834898A (en) * 2015-04-09 2015-08-12 华南理工大学 Quality classification method for portrait photography image
CN105046277A (en) * 2015-07-15 2015-11-11 华南农业大学 Robust mechanism research method of characteristic significance in image quality evaluation
CN105426826A (en) * 2015-11-09 2016-03-23 张静 Tag noise correction based crowd-sourced tagging data quality improvement method
CN106709511A (en) * 2016-12-08 2017-05-24 华中师范大学 Urban rail transit panoramic monitoring video fault detection method based on depth learning
CN107688823A (en) * 2017-07-20 2018-02-13 北京三快在线科技有限公司 A kind of characteristics of image acquisition methods and device, electronic equipment
CN107463953A (en) * 2017-07-21 2017-12-12 上海交通大学 Image classification method and system based on quality insertion in the case of label is noisy
CN107562859A (en) * 2017-08-29 2018-01-09 武汉斗鱼网络科技有限公司 A kind of disaggregated model training system and its implementation
CN107704806A (en) * 2017-09-01 2018-02-16 深圳市唯特视科技有限公司 A kind of method that quality of human face image prediction is carried out based on depth convolutional neural networks
CN108122002A (en) * 2017-12-18 2018-06-05 东软集团股份有限公司 Training sample acquisition methods and device
CN108345846A (en) * 2018-01-29 2018-07-31 华东师范大学 A kind of Human bodys' response method and identifying system based on convolutional neural networks

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GUOQING GUI 等: ""Data-Driven Support Vector Machine with Optimization Techniques for Structural Health Monitoring and Damage Detection"", 《KSCE JOURNAL OF CIVIL ENGINEERING》 *
夏战国 等: ""类不均衡的半监督高斯过程分类算法"", 《通信学报》 *
范媛媛 等: ""基于BP神经网络的图像质量评价参数优化"", 《应用光学》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2596438A (en) * 2019-03-07 2021-12-29 Ibm Computer model machine learning based on correlations of training data with performance trends
US11809966B2 (en) 2019-03-07 2023-11-07 International Business Machines Corporation Computer model machine learning based on correlations of training data with performance trends
WO2020178687A1 (en) * 2019-03-07 2020-09-10 International Business Machines Corporation Computer model machine learning based on correlations of training data with performance trends
CN110705596A (en) * 2019-09-04 2020-01-17 北京三快在线科技有限公司 White screen detection method and device, electronic equipment and storage medium
CN110929785A (en) * 2019-11-21 2020-03-27 中国科学院深圳先进技术研究院 Data classification method and device, terminal equipment and readable storage medium
CN110929785B (en) * 2019-11-21 2023-12-05 中国科学院深圳先进技术研究院 Data classification method, device, terminal equipment and readable storage medium
CN110909688A (en) * 2019-11-26 2020-03-24 南京甄视智能科技有限公司 Face detection small model optimization training method, face detection method and computer system
CN111144216A (en) * 2019-11-27 2020-05-12 北京三快在线科技有限公司 Picture label generation method and device, electronic equipment and readable storage medium
CN111078877B (en) * 2019-12-05 2023-03-21 支付宝(杭州)信息技术有限公司 Data processing method, training method of text classification model, and text classification method and device
CN111078877A (en) * 2019-12-05 2020-04-28 支付宝(杭州)信息技术有限公司 Data processing method, training method of text classification model, and text classification method and device
CN111325278B (en) * 2020-02-26 2023-08-29 重庆金山医疗技术研究院有限公司 Image processing method, device and storage medium
CN111325278A (en) * 2020-02-26 2020-06-23 重庆金山医疗技术研究院有限公司 Image processing method, device and storage medium
CN111724136A (en) * 2020-06-23 2020-09-29 平安医疗健康管理股份有限公司 Method and device for entering information of first page of medical record and computer equipment
CN112749516A (en) * 2021-02-03 2021-05-04 江南机电设计研究所 System combination model reliability intelligent evaluation method suitable for multi-type data characteristics
CN112749516B (en) * 2021-02-03 2023-08-25 江南机电设计研究所 Intelligent evaluation method for credibility of system combination model adapting to multi-type data characteristics
CN113360643A (en) * 2021-05-27 2021-09-07 重庆南鹏人工智能科技研究院有限公司 Electronic medical record data quality evaluation method based on short text classification
CN114417987A (en) * 2022-01-11 2022-04-29 支付宝(杭州)信息技术有限公司 Model training method, data identification method, device and equipment

Also Published As

Publication number Publication date
CN109189767B (en) 2021-07-23

Similar Documents

Publication Publication Date Title
CN109189767A (en) Data processing method, device, electronic equipment and storage medium
CN103455545B (en) The method and system of the location estimation of social network user
CN107403198B (en) Official website identification method based on cascade classifier
CN111581385B (en) Unbalanced data sampling Chinese text category recognition system and method
Bochinski et al. Deep active learning for in situ plankton classification
CN111475615B (en) Fine granularity emotion prediction method, device and system for emotion enhancement and storage medium
CN104615730B (en) A kind of multi-tag sorting technique and device
CN110489578A (en) Image processing method, device and computer equipment
CN109213859A (en) A kind of Method for text detection, apparatus and system
US20230177626A1 (en) Systems and methods for determining structured proceeding outcomes
CN110032859A (en) Abnormal account's discrimination method and device and medium
CN106537387B (en) Retrieval/storage image associated with event
CN108229527A (en) Training and video analysis method and apparatus, electronic equipment, storage medium, program
CN108733644A (en) A kind of text emotion analysis method, computer readable storage medium and terminal device
CN106445908A (en) Text identification method and apparatus
Hare et al. Verifying FAD-association in purse seine catches on the basis of catch sampling
CN109800309A (en) Classroom Discourse genre classification methods and device
CN112749330B (en) Information pushing method, device, computer equipment and storage medium
CN110796171A (en) Unclassified sample processing method and device of machine learning model and electronic equipment
CN111160959A (en) User click conversion estimation method and device
CN111859967A (en) Entity identification method and device and electronic equipment
CN116648698A (en) Dynamic facet ordering
CN108241867A (en) A kind of sorting technique and device
CN110909768B (en) Method and device for acquiring marked data
CN108021565A (en) A kind of analysis method and device of the user satisfaction based on linguistic level

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant