CN108416364A - Integrated study data classification method is merged in subpackage - Google Patents
Integrated study data classification method is merged in subpackage Download PDFInfo
- Publication number
- CN108416364A CN108416364A CN201810097334.3A CN201810097334A CN108416364A CN 108416364 A CN108416364 A CN 108416364A CN 201810097334 A CN201810097334 A CN 201810097334A CN 108416364 A CN108416364 A CN 108416364A
- Authority
- CN
- China
- Prior art keywords
- sample
- subset
- weight
- sorter model
- integrated study
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2148—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
The present invention discloses a kind of subpackage fusion integrated study data classification method, includes the following steps:S1:It obtains data and forms training set and test set;S2:Training set is divided into K subset using Subspace partition module;S3:Corresponding a subset trains a sorter model;S4:Calculate the corresponding weight factor of each sorter model;S5:Testing data is inputted in each sorter model, the sample label of each sorter model output obtains classification results to the end with corresponding weight factor multiplication rear weight.Its effect is:Learn by subpackage and to sample in every sub-spaces, weaken influence of the overlapping area to sorter model in sample space, then each subset misclassification sample is enhanced, is transferred in next subset and learns again, increases sample utilization rate.The prediction of all subsets is weighted using more spatial weightings fusion integrated study module integrated, to further influence of the reduction overlapping area sample to sorter model, improves nicety of grading.
Description
Technical field
The invention belongs to the data Classification and Identification technologies in big data field, and in particular to a kind of subpackage fusion integrated study
Data classification method.
Background technology
In big data field, data classification has a wide range of applications, such as medical diagnosis, Judgment by emotion, semantics recognition
And image recognition etc..Common grader mainly uses:Random forest (RF) algorithm, K arest neighbors (KNN) algorithm, support to
Amount machine (SVM) model, extreme learning machine (ELM) model etc..Although existing research is in feature extraction, feature learning and classification
Device design etc. all makes great progress, but sample study is not often taken seriously.
By taking the diagnosis of Parkinson disease based on voice data as an example, in speech sample and preprocessing process, it may be adopted
Collect equipment, the influence of the factors such as noise, there may be large error, shapes between finally obtained numerical value sample and actual sample
At exceptional sample.Exceptional sample normally results in different classes of sample aliasing in sample space and forms overlapping region, overlapping region
Sample may mislead sorter model.There is presently no results of study can prove grader mould of this part sample to foundation
Type is advantageous or harmful.Or existing method delete this part sample or be regarded as it is important as other samples, and
It does not account for weakening influence of these samples to grader by algorithm.
Invention content
Based on drawbacks described above, the present invention provides a kind of subpackage fusion integrated study data classification method, and it is right that this method passes through
Sample space is learnt, influence of the reduction overlapping region sample to disaggregated model.First, by each sample in training set
Centroid distance measure ratio is calculated as sample weights.Sample in training sample is arranged according to sample weights descending.So
The training set sample of sequence is divided into several subsets successively afterwards.Secondly, using staying a cross validation (LOO) method pair one
The wrong classification samples and error rate of subset are calculated, and go out a sub- sorter model using each trained.It is based on
Sample weights in each subset calculate penalty factor, and the weight factor of subset is calculated by the error rate of the subset after LOO.
In the learning process of all subsets, it is transmitted in next subset after the misclassification sample from previous subset is enhanced,
Next subset is learnt again.Again, the power of each subset is calculated using the weight factor of subset and penalty factor
Weight, and the test result of each sub-classifier is weighted using subset weight.By being carried out to sample in every sub-spaces
Study, and each subset misclassification sample is enhanced, it is transferred in next subset and learns again, realized to existing with this
There is making full use of for sample, increases sample utilization rate.Integrated study module is merged to the pre- of all subsets using more spatial weightings
Survey is weighted integrated, to further influence of the reduction overlapping area sample to sorter model, improves nicety of grading.
To achieve the above object, specific technical solution of the present invention is as follows:
A kind of subpackage fusion integrated study data classification method, feature include the following steps:
S1:It obtains data and forms training set and test set;
S2:Training set is divided into K subset using Subspace partition module, K is the integer more than or equal to 2;
S3:Corresponding a subset trains a sorter model;
S4:Calculate the corresponding weight factor of each sorter model;
S5:Testing data is inputted in each sorter model, the sample label of each sorter model output with it is right
The weight factor multiplication rear weight answered obtains classification results to the end.
Further, Subspace partition module described in step S2 uses power of the class heart distance metric ratio as sample
Weight by calculating the class heart distance metric ratio of each sample in training set, and is lined up, finally successively by from big size order
It is divided into K subset.
Further, step S3 carries out the training of sorter model using subspace sample delivery type training method, specifically
For:
S31:Set subset TkTrue tag representation be:Yk=[y1,y2,…,yj,…,ys], it is tested using an intersection is stayed
Demonstration is verified to obtain prediction label set to be Lk;
S32:Count subset TkThe classification error rate error_rate of middle misclassification sample and subset,
S33:According toCalculate the grader mould of K trained
The weight factor of type.
Further, classification error rate in step S32Its
In:
wjIndicate the class heart distance metric ratio of j-th of sample,Indicate subset TkMiddle s
The class heart distance metric ratio weighted value of a sample, weight (j) represent the initialization weight of j-th of sample;I(Yk(j)≠Lk
(j) indicate j-th of sample by misclassification.
Further, setting subset TkIn by staying the misclassification sample set after a cross validation to beThe sample
Next subset T is transmitted to after enhancingk+1It is middle to be learnt again.
Further, the enhancement method of misclassification sample isWherein:
It is the wrong original weight for dividing sample,It is the wrong weight for dividing sample after enhancing.
Further, it is weighted processing using more spatial weighting integrated study modules, specially:
S41:According toCalculate separately the penalty factor of K subset;
S42:According to weightk=βk·αkCalculate the weight of each partitions of subsets device;
S43:Calculate the weight of the sample predictions label of each partitions of subsets device output.
The present invention remarkable result be:
The Subspace partition module that this method proposes is based on the concept wrapped in bagging algorithms, by training set according to one
Fixed criterion is directly divided into several subsets, rather than random sampling is repeated as bagging algorithms, and duplicate removal is saved on algorithm
Multiple sampling process reduces time complexity and weakens overlapping area according to sample distribution characteristic dividing subset in sample space
Sample influence to other samples in training sorter model, between subspace sample delivery type training module with reference to
The thought for the concept and grader weight calculation that sample enhances in Adaboost algorithm, to sample in every sub-spaces
It practises, and each subset misclassification sample is enhanced, be transferred in next subset and learn again, realized to existing with this
Sample makes full use of, and increases sample utilization rate;Finally integrated study module is merged to all subsets using more spatial weightings
Prediction is weighted integrated, to further influence of the reduction overlapping area sample to sorter model, improves nicety of grading.
Description of the drawings
Fig. 1 is the control flow chart of the present invention;
Fig. 2 is data subpackage flow chart in Subspace partition module;
Fig. 3 is the class heart apart from schematic diagram calculation;
Fig. 4 is subspace sample delivery type training flow chart;
Fig. 5 is the flow chart of more spatial weighting integrated studies;
The classification accuracy average result of randomly drawing sample when Fig. 6 is different subsets number;
Fig. 7 show each subset weight and test set prediction result under different situations;
Fig. 8 is the impact of performance figure of algorithms of different in specific embodiment.
Specific implementation mode
In order to keep the technical problem to be solved in the present invention, technical solution and advantage clearer, below in conjunction with attached drawing and
Specific embodiment is described in detail.
As shown in Figure 1, the present embodiment provides a kind of subpackages to merge integrated study data classification method, include the following steps:
S1:It obtains data and forms training set and test set;
S2:Training set is divided into K subset using Subspace partition module, K is the integer more than or equal to 2;
S3:Corresponding a subset trains a sorter model;
S4:Calculate the corresponding weight factor of each sorter model;
S5:Testing data is inputted in each sorter model, the sample label of each sorter model output with it is right
The weight factor multiplication rear weight answered obtains classification results to the end.
The present embodiment applies this method to during diagnosis of Parkinson disease, by carrying out classification processing to voice data,
Realize the early diagnosis and prediction of Parkinson's disease.Using to data set be " Training set ", by Sakar et al. provide,
And it is downloaded from the machine learning data set library website University of California, Irvine (UCI).The data set is divided into two portions
Point:With Parkinson's disease subject and health volunteer.Wherein suffering from the subject of Parkinson's disease has male 14, and women 6
Example;There are male 10, women 10 in health volunteer.Therefore, data set shares 40 subjects.Entirely data set includes
1040 samples, each sample have 26 features.It is worth noting that, each subject has 26 samples, these sample representations
26 different semantic tasks.
The above method can be divided into Subspace partition module (SP) in specific implementation, and subspace sample delivery type trains mould
Block (TST) and more spatial weighting integrated study module (MWEL) three parts realize that SP modules are used to carry out dividing son to training set
Collection.Sub-classifier model is trained using each subset and calculate the relevant parameter of subset in TST modules.With MWEL modules
Fusion is weighted to the prediction label of all subsets, obtains final classification results.
As shown in Fig. 2, bagging algorithms by training set using there is the method for putting back to stochastical sampling multiple to generate
New training set, each newly trained sample number is identical as the sample number of original training set.Then each new training set trains one
A sorter model is simultaneously verified with test set.Finally, to the sorter model of each new training set by way of ballot
Prediction label is weighted, and obtains final result.Obviously, the training set in bagging algorithms is obtained by random sampling
, which results in the uncertainties of result.When carrying out classification experiments using bagging algorithms, usually experiment is repeated more
It is secondary, using the average value of many experiments result as final result.The time that such experimentation undoubtedly increases model is complicated
Degree.Subspace partition module (SP) proposed by the present invention is based on the concept wrapped in bagging algorithms, by training set according to one
Fixed criterion is directly divided into several subsets, rather than random sampling is repeated as bagging algorithms.In this process,
Training set sample is weighted with sample class heart distance ratio.
Assuming that the training set T comprising K class samples is expressed as T=[S0;S1;...;St;...;SK], t=1,2 ..., K;Its
The sample set that middle classification is t is:St=[s1;s2;...;si;...;sm], i=1,2 ... ..m, m are subset sample number.St
In i-th of sample be expressed as si=[f1;...;fj;...;fn], i=1,2 ... ..n, fjIndicate j-th of feature of the sample.
As shown in figure 3, point B is class StCenter of a sample's point, coordinate representation isWherein For j-th of feature of i-th of sample.Point C is class StIn i-th of sample coordinate.Point A is other foreign peoples
The central point of sample, is expressed as:
And haveα=∠ CDB, β=∠ CDA.Then there is sample siWith it is similar
Center of a sample's point distance is:
It is with foreign peoples center of a sample point distance:
It is possible thereby to which the class heart distance metric ratio for acquiring sample i is:
As seen from the figure there is three kinds of situations in the class heart distance metric ratio wi of sample:
From the point of view of geometric angle, due to AD=DB in triangle, CD=DC, alpha+beta=180 °, therefore the length of line segment AC and BC
Spend (i.e. d0And d1) closely bound up with the size of angle α and β.If α < β, d0< d1, it is meant that wi< 1;If α=β,
Then triangle Δ ADC and Δ DBC is congruent triangles, then d0=d1And wi=1;If α > β, d0> d1, also with regard to table
Show it is wi> 1.
It is found by analysis, w values are bigger, and the aliasing degree between sample and other different classes of samples is bigger, in d0Phase
With in the case of, sampled point is remoter from other classification samples, and w values are smaller.Based on this, class heart distance metric ratio can be used for
Indicate the aliasing degree of sample and other classification samples.In the ideal case, w is smaller, and representative sample is in structure sorter model
In it is more advantageous, w is bigger, and represented sample may more mislead sorter model in entire sample space.Therefore, originally
Invention uses weight of the class heart distance metric ratio as sample.The class heart of each sample in training set is calculated using formula (3)
Distance metric ratio is worth to w, and is ranked up from big to small to training set sample by w, the training set after finally sample sorts
It is averagely divided into K subset successively, we term it subpackages for this process.
Assuming that original training set is divided into K subset after Subspace partition module, can be expressed as:
T=[T1,T2,…,Tk,…,TK], k=1,2 ..., K.
The sample weights of training set are expressed as:W=[W1,W2,…,Wk,…,WK], wherein:
Wk=[w1,w2,…,wj,wj,ws], (j=1,2 ..., s) represent k-th of subset TkWeight set, s tables
Show this subset TkSample size.
By the study found that after training set is divided into K subset, with the increase of subsequence number, each subset
The separability of sample should show as being concave function.Subset separability is bigger, and the performance of the sub-classifier after training is better, whole
The weight of sub-classifier should bigger in a model.
Step S3 carries out the training of sorter model using subspace sample delivery type training method, specially:
S31:Set subset TkTrue tag representation be:Yk=[y1,y2,…,yj,…,ys], it is tested using an intersection is stayed
Demonstration is verified to obtain prediction label set to be Lk;
S32:Count subset TkThe classification error rate error_rate of middle misclassification sample and subset,
S33:According to:
Calculate the weight factor of the sorter model of K trained.
With reference to the method for thought and grader weight calculation that sample in Adaboost algorithm enhances, statistics subset TkMiddle mistake
The classification error rate of classification samples and subset, and calculate the weight factor of the sorter model of trained.
Classification error rate in step S32:
Wherein:wjIndicate the class heart distance metric ratio of j-th of sample,Indicate subset TkThe class heart of middle s sample
Distance metric ratio weighted value, weight (j) represent the initialization weight of j-th of sample;I(Yk(j)≠Lk(j) it indicates j-th
Sample is by misclassification.
Assuming that by staying the misclassification sample set after a cross validation to beAnd in TST modules, mistake classification
Sample set in sample next subset T is passed to after enhancingk+1It is middle to be learnt again.Therefore, it transmits successively, accidentally
The sample of classification always can in next subset re -training, to increase the utilization rate of sample.The flow of TST modules
Figure is specific as shown in Figure 4.
As shown in figure 4, before dividing sample to be transferred in next subset the mistake of each subset, need to mistake divide sample into
Row enhancing.Because in Subspace partition module, subset is divided according to the descending of sample weights.Therefore in each subset
Sample weights successively decrease successively, enhancing sample just need reduce mistake divide sample class heart distance measurement ratio.However, being more worth
It must consider, the wrong classification samples of previous subset may mistake classification again in next subset learning process.
Therefore, it is necessary to inhibit the influence of wrong classification samples in previous subset to the grader weight of next trained.Together
When, αkIt is bigger, indicate subset TkSample separability it is bigger, be put into next subset misclassification sample be more possible under the influence of
The weight of a subset.So these misclassification samples should reduce the interference to model to the greatest extent.Sample enhancing in the present invention
Mode is expressed as:
Wherein:It is the wrong original weight for dividing sample,It is the wrong weight for dividing sample after enhancing.Wherein sample barycenter
Distance metric ratio (is referred to as sample weights).Since αkAlways meet αk>=0, then just there is exp (αk) >=1 is always
It sets up.Formula (8) can reduce the class heart distance metric ratio for accidentally dividing sample well, realize sample enhancing.Moreover, exp
(αk) be a monotonic increase function.αkBigger, the new weight for accidentally dividing sample that formula (8) generates is smaller, to next height
Collect αk+1The influence of parameter is smaller.In this way, can prevent from having well the sample of larger separability to have compared with
The increase of small separability subset weight.
Next, as shown in figure 5, the prediction label of all subsets is weighted using MWEL modules it is integrated, further
Weaken influence of the overlapping area sample to sorter model, final nicety of grading is obtained by integrated study.Each subset instruction
Weight of the experienced sorter model in entire model can be calculated using equation (5).However, the test in sample space
The distribution of sample is totally unknown.Formula (5) cannot be used for indicating completely point in final mask by each trained
The weight of class device.In order to improve robustness of the model to test set, needs a penalty factor to carry out antithetical phrase collection weight and carry out about
Beam.Therefore, it is weighted processing using more spatial weighting integrated study modules in the present embodiment, specially:
S41:According toCalculate separately the penalty factor of K subset;
S42:According to weightk=βk·αkCalculate the weight of each partitions of subsets device;
S43:Calculate the weight of the sample predictions label of each partitions of subsets device output.
Pass through above-mentioned design, αkBy βkConstraint can improve the generalization ability of model.Assuming that the weight set of K subset
It is expressed as Weight=[weight1,weight2,…,weightK], and weight of the subset in entire model is to depend on
Weight is calculated.If the λ in entire modelkThe weight for representing k-th subset, in order to ensureThen λk's
Calculation is:
Further, in order to which the feasibility to the above method is verified, following experiment has also been devised in the present invention.
(1) because the sample space sample distribution situation of different data collection is different, it is therefore desirable to determine draw for data sets
The optimum number of Molecule Set.Moreover, the quantity of subpackage cannot be too big, it can not be too small.Sample size in too big each subset
Too small, training is insufficient;Inhomogeneous sample aliasing depth is too big in subset if too small, is unfavorable for sample classification.Therefore,
Using subpackage proposed by the invention fusion integrated study sorting technique by after training set training pattern, in random slave training set
It extracts 26 samples to be verified, statistical forecast accuracy rate.For data set used in the present embodiment, subpackage number is from 5-9
Between select, in the case of different packet numbers, 20 times experiment predictablity rates average result it is as shown in Figure 6.
It will be appreciated from fig. 6 that the best subpackage number of the used data set of the present embodiment is 7.In order to verify the sample of each subset
Separability, in certain experimentation, the classification for having counted 7 subset subset samples in the case where staying a cross validation is accurate
A concave function is substantially presented in the classification accuracy of true rate, the respective sample of seven subsets, has confirmed method part to each subset
The analysis of performance.It can be seen that classification accuracy it is high subset sample separability it is big, the subset weight in entire model should be more
Greatly;The low subset sample separability of classification accuracy is small, and the subset weight in entire model should be smaller.
The wrong transmission for dividing sample, increases the utilization rate of sample between subset.In certain experimentation, each subset is to next
The mistake that a subset is transmitted divides number of samples to be denoted as Ni, it is transferred to next subset and then secondary wrong point of number is denoted as Mi+1, system
The results are shown in Table 1 for meter:
Table 1:Next subset mistake is transmitted to divide sample number and transfer samples wrong score mesh compares again
Number of subsets | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
Ni | 2 | 8 | 57 | 64 | 26 | 18 | 1 |
Mi+1 | — | 0 | 4 | 13 | 4 | 1 | 1 |
Table 1 the result shows that the mistake transmitted each time divides sample, have most can correctly be divided in next subset
Class, to prove, mistake divides the utilization ratio that the transmission of sample increases sample, realizes making full use of for available sample.
Because the weight parameter of a subset under the influence of the wrong transmission meeting for dividing sample, therefore counted context of methods respectively and existed
There is no mistake that sample is divided to transmit, wrong point of sample transmits but no specimen enhances, and mistake divides sample to transmit and enhance.
Fig. 7 each subset weight and test set prediction result in the case of showing three kinds.Sample is divided to pass to verify mistake with this
Pass the influence enhanced experimental result influence and sample to experimental result.
As seen in Figure 7, mistake is not transmitted to be divided to sample and transmit the wrong son for being divided to sample but not reinforcing two kinds of situations
Collect weight, it is seen that after sample transmits, for the weighting curve of each subset closer to a concave function, improving on the whole can
The weight for dividing the big subset of property, reduces the weight of the small subset of separability.Comparative sample enhancing is in non-reinforced situation, it is seen that
Sample enhancing antithetical phrase collection weights influence and little on the whole.
For performance more of the invention, method 1 is conventional method as a result, being directly to be classified to obtain to data set
's;Method 2 is the result of bagging algorithms;Method 3 is to use SP modules in context of methods, after TST modules, to each height
Collection prediction result carries out the result of ballot succession.Method 4 is to classify fully according to method proposed by the present invention, four kinds of methods
The comparison curves of accuracy rate is as shown in Figure 8.
As seen in Figure 8, method proposed by the present invention is on the graders such as RF, SVM (linear) or SVM (RBF),
Its classification performance is significantly improved.
Finally, it should be noted that the present embodiment description is merely a preferred embodiment of the present invention, the ordinary skill of this field
Personnel under the inspiration of the present invention, without prejudice to the purpose of the present invention and the claims, as can making multiple types
It indicates, such transformation is each fallen within protection scope of the present invention.
Claims (7)
1. integrated study data classification method is merged in a kind of subpackage, feature includes the following steps:
S1:It obtains data and forms training set and test set;
S2:Training set is divided into K subset using Subspace partition module, K is the integer more than or equal to 2;
S3:Corresponding a subset trains a sorter model;
S4:Calculate the corresponding weight factor of each sorter model;
S5:Testing data is inputted in each sorter model, the sample label of each sorter model output with it is corresponding
Weight factor multiplication rear weight obtains classification results to the end.
2. integrated study data classification method is merged in subpackage according to claim 1, it is characterised in that:Described in step S2
Subspace partition module uses weight of the class heart distance metric ratio as sample, by the class for calculating each sample in training set
Heart distance metric ratio, and be lined up successively by descending order, finally it is divided into K subset.
3. integrated study data classification method is merged in subpackage according to claim 1 or 2, it is characterised in that:Step S3 is adopted
The training of sorter model is carried out with subspace sample delivery type training method, specially:
S31:Set subset TkTrue tag representation be:Yk=[y1,y2,…,yj,…,ys], using staying a cross-validation method
It is verified to obtain prediction label set to be Lk;
S32:Count subset TkThe classification error rate error_rate of middle misclassification sample and subset,
S33:According toCalculate the sorter model of K trained
Weight factor.
4. integrated study data classification method is merged in subpackage according to claim 3, it is characterised in that:Divide in step S32
Class error rateWherein:
wjIndicate the class heart distance metric ratio of j-th of sample,Indicate subset TkMiddle s sample
This class heart distance metric ratio weighted value, weight (j) represent the initialization weight of j-th of sample;I(Yk(j)≠Lk(j) table
Show j-th of sample by misclassification.
5. integrated study data classification method is merged in subpackage according to claim 3, it is characterised in that:Set subset TkIn
By staying the misclassification sample set after a cross validation to beThe sample is transmitted to next subset T after enhancingk+1
It is middle to be learnt again.
6. integrated study data classification method is merged in subpackage according to claim 5, it is characterised in that:Misclassification sample
Enhancement method isWherein:It is the wrong original weight for dividing sample,It is to increase
Mistake divides the weight of sample after strong.
7. integrated study data classification method is merged in subpackage according to claim 3, it is characterised in that:Added using more spaces
Power integrated study module is weighted processing, specially:
S41:According toCalculate separately the penalty factor of K subset;
S42:According to weightk=βk·αkCalculate the weight of each partitions of subsets device;
S43:Calculate the weight of the sample predictions label of each partitions of subsets device output.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810097334.3A CN108416364A (en) | 2018-01-31 | 2018-01-31 | Integrated study data classification method is merged in subpackage |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810097334.3A CN108416364A (en) | 2018-01-31 | 2018-01-31 | Integrated study data classification method is merged in subpackage |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108416364A true CN108416364A (en) | 2018-08-17 |
Family
ID=63127486
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810097334.3A Pending CN108416364A (en) | 2018-01-31 | 2018-01-31 | Integrated study data classification method is merged in subpackage |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108416364A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109765341A (en) * | 2019-01-25 | 2019-05-17 | 重庆水利电力职业技术学院 | A kind of structure monitoring system for civil engineering |
CN110222762A (en) * | 2019-06-04 | 2019-09-10 | 恒安嘉新(北京)科技股份公司 | Object prediction method, apparatus, equipment and medium |
CN111382758A (en) * | 2018-12-28 | 2020-07-07 | 杭州海康威视数字技术股份有限公司 | Training image classification model, image classification method, device, equipment and medium |
CN111709488A (en) * | 2020-06-22 | 2020-09-25 | 电子科技大学 | Dynamic label deep learning algorithm |
CN111783093A (en) * | 2020-06-28 | 2020-10-16 | 南京航空航天大学 | Malicious software classification and detection method based on soft dependence |
CN111882003A (en) * | 2020-08-06 | 2020-11-03 | 北京邮电大学 | Data classification method, device and equipment |
CN112183582A (en) * | 2020-09-07 | 2021-01-05 | 中国海洋大学 | Multi-feature fusion underwater target identification method |
CN113393932A (en) * | 2021-07-06 | 2021-09-14 | 重庆大学 | Parkinson's disease voice sample segment multi-type reconstruction transformation method |
US11507882B2 (en) | 2019-09-12 | 2022-11-22 | Beijing Xiaomi Intelligent Technology Co., Ltd. | Method and device for optimizing training set for text classification and storage medium |
CN116843998A (en) * | 2023-08-29 | 2023-10-03 | 四川省分析测试服务中心 | Spectrum sample weighting method and system |
CN118098623A (en) * | 2024-04-26 | 2024-05-28 | 菏泽医学专科学校 | Medical information data intelligent management method and system based on big data |
-
2018
- 2018-01-31 CN CN201810097334.3A patent/CN108416364A/en active Pending
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111382758B (en) * | 2018-12-28 | 2023-12-26 | 杭州海康威视数字技术股份有限公司 | Training image classification model, image classification method, device, equipment and medium |
CN111382758A (en) * | 2018-12-28 | 2020-07-07 | 杭州海康威视数字技术股份有限公司 | Training image classification model, image classification method, device, equipment and medium |
CN109765341A (en) * | 2019-01-25 | 2019-05-17 | 重庆水利电力职业技术学院 | A kind of structure monitoring system for civil engineering |
CN110222762A (en) * | 2019-06-04 | 2019-09-10 | 恒安嘉新(北京)科技股份公司 | Object prediction method, apparatus, equipment and medium |
US11507882B2 (en) | 2019-09-12 | 2022-11-22 | Beijing Xiaomi Intelligent Technology Co., Ltd. | Method and device for optimizing training set for text classification and storage medium |
CN111709488A (en) * | 2020-06-22 | 2020-09-25 | 电子科技大学 | Dynamic label deep learning algorithm |
CN111783093A (en) * | 2020-06-28 | 2020-10-16 | 南京航空航天大学 | Malicious software classification and detection method based on soft dependence |
CN111882003A (en) * | 2020-08-06 | 2020-11-03 | 北京邮电大学 | Data classification method, device and equipment |
CN111882003B (en) * | 2020-08-06 | 2024-01-23 | 北京邮电大学 | Data classification method, device and equipment |
CN112183582A (en) * | 2020-09-07 | 2021-01-05 | 中国海洋大学 | Multi-feature fusion underwater target identification method |
CN113393932B (en) * | 2021-07-06 | 2022-11-25 | 重庆大学 | Parkinson's disease voice sample segment multi-type reconstruction transformation method |
CN113393932A (en) * | 2021-07-06 | 2021-09-14 | 重庆大学 | Parkinson's disease voice sample segment multi-type reconstruction transformation method |
CN116843998A (en) * | 2023-08-29 | 2023-10-03 | 四川省分析测试服务中心 | Spectrum sample weighting method and system |
CN116843998B (en) * | 2023-08-29 | 2023-11-14 | 四川省分析测试服务中心 | Spectrum sample weighting method and system |
CN118098623A (en) * | 2024-04-26 | 2024-05-28 | 菏泽医学专科学校 | Medical information data intelligent management method and system based on big data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108416364A (en) | Integrated study data classification method is merged in subpackage | |
CN105589806B (en) | A kind of software defect tendency Forecasting Methodology based on SMOTE+Boosting algorithms | |
CN111832608A (en) | Multi-abrasive-particle identification method for ferrographic image based on single-stage detection model yolov3 | |
CN111090764B (en) | Image classification method and device based on multitask learning and graph convolution neural network | |
CN103605711B (en) | Construction method and device, classification method and device of support vector machine | |
CN110111885B (en) | Attribute prediction method, attribute prediction device, computer equipment and computer readable storage medium | |
CN109948740A (en) | A kind of classification method based on tranquillization state brain image | |
CN110175697A (en) | A kind of adverse events Risk Forecast System and method | |
CN104966106B (en) | A kind of biological age substep Forecasting Methodology based on support vector machines | |
CN107247954A (en) | A kind of image outlier detection method based on deep neural network | |
CN115908255A (en) | Improved light-weight YOLOX-nano model for target detection and detection method | |
US20150242676A1 (en) | Method for the Supervised Classification of Cells Included in Microscopy Images | |
CN112690774B (en) | Magnetic resonance image-based stroke recurrence prediction method and system | |
CN112382382B (en) | Cost-sensitive integrated learning classification method and system | |
CN113486202A (en) | Method for classifying small sample images | |
US20220319002A1 (en) | Tumor cell isolines | |
US20230409926A1 (en) | Index for risk of non-adherence in geographic region with patient-level projection | |
Sridhar et al. | Multi-lane capsule network architecture for detection of COVID-19 | |
Isnanto et al. | Classification of Heart Disease Using Linear Discriminant Analysis Algorithm | |
Mahendra et al. | Optimizing convolutional neural network by using genetic algorithm for COVID-19 detection in chest X-ray image | |
Ahmed et al. | Improving prediction of plant disease using k-efficient clustering and classification algorithms | |
CN115204475A (en) | Drug rehabilitation place security incident risk assessment method | |
CN109800854A (en) | A kind of Hydrophobicity of Composite Insulator grade determination method based on probabilistic neural network | |
TWI599896B (en) | Multiple decision attribute selection and data discretization classification method | |
Cahyani et al. | COVID-19 classification using CNN-BiLSTM based on chest X-ray images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180817 |
|
RJ01 | Rejection of invention patent application after publication |