CN103593470B - The integrated unbalanced data flow classification algorithm of a kind of two degree - Google Patents

The integrated unbalanced data flow classification algorithm of a kind of two degree Download PDF

Info

Publication number
CN103593470B
CN103593470B CN201310624425.5A CN201310624425A CN103593470B CN 103593470 B CN103593470 B CN 103593470B CN 201310624425 A CN201310624425 A CN 201310624425A CN 103593470 B CN103593470 B CN 103593470B
Authority
CN
China
Prior art keywords
data flow
classification model
data
classification
lack
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310624425.5A
Other languages
Chinese (zh)
Other versions
CN103593470A (en
Inventor
张重生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan University
Original Assignee
Henan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan University filed Critical Henan University
Priority to CN201310624425.5A priority Critical patent/CN103593470B/en
Publication of CN103593470A publication Critical patent/CN103593470A/en
Application granted granted Critical
Publication of CN103593470B publication Critical patent/CN103593470B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries

Abstract

The invention discloses the integrated unbalanced data flow classification algorithm of a kind of two degree, comprise the model prediction of equalization data traffic classification, classifying believe degree assessment and and unbalanced data flow classification model prediction three phases. Wherein, first equalization data traffic classification model predicts the classification of every data record; Classifying believe degree assessment is carried out reliability assessment to the classification results in equalization data traffic classification model prediction stage, the classification results of record with a high credibility directly returns to user, and data record with a low credibility need to be through the subseries again in unbalanced data flow classification model prediction stage. Method in the present invention can be widely used in, in the application such as area of computer aided clinical diagnosis, intrusion detection in real time, belonging to artificial intelligence application field.

Description

The integrated unbalanced data flow classification algorithm of a kind of two degree
Technical field
The present invention relates to a kind of data flow classification algorithm, relate in particular to the integrated unbalanced data flow classification of a kind of two degree and calculateMethod.
Background technology
In recent years, data mining technology in the practical application of all trades and professions, comprises that area of computer aided is clinical more and moreDiagnosis, commending system based on internet and ad system, client segmentation, finance data analysis and abnormal transaction monitoring etc.,Intellectual analysis and the decision system of this Industry-oriented are accepted extensively by people.
In a lot of practical applications, the distribution of data is unbalanced, claim again to distribute, for example, 90% numberBelong to classification A together according to record, claim that A is most classes; And only have 10% data record to belong to classification B, so claim that again B is minority class. ExampleAs, in the application of analyzing at finance data, most transaction are all normal, it is abnormal only having only a few transaction; Use classificationWhen abnormal conclude the business regular of scientific discovery, the rule of the transaction that how to note abnormalities from a small amount of abnormal transaction record is also set up differentNormal classification of business transaction model is the task of extremely having challenge: this disaggregated model needs to identify comparatively exactly abnormalTransaction; Can not transaction be normally mistaken for abnormal simultaneously. In other words, this disaggregated model should be comparatively exactly to extremelyTransaction is classified, and normal transaction also needs to classify comparatively exactly.
The practical application of a lot of data minings not only needs to process static data, and need to process a large amount of fluxionsAccording to, be also data flow, for example: social media excavate, flow analysis, stock exchange analysis, event detection, sensing data are clicked in websiteThe application such as processing. In these application, the data flow of skewness weighing apparatus, the data flow tilting that also distributes is common. ExistingAlthough sorting algorithm can improve the classify accuracy of the minority class in the data flow of skewness weighing apparatus, has reduced majorityThe classify accuracy of class. Therefore, need a kind of sorting algorithm of more desirable unbalanced data flow, this algorithm can be as the criterionReally predict the minority class data record in unbalanced data flow, can ensure again the classify accuracy to most class data record.
Summary of the invention
The object of this invention is to provide the integrated unbalanced data flow classification algorithm of a kind of two degree, can predict comparatively exactlyMinority class in unbalanced data flow, can ensure again the classify accuracy to most class data record.
The present invention adopts following technical proposals:
The integrated unbalanced data flow classification algorithm of two degree, comprises following step:
A: equalization data traffic classification model and lack of balance data flow classification model training stage: concentrate for training dataEach up-to-date data flow record block, be divided into training set and checking collection; On training set, train respectively one allWeighing apparatus data flow classification model and a lack of balance data flow classification model; Be retained in the n that the upper classify accuracy of checking collection is the highest individualEqualization data traffic classification model and n lack of balance data flow classification model;
B: utilize n equalization data traffic classification model in steps A and n lack of balance data flow classification model to checkingConcentrated data record is classified and is carried out reliability assessment, finally draws the confidence level threshold value δ of optimization;
C: n equalization data traffic classification model in use steps A and n lack of balance data flow classification model are for surveyEach data record of examination data centralization is classified, and exports final classification results.
The method that in described step B, usage data drives is determined the confidence level threshold value δ optimizing, concrete grammar on checking collectionAs follows:
With the degree of accuracy of m1 presentation class, the sensitivity of m2 presentation class and the geometric mean of specificity; Initializing variable d=1.0, t=0, on verification msg collection; Circulation is carried out as is finished drilling: since 0, the value of δ is increased to 0.02 at every turn, and checkingThe value of point (m1, m2) that this δ value is corresponding and the distance l of point (1,1); If this l is also less than d, d=l, t=δ; ShouldCircular flow o'clock finishes to δ=1; After circulation finishes, the currency of t is assigned to δ, δ value is now the confidence level of optimizationThreshold value.
Every the data record u in described step C, test data being concentrated classify and predicts and comprise following stepRapid:
C1: first integrated retained a n equalization data traffic classification model to the u prediction of classifying;
C2: calculate the confidence level r (u) of classification results to u, confidence level r (u) be greater than optimization confidence level threshold value δ pointClass result directly returns to user;
C3: if to the low confidence level threshold value δ with optimizing of the classifying believe degree r (u) of u, integrated n lack of balance data flowDisaggregated model carries out subseries again to u, and returns to final classification results.
In described steps A, train equalization data traffic classification model to comprise following step:
A11: training set is carried out to simple random sampling, and sample size, for being designated as s, is not distinguished the classification of sample when sampling,This sample is designated as T1;
A12: use sorting algorithm, train classification models on T1, claims that this disaggregated model is 1 equalization data traffic classification mouldType;
A13: test existing equalization data traffic classification model, if the sum of equalization data traffic classification model exceedes n,On checking collection, test has equalization data traffic classification model one by one, and the poorest equalization data traffic classification of superseded classify accuracyModel, until the sum of residue equalization data traffic classification model equals n;
In described steps A, train 1 lack of balance data flow classification model to comprise following step:
A21: collect the minority class data record in the training set of each data flow record block, and put into minority class record and holdIn device, exceed defined amount s if minority class records the sum of data record in container, eliminate data note the oldest in this pieceRecord, until the sum of remaining data record equals s;
A22: when sampling, first Tr is carried out to simple random sampling, sample size is s/2, does not distinguish sample when samplingClassification; Then data record minority class being recorded in container is carried out simple random sampling, and sample size is also s/2, by twiceData from the sample survey combines and forms up-to-date data from the sample survey, is designated as T2;
A23: use sorting algorithm, train classification models on T2. Claim that this disaggregated model is 1 lack of balance data flow classificationModel;
A24: test existing lack of balance data flow classification model: if the sum of lack of balance data flow classification model exceedesN, on Va, test has lack of balance data flow classification model one by one, and the poorest lack of balance data flow of superseded classify accuracyDisaggregated model, until the sum of remaining lack of balance data flow classification model equals n.
The present invention is by using the model prediction of equalization data traffic classification, classifying believe degree assessment and unbalanced data flow classificationModel prediction three phases, comparatively exactly non-classified new data in unbalanced data flow is carried out to real-time grading, usesThe present invention can predict the record of minority class comparatively exactly, can greatly reduce again grader most classes are mistaken for to minority classProbability; Therefore, the method in the present invention, the classification of the data flow weighing for skewness, has great importance; And thisIn bright can application for solution data flow, how from unbalanced real-time stream, to find classifying rules right comparatively exactlyNon-classified new data carries out the problem of real-time grading; The method belongs to artificial intelligence application field, can be widely used in calculatingMachine is assisted in the application such as clinical diagnosis, intrusion detection in real time.
Brief description of the drawings
Fig. 1 is schematic flow sheet of the present invention.
Detailed description of the invention
As shown in Figure 1, the integrated unbalanced data flow classification algorithm of a kind of two degree, uses pane data flow model, by numberAccording to large data flow record blocks such as stream are cut into successively, the data record quantity of each data flow record block is identical. In this patentThe parameter using mainly contains: b: the quantity of the data record in data flow record block. S: the sample size of sampling, s <b, sAlso be the size that minority class records container simultaneously. N: the quantity of the data flow record block that pane data flow model can keep. ToolBody comprises following step:
A: equalization data traffic classification model and lack of balance data flow classification model training stage: for each up-to-date dataStream record block, this data flow record block is divided into two of training dataset Tr and verification msg collection Va by the ratio with 90% and 10%Point, on Tr, train respectively 1 equalization data traffic classification model and 1 lack of balance data flow classification model;
In described steps A, train 1 equalization data traffic classification model to comprise following step:
A11: Tr is carried out to simple random sampling, and sample size is s, does not distinguish the classification of sample when sampling, this sample noteFor T1;
A12: use sorting algorithm, train classification models on T1, claims that this disaggregated model is 1 equalization data traffic classification mouldType;
A13: test existing equalization data traffic classification model: if the sum of equalization data traffic classification model exceedes n,On Va, test has equalization data traffic classification model one by one, and the poorest equalization data traffic classification mould of superseded classify accuracyType, until the sum of residue equalization data traffic classification model equals n;
In described steps A, train 1 lack of balance data flow classification model to comprise following step:
A21: collect the minority class data record in the training set of each data flow record block, and put into minority class record and holdIn device. Exceed defined amount s if minority class records the sum of data record in container, eliminate data note the oldest in this pieceRecord, until the sum of remaining data record equals s;
A22: when sampling, first Tr is carried out to simple random sampling, sample size is s/2, does not distinguish sample when samplingClassification; Then data record minority class being recorded in container is carried out simple random sampling, and sample size is also s/2, by twiceData from the sample survey combines and forms up-to-date data from the sample survey, is designated as T2;
A23: use sorting algorithm, train classification models on T2. Claim that this disaggregated model is 1 lack of balance data flow classificationModel;
A24: test existing lack of balance data flow classification model: if the sum of lack of balance data flow classification model exceedesN, on Va, test has lack of balance data flow classification model one by one, and the poorest lack of balance data flow of superseded classify accuracyDisaggregated model, until the sum of remaining lack of balance data flow classification model equals n.
B: utilize n equalization data traffic classification model in steps A and n lack of balance data flow classification model in VaData record classify and carry out reliability assessment, draw the confidence level threshold value δ of optimization.
E1 is by the n in steps A the integrated grader of equalization data traffic classification model, and E2 is non-by the n in steps AThe grader that equalization data traffic classification model is integrated. E1 and E2 use a data record of method prediction of member's majority votingClassification.
The fall into a trap method of point counting class credible result degree of described step B is as follows:
B1: for binary classification device, the value of definition r (x) is that a data record x of a grader prediction belongs to twoThe absolute value of the difference of the probability of class; Represent that with P (x ∈ A) x belongs to the probability of class A, represent that with P (x ∈ B) x belongs to the probability of class B,R (x)=| P (x ∈ A)-P (x ∈ B) |, wherein P (x ∈ A)+P (x ∈ B)=1; Wherein the value of r (x) is larger, justThe confidence level of classification results that shows binary classification device is higher; Otherwise, if the value of r (x) is less, just show binary classification deviceThe confidence level of classification results is lower.
The method of calculating confidence level threshold value δ in described step B is as follows:
B2: the method that usage data drives is determined the confidence level threshold value of optimizing: with the degree of accuracy of m1 presentation class, m2 representsThe sensitivity of classification and the geometric mean of specificity; Initializing variable d=1.0, t=0. Verification msg collection Va in steps AUpper, following operation is carried out in circulation: since 0, the value of δ is increased to 0.02 at every turn, and verify point (m1, m2) corresponding to this δ valueThe distance of value and point (1,1); In each circulation, reservation is put corresponding δ value from (m1, the m2) of the distance minimum of point (1,1), shouldCircular flow o'clock finishes to δ=1. Specific procedure is as follows:
The method of data-driven:
Input: verification msg collection Va, by the n in steps A the grader E1 that equalization data traffic classification model is integrated, byN in steps A the grader E2 that lack of balance data flow classification model is integrated
Output: the optimum value of parameter δ
begin
1t?0,d?1.0;
2fort=0:0.02:1{ // circulation (t span is in [0,1], and each circulation increases progressively 0.02)
3foreachuinVa{ // circulation (u is a data record in Va)
First 4 use grader E1 to classify to u;
Then 5 calculate r (u) according to classification results;
6if(r(u)<t){
7 use grader E2 to reclassify u; ?
8 calculate (m1, m2) and calculate it the distance l that arrives point (1,1);
9if(l<d){
10δ=t;
11d=l;}}}
end
After circulation finishes, the currency of t is assigned to δ, δ value is now the confidence level threshold value of optimization.
C: to the prediction of classifying of every data record in test data set Test.
In described step C, any data record u in test data set Test is classified and comprises following stepRapid:
C1: first use the grader E1 in step B to classify to u;
C2: use the confidence level computational methods in step B1 to calculate r (u);
C3: if r (u)>=δ, output category result; If r (u)<δ, uses grader E2 in step B to uCarry out subseries again, and export the classification results of E2.
The present invention is the model prediction of equalization data traffic classification, classifying believe degree assessment by the classifying and dividing of unbalanced data flowWith unbalanced data flow classification model prediction three phases. Wherein, the equalization data traffic classification model prediction stage is used step BIn the classification of the integrated grader E1 prediction data record of n balanced grader; The classification knot of classifying believe degree assessment to E1Fruit carries out reliability assessment, and the classification results of record with a high credibility directly returns to user, and does not need through unbalanced numberAccording to the classification of traffic classification model prediction. And record with a low credibility need to be integrated through n lack of balance grader in step BGrader E2 subseries again and export the classification results of E2.
Overall flow of the present invention is as follows:
Total algorithm:
Input: the training set Train of data flow, the test set Test of data flow
Output: the classification results of Test data set
begin
Train is divided into the data flow record block D1 that n size is b by 1, D2 ..., Dn;
2fori=1:1:n{ // circulation (i span is at [1, n], and each circulation increases progressively 1)
Data flow record block Di is divided into training set Tr and checking collection Va by 3;
4 use the method in steps A on Tr, to train 1 balanced grader and 1 lack of balance grader;
5 are retained in n balanced grader and n the lack of balance grader of the upper classifying quality the best of Va;
6 use algorithm 1 to solve optimal threshold δ on Va;
7foreachuinTest{ // circulation (u is a data record in Test)
First 8 use by n the integrated grader E1 of balanced grader u classified;
Then 9 calculate r (u) according to E1 classification results;
10if(r(u)<δ){
11 use by the integrated grader E2 of n lack of balance grader u subseries again; ?
end

Claims (5)

1. the integrated unbalanced data flow classification algorithm of two degree, is characterized in that: comprise following step:
A: equalization data traffic classification model and lack of balance data flow classification model training stage: that concentrates for training data is everyA up-to-date data flow record block, is divided into training set and checking collection; On training set, train respectively a balanced numberAccording to traffic classification model and a lack of balance data flow classification model; Be retained in n the equilibrium that the upper classify accuracy of checking collection is the highestData flow classification model and n lack of balance data flow classification model;
B: utilize n equalization data traffic classification model and n lack of balance data flow classification model in steps A concentrated to verifyingData record classify and carry out reliability assessment, finally draw the confidence level threshold value δ of optimization;
C: n equalization data traffic classification model in use steps A and n lack of balance data flow classification model are for test numberClassify according to each data record of concentrating, and export final classification results.
2. the integrated unbalanced data flow classification algorithm of two degree according to claim 1, is characterized in that: described step BThe method that middle usage data drives is determined the confidence level threshold value δ optimizing on checking collection, and concrete grammar is as follows:
With the degree of accuracy of m1 presentation class, the sensitivity of m2 presentation class and the geometric mean of specificity; Initializing variable d=1.0, t=0, on verification msg collection; Following operation is carried out in circulation: since 0, the value of δ is increased to 0.02 at every turn, and checkingThe value of point (m1, m2) that this δ value is corresponding and the distance l of point (1,1); If this l is also less than d, d=l, t=δ; ShouldCircular flow o'clock finishes to δ=1; After circulation finishes, the currency of t is assigned to δ, δ value is now the confidence level of optimizationThreshold value.
3. the integrated unbalanced data flow classification algorithm of two degree according to claim 1, its feature exists
Classify and predict and comprise following step in: every the data record u in described step C, test data being concentratedRapid:
C1: first integrated retained a n equalization data traffic classification model to the u prediction of classifying;
C2: calculate the confidence level r (u) of the classification results to u, confidence level r (u) is greater than the classification knot of the confidence level threshold value δ of optimizationFruit directly returns to user;
C3: if to the classifying believe degree r (u) of u lower than optimize confidence level threshold value δ, integrated n lack of balance data flow classificationModel carries out subseries again to u, and returns to final classification results.
4. according to the arbitrary described integrated unbalanced data flow classification algorithm of two degree of claim 1-3, its feature exists
In: in described steps A, train equalization data traffic classification model to comprise following step:
A11: training set is carried out to simple random sampling, and sample size, for being designated as s, is not distinguished the classification of sample when sampling, this sampleOriginally be designated as T1;
A12: use sorting algorithm, train classification models on T1, claims that this disaggregated model is 1 equalization data traffic classification model;
A13: test existing equalization data traffic classification model, if the sum of equalization data traffic classification model exceedes n, testingOn card collection, test has equalization data traffic classification model one by one, and the poorest equalization data traffic classification mould of superseded classify accuracyType, until the sum of residue equalization data traffic classification model equals n.
5. the integrated unbalanced data flow classification algorithm of two degree according to claim 4, its feature exists
In: in described steps A, train 1 lack of balance data flow classification model to comprise following step:
A21: collect the minority class data record in the training set of each data flow record block, and put into minority class and record containerIn, if recording the sum of data record in container, minority class exceedes defined amount s, eliminate data note the oldest in this pieceRecord, until the sum of remaining data record equals s;
A22: when sampling, first Tr is carried out to simple random sampling, sample size is s/2, does not distinguish the classification of sample when sampling;Then data record minority class being recorded in container is carried out simple random sampling, and sample size is also s/2, by twice sampling numberForm up-to-date data from the sample survey according to combining, be designated as T2;
A23: use sorting algorithm, train classification models on T2, claims that this disaggregated model is 1 lack of balance data flow classification mouldType;
A24: test existing lack of balance data flow classification model: if the sum of lack of balance data flow classification model exceedes n,On Va, test has lack of balance data flow classification model one by one, and the poorest lack of balance data flow classification of superseded classify accuracyModel, until the sum of remaining lack of balance data flow classification model equals n.
CN201310624425.5A 2013-11-29 2013-11-29 The integrated unbalanced data flow classification algorithm of a kind of two degree Active CN103593470B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310624425.5A CN103593470B (en) 2013-11-29 2013-11-29 The integrated unbalanced data flow classification algorithm of a kind of two degree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310624425.5A CN103593470B (en) 2013-11-29 2013-11-29 The integrated unbalanced data flow classification algorithm of a kind of two degree

Publications (2)

Publication Number Publication Date
CN103593470A CN103593470A (en) 2014-02-19
CN103593470B true CN103593470B (en) 2016-05-18

Family

ID=50083611

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310624425.5A Active CN103593470B (en) 2013-11-29 2013-11-29 The integrated unbalanced data flow classification algorithm of a kind of two degree

Country Status (1)

Country Link
CN (1) CN103593470B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104462301B (en) * 2014-11-28 2018-05-04 北京奇虎科技有限公司 A kind for the treatment of method and apparatus of network data
CN106294490B (en) * 2015-06-08 2019-12-24 富士通株式会社 Feature enhancement method and device for data sample and classifier training method and device
CN108141377B (en) * 2015-10-12 2020-08-07 华为技术有限公司 Early classification of network flows
CN107423156A (en) * 2017-07-29 2017-12-01 合肥千奴信息科技有限公司 Fault pre-alarming algorithm based on taxonomic clustering
CN113692589A (en) * 2019-04-29 2021-11-23 西门子(中国)有限公司 Classification model training method and device and computer readable medium
CN110245232B (en) * 2019-06-03 2022-02-18 网易传媒科技(北京)有限公司 Text classification method, device, medium and computing equipment
CN111915559B (en) * 2020-06-30 2022-09-20 电子科技大学 Airborne SAR image quality evaluation method based on SVM classification credibility
CN112017634B (en) * 2020-08-06 2023-05-26 Oppo(重庆)智能科技有限公司 Data processing method, device, equipment and storage medium
CN112989207B (en) * 2021-04-27 2021-08-27 武汉卓尔数字传媒科技有限公司 Information recommendation method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101587493A (en) * 2009-06-29 2009-11-25 中国科学技术大学 Text classification method
CN101763466A (en) * 2010-01-20 2010-06-30 西安电子科技大学 Biological information recognition method based on dynamic sample selection integration
CN102945280A (en) * 2012-11-15 2013-02-27 翟云 Unbalanced data distribution-based multi-heterogeneous base classifier fusion classification method
CN103309953A (en) * 2013-05-24 2013-09-18 合肥工业大学 Method for labeling and searching for diversified pictures based on integration of multiple RBFNN classifiers

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101587493A (en) * 2009-06-29 2009-11-25 中国科学技术大学 Text classification method
CN101763466A (en) * 2010-01-20 2010-06-30 西安电子科技大学 Biological information recognition method based on dynamic sample selection integration
CN102945280A (en) * 2012-11-15 2013-02-27 翟云 Unbalanced data distribution-based multi-heterogeneous base classifier fusion classification method
CN103309953A (en) * 2013-05-24 2013-09-18 合肥工业大学 Method for labeling and searching for diversified pictures based on integration of multiple RBFNN classifiers

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种不平衡数据流集成分类模型;欧阳震诤,罗建书,胡东敏,吴泉源;《电子学报》;20100115(第2010年01期);全文 *
不平衡数据集的分类方法研究;王和勇,樊泓坤,姚正安,李成安;《计算机应用研究》;20080515(第2008年05期);全文 *

Also Published As

Publication number Publication date
CN103593470A (en) 2014-02-19

Similar Documents

Publication Publication Date Title
CN103593470B (en) The integrated unbalanced data flow classification algorithm of a kind of two degree
Wilson et al. Predictive inequity in object detection
Umadevi et al. A survey on data mining classification algorithms
Li et al. Localizing and quantifying damage in social media images
CN111882446B (en) Abnormal account detection method based on graph convolution network
WO2019218699A1 (en) Fraud transaction determining method and apparatus, computer device, and storage medium
WO2018014610A1 (en) C4.5 decision tree algorithm-based specific user mining system and method therefor
WO2017143932A1 (en) Fraudulent transaction detection method based on sample clustering
CN103632168B (en) Classifier integration method for machine learning
CN110852395A (en) Ore granularity detection method and system based on autonomous learning and deep learning
CN109034194B (en) Transaction fraud behavior deep detection method based on feature differentiation
WO2020220758A1 (en) Method for detecting abnormal transaction node, and device
CN107679734A (en) It is a kind of to be used for the method and system without label data classification prediction
CN106503086A (en) The detection method of distributed local outlier
WO2021254027A1 (en) Method and apparatus for identifying suspicious community, and storage medium and computer device
CN104239553A (en) Entity recognition method based on Map-Reduce framework
US20220180369A1 (en) Fraud detection device, fraud detection method, and fraud detection program
WO2023056723A1 (en) Fault diagnosis method and apparatus, and electronic device and storage medium
CN110737641A (en) Construction method, device and system of confidence and audit models
Velden et al. Resolving author name homonymy to improve resolution of structures in co-author networks
CN114254146A (en) Image data classification method, device and system
Shoohi et al. DCGAN for Handling Imbalanced Malaria Dataset based on Over-Sampling Technique and using CNN.
CN111738290B (en) Image detection method, model construction and training method, device, equipment and medium
WO2020259391A1 (en) Database script performance testing method and device
CN104537392A (en) Object detection method based on distinguishing semantic component learning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: 475001 Henan province city Minglun Street No. 85

Patentee after: Henan University

Address before: 475004 Jinming Avenue, Kaifeng City, Henan Province

Patentee before: Henan University