CN114254698A - Unbalanced data and image processing method and system and computer equipment - Google Patents
Unbalanced data and image processing method and system and computer equipment Download PDFInfo
- Publication number
- CN114254698A CN114254698A CN202111485510.9A CN202111485510A CN114254698A CN 114254698 A CN114254698 A CN 114254698A CN 202111485510 A CN202111485510 A CN 202111485510A CN 114254698 A CN114254698 A CN 114254698A
- Authority
- CN
- China
- Prior art keywords
- data
- data set
- samples
- unbalanced
- attribute
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 15
- 238000000034 method Methods 0.000 claims abstract description 35
- 238000013528 artificial neural network Methods 0.000 claims abstract description 32
- 238000007781 pre-processing Methods 0.000 claims abstract description 13
- 238000012545 processing Methods 0.000 claims abstract description 10
- 210000002569 neuron Anatomy 0.000 claims description 31
- 210000004205 output neuron Anatomy 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 9
- 239000011159 matrix material Substances 0.000 claims description 9
- 230000006870 function Effects 0.000 claims description 7
- 210000002364 input neuron Anatomy 0.000 claims description 3
- 238000012805 post-processing Methods 0.000 claims description 3
- 238000007405 data analysis Methods 0.000 abstract description 3
- 238000003745 diagnosis Methods 0.000 description 4
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 201000004569 Blindness Diseases 0.000 description 1
- 238000012952 Resampling Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a method, a system and computer equipment for processing unbalanced data and images, which comprises the following steps: 1) preprocessing the unbalanced data set O; 2) determining parameters of the RBF neural network data generation model by using a maximum distribution algorithm based on the Hausdorff distance; 3) constructing an RBF neural network data generation model; 4) generating a sample set S by combining the constructed RBF neural network data generation model with the mvnrnd function; 5) filling the generated sample set S into the original unbalanced data set O to obtain a processed balanced data set Os,OsO ═ os. The unbalanced data and image processing method provided by the invention can process missing values and different types of attributesThe method adaptively learns the intra-class and inter-class distribution of the original unbalanced data, and automatically generates data according to classes to expand a few classes in the original data, thereby effectively improving the unbalance of the data and improving the accuracy of data analysis.
Description
Technical Field
The invention relates to the field of data analysis and processing, in particular to an unbalanced data and image processing method, system and computer equipment.
Background
In the same dataset, the number of samples of one or a part of the classes is small (positive or few classes), while the number of samples of the other or other part of the classes is relatively large (negative or majority classes), and the samples contained in the two parts are far apart in number, and a dataset that meets this condition is called an unbalanced dataset. In an unbalanced data set, the number of minority class samples is small, so that sufficient information cannot be provided for the classifier in classification learning, and the number of majority classes is large, so that sufficient information is provided for the classifier, which results in that the classifier can more easily identify the majority classes in the classification process, and the identification rate of the minority classes is low.
There are many fields in real life that require knowledge modeling analysis for the condition of data imbalance, such as the following fields: medical information assisted diagnosis, mass advertising spam handling, multimedia information retrieval, credit card fraud detection, text information classification, and the like. In many related fields, the identification and classification of minority classes are important, and the meaning of the correct identification of the minority classes to the whole classification learning is far more than that of the correct identification of the majority classes of samples. For example, in medical information-assisted diagnosis, the diagnosis of a doctor can be divided into four cases: normal persons are correctly diagnosed as normal, persons with diseases are correctly diagnosed as diseased, normal persons are misdiagnosed as diseased, and persons with diseases are misdiagnosed as normal. If the doctor misdiagnoses the normal person as a patient in the process, the serious psychological and monetary pressure can be brought to the normal person. However, if a patient is misdiagnosed as a healthy person by the auxiliary medical diagnosis system, it is highly likely that the patient cannot be treated in time. The misdiagnosis of the patient as normal in the four cases is the least common case in reality and can be regarded as a few types, and the other three cases are frequently regarded as a plurality of types. However, most of the existing classification methods have high recognition rate for most classes, but have low recognition rate for few classes, and do not show the true function of the classifier.
The processing method for the unbalanced data mainly comprises the step of carrying out undersampling or oversampling on a sample through a resampling technology so as to adjust the unbalanced degree of a sample set. Common methods for adjusting imbalance data from a few classes of angles are: random oversampling, SMOTE, borderline-SMOTE, and the like. The methods do not well consider the data distribution characteristics of the actual data set, and have certain randomness and blindness, so that the classification effect is influenced.
Therefore, there is a need to provide a more reliable solution.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide an unbalanced data and image processing method, system and computer device for overcoming the above-mentioned shortcomings in the prior art.
In order to solve the technical problems, the invention adopts the technical scheme that: an unbalanced data and image processing method is provided, which comprises the following steps:
1) preprocessing the unbalanced data set O;
2) processing the preprocessed unbalanced data set O by using a maximum distribution algorithm based on the Hausdorff distance, and determining parameters of an RBF neural network data generation model to be constructed; the parameters comprise hidden layer neurons of an RBF neural network data generation model, a category, an output weight and a diagonal distribution matrix corresponding to each hidden layer neuron, and a connection weight between each hidden layer neuron and a corresponding output neuron;
3) constructing an RBF neural network data generation model based on the result of the step 2);
4) generating data by combining the constructed RBF neural network data generation model with the mvnrnd function to obtain a generated sample set S;
5) filling the generated sample set S into the original unbalanced data set O to obtain a processed balanced data set Os,Os=O∪S。
Preferably, the step 1) is specifically:
complementing the missing value of the numerical attribute in the unbalanced data set O by using the mean value of the attribute of the same type of sample; for missing values of ordinal attributes and nominal attributes, completing the missing values by using the value with the highest attribute occurrence frequency of the same type of samples;
after completing data completion, sequentially coding the ordinal attribute and the nominal attribute;
converting image data in the unbalanced data set O into numerical data by adopting a PyRadiomics-based tool kit, adding the numerical data into the data set O, and standardizing all attributes by using a z-score method to obtain a preprocessed data set D;
using the vector LmeanAnd LstdAnd respectively storing the mean value and the standard deviation of each attribute, and storing the sequential coding modes of the ordinal attribute and the nominal attribute.
Preferably, the step 2) specifically includes:
2-1) assume that there are N input samples { x in dataset DnN is 1,2, …, N, each sample has M attributes, each sample belongs to one of C classes, and the number of samples in the C class is Nc,c=1,2,…,C;
2-2) dividing the samples in the data set according to the categories to obtain a data subset D consisting of the samples belonging to the class ccC is 1,2, …, C; initializing, and making the current class index c equal to 0 and the current hidden layer neuron number P equal to 0;
2-3) let c ═ c + 1;
2-4) let P ═ P +1, calculate DcAnd the Hausdorff distance h between other samplesPThe corresponding sample is used as a hidden layer neuron center k newly added in the class cP(ii) a Calculating DcAll samples inkPThe recording distance is less than hPCorresponding subset d of all samplescAnd d iscFrom DcDeleting; with dcNumber of intermediate samples as kPConnection weight w between output neuron and corresponding classP,kPThe connection weight value between the neuron and other output neurons is 0; calculating dcThe variance v of each dimension attribute inmComposition kPCorresponding diagonal distribution matrix
2-5) if DcIf the number of the remaining samples is not 0, returning to the step c; otherwise, check if C is equal to C, if C < C, go back to step 2-3), if C ═ C, the algorithm terminates.
Preferably, the step 3) specifically includes:
3-1) determining that an input layer of the RBF neural network data generation model has M input neurons according to M attributes of each sample in the data set D, wherein each neuron corresponds to one attribute;
3-2) determining that an output layer of the RBF neural network data generation model has C output neurons according to C categories of the data set D, wherein each neuron corresponds to one category;
3-3) obtaining P hidden layer neurons k according to the result of the step 2)1,k2,…,kP-1,kPAnd its corresponding class and output weight { w }1,w2,…,wP-1,wPAnd the corresponding P diagonal distribution matrices { V }1,V2,…,VP-1,VPDetermining parameters of P hidden layer neurons { (k)1,V1),(k2,V3),…,(kP-1,VP-1),(kP,VP) And the connection weight between each hidden layer neuron and the corresponding output neuron { w }1,w2,…,wP-1,wP}。
Preferably, the step 4) specifically includes:
4-1) setting the number S of samples to be generated for each categorycC is 1,2, …, C; initializing, making the current hidden layer neuron center index p equal to 0, and generating a sample setRepresenting an empty set;
4-2) let p ═ p +1, assuming current hidden neuron center kPBelongs to class c, then kPCorresponding to the number of generated samples of
4-3) generated sample matrixWherein each sample belongs to class c; will be provided withAre combined into the generated set of samples S,checking whether P is equal to P, and returning to the step 4-2) if P < P); if P is equal to P, obtaining a complete generated sample set S, and executing the next step;
4-4) mean vector L from all attributes saved during preprocessingmeanAnd standard deviation LstdCarrying out inverse standardization on S; and converting the corresponding numerical value in the S back to the original values of the ordinal attribute and the nominal attribute according to the sequential coding mode of the ordinal attribute and the nominal attribute.
The present invention also provides an unbalanced data and image processing system, which uses the method as described above to process unbalanced data, the system comprising:
the data preprocessing module is used for preprocessing the unbalanced data set O according to the method in the step 1) to obtain a data set D;
the maximum distribution algorithm module is used for determining parameters of the RBF neural network data generation model to be constructed according to the method in the step 2);
the network model building module is used for building an RBF neural network data generation model according to the method in the step 3);
the RBF neural network data generation model is combined with the mvnrnd function, and a new data set S is generated in a self-adaptive mode according to the distribution of the original unbalanced data set by the method in the step 4);
and a data post-processing module for filling the generated sample set S into the original unbalanced data set O to obtain a processed balanced data set Os。
The invention also provides a storage medium having stored thereon a computer program which, when executed, is adapted to carry out the method as described above.
The invention also provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method as described above when executing the computer program.
The invention has the beneficial effects that: the unbalanced data and image processing method provided by the invention can process missing values and attributes of different types, adaptively learn the intra-class and inter-class distribution of the original unbalanced data, automatically generate data according to classes and expand a few classes in the original data, thereby effectively improving the unbalance of the data and improving the accuracy of data analysis.
Drawings
FIG. 1 is a flow chart of an unbalanced data and image processing method of the present invention;
FIG. 2 is a schematic diagram of the schematic structure of the RBF neural network data generation model of the present invention.
Detailed Description
The present invention is further described in detail below with reference to examples so that those skilled in the art can practice the invention with reference to the description.
It will be understood that terms such as "having," "including," and "comprising," as used herein, do not preclude the presence or addition of one or more other elements or groups thereof.
Example 1
Referring to fig. 1, the unbalanced data and image processing method of the present embodiment includes the following steps:
s1, preprocessing the unbalanced data set O:
complementing the missing value of the numerical attribute in the unbalanced data set O by using the mean value of the attribute of the same type of sample; for missing values of ordinal attributes and nominal attributes, completing the missing values by using the value with the highest attribute occurrence frequency of the same type of samples;
after completing data completion, sequentially coding the ordinal attribute and the nominal attribute;
converting image data in the unbalanced data set O into numerical data by adopting a PyRadiomics-based tool kit, adding the numerical data into the data set O, and standardizing all attributes by using a z-score method to obtain a preprocessed data set D; wherein, the data types in the unbalanced data set O comprise numerical data, image data and the like;
using the vector LmeanAnd LstdAnd respectively storing the mean value and the standard deviation of each attribute, and storing the sequential coding modes of the ordinal attribute and the nominal attribute.
S2, processing the preprocessed unbalanced data set O by using a maximum distribution algorithm based on the Hausdorff distance, and determining parameters of an RBF neural network data generation model to be constructed; the parameters comprise hidden layer neurons of an RBF neural network data generation model, a category, an output weight and a diagonal distribution matrix corresponding to each hidden layer neuron, and a connection weight between each hidden layer neuron and the corresponding output neuron; the method specifically comprises the following steps:
s2-1) assume that there are N input samples { x ] in the data set DnN is 1,2, …, N, each sample has M attributes, each sample belongs to one of C classes, and the number of samples in the C class is Nc,c=1,2,…,C;
S2-2) dividing the samples in the data set according to the belonged categories to obtain a data subset D consisting of samples belonging to the class ccC is 1,2, …, C; initializing the current class index c to 0 and hiding the current class indexThe number P of layer neurons is 0;
s2-3) making c ═ c + 1;
s2-4) let P ═ P +1, calculate DcAnd the Hausdorff distance h between other samplesPThe corresponding sample is used as a hidden layer neuron center k newly added in the class cP(ii) a Calculating DcAll samples in to kPThe recording distance is less than hPCorresponding subset d of all samplescAnd d iscFrom DcDeleting; with dcNumber of intermediate samples as kPConnection weight w between output neuron and corresponding classP,kPThe connection weight value between the neuron and other output neurons is 0; calculating dcThe variance v of each dimension attribute inmComposition kPCorresponding diagonal distribution matrix
S2-5) if DcIf the number of the remaining samples is not 0, returning to the step c; otherwise, it is checked whether C is equal to C, and if C < C, it returns to step S2-3), and if C ═ C, the algorithm terminates.
S3, constructing an RBF neural network data generation model based on the result of the step S2), specifically comprising the following steps:
s3-1) determining that an input layer of the RBF neural network data generation model has M input neurons according to M attributes of each sample in the data set D, wherein each neuron corresponds to one attribute;
s3-2) determining that an output layer of the RBF neural network data generation model has C output neurons according to C categories of the data set D, wherein each neuron corresponds to one category;
s3-3) obtaining P hidden layer neurons k according to the result of the step S2)1,k2,…,kP-1,kPAnd its corresponding class and output weight { w }1,w2,…,wP-1,wPAnd the corresponding P diagonal distribution matrices { V }1,V2,…,VP-1,VP}, determination of P hidden layer neuronsParameter { (k)1,V1),(k2,V3),…,(kP-1,VP-1),(kP,VP) And the connection weight between each hidden layer neuron and the corresponding output neuron { w }1,w2,…,wP-1,wP}。
Where, it is assumed that the 1 st and 2 nd hidden layer neurons belong to class 1 and that the P-1 st and P-th hidden layer neurons belong to class C.
The principle structure of the constructed RBF neural network data generation model is shown in FIG. 2.
S4, generating data by combining the constructed RBF neural network data generation model with the mvnrnd function to obtain a generated sample set S, which specifically comprises the following steps:
s4-1) setting the number S of samples to be generated for each categorycC is 1,2, …, C; initializing, making the current hidden layer neuron center index p equal to 0, and generating a sample setRepresenting an empty set;
s4-2) let p ═ p +1, assuming current hidden neuron center kPBelongs to class c, then kPCorresponding to the number of generated samples of
S4-3) generated sample matrixWherein each sample belongs to class c; will be provided withAre combined into the generated set of samples S,checking whether P is equal to P, and if P < P, returning to step S4-2); if P is equal to P, obtaining a complete generated sample set S, and executing the next step;
s4-4) average value vector L of all attributes stored in preprocessingmeanAnd standard deviation LstdCarrying out inverse standardization on S; and converting the corresponding numerical value in the S back to the original values of the ordinal attribute and the nominal attribute according to the sequential coding mode of the ordinal attribute and the nominal attribute.
S5, filling the generated sample set S into the original unbalanced data set O to obtain a processed balanced data set Os,Os=O∪S。
Example 2
The present embodiment provides an unbalanced data and image processing system, which performs unbalanced data processing by using the method of embodiment 1, and the system includes:
the data preprocessing module is used for preprocessing the unbalanced data set O according to the method in the step 1) to obtain a data set D;
the maximum distribution algorithm module is used for determining parameters of the RBF neural network data generation model to be constructed according to the method in the step 2);
the network model building module is used for building an RBF neural network data generation model according to the method in the step 3);
the RBF neural network data generation model is combined with the mvnrnd function, and a new data set S is generated in a self-adaptive mode according to the distribution of the original unbalanced data set by the method in the step 4);
and a data post-processing module for filling the generated sample set S into the original unbalanced data set O to obtain a processed balanced data set Os。
The present embodiment also provides a storage medium having stored thereon a computer program for implementing the method of embodiment 1 when executed.
The present embodiment also provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of embodiment 1 when executing the computer program.
While embodiments of the invention have been disclosed above, it is not limited to the applications listed in the description and the embodiments, which are fully applicable in all kinds of fields of application of the invention, and further modifications may readily be effected by those skilled in the art, so that the invention is not limited to the specific details without departing from the general concept defined by the claims and the scope of equivalents.
Claims (8)
1. An unbalanced data and image processing method, comprising the steps of:
1) preprocessing the unbalanced data set O;
2) processing the preprocessed unbalanced data set O by using a maximum distribution algorithm based on the Hausdorff distance, and determining parameters of an RBF neural network data generation model to be constructed; the parameters comprise hidden layer neurons of an RBF neural network data generation model, a category, an output weight and a diagonal distribution matrix corresponding to each hidden layer neuron, and a connection weight between each hidden layer neuron and a corresponding output neuron;
3) constructing an RBF neural network data generation model based on the result of the step 2);
4) generating data by combining the constructed RBF neural network data generation model with the mvnrnd function to obtain a generated sample set S;
5) filling the generated sample set S into the original unbalanced data set O to obtain a processed balanced data set Os,Os=O∪S。
2. The unbalanced data and image processing method according to claim 1, wherein the step 1) is specifically:
complementing the missing value of the numerical attribute in the unbalanced data set O by using the mean value of the attribute of the same type of sample; for missing values of ordinal attributes and nominal attributes, completing the missing values by using the value with the highest attribute occurrence frequency of the same type of samples;
after completing data completion, sequentially coding the ordinal attribute and the nominal attribute;
converting image data in the unbalanced data set O into numerical data by adopting a PyRadiomics-based tool kit, adding the numerical data into the data set O, and standardizing all attributes by using a z-score method to obtain a preprocessed data set D;
using the vector LmeanAnd LstdAnd respectively storing the mean value and the standard deviation of each attribute, and storing the sequential coding modes of the ordinal attribute and the nominal attribute.
3. The unbalanced data and image processing method of claim 2, wherein the step 2) specifically comprises:
2-1) assume that there are N input samples { x in dataset DnN is 1,2, …, N, each sample has M attributes, each sample belongs to one of C classes, and the number of samples in the C class is Nc,c=1,2,…,C;
2-2) dividing the samples in the data set according to the categories to obtain a data subset D consisting of the samples belonging to the class ccC is 1,2, …, C; initializing, and making the current class index c equal to 0 and the current hidden layer neuron number P equal to 0;
2-3) let c ═ c + 1;
2-4) let P ═ P +1, calculate DcAnd the Hausdorff distance h between other samplesPThe corresponding sample is used as a hidden layer neuron center k newly added in the class cP(ii) a Calculating DcAll samples in to kPThe recording distance is less than hPCorresponding subset d of all samplescAnd d iscFrom DcDeleting; with dcNumber of intermediate samples as kPConnection weight w between output neuron and corresponding classP,kPThe connection weight value between the neuron and other output neurons is 0; calculating dcThe variance v of each dimension attribute inmComposition kPCorresponding diagonal distribution matrix
2-5) if DcIf the number of the remaining samples is not 0, returning to the step c; otherwise, it is checked whether C is equal to C,if C < C, go back to step 2-3), if C ═ C, the algorithm terminates.
4. The unbalanced data and image processing method of claim 3, wherein the step 3) specifically comprises:
3-1) determining that an input layer of the RBF neural network data generation model has M input neurons according to M attributes of each sample in the data set D, wherein each neuron corresponds to one attribute;
3-2) determining that an output layer of the RBF neural network data generation model has C output neurons according to C categories of the data set D, wherein each neuron corresponds to one category;
3-3) obtaining P hidden layer neurons k according to the result of the step 2)1,k2,…,kP-1,kPAnd its corresponding class and output weight { w }1,w2,…,wP-1,wPAnd the corresponding P diagonal distribution matrices { V }1,V2,…,VP-1,VPDetermining parameters of P hidden layer neurons { (k)1,V1),(k2,V3),…,(kP-1,VP-1),(kP,VP) And the connection weight between each hidden layer neuron and the corresponding output neuron { w }1,w2,…,wP-1,wP}。
5. The unbalanced data and image processing method of claim 4, wherein the step 4) specifically comprises:
4-1) setting the number S of samples to be generated for each categorycC is 1,2, …, C; initializing, making the current hidden layer neuron center index p equal to 0, and generating a sample set Representing an empty set;
4-2) let p ═ p +1, assuming current hidden neuron center kPBelongs to class c, then kPCorresponding to the number of generated samples of
4-3) generated sample matrixWherein each sample belongs to class c; will be provided withAre combined into the generated set of samples S,checking whether P is equal to P, and returning to the step 4-2) if P < P); if P is equal to P, obtaining a complete generated sample set S, and executing the next step;
4-4) mean vector L from all attributes saved during preprocessingmeanAnd standard deviation LstdCarrying out inverse standardization on S; and converting the corresponding numerical value in the S back to the original values of the ordinal attribute and the nominal attribute according to the sequential coding mode of the ordinal attribute and the nominal attribute.
6. An unbalanced data and image processing system for processing unbalanced data using a method as claimed in any one of claims 1 to 5, the system comprising:
the data preprocessing module is used for preprocessing the unbalanced data set O according to the method in the step 1) to obtain a data set D;
the maximum distribution algorithm module is used for determining parameters of the RBF neural network data generation model to be constructed according to the method in the step 2);
the network model building module is used for building an RBF neural network data generation model according to the method in the step 3);
the RBF neural network data generation model is combined with the mvnrnd function, and a new data set S is generated in a self-adaptive mode according to the distribution of the original unbalanced data set by the method in the step 4);
and a data post-processing module for filling the generated sample set S into the original unbalanced data set O to obtain a processed balanced data set Os。
7. A storage medium on which a computer program is stored, characterized in that the program is adapted to carry out the method of any one of claims 1-5 when executed.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1-5 when executing the computer program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111485510.9A CN114254698B (en) | 2021-12-07 | Unbalanced data and image processing method, system and computer equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111485510.9A CN114254698B (en) | 2021-12-07 | Unbalanced data and image processing method, system and computer equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114254698A true CN114254698A (en) | 2022-03-29 |
CN114254698B CN114254698B (en) | 2024-10-22 |
Family
ID=
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019041629A1 (en) * | 2017-08-30 | 2019-03-07 | 哈尔滨工业大学深圳研究生院 | Method for classifying high-dimensional imbalanced data based on svm |
CN109993229A (en) * | 2019-04-02 | 2019-07-09 | 广东石油化工学院 | A kind of serious unbalanced data classification method |
KR20200027834A (en) * | 2018-09-05 | 2020-03-13 | 성균관대학교산학협력단 | Methods and apparatuses for processing data based on representation model for unbalanced data |
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019041629A1 (en) * | 2017-08-30 | 2019-03-07 | 哈尔滨工业大学深圳研究生院 | Method for classifying high-dimensional imbalanced data based on svm |
KR20200027834A (en) * | 2018-09-05 | 2020-03-13 | 성균관대학교산학협력단 | Methods and apparatuses for processing data based on representation model for unbalanced data |
CN109993229A (en) * | 2019-04-02 | 2019-07-09 | 广东石油化工学院 | A kind of serious unbalanced data classification method |
Non-Patent Citations (1)
Title |
---|
李金鑫: "基于多示例多标签径向基神经网络的网页分类方法", 中国优秀硕士学位论文全文数据库信息科技辑 (月刊), no. 07, 15 July 2018 (2018-07-15), pages 1 - 74 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113159147B (en) | Image recognition method and device based on neural network and electronic equipment | |
CN111260462B (en) | Transaction fraud detection method based on heterogeneous relation network attention mechanism | |
CN111352965B (en) | Training method of sequence mining model, and processing method and equipment of sequence data | |
Zhang et al. | Interpreting neural network judgments via minimal, stable, and symbolic corrections | |
CN112613552B (en) | Convolutional neural network emotion image classification method combined with emotion type attention loss | |
CN110532880B (en) | Sample screening and expression recognition method, neural network, device and storage medium | |
CN110210625A (en) | Modeling method, device, computer equipment and storage medium based on transfer learning | |
Pengfei et al. | A new sampling approach for classification of imbalanced data sets with high density | |
US20230401466A1 (en) | Method for temporal knowledge graph reasoning based on distributed attention | |
Zhang | Deep generative model for multi-class imbalanced learning | |
CN111415167B (en) | Network fraud transaction detection method and device, computer storage medium and terminal | |
CN116452333A (en) | Construction method of abnormal transaction detection model, abnormal transaction detection method and device | |
CN115330435A (en) | Method, device, equipment and medium for establishing carbon emission right price index system | |
CN108647714A (en) | Acquisition methods, terminal device and the medium of negative label weight | |
Gavval et al. | CUDA-Self-Organizing feature map based visual sentiment analysis of bank customer complaints for Analytical CRM | |
Sahbi | A particular Gaussian mixture model for clustering and its application to image retrieval | |
CN112541530B (en) | Data preprocessing method and device for clustering model | |
CN114254698B (en) | Unbalanced data and image processing method, system and computer equipment | |
CN114254698A (en) | Unbalanced data and image processing method and system and computer equipment | |
Benchaji et al. | Novel learning strategy based on genetic programming for credit card fraud detection in Big Data | |
CN116188174A (en) | Insurance fraud detection method and system based on modularity and mutual information | |
CN115688923A (en) | Data processing method and system for coping with internet financial security | |
CN111291838B (en) | Method and device for interpreting entity object classification result | |
CN113240425A (en) | Financial anti-money laundering transaction method, device and storage medium based on deep learning | |
CN110033862A (en) | A kind of Chinese medicine Quantitative Diagnosis system and storage medium based on weighted digraph |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |