CN114298191A - Classification method and system based on label subset - Google Patents

Classification method and system based on label subset Download PDF

Info

Publication number
CN114298191A
CN114298191A CN202111566217.5A CN202111566217A CN114298191A CN 114298191 A CN114298191 A CN 114298191A CN 202111566217 A CN202111566217 A CN 202111566217A CN 114298191 A CN114298191 A CN 114298191A
Authority
CN
China
Prior art keywords
label
sample
subset
calculating
labels
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111566217.5A
Other languages
Chinese (zh)
Inventor
彭黎文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Police College
Original Assignee
Sichuan Police College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Police College filed Critical Sichuan Police College
Priority to CN202111566217.5A priority Critical patent/CN114298191A/en
Publication of CN114298191A publication Critical patent/CN114298191A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a classification method and a classification system based on a label subset, which belong to the technical field of computers, and comprise the following steps: converting the multi-label data sample set into a single label data set; calculating label subsets of all samples in the single label data set, and constructing a new sample data set based on the label subsets; the computing of the subset of tags includes: calculating the importance, the relevance and the redundancy of all the characteristics in the sample, selecting the labels with the top rank by combining the relevance and the redundancy among the characteristics, and constructing a label subset; putting the label subset of each sample into the original sample to obtain a new single label data set; and constructing a single label classification model based on the new single label data set, and then counting the labels of each sample in each single label classification model to obtain a final multi-label classification result. According to the method, through analyzing the correlation and redundancy among the characteristics in the labels, an excellent label subset is selected, and the performance of the classification model is effectively improved.

Description

Classification method and system based on label subset
Technical Field
The invention relates to the technical field of computers, in particular to a classification method and a classification system based on a label subset.
Background
The multi-label classification is mainly used for processing the situation that one piece of data belongs to multiple categories at the same time, and in reality, the situation is widely existed, for example, an article belongs to the categories of news, economy, culture and the like at the same time. In order to accurately divide ambiguous objects in a real scene, many researchers have conducted intensive research on multi-label classification methods.
In the field of machine learning, a plurality of multi-label classification algorithms are provided, an existing multi-label classification scheme is generally realized based on a plurality of single-label classification models, the multi-label tasks are classified by the single-label classification models respectively, then a set of prediction results of all single-label classifiers is used as a final prediction result of the multi-label task, and the prediction accuracy of the single-label classifier can directly influence the accuracy of multi-label classification. In practical application, the samples used for each single label classifier are fewer, so that the accuracy of the prediction result of a single label classifier is poor, the accuracy of the final prediction result of a multi-label task is affected, and the relevance among labels is not considered in the conventional multi-label classification algorithm.
In addition, research finds that if the multi-label classification method only considers the relevance among labels during classification, the method does not necessarily obtain good classification performance, that is, in feature selection, the classification performance of the classification method cannot be always improved by the combination of single good features, because there is a possibility that the features are highly correlated, which causes redundancy among the features and affects the classification performance of the method.
Disclosure of Invention
The invention aims to overcome the problems of a multi-label classification method in the prior art, and provides a classification method and a classification system based on a label subset.
The purpose of the invention is realized by the following technical scheme:
there is provided a method for classification based on a subset of tags, the method comprising:
acquiring a multi-label data sample set, and converting the multi-label data sample set into a single-label data set;
calculating label subsets of all samples in the single label data set, and constructing a new sample data set based on the label subsets; the computing of the subset of tags includes: calculating the importance of all the characteristics in the sample; calculating a correlation between the features; calculating the redundancy among the characteristics, selecting the labels in the top order by combining the correlation and the redundancy among the characteristics, and constructing a label subset;
putting the label subset of each sample into the sample of the original single-label data set to obtain a new single-label data set;
and constructing a single label classification model based on the new single label data set, and then counting the labels of each sample in each single label classification model to obtain a final multi-label classification result.
As an option, the converting the multi-labeled data sample set into a single-labeled data set includes:
and corresponding a single label in the multi-label data sample set to each sample, and decomposing the single label into a data subset equal to the number of labels.
As an option, the method further comprises:
pre-processing the multi-labeled data sample set, the pre-processing comprising:
and deleting the samples with missing data characteristic values, keeping the samples with complete data characteristics, and then randomly dividing the multi-label data sample set into a training set and a testing set.
As an option, the importance of each feature is calculated by the F-score formula:
Figure BDA0003422058380000031
wherein, FiThe larger the feature x isiThe stronger the class discrimination ability of (2).
As an option, the correlation between features is calculated using mutual information, and the calculation formula is as follows:
Figure BDA0003422058380000032
x denotes a characteristic variable, Y denotes a tag variable, p (X)i) And p (y)j) Edge probabilities, p (X), for variables X and Y, respectivelyi,yj) Is a joint probability distribution function of X and Y.
As an option, the selecting top-ranked labels in combination with relevance and redundancy between features includes:
calculating the mutual information mean value between all the characteristics and the target variable, wherein the calculation formula is as follows:
Figure BDA0003422058380000033
wherein S represents a selected feature set, c represents a target variable, namely a class label variable;
and calculating the redundant information quantity between the characteristics according to the following calculation formula:
Figure BDA0003422058380000034
selecting a label with small redundancy and large correlation according to the mutual information mean value and the redundant information quantity, and selecting a calculation formula as follows:
Figure RE-GDA0003510723370000035
wherein m is the number of features, and the labels ranked in the top 70% are selected according to the result calculated by the formula.
As an option, the ratio of training set to test set is 1: 1.
As an option, a bayesian algorithm is used to build the single label classification model.
As an option, the multi-labeled data sample set includes a plurality of different labels, including a plurality of different features.
The invention also provides a classification system based on the label subset, which comprises:
the sample acquisition module is used for acquiring a multi-label data sample set and converting the multi-label data sample set into a single-label data set;
the label subset calculation module is used for calculating label subsets of all samples in the single label data set and constructing a new sample data set based on the label subsets; the computing of the subset of tags includes: calculating the importance of all the characteristics in the sample; calculating a correlation between the features; calculating the redundancy among the characteristics, selecting the labels in the top order by combining the correlation and the redundancy among the characteristics, and constructing a label subset;
the sample recombination module is used for putting the label subset of each sample into the original sample to obtain a new single-label data set;
and the modeling and classifying module is used for constructing a single label classification model based on the new single label data set, then counting the labels of each sample in each single label classification model, and obtaining a final multi-label classification result.
It should be further noted that the technical features corresponding to the above options can be combined with each other or replaced to form a new technical solution without conflict.
Compared with the prior art, the invention has the beneficial effects that:
the method comprises the steps of converting a multi-label data sample set into a single-label data set, effectively selecting an excellent label subset by calculating the importance of all features in the single-label data set, calculating the correlation among the features, calculating the redundancy among the features and combining the correlation and the redundancy among the features, wherein the obtained features have the redundancy as small as possible, the correlation among the features is large as possible, the influence of the redundancy among the features on the classification performance of the model is avoided, and the classification performance of the model is improved.
Drawings
Fig. 1 is a schematic flow chart of a classification method based on tag subsets according to the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In addition, the technical features involved in the different embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The invention mainly converts the multi-label data sample set into the single label data set, effectively selects the excellent label subset by calculating the importance of all the characteristics, calculating the correlation among the characteristics and calculating the redundancy among the characteristics in the single label data set and combining the correlation and the redundancy among the characteristics, obtains the characteristics with the redundancy among the characteristics as small as possible, obtains the correlation among the characteristics as large as possible, considers the correlation among the labels, avoids the influence of the redundancy among the characteristics on the classification performance of the model and achieves the aim of improving the classification performance of the model.
Example 1
In an exemplary embodiment, a method of classification based on a subset of tags is provided, as shown in figure 1,
the method comprises the following steps:
acquiring a multi-label data sample set, and converting the multi-label data sample set into a single-label data set;
calculating label subsets of all samples in the single label data set, and constructing a new sample data set based on the label subsets; the computing of the subset of tags includes: calculating the importance of all the characteristics in the sample; calculating a correlation between the features; calculating the redundancy among the characteristics, selecting the labels in the top order by combining the correlation and the redundancy among the characteristics, and constructing a label subset;
putting the label subset of each sample into the sample of the original single-label data set to obtain a new single-label data set;
and constructing a single label classification model based on the new single label data set, and then counting the labels of each sample in each single label classification model to obtain a final multi-label classification result.
Specifically, by calculating the importance of all features in a single label data set, calculating the correlation among the features, calculating the redundancy among the features, and combining the correlation and the redundancy among the features, an excellent label subset is effectively selected, the redundancy between the obtained features is as small as possible, the correlation among the features is as large as possible, the correlation among the labels is considered, the influence of the redundancy among the features on the classification performance of the model is avoided, and the purpose of improving the classification performance of the model is achieved.
Further, after the label subset of each sample in the single label data set is obtained through calculation, the label subset is put into each sample converted into a single label, the method is used for directly inserting the label subset into the data sample as a feature to obtain a new single label data set, and the obtained new single label data set can reflect the relation between the labels.
Further, the label of each sample in each single label classification model is counted, for example, a certain data sample to be predicted is classified in each single label model, and if the sample belongs to the classification, the sample is recorded with the label.
Example 2
Based on the embodiment 1, there is provided a classification method based on tag subsets, the method for converting a multi-tag data sample set into a single-tag data set includes:
and corresponding a single label in the multi-label data sample set to each sample, and decomposing the single label into a data subset equal to the number of labels.
Further, the method further comprises:
pre-processing the multi-labeled data sample set, the pre-processing comprising:
and deleting the samples with missing data characteristic values, keeping the samples with complete data characteristics, and then randomly dividing the multi-label data sample set into a training set and a testing set.
Specifically, assuming that the multi-label data sample set D contains L labels in total, wherein the number of the labels includes q features, the given multi-label data sample set D is preprocessed, wherein the preprocessing includes processing missing values, and the deleting processing is performed on samples with missing data feature values. A sample with the data characteristics intact is retained. And then randomly dividing the set D into two parts according to the proportion of 1:1 between the training set Train and the Test set Test, setting the selected feature set as S, wherein S is empty at the beginning, the candidate feature set is W, and all q features in the set are included at the beginning.
Further, assuming that the multi-label dataset D contains 8 samples, L ═ 5, which are labels { L1, L2, L3, L4, L5}, the original multi-label dataset D is shown in the following table:
TABLE 1
id F1 F2 F3 F4 F5 F6 F7 Labels
1 G1 G2 G3 G4 G5 G6 G7 L1,L2,L3,L5
2 R1 R2 R3 R4 R5 R6 R7 L2,L4,L5
3 M1 M2 M3 M4 M5 M6 M7 L1,L3,L4,L5
4 Y1 Y2 Y3 Y4 Y5 Y6 Y7 L2,L3,L5
5 V1 V2 V3 V4 V5 V6 V7 L1,L2,L4
6 H1 H2 H3 H4 H5 H6 H7 L3,L4
7 X1 X2 X3 X4 X5 X6 X7 L2,L5
8 N1 N1 N1 N1 N1 N1 N7 L1,L2,L4
Wherein, F1, F2, etc. represent features, and a single label in the label set is corresponding to each sample and is decomposed into data subsets equal to the number of labels.
In particular, label L1 corresponds to the following table:
TABLE 2
id F1 F2 F3 F4 F5 F6 F7 Labels
1 G1 G2 G3 G4 G5 G6 G7 L1
2 R1 R2 R3 R4 R5 R6 R7
3 M1 M2 M3 M4 M5 M6 M7
4 Y1 Y2 Y3 Y4 Y5 Y6 Y7
5 V1 V2 V3 V4 V5 V6 V7 L1
6 H1 H2 H3 H4 H5 H6 H7
7 X1 X2 X3 X4 X5 X6 X7
8 N1 N1 N1 N1 N1 N1 N7 L1
Label L2 corresponds to the following table:
TABLE 3
id F1 F2 F3 F4 F5 F6 F7 Labels
1 G1 G2 G3 G4 G5 G6 G7 L2
2 R1 R2 R3 R4 R5 R6 R7 L2
3 M1 M2 M3 M4 M5 M6 M7
4 Y1 Y2 Y3 Y4 Y5 Y6 Y7 L2
5 V1 V2 V3 V4 V5 V6 V7 L2
6 H1 H2 H3 H4 H5 H6 H7
7 X1 X2 X3 X4 X5 X6 X7 L2
8 N1 N1 N1 N1 N1 N1 N7 L2
Label L3 corresponds to the following table:
TABLE 4
id F1 F2 F3 F4 F5 F6 F7 Labels
1 G1 G2 G3 G4 G5 G6 G7 L3
2 R1 R2 R3 R4 R5 R6 R7
3 M1 M2 M3 M4 M5 M6 M7 L3
4 Y1 Y2 Y3 Y4 Y5 Y6 Y7 L3
5 V1 V2 V3 V4 V5 V6 V7
6 H1 H2 H3 H4 H5 H6 H7 L3
7 X1 X2 X3 X4 X5 X6 X7
8 N1 N1 N1 N1 N1 N1 N7
Label L4 corresponds to the following table:
TABLE 5
id F1 F2 F3 F4 F5 F6 F7 Labels
1 G1 G2 G3 G4 G5 G6 G7
2 R1 R2 R3 R4 R5 R6 R7 L4
3 M1 M2 M3 M4 M5 M6 M7 L4
4 Y1 Y2 Y3 Y4 Y5 Y6 Y7
5 V1 V2 V3 V4 V5 V6 V7 L4
6 H1 H2 H3 H4 H5 H6 H7 L4
7 X1 X2 X3 X4 X5 X6 X7
8 N1 N1 N1 N1 N1 N1 N7 L4
Label L5 corresponds to the following table:
TABLE 6
Figure BDA0003422058380000091
Figure BDA0003422058380000101
Further, the converted data can be directly constructed by using a single label classification algorithm, but the incidence relation among labels in the multi-label data is not considered at this time, and the labels in the multi-label data have a certain incidence relation, so that a new method is provided for constructing the label subset of the sample, the incidence relation among the labels in the sample is mapped through the label subset, and the F-score is a method for measuring the distinguishing capability of the features between the two types, so that the effective feature selection can be realized through the method.
Specifically, the importance of each feature is calculated by the F-score formula:
Figure BDA0003422058380000102
wherein n is+Number of samples representing positive class, n-Number of samples representing negative class。
Figure BDA0003422058380000103
And
Figure BDA0003422058380000104
mean values over the entire data set, mean values over the positive class data set and mean values over the negative class data set for the ith feature, respectively.
Figure BDA0003422058380000105
The feature value of the ith feature representing the kth positive type sample point,
Figure BDA0003422058380000106
the eigenvalue of the ith characteristic representing the kth negative type sample point.
FiThe larger the feature x isiThe stronger the class discrimination ability of (2), i.e. the more sparse the classes are, the more dense the classes are, the better the classification effect is, i.e. the stronger the class discrimination ability of the feature is.
However, the F-score cannot accurately measure mutual information between features, and the mutual information is a representation of correlation between features, and if the mutual information cannot be disclosed, the magnitude of the correlation between features cannot be measured, so that the correlation between features is calculated by using the mutual information, and the calculation formula is as follows:
Figure BDA0003422058380000107
x denotes a characteristic variable, Y denotes a tag variable, p (X)i) And p (y)j) Edge probabilities, p (X), for variables X and Y, respectivelyi,yj) Is a joint probability distribution function of X and Y, where the labels are also considered as a feature to reflect the relationship between the labels.
Further, a model is constructed by considering only features that are particularly relevant to class variables, and the model does not necessarily result in good classification performance, that is, in feature selection, the combination of single good features does not always improve the classification performance of the model, because there is a possibility that features are highly relevant to each other, which results in redundancy among the features. Therefore, in order to effectively select an excellent feature subset, it is necessary to combine the correlation and redundancy between features to select top-ranked labels, including:
calculating the mutual information mean value between all the characteristics and the target variable, wherein the calculation formula is as follows:
Figure BDA0003422058380000111
wherein S represents a selected feature set, c represents a target variable, namely a class label variable;
and calculating the redundant information quantity between the characteristics according to the following calculation formula:
Figure BDA0003422058380000112
selecting a label with small redundancy and large correlation according to the mutual information mean value and the redundant information quantity, and selecting a calculation formula as follows:
Figure RE-GDA0003510723370000121
wherein m is the number of features, and the labels ranked in the top 70% are selected according to the result calculated by the formula to construct a new label subset of the sample.
And finally, circularly and iteratively calculating the label subset of each sample, constructing a new single-label data set based on the new sample data set, putting the label subset of each sample into the sample, and constructing a classification model based on the new data sample.
Taking label L5 as an example, the new single label data set after adding the label subset is shown in the following table:
TABLE 7
id F1 F2 F3 F4 F5 F6 F7 L1 L2 L3 L4 Labels
1 G1 G2 G3 G4 G5 G6 G7 G8 G9 G10 G11 L5
2 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 L5
3 M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 L5
4 Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 Y11 L5
5 V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11
6 H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 H11
7 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 L5
8 N1 N1 N1 N1 N1 N1 N7 N8 N9 N10 N11
As can be seen from table 7, F1 through F7 are the original features in the data, and we then add the original L1, L2, L3, and L4 tags as features to the single-tagged data, assuming that there are a total of 5 tags in the multi-tagged dataset. G8, G9, G10, G11, etc. are only values of the feature, and may be, for example, 0 or 1, indicating the presence or absence of the label. Thus, the relationships between label L5 and other labels L1, L2, L3, and L4 are established, and so on, the relationship between label L1 and other labels, the relationship between label L2 and other labels, the relationship between label L3 and other labels, and the relationship between label L4 and other labels are established.
Further, the ratio of training set to test set is 1: 1.
Further, a Bayesian algorithm is used for constructing the single-label classification model.
Further, the multi-labeled data sample set includes a plurality of different labels, including a plurality of different features.
Example 3
There is provided a classification system based on a subset of tags, the system comprising:
the sample acquisition module is used for acquiring a multi-label data sample set and converting the multi-label data sample set into a single-label data set;
the label subset calculation module is used for calculating label subsets of all samples in the single label data set and constructing a new sample data set based on the label subsets; the computing of the subset of tags includes: calculating the importance of all the characteristics in the sample; calculating a correlation between the features; calculating the redundancy among the characteristics, selecting the labels in the top order by combining the correlation and the redundancy among the characteristics, and constructing a label subset;
the sample recombination module is used for putting the label subset of each sample into the original sample to obtain a new single-label data set;
and the modeling and classifying module is used for constructing a single label classification model based on the new single label data set, then counting the labels of each sample in each single label classification model, and obtaining a final multi-label classification result.
Example 4
The present embodiment has the same inventive concept as embodiment 1, and a storage medium is provided on the basis of embodiment 1, and computer instructions are stored thereon, and when the computer instructions are executed, the steps of the classification method based on the tag subset in embodiment 1 are executed.
Based on such understanding, the technical solution of the present embodiment or parts of the technical solution may be essentially implemented in the form of a software product, which is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method of the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Example 5
The present embodiment also provides a terminal, which has the same inventive concept as that of embodiment 1, and includes a memory and a processor, where the memory stores computer instructions executable on the processor, and the processor executes the computer instructions to perform the steps of the classification method based on the tag subset in embodiment 1. The processor may be a single or multi-core central processing unit or a specific integrated circuit, or one or more integrated circuits configured to implement the present invention.
Each functional unit in the embodiments provided by the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The above detailed description is for the purpose of describing the invention in detail, and it should not be construed that the detailed description is limited to the description, and it will be apparent to those skilled in the art that various modifications and substitutions can be made without departing from the spirit of the invention.

Claims (10)

1. A method for classification based on a subset of tags, the method comprising:
acquiring a multi-label data sample set, and converting the multi-label data sample set into a single-label data set;
calculating label subsets of all samples in the single label data set, and constructing a new sample data set based on the label subsets; the computing of the subset of tags includes: calculating the importance of all the characteristics in the sample; calculating a correlation between the features; calculating the redundancy among the characteristics, selecting the labels in the top order by combining the correlation and the redundancy among the characteristics, and constructing a label subset;
putting the label subset of each sample into the sample of the original single-label data set to obtain a new single-label data set;
and constructing a single label classification model based on the new single label data set, and then counting the labels of each sample in each single label classification model to obtain a final multi-label classification result.
2. The method of claim 1,
the converting the multi-label data sample set into the single-label data set comprises:
and corresponding a single label in the multi-label data sample set to each sample, and decomposing the single label into a data subset equal to the number of labels.
3. The method of claim 1, wherein the method further comprises:
pre-processing the multi-labeled data sample set, the pre-processing comprising:
and deleting the samples with missing data characteristic values, keeping the samples with complete data characteristics, and then randomly dividing the multi-label data sample set into a training set and a testing set.
4. The method of claim 1, wherein the importance of each feature is calculated by the F-score formula:
Figure FDA0003422058370000021
wherein, FiThe larger the feature x isiThe stronger the class discrimination ability of (2).
5. The method of claim 4, wherein the correlation between features is calculated using mutual information, and the calculation formula is as follows:
Figure FDA0003422058370000022
x denotes a characteristic variable, Y denotes a tag variable, p (X)i) And p (y)j) Edge probabilities, p (X), for variables X and Y, respectivelyi,yj) Is a joint probability distribution function of X and Y.
6. The method according to claim 5, wherein the selecting the top ranked label in combination with the correlation and redundancy between features comprises:
calculating the mutual information mean value between all the characteristics and the target variable, wherein the calculation formula is as follows:
Figure RE-FDA0003510723360000023
wherein S represents a selected feature set, and c represents a target variable;
and calculating the redundant information quantity between the characteristics according to the following calculation formula:
Figure RE-FDA0003510723360000024
selecting a label with small redundancy and large correlation according to the mutual information mean value and the redundant information quantity, and selecting a calculation formula as follows:
Figure RE-FDA0003510723360000025
wherein m is the number of features, and the labels ranked in the top 70% are selected according to the result calculated by the formula.
7. The method of claim 3, wherein the ratio of the training set to the test set is 1: 1.
8. The method of claim 1, wherein a Bayesian algorithm is used to construct the single label classification model.
9. The method of claim 1, wherein the multi-labeled data sample set comprises a plurality of different labels and a plurality of different features.
10. A classification system based on a subset of tags, the system comprising:
the sample acquisition module is used for acquiring a multi-label data sample set and converting the multi-label data sample set into a single-label data set;
the label subset calculation module is used for calculating label subsets of all samples in the single label data set and constructing a new sample data set based on the label subsets; the computing of the subset of tags includes: calculating the importance of all the characteristics in the sample; calculating a correlation between the features; calculating the redundancy among the characteristics, selecting the labels in the top order by combining the correlation and the redundancy among the characteristics, and constructing a label subset;
the sample recombination module is used for putting the label subset of each sample into the original sample to obtain a new single-label data set;
and the modeling and classifying module is used for constructing a single label classification model based on the new single label data set, then counting the labels of each sample in each single label classification model, and obtaining a final multi-label classification result.
CN202111566217.5A 2021-12-20 2021-12-20 Classification method and system based on label subset Pending CN114298191A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111566217.5A CN114298191A (en) 2021-12-20 2021-12-20 Classification method and system based on label subset

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111566217.5A CN114298191A (en) 2021-12-20 2021-12-20 Classification method and system based on label subset

Publications (1)

Publication Number Publication Date
CN114298191A true CN114298191A (en) 2022-04-08

Family

ID=80967444

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111566217.5A Pending CN114298191A (en) 2021-12-20 2021-12-20 Classification method and system based on label subset

Country Status (1)

Country Link
CN (1) CN114298191A (en)

Similar Documents

Publication Publication Date Title
CN110472090B (en) Image retrieval method based on semantic tags, related device and storage medium
CN110532417B (en) Image retrieval method and device based on depth hash and terminal equipment
CN109948735B (en) Multi-label classification method, system, device and storage medium
CN110046634B (en) Interpretation method and device of clustering result
CN111667022A (en) User data processing method and device, computer equipment and storage medium
CN111461164B (en) Sample data set capacity expansion method and model training method
CN111026544A (en) Node classification method and device of graph network model and terminal equipment
CN115481355A (en) Data modeling method based on category expansion
CN112418320A (en) Enterprise association relation identification method and device and storage medium
CN115965058A (en) Neural network training method, entity information classification method, device and storage medium
CN113656699B (en) User feature vector determining method, related equipment and medium
CN115222443A (en) Client group division method, device, equipment and storage medium
CN108830302B (en) Image classification method, training method, classification prediction method and related device
CN114580354B (en) Information coding method, device, equipment and storage medium based on synonym
CN114298191A (en) Classification method and system based on label subset
CN112541357B (en) Entity identification method and device and intelligent equipment
CN113989607A (en) Classification method and system based on label relevance
CN115495636A (en) Webpage searching method, device and storage medium
CN112417290A (en) Training method of book sorting push model, electronic equipment and storage medium
CN111950615A (en) Network fault feature selection method based on tree species optimization algorithm
CN112445939A (en) Social network group discovery system, method and storage medium
CN111522795A (en) Method and device for processing data
CN110532384A (en) A kind of multitask dictionary list classification method, system, device and storage medium
CN117093501B (en) Test case recommendation method based on pre-training model, electronic equipment and storage medium
CN114118085B (en) Text information processing method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination