CN117851860A - Method for automatically generating data classification grading template - Google Patents
Method for automatically generating data classification grading template Download PDFInfo
- Publication number
- CN117851860A CN117851860A CN202311712760.0A CN202311712760A CN117851860A CN 117851860 A CN117851860 A CN 117851860A CN 202311712760 A CN202311712760 A CN 202311712760A CN 117851860 A CN117851860 A CN 117851860A
- Authority
- CN
- China
- Prior art keywords
- classification
- data
- template
- class
- subclasses
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000005070 sampling Methods 0.000 claims abstract description 14
- 238000010801 machine learning Methods 0.000 claims abstract description 12
- 238000005516 engineering process Methods 0.000 claims abstract description 7
- 238000003058 natural language processing Methods 0.000 claims abstract description 5
- 235000012571 Ficus glomerata Nutrition 0.000 claims description 18
- 244000153665 Ficus glomerata Species 0.000 claims description 18
- 238000009966 trimming Methods 0.000 claims description 6
- 238000013138 pruning Methods 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 3
- 239000013598 vector Substances 0.000 description 3
- 238000010276 construction Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23211—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with adaptive number of clusters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computing Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Mathematical Physics (AREA)
- Medical Informatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a method for automatically generating a data classification grading template, which comprises the following steps: s1, reading a preset data source; s2, carrying out hierarchical clustering with constraint; s3, forming a classification catalog of a multi-layer tree structure; s4, training a machine learning model as a recognition rule of leaf classification; and S5, applying the classification to the target data source. According to the invention, metadata information and content sampling data are extracted from a preset data source, the metadata information and the content data are comprehensively considered, a hierarchical clustering and natural language processing technology are combined to generate a classification catalog and a classification name with high descriptive and accuracy, and meanwhile, a machine learning model is trained to generate a leaf classification recognition rule, so that a complete classification template is formed.
Description
Technical Field
The invention relates to the technical field of data classification and grading, in particular to a method for automatically generating a data classification and grading template.
Background
In the current information age, the amount and complexity of data grows exponentially, management and security of data face increasing challenges, and classifying and ranking data and more efficient protection are important tasks. By classifying the data, the data with similarity is assigned to the same category, and the data resources can be better organized and managed. By ranking the data, appropriate access rights and security control measures can be determined according to the category, degree of sensitivity and importance to which it belongs.
To implement data classification, it is generally necessary to construct a classification template. The classification hierarchy template typically contains a hierarchical tree structured classification directory in which each leaf classification contains specific identification rules for identifying whether the data belongs to that class. After the classification and grading template is constructed, the classification and grading template can be applied to the identification and classification and grading of the data source data.
It is noted that the construction of the classification hierarchical template is generally performed manually at present, and a professional is required to define classification standards and recognition rules of data according to field knowledge and experience. However, as data size and complexity increase, manual design and construction of templates and recognition rules becomes difficult and time consuming, and subject to subjective bias and error.
Disclosure of Invention
Aiming at the defects existing in the prior art, the invention aims to provide a method for automatically generating a data classification template, which can improve the efficiency of constructing the classification template and the identification accuracy.
In order to achieve the above object, the present invention is realized by the following technical scheme: a method of automatically generating a data classification ranking template, the method comprising the steps of:
s1, reading a preset data source;
s2, carrying out hierarchical clustering with constraint;
s3, forming a classification catalog of a multi-layer tree structure;
s4, training a machine learning model as a recognition rule of leaf classification;
and S5, applying the classification to the target data source.
Further, the step S1 includes: the data source comprises a field or a file, and when the data source is read, metadata information and content sampling data in the data source are acquired, so that a data basis is provided for the subsequent generation of the classification and grading template.
Further, the step S2 further includes the following steps:
s21, presetting a field or a file in a data source as an initial class cluster, forming a hierarchical clustering structure by iteratively combining two most similar class clusters, and iterating until the total number of the class clusters is less than or equal to a preset value N;
step S22, the similarity between the class clusters can be calculated by combining metadata information and content sampling data, wherein the content sampling data is a part of specific content samples extracted from fields or files, and the similarity between the class clusters can be calculated by comprehensively considering the information and is used for determining the two most similar class clusters to be combined;
step S23, balancing a cluster tree formed by clustering.
Further, the step S21 further includes: the maximum similarity between clusters is smaller than or equal to a preset value θ1, and the step S22 further includes: the metadata information includes attributes, types, association relations of fields or files.
Further, a constraint principle needs to be followed in the clustering process, and the constraint principle is as follows: a class cluster contains a minimum number of subclasses greater than M, and the class cluster can no longer participate in merging.
Further, the step S3 further includes the following steps:
step S31, forming a cluster tree in the clustering process, and pruning the cluster tree to form a classification catalog with a multi-layer tree structure;
step S32, for each class cluster in the pruned cluster tree, semantic analysis and feature extraction are carried out on text contents of the included subclasses by utilizing a natural language processing technology, keyword, theme or concept information is obtained, meanwhile, metadata of fields or files are considered, common information of the included subclasses is screened out from the information, and a father class name with high descriptive and accuracy is generated based on the common information and used for identifying and organizing the corresponding class clusters to form a classification catalog with clear hierarchical relation and convenient understanding.
Further, the step S31 further includes a trimming process, where the trimming process is: traversing the class clusters in the cluster tree from bottom to top, if the similarity between the minimum subclasses in one class cluster is larger than theta 2, tiling the minimum subclasses, removing the class clusters formed by combining the minimum subclasses in the middle of the minimum subclasses, and taking the minimum subclasses as direct subclasses.
Further, the step S4 includes:
step S41, classifying each leaf in the classification catalog, training a machine learning model by using metadata information and content data contained in the leaf as a recognition rule of the subclass, and constructing a model so as to judge which subclass the new data belongs to according to the input data;
step S42, in the process of training one leaf classification, randomly selecting a certain proportion of samples from the class as positive samples, randomly selecting a certain proportion of samples from other leaf classifications as negative samples for training, so that the model learns the characteristics and rules of each leaf classification, and correctly classifies new data from other leaf classifications;
and S43, after training is completed, predicting other classifications by using the trained models, marking the samples with the mispredicted samples as negative samples, training again, repeating the steps until the misprediction rate of the other classifications reaches a satisfactory level or reaches a preset stop condition, and repeatedly training and adjusting the models to iteratively train the easily confused classifications or the newly appeared classifications by continuously introducing the mispredicted samples.
Further, the step S5 includes: the classification catalog and the identification rule generated by the process are combined into a complete classification template, and the template is applied to a new data source to classify and classify the new data source.
Further, the various fields or files in the step S21 are: fields in a database table, text documents in a file system, and spreadsheets.
The invention has the beneficial effects that:
automatically generating a data classification grading template: the invention provides a method for automatically generating a classification and grading template, which comprehensively considers metadata information and content data by extracting the metadata information and the content sampling data from a preset data source, combines hierarchical clustering and natural language processing technology to generate a classification catalog and a classification name with high descriptive and accuracy, and simultaneously trains a machine learning model to generate a recognition rule of leaf classification so as to form a complete classification and grading template.
And the recognition accuracy is improved: in the process of training a machine learning model to generate leaf classification recognition rules, after one iteration training is completed, other classifications are predicted by using the model, and samples with misprediction are marked as negative samples and are trained again. By continuously introducing samples with wrong predictions as negative samples for training, the accuracy of classification can be improved, which makes the classification result more reliable and accurate.
Flexibility: for each leaf classification, the invention trains a machine learning model as an identification rule thereof, and does not influence the existing classification when other classifications are added or deleted. Therefore, the template can be adjusted and optimized according to specific requirements so as to adapt to the classification requirements of different fields and scenes.
High efficiency: the method can quickly generate the classification template by reading the preset data source and combining hierarchical clustering and machine learning technology, and can be applied to the target data source to realize efficient classification.
Drawings
Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, given with reference to the accompanying drawings in which:
FIG. 1 is a flow chart of the present invention for generating a classification hierarchical template;
FIG. 2 is a hierarchical clustering flow chart with constraints of the present invention;
FIG. 3 is a flow chart of training recognition rules according to the present invention.
Detailed Description
The invention is further described in connection with the following detailed description, in order to make the technical means, the creation characteristics, the achievement of the purpose and the effect of the invention easy to understand.
Referring to fig. 1-3, a method for automatically generating a data classification hierarchical template, the method comprising the steps of:
s1, reading a preset data source;
s2, carrying out hierarchical clustering with constraint;
s3, forming a classification catalog of a multi-layer tree structure;
s4, training a machine learning model as a recognition rule of leaf classification;
and S5, applying the classification to the target data source.
The step S1 includes: the data source comprises a field or a file, and when the data source is read, metadata information and content sampling data in the data source are acquired, so that a data basis is provided for the subsequent generation of the classification and grading template.
The step S2 further includes the following steps:
s21, presetting a field or a file in a data source as an initial class cluster, forming a hierarchical clustering structure by iteratively combining two most similar class clusters, and iterating until the total number of the class clusters is less than or equal to a preset value N;
step S22, the similarity between the class clusters can be calculated by combining metadata information and content sampling data, wherein the content sampling data is a part of specific content samples extracted from fields or files, and the similarity between the class clusters can be calculated by comprehensively considering the information and is used for determining the two most similar class clusters to be combined;
step S23, balancing a cluster tree formed by clustering.
The step S21 further includes: the maximum similarity between clusters is smaller than or equal to a preset value θ1, and step S22 further includes: the metadata information includes attributes, types, association relations of fields or files.
In the clustering process, a constraint principle needs to be followed, wherein the constraint principle is as follows: a class cluster contains a minimum number of subclasses greater than M, and the class cluster can no longer participate in merging.
The step S3 further includes the following steps:
step S31, forming a cluster tree in the clustering process, and pruning the cluster tree to form a classification catalog with a multi-layer tree structure;
step S32, for each class cluster in the pruned cluster tree, semantic analysis and feature extraction are carried out on text contents of the included subclasses (fields or files) by utilizing a natural language processing technology, keyword, theme or concept information is obtained, metadata of the fields or files such as labels, descriptions, types and the like are considered, common information of the included subclasses is screened out from the information, and a descriptive and high-accuracy parent class name is generated based on the common information and used for identifying and organizing the corresponding class clusters to form a classification catalog with clear hierarchical relation and convenient understanding.
The step S31 further includes a trimming process, where the trimming process is: traversing the class clusters in the cluster tree from bottom to top, if the similarity between the minimum subclasses in one class cluster is larger than theta 2, tiling the minimum subclasses, removing the class clusters formed by combining the minimum subclasses in the middle of the minimum subclasses, and taking the minimum subclasses as direct subclasses.
The step S4 includes:
step S41, classifying each leaf in the classification catalog, training a machine learning model by using metadata information and content data contained in the leaf as a recognition rule of the subclass, and constructing a model so as to judge which subclass the new data belongs to according to the input data;
step S42, in the process of training one leaf classification, randomly selecting a certain proportion of samples from the class as positive samples, randomly selecting a certain proportion of samples from other leaf classifications as negative samples for training, so that the model learns the characteristics and rules of each leaf classification, and correctly classifies new data from other leaf classifications;
and S43, after the training is finished, predicting other classifications by using the trained models, marking the samples with the mispredicted samples as negative samples, training again, repeating the steps until the misprediction rate of the other classifications reaches a satisfactory level or reaches a preset stop condition, and repeatedly training and adjusting the models, particularly for the easily confused classifications or the newly appeared classifications, and carrying out iterative training by continuously introducing the mispredicted samples, thereby improving the accuracy and generalization capability of the models.
The step S5 includes: the classification catalog and the identification rule generated by the process are combined into a complete classification template, and the template is applied to a new data source to classify and classify the new data source.
The various fields or files in step S21 are: fields in database tables, text documents and spreadsheets in file systems, and the like.
Working principle: the method comprises the following steps:
step 1, assuming a relational database with better structural definition as a preset data source, the system reads metadata information and content sampling data of fields contained in the database by connecting the database. Wherein the metadata information includes library names and notes, table names and notes, and names, notes, types of fields, etc. The content sample data is obtained by randomly sampling the fields.
And 2, iteratively combining the two most similar class clusters by taking each field as an initial class cluster until the total number of the class clusters is less than or equal to 5 or the maximum similarity between the class clusters is less than or equal to 30 percent. When the similarity is calculated, preprocessing operations such as word segmentation, stop word removal and the like are performed on text data such as library/table/field names and notes and content sampling data. And then using a pre-trained Word2Vec model to weight and average Word vectors of all the pre-processed words, taking the obtained vectors as class cluster vectors, and calculating cosine similarity to obtain similarity among class clusters. A constraint also needs to be followed in the clustering process: if a cluster contains a minimum sub-class number greater than 20, merging cannot be participated.
And 3, pruning a cluster tree formed in the clustering process in the step 2, traversing class clusters in the cluster tree from bottom to top, and if the similarity between the minimum subclasses in one class cluster is greater than 60%, tiling the minimum subclasses, namely removing the class clusters formed by combining the minimum subclasses in the middle of the minimum subclasses, and taking all the minimum subclasses as direct subclasses.
And 4, extracting a named entity from the text content of the included field or file by using a Named Entity Recognition (NER) technology according to each class cluster in the cluster tree after pruning in the step 3, counting word frequency, and selecting the named entity with high word frequency as the name of the class cluster and as one class of the class catalog. All class clusters are organized in a hierarchical relationship to form a complete classification directory.
And 5, randomly selecting a certain proportion of samples from the class as positive samples according to each minimum subclass in the classification catalog generated in the step 4, and randomly selecting a certain proportion of samples from other classifications as negative samples to perform textCNN model training, wherein the trained model is used as the recognition rule of the classification. Still further, for confusable classifications, the interactions are further trained as negative examples to improve recognition accuracy.
And 6, combining the classification catalogue and the recognition rule generated in the process to form a complete classification template. The template is applied to a new data source, and can be classified and graded.
While the fundamental and principal features of the invention and advantages of the invention have been shown and described, it will be apparent to those skilled in the art that the invention is not limited to the details of the foregoing exemplary embodiments, but may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present disclosure describes embodiments, not every embodiment is provided with a separate embodiment, and that this description is provided for clarity only, and that the disclosure is not limited to the embodiments described in detail below, and that the embodiments described in the examples may be combined as appropriate to form other embodiments that will be apparent to those skilled in the art.
Claims (10)
1. A method for automatically generating a data classification hierarchical template, characterized in that: the method comprises the following steps:
s1, reading a preset data source;
s2, carrying out hierarchical clustering with constraint;
s3, forming a classification catalog of a multi-layer tree structure;
s4, training a machine learning model as a recognition rule of leaf classification;
and S5, applying the classification to the target data source.
2. A method of automatically generating a data classification hierarchical template in accordance with claim 1, wherein: the step S1 includes: the data source comprises a field or a file, and when the data source is read, metadata information and content sampling data in the data source are acquired, so that a data basis is provided for the subsequent generation of the classification and grading template.
3. A method of automatically generating a data classification hierarchical template in accordance with claim 2, wherein: the step S2 further includes the following steps:
s21, presetting a field or a file in a data source as an initial class cluster, forming a hierarchical clustering structure by iteratively combining two most similar class clusters, and iterating until the total number of the class clusters is less than or equal to a preset value N;
step S22, the similarity between the class clusters can be calculated by combining metadata information and content sampling data, wherein the content sampling data is a part of specific content samples extracted from fields or files, and the similarity between the class clusters can be calculated by comprehensively considering the information and is used for determining the two most similar class clusters to be combined;
step S23, balancing a cluster tree formed by clustering.
4. A method of automatically generating a data classification hierarchical template in accordance with claim 3 wherein: the step S21 further includes: the maximum similarity between clusters is smaller than or equal to a preset value θ1, and the step S22 further includes: the metadata information includes attributes, types, association relations of fields or files.
5. The method for automatically generating a data classification hierarchical template in accordance with claim 4, wherein: in the clustering process, a constraint principle needs to be followed, wherein the constraint principle is as follows: a class cluster contains a minimum number of subclasses greater than M, and the class cluster can no longer participate in merging.
6. A method of automatically generating a data classification hierarchical template in accordance with claim 5, wherein: the step S3 further includes the following steps:
step S31, forming a cluster tree in the clustering process, and pruning the cluster tree to form a classification catalog with a multi-layer tree structure;
step S32, for each class cluster in the pruned cluster tree, semantic analysis and feature extraction are carried out on text contents of the included subclasses by utilizing a natural language processing technology, keyword, theme or concept information is obtained, meanwhile, metadata of fields or files are considered, common information of the included subclasses is screened out from the information, and a father class name with high descriptive and accuracy is generated based on the common information and used for identifying and organizing the corresponding class clusters to form a classification catalog with clear hierarchical relation and convenient understanding.
7. A method of automatically generating a data classification hierarchical template in accordance with claim 6 wherein: the step S31 further includes a trimming process, where the trimming process is: traversing the class clusters in the cluster tree from bottom to top, if the similarity between the minimum subclasses in one class cluster is larger than theta 2, tiling the minimum subclasses, removing the class clusters formed by combining the minimum subclasses in the middle of the minimum subclasses, and taking the minimum subclasses as direct subclasses.
8. A method of automatically generating a data classification hierarchical template in accordance with claim 7, wherein: the step S4 includes:
step S41, classifying each leaf in the classification catalog, training a machine learning model by using metadata information and content data contained in the leaf as a recognition rule of the subclass, and constructing a model so as to judge which subclass the new data belongs to according to the input data;
step S42, in the process of training one leaf classification, randomly selecting a certain proportion of samples from the class as positive samples, randomly selecting a certain proportion of samples from other leaf classifications as negative samples for training, so that the model learns the characteristics and rules of each leaf classification, and correctly classifies new data from other leaf classifications;
and S43, after training is completed, predicting other classifications by using the trained models, marking the samples with the mispredicted samples as negative samples, training again, repeating the steps until the misprediction rate of the other classifications reaches a satisfactory level or reaches a preset stop condition, and repeatedly training and adjusting the models to iteratively train the easily confused classifications or the newly appeared classifications by continuously introducing the mispredicted samples.
9. A method of automatically generating a data classification ranking template in accordance with claim 8, wherein: the step S5 includes: the classification catalog and the identification rule generated by the process are combined into a complete classification template, and the template is applied to a new data source to classify and classify the new data source.
10. A method of automatically generating a data classification hierarchical template in accordance with claim 3 wherein: the various fields or files in the step S21 are: fields in a database table, text documents in a file system, and spreadsheets.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311712760.0A CN117851860A (en) | 2023-12-13 | 2023-12-13 | Method for automatically generating data classification grading template |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311712760.0A CN117851860A (en) | 2023-12-13 | 2023-12-13 | Method for automatically generating data classification grading template |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117851860A true CN117851860A (en) | 2024-04-09 |
Family
ID=90537296
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311712760.0A Pending CN117851860A (en) | 2023-12-13 | 2023-12-13 | Method for automatically generating data classification grading template |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117851860A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118276793A (en) * | 2024-06-04 | 2024-07-02 | 江苏达海智能系统股份有限公司 | Method and system for collecting facility heterogeneous data for building intellectualization |
-
2023
- 2023-12-13 CN CN202311712760.0A patent/CN117851860A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118276793A (en) * | 2024-06-04 | 2024-07-02 | 江苏达海智能系统股份有限公司 | Method and system for collecting facility heterogeneous data for building intellectualization |
CN118276793B (en) * | 2024-06-04 | 2024-08-13 | 江苏达海智能系统股份有限公司 | Method and system for collecting facility heterogeneous data for building intellectualization |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110413780B (en) | Text emotion analysis method and electronic equipment | |
CN105677873B (en) | Text Intelligence association cluster based on model of the domain knowledge collects processing method | |
CN110162591B (en) | Entity alignment method and system for digital education resources | |
CN112182148B (en) | Standard aided writing method based on full text retrieval | |
WO2022081812A1 (en) | Artificial intelligence driven document analysis, including searching, indexing, comparing or associating datasets based on learned representations | |
CN117851860A (en) | Method for automatically generating data classification grading template | |
CN115794803B (en) | Engineering audit problem monitoring method and system based on big data AI technology | |
CN113254507A (en) | Intelligent construction and inventory method for data asset directory | |
CN113742396A (en) | Mining method and device for object learning behavior pattern | |
CN117473431A (en) | Airport data classification and classification method and system based on knowledge graph | |
CN115146062A (en) | Intelligent event analysis method and system fusing expert recommendation and text clustering | |
CN118035440A (en) | Enterprise associated archive management target knowledge feature recommendation method | |
CN112286799B (en) | Software defect positioning method combining sentence embedding and particle swarm optimization algorithm | |
CN117573876A (en) | Service data classification and classification method and device | |
CN117111890A (en) | Software requirement document analysis method, device and medium | |
CN113254583B (en) | Document marking method, device and medium based on semantic vector | |
CN111274404B (en) | Small sample entity multi-field classification method based on man-machine cooperation | |
CN114741512A (en) | Automatic text classification method and system | |
Jingliang et al. | A data-driven approach based on LDA for identifying duplicate bug report | |
CN117648635B (en) | Sensitive information classification and classification method and system and electronic equipment | |
CN117544831B (en) | Automatic decomposing method and system for classroom teaching links | |
CN115858738B (en) | Enterprise public opinion information similarity identification method | |
CN117251605B (en) | Multi-source data query method and system based on deep learning | |
CN118377771B (en) | Data modeling method and system based on graph data structure | |
Timuçin et al. | Initial seed value effectiveness on performances of data mining algorithms |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |