CN113837240A - Classification system and classification method for education department - Google Patents
Classification system and classification method for education department Download PDFInfo
- Publication number
- CN113837240A CN113837240A CN202111030674.2A CN202111030674A CN113837240A CN 113837240 A CN113837240 A CN 113837240A CN 202111030674 A CN202111030674 A CN 202111030674A CN 113837240 A CN113837240 A CN 113837240A
- Authority
- CN
- China
- Prior art keywords
- classification
- subject
- training
- education
- articles
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 53
- 238000012549 training Methods 0.000 claims abstract description 76
- 238000012360 testing method Methods 0.000 claims abstract description 43
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 9
- 238000002372 labelling Methods 0.000 claims abstract description 9
- 239000011159 matrix material Substances 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000011156 evaluation Methods 0.000 claims description 3
- 230000007547 defect Effects 0.000 abstract description 4
- 230000008676 import Effects 0.000 description 5
- 239000003814 drug Substances 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 241000238631 Hexapoda Species 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Tourism & Hospitality (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Educational Administration (AREA)
- Educational Technology (AREA)
- Biomedical Technology (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A classification system and a classification method for education department, comprising: step 1: establishing a labeling data set; step 2: transcoding the annotation data set; and step 3: establishing a training set and a testing set of a certain subject; and 4, step 4: building a model based on a convolutional neural network; and 5: training a model; step 6: and (5) subject classification. The method effectively overcomes the defects that in the prior art, a subject classification system which has high accuracy and takes education department as a reference is not available in the market, the classification accuracy of the traditional classification method is low, the classification difficulty is high, and the classification of the traditional article does not relate to the education department.
Description
Technical Field
The embodiment of the invention relates to the technical field of classification, belongs to the field of student classification and subject classification, particularly relates to a classification system and a classification method for education departments, and particularly relates to a periodical classification system and a classification method based on a convolutional neural network and the subjects of the education departments.
Background
At present, manufacturers have made many studies on subject classification, but only the conventional word frequency analysis and keyword clustering are used to distinguish subjects of periodicals, and there is no subject classification system with high accuracy on the basis of education department in the market.
Problem 1: the traditional classification method has low classification accuracy and high classification difficulty.
Most of traditional classification methods adopt word frequency association, namely, the frequency of occurrence of a certain keyword in an article is high, and the article can be hooked with a subject associated with the keyword. With the development of the times and the richness of article contents, the classification method cannot adapt to the trend. For example, a classification method would be for a scalpel and a medical hook, but if an article teaches the manufacturing process of a scalpel, it is clear that the article is not too much associated with medicine. Secondly, the difficulty of collecting all the keywords related to medicine is too high, which results in a large amount of manpower and material resources to maintain the system.
Problem 2: the traditional article classification does not relate to the education department.
The relevant web sites of the thesis have no uniform standard for article classification, and basically fight each other, and at present, no data dealer in China develops research on a classification system of education department. .
Disclosure of Invention
In order to solve the above problems, embodiments of the present invention provide a classification system and a classification method for education departments, which effectively avoid the defects that in the prior art, there is no discipline classification system with high accuracy on the basis of the disciplines of education departments in the market, the classification accuracy of the conventional classification method is low, the classification difficulty is high, and the classification of the conventional articles does not relate to the disciplines of education departments.
In order to overcome the defects in the prior art, the embodiment of the invention provides a solution for a classification system and a classification method for education departments, which comprises the following specific steps:
a classification method of a classification system for education department, comprising:
step 1: establishing a labeling data set;
the method for establishing the annotation data set comprises the following steps: and establishing a labeling data set according to the corresponding relation between the academic paper and the Chinese country library classification number and the corresponding relation between the Chinese country library classification number and the education department subject.
Step 2: transcoding the annotation data set;
the transcoding method of the annotation data set comprises the following steps: acquiring all the words appearing in the articles according to the acquired articles, and making an English dictionary with the length of 601408; all articles are converted to a1 x 200 matrix according to the english dictionary.
And step 3: establishing a training set and a testing set of a certain subject;
the method for establishing a training set and a test set of a certain subject comprises the following steps: marking all the articles of the subject as positive results, and extracting 80% of the articles as a positive result training set, and taking the rest 20% of the articles as a positive result testing set; all articles that are not the subject are marked as negative results and 80% are extracted as a negative result training set and the remaining 20% are extracted as a negative result testing set;
in the training process, 64 training samples are respectively extracted from the positive result training set and the negative result training set at a time and used as model training samples, and 64 test samples are respectively extracted from the positive result testing set and the negative result testing set at a time and used as model training samples.
And 4, step 4: building a model based on a convolutional neural network;
and 5: training a model;
the model training method comprises the following steps: the total number of 13 gate models and 110 education department models are trained, and the evaluation index of each model is more than 90%.
Step 6: classifying subjects;
the subject classification method comprises the following steps: if an article is to be classified into a primary subject of an education department, firstly, a gate model of the subject is required to be satisfied, and then, the primary subject model of the education department is required to be satisfied;
if a journal is to be classified in a department of education primary subject, at least 60% of the articles are under that department of education primary subject.
A classification system for educational departments, comprising:
the establishing module is used for establishing a marking data set;
the transcoding module is used for transcoding the marked data set;
the training module is used for establishing a training set and a testing set of a certain subject;
the building module is used for building a model based on a convolutional neural network;
the model module is used for model training;
and the classification module is used for subject classification.
The establishing module is also used for establishing a labeling data set according to the corresponding relation between the academic paper and the Chinese country library classification number and the corresponding relation between the Chinese country library classification number and the education department subject.
The transcoding module is also used for acquiring all words appearing in the articles according to the acquired articles and making an English dictionary with the length of 601408; all articles are converted to a1 x 200 matrix according to the english dictionary.
The training module is also used for marking all the articles of the subject as positive results, extracting 80% of the articles as a positive result training set, and taking the rest 20% of the articles as a positive result testing set; all articles that are not the subject are marked as negative results and 80% are extracted as a negative result training set and the remaining 20% are extracted as a negative result testing set;
in the training process, 64 training samples are respectively extracted from the positive result training set and the negative result training set at a time and used as model training samples, and 64 test samples are respectively extracted from the positive result testing set and the negative result testing set at a time and used as model training samples.
The classification module is also used for meeting a gate model of an article to which the article belongs first and then meeting a primary subject model of an education department if the article is to be classified into the primary subject of the education department;
if a journal is to be classified in a department of education primary subject, at least 60% of the articles are under that department of education primary subject.
The embodiment of the invention has the beneficial effects that:
the method of the invention realizes a subject classification system which has high accuracy and takes the education department as a reference, and the classification is easy and can relate to the education department. The method effectively overcomes the defects that in the prior art, a subject classification system which has high accuracy and takes education department as a reference is not available in the market, the classification accuracy of the traditional classification method is low, the classification difficulty is high, and the classification of the traditional article does not relate to the education department.
Drawings
Fig. 1 is an overall flowchart of a classification method of a classification system for education department according to the present invention.
Detailed Description
The embodiments of the present invention will be further described with reference to the drawings and the embodiments.
As shown in fig. 1, the classification method of the classification system for education department includes the following steps:
step 1: establishing a labeling data set;
the method for establishing the annotation data set comprises the following steps: and establishing a labeling data set according to the corresponding relation between the academic paper and the Chinese country library classification number and the corresponding relation between the Chinese country library classification number and the education department subject (13 education department subjects, 110 education department primary subjects). On the final presented results at this step, a more accurate discriminant data set was obtained, 10 articles (repeatable) for each of 13 disciplines, and 2 articles (repeatable) for each of 110 disciplines.
For example: the primary subject: the classification number of the Chinese national library corresponding to the marxist philosophy is as follows: a1, A8, a84, B0; the door class: the Chinese national library classification number corresponding to philosophy is as follows: b01, B02, B03, B08, B0.
The advantages are that: in the step, the traditional manual marking method is replaced by the machine marking method, so that the investment of manpower and material resources is greatly reduced; the article obtained by the two corresponding relations has high accuracy.
Step 2: transcoding the annotation data set;
the transcoding method of the annotation data set comprises the following steps: acquiring all the words appearing in the articles according to the acquired articles, and making an English dictionary with the length of 601408; all articles are converted to a1 x 200 matrix according to the english dictionary.
For example, the english dictionary shown below:
dict
minimizingweighted
municipai
as200mw
recovery51
about9years
andmangiferin
hypsochromically
pp2c21
wakening
couldlower
educationenvironment
enogenousely
betterfamily
incomplementary
acmotor
lc50were1
saionji
controllled
progresson
enhancedgreatly
ionx
bacillary
refracive
in1890
crystalsbased
energyand
forguangzhou
libertins
part of implementation codes of the transcoding method of the annotation data set are as follows:
for example: a method for creating a software program for creating, the software code of the computer program product may include a code of code.
Is coded into after conversion
[1129794 1238442 1142221 138159 1381583 571579 1335737 617718 1326063 618069 286557 1315384 776902 1259783 90889 1165424 512814 839423 547653 1391312 237506 963132 546716 1067425 113548 354942 132381 1335737 900013 214897 1143905 964454 1315933 624879 214897 1136531 1314985 51201 445480 304242 1312112 1216493 1058571 1167438 1049619 1067425 383474 1335737 900013 214897 90889 790745 1238442 1356034 1326063 237506 306144 279336 138159 428031 299002 814090 484760 776902 1259783 90889 811154 1067425 383474 1335737 900013 214897 138159 1054269 1356034 1239053 1216493 776902 1113755 654817 912278 286557 1315384 1314985 796005 1238442 618069 1381583 237506 138159 89707 1335737 682687 218181 878963 1330000 622842 153527 571579 906748 776902 700796 90889 412721 1054269 1129940 1237833 852873 1067425 878963 586549 90889 646562 214897 1352935 1314985 618069 814090 484760 183524 811154 1067425 383474 1335737 900013 214897 214897 383474 183524 394974 181300 951935 493621 1233765 1152098 214897 1010930 1314985 714988 445480 304242 618069 814090 484760 183524 571769 618069 376024 1335737 214897 383474 98387 181300 493621 1314985 214897 1010930 831280 260478 618069 376024 1335737 214897 383474 98387 445480 304242 489797 1138128 729142 877022 275706 1211368 878963 1330000 1260399 1166217 1174398 878963 385770 4958 618069 237506 236913 637641 215509 1134332 138159 1381583 1238442 380444 776902 1259783 618069 376024 1335737 214897 383474 776902 868183]
And step 3: establishing a training set and a testing set of a certain subject;
the method for establishing a training set and a test set of a certain subject comprises the following steps: marking all the articles of the subject as positive results, and extracting 80% of the articles as a positive result training set, and taking the rest 20% of the articles as a positive result testing set; all articles that are not the subject are marked as negative results and 80% are extracted as a negative result training set and the remaining 20% are extracted as a negative result testing set;
in the training process, 64 training samples are respectively extracted from the positive result training set and the negative result training set at a time and used as model training samples, and 64 test samples are respectively extracted from the positive result testing set and the negative result testing set at a time and used as model training samples.
For example:
there are three abstracts A, B, C to the Marxism.
There are three abstracts D, E, F of philosophy.
There are three abstracts H, I, J to law.
For Marxist, its positive result is A, B, C and its negative result is D, E, F, H, I, J.
Part of the code for establishing the training set and the test set of a certain subject is realized as follows:
selected_index=
random.sample(list(range(len(train_Y_true))),k=64)
batch_X_1=train_X_true[selected_index]
batch_Y_1=train_Y_true[selected_index]
selected_index=
random.sample(list(range(len(train_Y_false))),k=64)
batch_X_2=train_X_false[selected_index]
batch_Y_2=train_Y_false[selected_index]
batch_X=np.vstack((batch_X_2,batch_X_1))
batch_Y=np.vstack((batch_Y_2,batch_Y_1))
64 samples from each of the positive result training set and the negative result training set
selected_index=
random.sample(list(range(len(test_Y_true))),k=64)
batch_X_1=test_X_true[selected_index]
batch_Y_1=test_Y_true[selected_index]
selected_index=
random.sample(list(range(len(test_Y_false))),k=64)
batch_X_2=test_X_false[selected_index]
batch_Y_2=test_Y_false[selected_index]
test_X=np.vstack((batch_X_2,batch_X_1))
test_Y=np.vstack((batch_Y_2,batch_Y_1))
64 samples from each of the positive result test set and the negative result test set
The advantages are that: the building method that one subject corresponds to one model is adopted, but not that multiple subjects correspond to one model, so that the classification accuracy of a certain subject is optimized; the proportion of positive results and negative results is equivalent, and the problem that the accuracy is not practical under the condition that the proportion of negative results is too small is prevented.
And 4, step 4: building a model based on a convolutional neural network;
part of codes for realizing the model building based on the convolutional neural network are as follows:
# import-related library
import tensorflow as tf
import tensorflow.compat.v1 as tf
tf.reset_default_graph()
tf.disable_v2_behavior()
from tensorflow import keras as kr
from sklearn import metrics
New variables x and y
X_holder=tf.placeholder(tf.int32,[None,seq_length])
Y_holder=tf.placeholder(tf.float32,[None, num_classes])
# converts to a sentence vector based on the corresponding word vector
embedding=tf.get_variable('embedding',[601408, embedding_dim])
embedding_inputs=tf.nn.embedding_lookup(embedding, X_holder)
Layers. conv1d one-dimensional convolution
conv=tf.layers.conv1d(embedding_inputs,num_filters, kernel_size)
Pooled in # pool
max_pooling=tf.reduce_max(conv, reduction_indices=[1])
# full connection
full_connect=tf.layers.dense(max_pooling,hidden_dim)
# dropout, randomly culling partial data
full_connect_dropout=tf.nn.dropout(full_connect,
keep_prob=dropout_keep_prob)
Function activation
full_connect_activate=tf.nn.relu(full_connect_dropout)
# full connection
softmax_before=tf.layers.dense(full_connect_activate, num_classes)
predict_Y=tf.nn.softmax(softmax_before)
# optimizer
cross_entropy= tf.nn.softmax_cross_entropy_with_logits_v2(labels=Y_h older,logits=softmax_before)
loss=tf.reduce_mean(cross_entropy)
optimizer=tf.train.AdamOptimizer(learning_rate)
# training
train=optimizer.minimize(loss)
# output results
true_result=tf.argmax(Y_holder,1)
predict_result=tf.argmax(predict_Y,1)
And 5: training a model;
the model training method comprises the following steps: the total number of 13 gate models and 110 education department models are trained, and the evaluation index of each model is more than 90%.
Step 6: classifying subjects;
the subject classification method comprises the following steps: if an article is to be classified into a primary subject of an education department, firstly, a gate model of the subject is required to be satisfied, and then, the primary subject model of the education department is required to be satisfied;
if a journal is to be classified in a department of education primary subject, at least 60% of the articles are under that department of education primary subject.
For example: the first-level discipline and the public security are under the department of law, and if an article is classified into the department of law and the first-level discipline and the public security, the article can be considered to belong to the first-level discipline.
A classification system for educational departments, comprising:
the establishing module is used for establishing a marking data set;
the transcoding module is used for transcoding the marked data set;
the training module is used for establishing a training set and a testing set of a certain subject;
the building module is used for building a model based on a convolutional neural network;
the model module is used for model training;
and the classification module is used for subject classification.
The establishing module is further used for establishing a labeling data set according to the corresponding relation between the academic paper and the Chinese country library classification number and the corresponding relation between the Chinese country library classification number and the education department subject (13 education department subjects, 110 education department primary subjects). On the final presented results at this step, a more accurate discriminant data set was obtained, 10 articles (repeatable) for each of 13 disciplines, and 2 articles (repeatable) for each of 110 disciplines.
The transcoding module is also used for acquiring all words appearing in the articles according to the acquired articles and making an English dictionary with the length of 601408; all articles are converted to a1 x 200 matrix according to the english dictionary.
The training module is also used for marking all the articles of the subject as positive results, extracting 80% of the articles as a positive result training set, and taking the rest 20% of the articles as a positive result testing set; all articles that are not the subject are marked as negative results and 80% are extracted as a negative result training set and the remaining 20% are extracted as a negative result testing set;
in the training process, 64 training samples are respectively extracted from the positive result training set and the negative result training set at a time and used as model training samples, and 64 test samples are respectively extracted from the positive result testing set and the negative result testing set at a time and used as model training samples.
The classification module is also used for meeting a gate model of an article to which the article belongs first and then meeting a primary subject model of an education department if the article is to be classified into the primary subject of the education department;
if a journal is to be classified in a department of education primary subject, at least 60% of the articles are under that department of education primary subject.
The journal range recorded by Scival is far smaller than that recorded by insects, in the aspect of the academic department, Scival only distinguishes 97 primary subjects, while the invention distinguishes all the primary subjects and totals 110.
Taking the foreign language literature as an example, Scival totally includes 151 periodicals, which is obviously much lower than the number of actual periodicals. The embodiment of the invention records 2651 periodicals in total, and can be said to cover most foreign language and literature periodicals. Only 2 of the 151 journals it contains are not recognized by embodiments of the invention as foreign language literature, compared to the scope covered by Sciva. The 2 journals were indeed not to be classified in foreign language literature, as confirmed by the correlation. 100 periodicals are randomly extracted from 2502 periodicals which are recorded in the embodiment of the invention but not recorded by Scival, and are judged manually, and the 100 periodicals are confirmed to be classified into foreign language literature, which cannot be realized by Scival.
While the embodiments of the present invention have been described above in terms of procedures illustrated by the embodiments, it will be understood by those skilled in the art that the present disclosure is not limited to the embodiments described above, and that various changes, modifications, and substitutions can be made without departing from the scope of the embodiments of the present invention.
Claims (10)
1. A classification method of a classification system for education department, comprising:
step 1: establishing a labeling data set;
step 2: transcoding the annotation data set;
and step 3: establishing a training set and a testing set of a certain subject;
and 4, step 4: building a model based on a convolutional neural network;
and 5: training a model;
step 6: and (5) subject classification.
2. The classification method for a classification system for education sections according to claim 1, wherein the method of creating the annotation data set includes: and establishing a labeling data set according to the corresponding relation between the academic paper and the Chinese country library classification number and the corresponding relation between the Chinese country library classification number and the education department subject.
3. The method of classification for a classification system for an education department according to claim 1, wherein the method of transcoding the annotation data set comprises: acquiring all the words appearing in the articles according to the acquired articles, and making an English dictionary with the length of 601408; all articles are converted to a1 x 200 matrix according to the english dictionary.
4. The method of classifying a classification system for an education department according to claim 1, wherein the method of creating a training set and a test set of a certain discipline comprises: marking all the articles of the subject as positive results, and extracting 80% of the articles as a positive result training set, and taking the rest 20% of the articles as a positive result testing set; all articles that are not the subject are marked as negative results and 80% are extracted as a negative result training set and the remaining 20% are extracted as a negative result testing set;
in the training process, 64 training samples are respectively extracted from the positive result training set and the negative result training set at a time and used as model training samples, and 64 test samples are respectively extracted from the positive result testing set and the negative result testing set at a time and used as model training samples.
5. The method of classification for a classification system of an education department according to claim 1, wherein the method of model training includes: the total number of 13 gate models and 110 education department models are trained, and the evaluation index of each model is more than 90%.
6. The classification method of a classification system for education sections according to claim 1,
the subject classification method comprises the following steps: if an article is to be classified into a primary subject of an education department, firstly, a gate model of the subject is required to be satisfied, and then, the primary subject model of the education department is required to be satisfied;
if a journal is to be classified in a department of education primary subject, at least 60% of the articles are under that department of education primary subject.
7. A classification system for education, comprising:
the establishing module is used for establishing a marking data set;
the transcoding module is used for transcoding the marked data set;
the training module is used for establishing a training set and a testing set of a certain subject;
the building module is used for building a model based on a convolutional neural network;
the model module is used for model training;
and the classification module is used for subject classification.
8. The system according to claim 7, wherein the establishing module is further configured to establish the labeled data set according to the correspondence between the academic paper and the Chinese national library classification number and the correspondence between the Chinese national library classification number and the education department subject.
9. The system for classifying education sections according to claim 7, wherein the transcoding module is further configured to obtain all words appearing in all the articles according to the obtained articles, and create an English dictionary with length of 601408; all articles are converted to a1 x 200 matrix according to the english dictionary.
10. The system of claim 7, wherein the training module is further configured to label all the articles of the subject as positive results and extract 80% as a positive results training set and the remaining 20% as a positive results testing set; all articles that are not the subject are marked as negative results and 80% are extracted as a negative result training set and the remaining 20% are extracted as a negative result testing set;
in the training process, respectively extracting 64 pieces of training used as models from a positive result training set and a negative result training set each time, and respectively extracting 64 pieces of testing used as models from a positive result testing set and a negative result testing set each time;
the classification module is also used for meeting a gate model of an article to which the article belongs first and then meeting a primary subject model of an education department if the article is to be classified into the primary subject of the education department;
if a journal is to be classified in a department of education primary subject, at least 60% of the articles are under that department of education primary subject.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111030674.2A CN113837240A (en) | 2021-09-03 | 2021-09-03 | Classification system and classification method for education department |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111030674.2A CN113837240A (en) | 2021-09-03 | 2021-09-03 | Classification system and classification method for education department |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113837240A true CN113837240A (en) | 2021-12-24 |
Family
ID=78962197
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111030674.2A Pending CN113837240A (en) | 2021-09-03 | 2021-09-03 | Classification system and classification method for education department |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113837240A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101819601A (en) * | 2010-05-11 | 2010-09-01 | 同方知网(北京)技术有限公司 | Method for automatically classifying academic documents |
CN110516064A (en) * | 2019-07-11 | 2019-11-29 | 同济大学 | A kind of Aeronautical R&D paper classification method based on deep learning |
CN110990376A (en) * | 2019-11-20 | 2020-04-10 | 中国农业科学院农业信息研究所 | Subject classification automatic indexing method based on multi-factor mixed sorting mechanism |
US20210012199A1 (en) * | 2019-07-04 | 2021-01-14 | Zhejiang University | Address information feature extraction method based on deep neural network model |
CN112434159A (en) * | 2020-11-17 | 2021-03-02 | 东南大学 | Method for classifying thesis multiple labels by using deep neural network |
CN112989070A (en) * | 2020-06-17 | 2021-06-18 | 浙江大学 | Core periodical quantitative evaluation system and method based on computer system |
-
2021
- 2021-09-03 CN CN202111030674.2A patent/CN113837240A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101819601A (en) * | 2010-05-11 | 2010-09-01 | 同方知网(北京)技术有限公司 | Method for automatically classifying academic documents |
US20210012199A1 (en) * | 2019-07-04 | 2021-01-14 | Zhejiang University | Address information feature extraction method based on deep neural network model |
CN110516064A (en) * | 2019-07-11 | 2019-11-29 | 同济大学 | A kind of Aeronautical R&D paper classification method based on deep learning |
CN110990376A (en) * | 2019-11-20 | 2020-04-10 | 中国农业科学院农业信息研究所 | Subject classification automatic indexing method based on multi-factor mixed sorting mechanism |
CN112989070A (en) * | 2020-06-17 | 2021-06-18 | 浙江大学 | Core periodical quantitative evaluation system and method based on computer system |
CN112434159A (en) * | 2020-11-17 | 2021-03-02 | 东南大学 | Method for classifying thesis multiple labels by using deep neural network |
Non-Patent Citations (1)
Title |
---|
郭利敏: "基于卷积神经网络的文献自动分类研究", 图书与情报, no. 6, pages 96 - 103 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108734184B (en) | Method and device for analyzing sensitive image | |
CN112632989B (en) | Method, device and equipment for prompting risk information in contract text | |
CN110335653A (en) | Non-standard case history analytic method based on openEHR case history format | |
CN109872162A (en) | A kind of air control classifying identification method and system handling customer complaint information | |
CN109492105B (en) | Text emotion classification method based on multi-feature ensemble learning | |
CN104216876A (en) | Informative text filter method and system | |
CN112035675A (en) | Medical text labeling method, device, equipment and storage medium | |
CN108363691A (en) | A kind of field term identifying system and method for 95598 work order of electric power | |
CN109213853A (en) | A kind of Chinese community's question and answer cross-module state search method based on CCA algorithm | |
CN112380848B (en) | Text generation method, device, equipment and storage medium | |
CN111143531A (en) | Question-answer pair construction method, system, device and computer readable storage medium | |
CN110968664A (en) | Document retrieval method, device, equipment and medium | |
CN111930937A (en) | BERT-based intelligent government affair text multi-classification method and system | |
CN111813933A (en) | Automatic identification method for technical field in technical atlas | |
CN110110087A (en) | A kind of Feature Engineering method for Law Text classification based on two classifiers | |
CN113742469A (en) | Pipeline processing and ES storage based question-answering system construction method | |
CN113515699A (en) | Information recommendation method and device, computer-readable storage medium and processor | |
CN106709824B (en) | Building evaluation method based on semantic analysis of web text | |
CN111191029B (en) | AC construction method based on supervised learning and text classification | |
CN116976321A (en) | Text processing method, apparatus, computer device, storage medium, and program product | |
CN113837240A (en) | Classification system and classification method for education department | |
CN110472032A (en) | More classification intelligent answer search methods of medical custom entities word part of speech label | |
CN110414819B (en) | Work order scoring method | |
CN113722421A (en) | Contract auditing method and system and computer readable storage medium | |
CN113220850B (en) | Case image mining method for court trial and reading |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |