WO2020147238A1

WO2020147238A1 - Keyword determination method, automatic scoring method, apparatus and device, and medium

Info

Publication number: WO2020147238A1
Application number: PCT/CN2019/088544
Authority: WO
Inventors: 金戈; 徐亮
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-01-18
Filing date: 2019-05-27
Publication date: 2020-07-23
Also published as: CN109829155B; CN109829155A

Abstract

A keyword determination method, an automatic scoring method, apparatus and device, and a medium. The keyword determination method comprises: acquiring first sample answer data, and performing word segmentation processing and summarization on sample answer information in the first sample answer data to obtain a set of segmented sample words; performing feature conversion on the sample answer information to obtain a sample training feature; training a decision tree model according to the sample training feature and a first score to obtain a decision tree sample model; and extracting a sample keyword from the decision tree sample model. The automatic scoring method comprises: extracting a keyword from answer information to be scored to obtain a core keyword; and performing feature conversion on the core keyword by means of a target examination point to obtain an examination point feature to be scored, and then inputting same into a decision tree reference model to obtain an accurate score of the answer information to be scored. Thus, the keyword generalization capability and accuracy are improved; and efficient and accurate scoring of answer content of an examinee is realized.

Description

Keyword determination method, automatic scoring method, device, equipment and medium

This application is based on the Chinese invention patent application filed on January 18, 2019 with the application number 201910049180.5, titled "a method for determining keywords, automatic scoring methods, devices, equipment and storage media", and requires priority right.

Technical field

This application relates to the field of intelligent decision-making, and in particular to a method for determining keywords, an automatic scoring method, device, equipment, and storage medium.

Background technique

With the development of society, competition has become greater and greater, and examinations have gradually become a conventional means of measuring how much knowledge and skills a person has learned. Therefore, a series of systems for scoring test takers’ answer content also follow the examination. Prevail and develop. With the development of computer technology, fully automatic computer online scoring and real-time scoring of test takers’ objective questions can be achieved. However, subjective questions have certain randomness and memory elements. If the same scoring method is used, a computer Candidates' subjective questions are scored for the answer content, it is very easy to make misjudgments or errors. In addition, if manual scoring is used, when the number of candidates is large, the workload of manual scoring will become very large and the operation will become very difficult. At present, the method of scoring the content of the test takers’ subjective questions is usually to manually establish the content of the test site and related keywords through the information of the grading rules in advance, and then identify the content of the test based on the content of the test site and related keywords through the regular matching method to identify the content of the test, which is The content of the answer is scored. However, only through the grading rule information without considering the answers of other candidates to the same subjective question, the determined test sites and related keywords not only have low generalization ability, but also have low accuracy. Therefore, the final grading results obtained when grading the test takers’ answer content will be biased, which cannot reflect the real level of the examinees.

Summary of the invention

The embodiments of the present application provide a method, device, device, and storage medium for determining keywords to solve the problem of low keyword generalization ability and low accuracy.

The embodiments of the present application provide an automatic scoring method, device, equipment, and storage medium to solve the problem that the test taker’s answer content cannot be efficiently and accurately scored.

A method for determining keywords, including:

Acquiring N first sample answer data, each of the first sample answer data includes sample answer information and a first score value, and N is a positive integer;

Performing word segmentation processing on the sample answer information of each of the first sample answer data to obtain sample word segmentation of each of the first sample answer data;

Summarize the sample word segmentation of each of the first sample answer data to obtain a sample word segmentation set;

Using the sample word segmentation set to perform feature conversion on the sample answer information of each of the first sample answer data to obtain sample training features;

Training the decision tree model according to the sample training feature and the corresponding first score value to obtain the decision tree sample model;

Extract sample keywords from the decision tree sample model.

An automatic scoring method including:

Get information about the answer to be graded;

Perform keyword extraction on the answer information to be scored to obtain core keywords;

Use the target test site to perform feature transformation on the core keywords to obtain the features of the test site to be scored; wherein, the target test site is obtained by using the keyword determination method of claim 2;

The characteristics of the test point to be scored are input into a preset decision tree reference model to obtain the accurate score of the answer information to be scored.

A keyword determining device includes:

The first sample answer data acquisition module is used to acquire N first sample answer data, each of the first sample answer data includes sample answer information and a first score value, and N is a positive integer;

The word segmentation processing module is configured to perform word segmentation processing on the sample answer information of each of the first sample answer data to obtain the sample word segmentation of each of the first sample answer data;

The total vocabulary segmentation module is used to summarize the sample segmentation of each of the first sample answer data to obtain a sample segmentation set;

A sample feature conversion module, configured to use the sample word segmentation set to perform feature conversion on the sample answer information of each of the first sample answer data to obtain sample training features;

The decision tree sample model training module is used to train the decision tree model according to the sample training feature and the corresponding first score value to obtain the decision tree sample model;

The sample keyword extraction module is used to extract sample keywords from the decision tree sample model.

An automatic scoring device, including:

To-be-graded answer information acquisition module, used to obtain the to-be-graded answer information;

The keyword extraction module is used to extract keywords from the answer information to be scored to obtain core keywords;

The feature conversion module of the test point to be scored is used to transform the core keywords with the target test point to obtain the feature of the test point to be scored; wherein, the target test point is obtained by using the method for determining keywords according to claim 2;

The input module is used to input the characteristics of the test site to be scored into a preset decision tree reference model to obtain an accurate score of the answer information to be scored.

A computer device including a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor realizes the determination of the above-mentioned keywords when the computer-readable instructions are executed The steps of the method or the steps of the automatic scoring method described above are implemented when the processor executes the computer-readable instructions.

A computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, the computer-readable instructions are executed by a processor to achieve the steps of the method for determining keywords, or the computer-readable instructions The steps of the above-mentioned automatic scoring method are realized when executed by the processor.

The details of one or more embodiments of the present application are set forth in the following drawings and description, and other features and advantages of the present application will become apparent from the description, drawings, and claims.

BRIEF DESCRIPTION

In order to more clearly explain the technical solutions of the embodiments of the present application, the following will briefly introduce the drawings required in the description of the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application For those of ordinary skill in the art, without paying creative labor, other drawings can also be obtained based on these drawings.

FIG. 1 is a schematic diagram of an application environment of a method for determining keywords or an automatic scoring method in an embodiment of the present application;

2 is an example diagram of a method for determining keywords in an embodiment of the present application;

FIG. 3 is another example diagram of a method for determining keywords in an embodiment of the present application;

FIG. 4 is a functional block diagram of a keyword determining device in an embodiment of the present application;

FIG. 5 is another principle block diagram of an apparatus for determining keywords in an embodiment of the present application;

Fig. 6 is an example diagram of an automatic scoring method in an embodiment of the present application;

FIG. 7 is another example diagram of an automatic scoring method in an embodiment of the present application;

FIG. 8 is another example diagram of an automatic scoring method in an embodiment of the present application;

Fig. 9 is a functional block diagram of an automatic scoring device in an embodiment of the present application;

Fig. 10 is a schematic diagram of a computer device in an embodiment of the present application.

detailed description

The technical solutions in the embodiments of the present application will be described clearly and completely below with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, but not all the embodiments. Based on the embodiments in the present application, all other embodiments obtained by a person of ordinary skill in the art without creative work fall within the scope of protection of the present application.

The embodiment of the present application provides a method for determining keywords, and the method for determining keywords can be applied to the application environment shown in FIG. 1. Specifically, the keyword determination method is applied in a keyword determination system. The keyword determination system includes the client and server as shown in Figure 1. The client and server communicate through the network for solving The problem of low generalization ability and low accuracy of keywords at the test site determined according to the scoring rule information. Among them, the client is also called the client, which refers to the program that corresponds to the server and provides local services to the client. The client can be installed on but not limited to various personal computers, notebook computers, smart phones, tablets and portable wearable devices. The server can be implemented with an independent server or a server cluster composed of multiple servers.

In an embodiment, as shown in FIG. 2, a method for determining keywords is provided. The method is applied to the server in FIG. 1 as an example for description, including the following steps:

S11: Acquire N first sample answer data, each first sample answer data includes sample answer information and a first score value, and N is a positive integer.

Among them, the first sample answer data refers to the test taker's answer data. Each first sample answer data includes sample answer information and a corresponding first score value, that is, the first sample answer data includes sample answer information and a corresponding first score value obtained after preliminary scoring of the sample answer information. Among them, the sample answer information refers to the candidate's answer information of a certain subjective question obtained from the answer text of the scoring system. Optionally, the first sample answer data can be obtained from a scoring system. The scoring system can perform preliminary scoring on sample answer information and obtain the first scoring value. Preferably, the answer information written by the examinee on the paper answer sheet can also be obtained in advance, and then the answer information written by the examinee on the paper answer sheet is scanned and recognized, and the corresponding answer text is generated and submitted to the grading system to obtain sample answers information. The first scoring value refers to the scoring value obtained after preliminary scoring of the sample answer information by manual scoring or computer scoring. The first sample answer data may also be obtained by scanning and identifying the answer information written on the paper answer sheet and manual scoring. In addition, the first sample answer data may include a sample answer information and a first scoring value obtained after preliminary grading of the sample answer information, and may also include multiple sample answer information and preliminary results for each sample answer information. Multiple corresponding first score values obtained after scoring.

The number of the first sample answer data obtained is N, where N is a positive integer. The specific value of N can be set according to actual needs. The higher the value of N, the higher the accuracy of subsequent sample keyword extraction, but the extraction efficiency will decrease, and the selection of N can be comprehensively considered in terms of accuracy and efficiency.

S12: Perform word segmentation processing on sample answer information of each first sample answer data to obtain sample segmentation of each first sample answer data.

Among them, the sample word segmentation refers to the individual word segmentation obtained after word segmentation processing is performed on the sample answer information of each first sample answer data. Specifically, performing word segmentation processing on the sample answer information of each first sample answer data includes: first adopting a word segmentation algorithm to perform vocabulary splitting on the sample answer information of each first sample answer data. Optionally, the word segmentation algorithm may adopt a word segmentation algorithm based on string matching, or a word segmentation algorithm based on understanding, or a word segmentation algorithm based on statistics. Preferably, the automatic split function of the sample answer information of each first sample answer data can also be realized through the split function of the Java language, or by importing the sample answer information into the computer's EXCEL or PPT and other software with automatic character splitting function. Split. Then filter the split sample answer information by using Java language regular expressions to filter out some specific words that have no meaning, such as: auxiliary words, modal particles or conjunctions, etc.; finally each first sample answer is obtained Sample segmentation of data.

S13: Summarize the sample word segmentation of each first sample answer data to obtain a sample word segmentation set.

Among them, the sample word segmentation set refers to the word segmentation set obtained by uniformly summarizing the sample word segmentation of each first sample answer data. Specifically, the sample word segmentation of each first sample answer data is obtained, and then the sample word segmentation of each first sample answer data obtained is summarized to obtain a sample word segmentation set. Preferably, if each first sample answer data contains multiple sample answer information, when the sample segmentation of each first sample answer data is summarized, the sample of each first sample answer data The answer information is summarized in units, that is, the sample word segmentation set is corresponding to each sample answer information.

Specifically, summarizing the sample word segmentation of each first sample answer data includes: obtaining the sample word segmentation of the sample answer information in each first sample answer data, and then assign each obtained word in the order from smallest to largest All the sample word segmentation in this answer information are assigned corresponding identification numbers, and finally the sample word segmentation set distributed in the order from small to large is obtained. For example: the sample word segmentation set is E={e ₁ ,e ₂ ,e ₃ ,……, e _r }, where e ₁ , e ₂ , e ₃ ,……, e _r represents the sample contained in the sample word segmentation set Word segmentation, 1, 2, 3...r represents the identification number corresponding to each sample segmentation.

Preferably, if there are repeated sample word segments in the sample word segmentation of each first sample answer data obtained, before the sample word segmentation of each first sample answer data is summarized, the first sample answer data The sample word segmentation for deduplication is performed, and then the sample word segmentation of each first sample answer data after deduplication is summarized to obtain a sample word segmentation set. Specifically, the Count function, the Editor editor, or the R language can be used to de-duplicate the sample word segmentation of each first sample answer data. Preferably, the sample word segmentation of each first sample answer data can also be directly imported into the computer's EXCEL table, and the automatic deduplication of the sample word segmentation can be realized through the advanced screening function of EXCEL.

S14: Use the sample word segmentation set to perform feature transformation on the sample answer information of each first sample answer data to obtain sample training features.

Among them, the sample training feature refers to the result output after the feature conversion of the sample answer information of each first sample answer data. Specifically, by establishing a bag-of-words model, a sample word segmentation set is used to transform the sample answer information of each first sample answer data to obtain sample training features. In this embodiment, the bag-of-words model refers to the specific situation that the sample answer information of each first sample answer data appears in the sample word segmentation set. Specifically, the establishment of the bag-of-words model can be achieved by using the CountVectorizer in SKLearn. Among them, CountVectorizer is a common method of feature value calculation. For each training text, CountVectorizer only considers the frequency of each vocabulary in the training text. CountVectorizer can convert a document into a vector by counting, train the extracted vocabulary, and generate a CountVectorizerModel to store the corresponding Vocabulary vector space.

Specifically, using the sample word segmentation set to perform feature transformation on the sample answer information of each first sample answer data includes: first establishing a set of word vectors based on the number of sample word segmentation sets, and then using the regular matching method to transform each The sample answer information of the first sample answer data is matched with all the sample word segmentation in the sample word segmentation set; if the sample answer information of the first sample answer data matches the sample word segmentation in the sample word segmentation set successfully, the corresponding element in the word vector The value is 1. If the sample answer information of the first sample answer data does not match the sample word segmentation in the sample word segmentation set, the corresponding element in the word vector is 0, and finally a set of word vectors composed of a number of 1s and 0s is obtained , Which is the sample training feature.

Exemplarily, if a sample segmentation set containing five sample segmentation words B ₁ , B ₂ , C ₁ , C ₂ , C ₃ and two sample answer information of B and C are obtained; sample answer information B contains B ₁ , B ₂ two word segmentation, sample answer information C contains C ₁ , C ₂ , C ₃ three word segmentation; then use this sample word segmentation set to transform the sample answer information B, the sample training feature is obtained as [1,1, 0,0,0]; After using the sample word segmentation set to transform the sample answer information C, the sample training feature is obtained as [0,0,1,1,1].

Among them, the regular matching method is used to test the application of regular expressions. Among them, the regular expression is a logical formula for the operation of strings or special characters, which refers to the use of predefined specific characters and combinations of these specific characters , Compose a "rule string", this "rule string" is used to express a kind of filtering logic on the string. A regular expression is a text pattern that describes one or more strings to be matched when searching for text.

S15: Train the decision tree model according to the sample training characteristics and the corresponding first score value to obtain the decision tree sample model.

Among them, the decision tree sample model refers to a sample model generated after training the decision tree model based on the characteristics of the bag-of-words model according to the sample training characteristics and the corresponding first score value. Specifically, the establishment process of the decision tree sample model includes: input the sample training features and the corresponding first score value into the decision tree model, and then train the decision tree model by using the C4.5 algorithm to generate the trained decision tree Sample model. The C4.5 algorithm is a series of algorithms used in machine learning and data mining classification problems. The goal of the C4.5 algorithm is supervised learning. Given a data set, each tuple in it can be described by a set of attribute values, and each tuple belongs to a certain category in a mutually exclusive category. The C4.5 algorithm can find a mapping relationship from attribute values to categories through learning, and this mapping can be used to classify new entities with unknown categories.

Further, before establishing the decision tree sample model, it is necessary to confirm the size of the decision tree sample model, where the size of the decision tree sample model is determined by the depth of the decision tree and the number of node samples. Optionally, in this implementation, in order to ensure that the established decision tree sample model does not appear over-fitting and to ensure the accuracy of the decision tree sample model, the maximum depth of the decision tree is set to 5, and the minimum number of leaf node samples is set Is 50 and the classification standard is entropy.

S16: Extract sample keywords from the decision tree sample model.

Among them, the sample keyword refers to the characteristic attribute value corresponding to each output node of the decision tree sample model. Specifically, the extraction of sample keywords is also called the feature value extraction of the decision tree sample model. Since each feature of the decision tree sample model belongs to the decision attribute in the decision sample model, each feature value of the decision tree sample model corresponds to the branch of the decision attribute in the decision sample model. Understandably, the output node of each branch in the decision tree sample model has a corresponding sample keyword.

Specifically, extracting sample keywords from the decision tree sample model can be achieved by first reading the decision tree sample model as a sourcable object, then coding the decision tree sample model through the tosource method, and then obtaining the decision tree sample model by analyzing the code structure The output sample keywords are finally extracted.

In this embodiment, by obtaining N first sample answer data, each first sample answer data includes sample answer information and a first score value, and segmentation is performed on the sample answer information of each first sample answer data Process, get the sample word segmentation set, and then use the sample word segmentation set to perform feature transformation on the sample answer information of each first sample answer data to obtain the sample training features, and then make the decision tree model based on the sample training features and the corresponding first score value Train to obtain the decision tree sample model, and finally extract the sample keywords from the decision tree sample model, which can not only improve the generalization ability and accuracy of keywords in the test site, ensure that the keywords are more comprehensive, but also improve the accuracy of subsequent scoring.

In one embodiment, as shown in FIG. 3, after the sample keywords are extracted from the decision tree sample model, the method for determining the keywords further includes the following steps:

S17: Obtain scoring rule information, where the scoring rule information includes preset test sites and preset keywords corresponding to each preset test site.

Among them, the scoring rule information refers to the basic scoring basis provided by the business party, including preset test sites and preset keywords corresponding to each preset test site. The preset test point refers to the knowledge point provided by the business side to judge whether the test taker’s answer information is correct. The preset test sites include the wrong test sites for judging candidates' wrong answers and the correct test sites for judging candidates' correct answers. Understandably, the scoring rule information is a preliminary scoring standard, and there may be a problem that the keywords are not accurate or comprehensive. Optionally, the preset test site can be a word, a sentence, or a paragraph. In addition, in this implementation, in order to facilitate the distinction between different preset test sites, each preset test site may be given a different mark in advance. Specifically, the identifier corresponding to each preset test site may be represented by at least one of Arabic numerals, English capital letters, or English lowercase letters. Each preset test site contains corresponding preset keywords. The preset keywords refer to words that are extracted from the preset test sites and can be directly used for rule quantification. Understandably, a preset test site contains at least one preset keyword. For example, the preset test site 1 is: Du Fu is a great realist poet in the Tang Dynasty; the preset keywords corresponding to the preset test site 1 can be "Du Fu", "Tang Dynasty", "realism" and "poet".

S18: Remove keywords that are repeated with preset keywords from the sample keywords to obtain target keywords.

Among them, the target keywords refer to keywords extracted from the sample keywords that are different from the preset keywords. Specifically, to remove keywords that overlap with preset keywords from the sample keywords, you can compare the sample keywords with the preset keywords one by one by using the character comparison function in C++, and then remove the keywords with the preset keywords according to the comparison result. Preset sample keywords with the same keywords, and finally extract the remaining sample keywords that are different from the preset keywords as target keywords.

S19: Send the target keyword to the client, and obtain the test center label returned by the client according to the target keyword.

Wherein, the test site label refers to a label assigned a corresponding identification number to the acquired target keyword according to a preset test site. Specifically, after sending the target keywords to the client, the user can analyze the acquired target keywords, and then assign each target keyword the same identification number as the corresponding preset test site according to the preset test site, to obtain the test site The label is sent to the server. Preferably, the test center label corresponding to each target keyword may be uniformly generated and then sent to the server.

S20: Add each target keyword to the corresponding preset test center according to the test center label to obtain the target test center.

Among them, the target test site refers to the test site after adding the target keywords. Specifically, after the server receives the test center label text sent from the client, it adds each target keyword to the corresponding preset test center with the same identification number according to the identification number corresponding to each target keyword in the test center label text . Understandably, the keywords contained in the target test site are richer and more comprehensive than the keywords contained in the preset test site.

Exemplarily, if there are preset test site 1 and preset test site 2, preset test site 1 includes three preset keywords a ₁ , a ₂ , and a ₃ , and preset test site 2 includes three preset keywords, respectively B ₁ , b ₂ , b ₃ , the target keywords obtained in step S18 are a ₄ , a ₅ , b ₄ , b ₅ , and the target keywords a ₄ , a _{5 are} assigned to the test site label as 1, and the target key The words b ₄ , b _{5 are} assigned to the test site label as 2. According to the test site label, a ₄ , a _{5 are} added to the default test site 1, and b ₄ , b _{5 are} added to the default test site 2; finally the target test site 1 is obtained The keywords included are a ₁ , a ₂ , a ₃ , a ₄ , and a ₅ , and the keywords included in the target test site 2 are b ₁ , b ₂ , b ₃ , b ₄ , and b ₅ .

In this embodiment, by obtaining the scoring rule information, the scoring rule information includes the preset test sites and the preset keywords corresponding to each preset test site. The keywords that are repeated with the preset keywords are removed from the sample keywords to obtain the target Keywords, send the target keywords to the client, and then obtain the test center tags returned by the client according to the target keywords, and finally add each target keyword to the corresponding preset test center according to the test center tags to obtain the target test center; further enriched Keywords contained in the test site determined according to the scoring rule information.

It should be understood that the size of the sequence numbers of the steps in the above embodiments does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.

In one embodiment, a keyword determining device is provided, and the keyword determining device corresponds to the keyword determining method in the foregoing embodiment in a one-to-one correspondence. As shown in Figure 4, the keyword determination device includes a first sample answer data acquisition module 11, a word segmentation processing module 12, a total vocabulary module 13, a sample feature conversion module 14, a decision tree sample model training module 15 and a sample key Word extraction module 16. The detailed description of each functional module is as follows:

The first sample answer data acquisition module 11 is used to acquire N first sample answer data, each first sample answer data includes sample answer information and a first score value, and N is a positive integer;

The word segmentation processing module 12 is used to perform word segmentation processing on the sample answer information of each first sample answer data to obtain the sample word segmentation of each first sample answer data;

The total vocabulary module 13 is used to summarize the sample word segmentation of each first sample answer data to obtain a sample word segmentation set;

The sample feature conversion module 14 is used to use the sample word segmentation set to perform feature conversion on the sample answer information of each first sample answer data to obtain sample training features;

The decision tree sample model training module 15 is used to train the decision tree model according to the sample training characteristics and the corresponding first score value to obtain the decision tree sample model;

The sample keyword extraction module 16 is used to extract sample keywords from the decision tree sample model.

Preferably, as shown in Fig. 5, the keyword determining device further includes:

The scoring rule information obtaining module 17 is used to obtain scoring rule information, the scoring rule information includes preset test sites and preset keywords corresponding to each preset test site;

The repetitive keyword removal module 18 is used to remove keywords that are repeated with preset keywords from the sample keywords to obtain target keywords;

The test center label obtaining module 19 is used to send the target keyword to the client, and obtain the test center label returned by the client according to the target keyword;

The target keyword adding module 20 is used to add each target keyword to the corresponding preset test site according to the test site tag to obtain the target test site.

For the specific definition of the means for determining keywords, please refer to the above limitation on the method for determining keywords, which will not be repeated here. The various modules in the device for determining the above keywords can be implemented in whole or in part by software, hardware, and combinations thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.

The embodiment of the present application also provides an automatic scoring method, which can be applied in the application environment shown in FIG. 1. Specifically, the automatic scoring method is applied in an automatic scoring system. The automatic scoring system includes a client and a server as shown in FIG. 1. The client and the server communicate through the network to solve the problem of the test taker’s answer. Make efficient and accurate scoring questions. Among them, the client is also called the client, which refers to the program that corresponds to the server and provides local services to the client. The client can be installed on but not limited to various personal computers, notebook computers, smart phones, tablets and portable wearable devices. The server can be implemented with an independent server or a server cluster composed of multiple servers.

In an embodiment, as shown in FIG. 6, an automatic scoring method is provided, and the method is applied to the server in FIG. 1 as an example for description, including the following steps:

S21: Obtain the answer information to be graded.

Among them, the answer information to be graded refers to the answer information obtained from the test taker's answer text. Specifically, to obtain the answer information to be graded, the answer information of any examinee can be obtained directly from the answer text of the grading system, or the answer information written by any examinee on the paper answer sheet is scanned and recognized.

S22: Perform keyword extraction on the response information to be scored to obtain core keywords.

Among them, the core keywords refer to individual keywords extracted from the obtained answer information to be scored. Specifically, performing keyword extraction on the response information to be scored includes: first adopting a word segmentation algorithm to perform vocabulary splitting on the response information to be scored. Optionally, the word segmentation algorithm may adopt a word segmentation algorithm based on string matching, or a word segmentation algorithm based on understanding, or a word segmentation algorithm based on statistics. Preferably, the automatic splitting of the response information to be scored can also be realized through the split function of the Java language, or by importing the response information to be scored into the computer's EXCEL or PPT software with automatic character splitting function. Then the regular expressions of the Java language are used to filter the split answering information to be scored, and some specific words that have no meaning, such as auxiliary words, modal particles, or conjunctions, are filtered out. Finally, the words obtained after screening are extracted as core keywords. In this embodiment, the number of core keywords should be no less than one.

S23: Use the target test site to perform feature transformation on the core keywords to obtain the features of the test site to be scored; wherein the target test site is obtained by using the above keyword determination method.

Among them, the feature of the test site to be scored refers to a feature that measures the similarity between the core keywords and the keywords in the target test site. In this step, the target test point is obtained by using the method for determining keywords in the above embodiment.

The use of target test sites to perform feature transformation of core keywords includes: first establish a set of test site vectors based on the number of target test sites, and then match each core keyword with the keywords in the target test site through the regular matching method. Matching results, to determine whether the core keywords match the target test site. Specifically, judging whether the core keyword matches the target test site can be judged according to the degree of matching between the core keyword and the keywords contained in the target test site. It can be that as long as the core keyword matches any one of the keywords in the target test site, the core keyword is considered to match the corresponding target test site, or it can be that the core keyword matches at least two keywords in the corresponding target test site. Match, it is considered that the core keyword matches the corresponding target test site. The specific settings can be customized according to the actual situation. Preferably, if the core keyword matches any keyword in the target test site successfully, it means that the core keyword matches the target test site, and the corresponding element value in the test site vector is 1, if the core keyword matches the target test site If none of the keywords in the test site match, it means that the core keyword fails to match the target test site, and the corresponding element value in the test site vector is 0. Finally, a set of test site vectors consisting of a number of 1s and 0s is obtained, that is, the characteristics of test sites to be scored.

S24: Input the characteristics of the test point to be scored into the preset decision tree reference model to obtain the accurate score of the answer information to be scored.

Among them, the accurate score refers to the score obtained by training the decision tree reference model for the answer information to be scored. In this embodiment, the decision tree reference model is pre-established and stored in the backend database of the server. After step S23 is executed and the characteristics of the test points to be scored are obtained, it can be directly retrieved from the database of the server.

Among them, the decision tree reference model refers to the decision tree to obtain the probability that the expected value of the net present value is greater than or equal to zero based on the known probability of various situations. It belongs to a tree structure, in which each internal node Represents a test on an attribute, each branch represents a test output, and each leaf node represents a category.

In this embodiment, by obtaining the answer information to be scored, keyword extraction is performed on the answer information to be scored to obtain the core keywords, and then the target test site is used to transform the core keywords to obtain the characteristics of the test site to be scored; where the target test site is It is obtained by using the above keyword determination method; finally, the characteristics of the test points to be scored are input into the preset decision tree reference model to obtain the accurate score of the answer information to be scored; efficient and accurate scoring of the candidate's answer information is realized.

In one embodiment, as shown in FIG. 7, using target test sites to perform feature transformation on core keywords to obtain the features of test sites to be scored includes the following steps:

S231: Obtain valid keywords corresponding to the target test site.

Among them, effective keywords refer to all the keywords contained in the target test site. Specifically, according to the above keyword determination method, it can be known that the keywords corresponding to each target test site have been determined. Therefore, the effective keywords corresponding to the target test site can be obtained directly according to each target test site and from each target test site. Obtain the corresponding valid keywords in.

S232: Through the regular matching method, the effective keywords are matched with the core keywords one by one to obtain keyword matching information.

Specifically, through the regular matching method, the one-to-one matching of valid keywords with core keywords refers to defining valid keywords as specific characters, and then combining these specific characters composed of valid keywords into a "rule string", using To express a filtering logic for core keywords, so as to match the core keywords corresponding to the effective keywords, and obtain keyword matching information.

Among them, the keyword matching information refers to the matching result obtained after matching the effective keyword with the core keyword, including matching success and matching failure. Specifically, through the regular matching method, the effective keywords are matched with the core keywords one by one, and the corresponding keyword matching information is obtained according to the matching result. For example: if 10 core keywords and 5 effective keywords are obtained, any core keyword is extracted, and the 5 effective keywords obtained are matched one by one through the regular matching method. During the matching process, as long as the core keyword If the keyword matches any one of the obtained 5 effective keywords, it means the matching is successful. If the core keyword does not match the obtained 5 effective keywords, it means the matching failed; according to the above steps Extract the core keywords one by one, and use the regular matching method to match the extracted core keywords with the obtained 5 effective keywords one by one, until the obtained 10 core keywords are matched with the obtained 5 effective keywords one by one Complete, finally get keyword matching information.

S233: Assign a corresponding matching identifier to each core keyword according to the keyword matching information.

Among them, the matching identifier refers to a type of identifier assigned to each core keyword according to the keyword matching information, which can be Arabic numerals, uppercase letters, or lowercase letters. Specifically, the matching identifier reflects the matching situation between the core keyword and the target keyword. In addition, after the core keyword and the effective keyword are successfully matched, the test point corresponding to the effective keyword needs to be clarified. Therefore, when the core keyword that successfully matches the effective keyword is assigned a matching mark, the mark The test site identification corresponding to the valid keyword. This scheme does not impose any restrictions on the specific matching identification. Preferably, in order to facilitate the identification of the feature of the test site to be scored later, the core keyword that successfully matches the effective keyword is assigned a capital letter logo and a corresponding test site logo, for example, A1, and a capital letter A indicates a successful match with the effective keyword , 1 represents the test site identifier corresponding to the valid keyword; the core keyword that fails to match the valid keyword is only given a lowercase letter identifier, for example, a, and the lowercase letter a indicates that the valid keyword fails to match.

S234: Obtain the feature of the test site to be scored according to the matching identifier of each core keyword.

Specifically, according to the matching identifier of each core keyword, it is determined whether the core keyword matches the corresponding target test site successfully. If the core keyword matches the target test site successfully, the corresponding element value in the test site vector is 0, if If the core keyword fails to match the target test site, the corresponding element value in the test site vector is 0, and finally a set of test site vectors composed of a number of 1s and 0s is obtained, that is, the test site features to be scored.

Exemplarily, if there are 6 target test sites, each target test site contains at least 1 valid keyword and 5 core keywords; according to the regular matching method, the 5 core keywords are combined with the effective key of the target test site After the words are matched one by one, only the first three core keywords are successfully matched with the target test site, and the test site feature to be scored is [1,1,1,0,0,0].

In this embodiment, the effective keywords corresponding to the target test site are obtained; the effective keywords are matched with the core keywords one by one through the regular matching method to obtain keyword matching information; then according to the keyword matching information, each A core keyword is assigned a corresponding matching identifier, and finally, according to the matching identifier of each core keyword, the characteristics of the test site to be scored are obtained; further ensuring the accuracy and effectiveness of the newly added test site keywords.

In one embodiment, as shown in FIG. 8, before the features of the test points to be scored are input into the preset decision tree reference model to obtain the output score of the answer information to be scored, the automatic scoring method further includes:

S241: Acquire M second sample answer data, each second sample answer data includes original answer information and a second score value, and M is a positive integer.

Among them, the second sample answer data refers to the test taker's answer data. Each second sample answer data includes original answer information and a second score value; that is, the second sample answer data includes original answer information and a second score value obtained after preliminary grading of the original answer information. Optionally, the second sample answer data can be obtained from a scoring system. The scoring system can perform preliminary scoring on the original answer information and obtain the second scoring value. Among them, the original answer information refers to the candidate's answer information of a certain subjective question obtained from the answer text of the scoring system. The second scoring value refers to the scoring value obtained by preliminary scoring the original answer information in advance by means of manual scoring or computer scoring.

The number of the second sample answer data obtained is M, where M is a positive integer. The specific value of M can be set according to actual needs. The higher the value of M, the higher the accuracy of the subsequent decision tree reference model, but the extraction efficiency will decrease. The accuracy and efficiency can be comprehensively considered to select M.

S242: Use the target test site to perform feature transformation on the original answer information of each second sample of answer data to obtain training features of the test site.

Among them, the test site training feature refers to a feature that measures the similarity between the target test site and the original answer information of each second sample of answer data. The target test site is obtained by using the above-mentioned keyword determination method.

Specifically, using the target test site to perform feature transformation on the original answer information of each second sample answer data includes: first, based on the number of target test sites, establish a set of empty test site vectors, and then use the synonym word forest semantic code to convert each The original answer information of the second sample of answer data is compared with the target test site; if the original answer information matches any target test site successfully, the corresponding element in the test site vector is 1. If the original answer information matches any target test site If none of them match, the corresponding element value in the test site vector is 0, and finally a set of test site vectors consisting of several 1's and 0's is obtained, that is, the test site training feature. Among them, the synonym word forest semantic code is a method used to calculate the similarity between words.

S243: Combine the test site training features and the corresponding second score value into a test site sample set.

Among them, the test site sample set refers to the sample data to be input into the decision tree model for training; it includes the test site training features and the corresponding second score value. Specifically, the test site sample set is a data set composed of several test site samples, and the test site samples include test site training features and a second score value corresponding to the test site training features. Understandably, each test site training feature is associated with the corresponding second score value.

S244: Train the decision tree model according to the test site sample set to obtain the decision tree reference model.

Among them, the decision tree reference model is a predictive model, which represents a mapping relationship between object attributes and object values. Each node in the decision tree represents an object, and each bifurcation path represents a certain possibility. Each leaf node corresponds to the value of the object represented by the path from the root node to the leaf node. Specifically, the decision tree model is trained according to the test site sample set, and the decision tree reference model is obtained. After the test site training characteristics and the corresponding second score value are input into the decision tree model, the decision tree model is performed by using the C4.5 algorithm Training, generate the trained decision tree sample model.

Preferably, in order to further verify the accuracy of the decision tree reference model, the test site sample set is divided into a training set for modeling and a test set for verifying the effect of the model. Among them, the training set refers to the data set used to build the decision tree sample model. The test set refers to the data set used to verify the effect of the established decision tree sample model. The test site sample set can be divided into training set and test set by randomly dividing the data set or cross-checking method; the ratio of training set and test set after division can be: training set: test set = 6:4, training set : Test set=7:3 or training set: Test set=75:25, etc. Preferably, in order to improve the accuracy of the decision tree sample model, in this step, 75% of the acquired test site sample set is used as the training set, and 25% of the acquired test site sample set is used as the test set.

In this embodiment, by acquiring M second sample answer data, each second sample answer data includes the original answer information and the second score value, and then use the target test site to perform the original answer information of each second sample answer data Feature transformation, the test site training characteristics are obtained, and finally the decision tree model is trained according to the test site training characteristics and the corresponding second score value to obtain the decision tree reference model; further ensuring the accuracy of the score of the candidate’s answer information through the decision tree reference model Sex.

In one embodiment, an automatic scoring device is provided, and the automatic scoring device corresponds to the automatic scoring method in the above-mentioned embodiment one-to-one. As shown in FIG. 9, the automatic scoring device includes a module 21 for obtaining answer information to be scored, a keyword extraction module 22, a feature conversion module 23 for the test site to be scored, and an input module 24. The detailed description of each functional module is as follows:

The answer information obtaining module 21 to be graded is used to obtain answer information to be graded;

The keyword extraction module 22 is used for keyword extraction on the answer information to be scored to obtain core keywords;

The feature conversion module 23 of the test point to be scored is used to transform the core keywords by using the target test point to obtain the feature of the test point to be scored; wherein the target test point is obtained by the keyword determination method;

The input module 24 is used to input the characteristics of the test site to be scored into the preset decision tree reference model to obtain the accurate score of the answer information to be scored.

Preferably, the feature conversion module 23 of the test point to be scored includes:

The effective keyword acquisition unit is used to obtain the effective keywords corresponding to the target test site;

The matching unit is used to match the effective keywords with the core keywords one by one through the regular matching method to obtain keyword matching information;

The allocation unit is used to allocate a corresponding matching identifier for each core keyword according to the keyword matching information;

The obtaining unit is used to obtain the feature of the test point to be scored according to the matching identifier of each core keyword.

Preferably, the input module 24 includes:

The second sample answer data acquisition unit is used to acquire M second sample answer data, each second sample answer data includes original answer information and a second score value, and M is a positive integer;

The test site feature transformation unit is used to use the target test site to perform feature transformation on the original answer information of each second sample answer data to obtain the test site training features;

The constituent unit is used to form the test site sample set by the test site training features and the corresponding second score value;

The decision tree reference model training unit is used to train the decision tree model according to the test site sample set to obtain the decision tree reference model.

For the specific limitation of the automatic scoring device, please refer to the above limitation on the automatic scoring method, which will not be repeated here. Each module in the above-mentioned automatic scoring device can be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.

In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 10. The computer device includes a processor, memory, network interface, and database connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer-readable instructions, and a database. The internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium. The database of the computer device is used to store the data used in the above-mentioned keyword determination method and the above-mentioned automatic scoring method. The network interface of the computer device is used to communicate with external terminals through a network connection. The computer-readable instruction is executed by the processor to implement a method for determining keywords, or the computer-readable instruction is executed by the processor to implement an automatic scoring method.

In one embodiment, a computer device is provided, including a memory, a processor, and computer-readable instructions stored in the memory and running on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:

Extract sample keywords from the decision tree sample model.

Get information about the answer to be graded;

In one embodiment, one or more non-volatile readable storage media storing computer readable instructions, when the computer readable instructions are executed by one or more processors, cause the one or more processing The device performs the following steps:

Extract sample keywords from the decision tree sample model.

Get information about the answer to be graded;

A person of ordinary skill in the art may understand that all or part of the process in the method of the foregoing embodiments may be completed by instructing relevant hardware through computer-readable instructions, and the computer-readable instructions may be stored in a non-volatile computer In a readable storage medium, when the computer-readable instructions are executed, they may include the processes of the foregoing method embodiments. Wherein, any reference to the memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Those skilled in the art can clearly understand that, for convenience and conciseness of description, only the above-mentioned division of each functional unit and module is used as an example for illustration. In practical applications, the above-mentioned functions may be allocated by different functional units, Module completion means that the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above.

The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, persons of ordinary skill in the art should understand that they can still implement the foregoing The technical solutions described in the examples are modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not deviate from the spirit and scope of the technical solutions of the embodiments of the present application. Within the scope of protection of this application.

Claims

A method for determining keywords is characterized by including:

Acquiring N first sample answer data, each of the first sample answer data includes sample answer information and a first score value, and N is a positive integer;

Performing word segmentation processing on the sample answer information of each of the first sample answer data to obtain sample word segmentation of each of the first sample answer data;

Summarize the sample word segmentation of each of the first sample answer data to obtain a sample word segmentation set;

Using the sample word segmentation set to perform feature conversion on the sample answer information of each of the first sample answer data to obtain sample training features;

Training the decision tree model according to the sample training feature and the corresponding first score value to obtain the decision tree sample model;

Extract sample keywords from the decision tree sample model.
5. The method for determining keywords according to claim 1, wherein after said extracting the sample keywords from the decision tree sample model, the method for determining keywords further comprises:

Acquiring scoring rule information, where the scoring rule information includes preset test sites and preset keywords corresponding to each of the preset test sites;

Removing keywords that overlap with the preset keywords from the sample keywords to obtain the target keywords;

Sending the target keyword to the client, and obtaining the test center label returned by the client according to the target keyword;

Add each of the target keywords to the corresponding preset test center according to the test center tag to obtain the target test center.
An automatic scoring method, characterized in that it comprises:

Get information about the answer to be graded;

Perform keyword extraction on the answer information to be scored to obtain core keywords;

Use the target test site to perform feature transformation on the core keywords to obtain the features of the test site to be scored; wherein, the target test site is obtained by using the keyword determination method of claim 2;

The characteristics of the test point to be scored are input into a preset decision tree reference model to obtain the accurate score of the answer information to be scored.
5. The automatic scoring method according to claim 3, wherein said adopting the target test site to perform feature conversion on the core keywords to obtain the features of the test site to be scored comprises:

Obtain valid keywords corresponding to the target test site;

Through the regular matching method, the effective keywords are matched with the core keywords one by one to obtain keyword matching information;

According to the keyword matching information, assign a corresponding matching identifier to each of the core keywords;

According to the matching identifier of each of the core keywords, the characteristics of the test points to be scored are obtained.
The automatic scoring method according to claim 3, characterized in that, before the feature of the test point to be scored is input into a preset decision tree reference model to obtain the output score of the answer information to be scored, the automatic The scoring method also includes:

Acquiring M second sample answer data, each of the second sample answer data includes original answer information and a second score value, and M is a positive integer;

Using the target test site to perform feature transformation on the original answer information of each of the second sample answer data to obtain test site training features;

Forming the test site training feature and the corresponding second score value into a test site sample set;

The decision tree model is trained according to the test site sample set to obtain the decision tree reference model.
A keyword determining device is characterized in that it includes:

The first sample answer data acquisition module is used to acquire N first sample answer data, each of the first sample answer data includes sample answer information and a first score value, and N is a positive integer;

The word segmentation processing module is configured to perform word segmentation processing on the sample answer information of each of the first sample answer data to obtain the sample word segmentation of each of the first sample answer data;

The total vocabulary segmentation module is used to summarize the sample segmentation of each of the first sample answer data to obtain a sample segmentation set;

A sample feature conversion module, configured to use the sample word segmentation set to perform feature conversion on the sample answer information of each of the first sample answer data to obtain sample training features;

The decision tree sample model training module is used to train the decision tree model according to the sample training feature and the corresponding first score value to obtain the decision tree sample model;

The sample keyword extraction module is used to extract sample keywords from the decision tree sample model.
A keyword determining device is characterized in that it further includes:

The scoring rule information obtaining module is used to obtain scoring rule information, where the scoring rule information includes preset test sites and preset keywords corresponding to each of the preset test sites;

The repeated keyword removal module is used to remove keywords that are repeated with the preset keywords from the sample keywords to obtain target keywords;

The test center label acquisition module, configured to send the target keyword to the client, and obtain the test center label returned by the client according to the target keyword;

The target keyword adding module is used to add each target keyword to the corresponding preset test site according to the test site tag to obtain a target test site.
An automatic scoring device, characterized by comprising:

To-be-graded answer information acquisition module, used to obtain the to-be-graded answer information;

The keyword extraction module is used to extract keywords from the answer information to be scored to obtain core keywords;

The feature conversion module of the test point to be scored is used to transform the core keywords with the target test point to obtain the feature of the test point to be scored; wherein, the target test point is obtained by using the method for determining keywords according to claim 2;

The input module is used to input the characteristics of the test site to be scored into a preset decision tree reference model to obtain an accurate score of the answer information to be scored.
8. The automatic scoring device according to claim 8, wherein the feature conversion module of the test point to be scored comprises:

The effective keyword acquisition unit is used to acquire the effective keywords corresponding to the target test site;

The matching unit is used to match the effective keywords with the core keywords one by one through a regular matching method to obtain keyword matching information;

An allocation unit, configured to allocate a corresponding matching identifier to each of the core keywords according to the keyword matching information;

The obtaining unit is used to obtain the feature of the test point to be scored according to the matching identifier of each of the core keywords.
9. The automatic scoring device of claim 9, wherein the input module comprises:

The second sample answer data acquisition unit is used to acquire M second sample answer data, each of the second sample answer data includes original answer information and a second score value, and M is a positive integer;

A test site feature conversion unit, configured to use the target test site to perform feature conversion on the original answer information of each of the second sample answer data to obtain test site training features;

A composition unit, configured to form the test site training feature and the corresponding second score value into a test site sample set;

The decision tree reference model training unit is used to train the decision tree model according to the test site sample set to obtain the decision tree reference model.
A computer device includes a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, wherein the processor executes the computer-readable instructions as follows step:

Acquiring N first sample answer data, each of the first sample answer data includes sample answer information and a first score value, and N is a positive integer;

Performing word segmentation processing on the sample answer information of each of the first sample answer data to obtain sample word segmentation of each of the first sample answer data;

Summarize the sample word segmentation of each of the first sample answer data to obtain a sample word segmentation set;

Using the sample word segmentation set to perform feature conversion on the sample answer information of each of the first sample answer data to obtain sample training features;

Training the decision tree model according to the sample training feature and the corresponding first score value to obtain the decision tree sample model;

Extract sample keywords from the decision tree sample model.
11. The computer device of claim 11, wherein after the sample keywords are extracted from the decision tree sample model, the processor further implements the following steps when executing the computer-readable instructions:

Acquiring scoring rule information, where the scoring rule information includes preset test sites and preset keywords corresponding to each of the preset test sites;

Removing keywords that overlap with the preset keywords from the sample keywords to obtain the target keywords;

Sending the target keyword to the client, and obtaining the test center label returned by the client according to the target keyword;

Add each of the target keywords to the corresponding preset test center according to the test center tag to obtain the target test center.
A computer device, including a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, characterized in that, when the processor executes the computer-readable instructions, it is implemented as follows step:

Get information about the answer to be graded;

Perform keyword extraction on the answer information to be scored to obtain core keywords;

Use the target test site to perform feature transformation on the core keywords to obtain the features of the test site to be scored; wherein, the target test site is obtained by using the keyword determination method of claim 2;

The characteristics of the test point to be scored are input into a preset decision tree reference model to obtain the accurate score of the answer information to be scored.
The computer device according to claim 13, wherein said adopting the target test site to perform feature conversion on the core keyword to obtain the feature of the test site to be scored comprises:

Obtain valid keywords corresponding to the target test site;

Through the regular matching method, the effective keywords are matched with the core keywords one by one to obtain keyword matching information;

According to the keyword matching information, assign a corresponding matching identifier to each of the core keywords;

According to the matching identifier of each of the core keywords, the characteristics of the test points to be scored are obtained.
The computer device according to claim 14, wherein before the feature of the test point to be scored is input into a preset decision tree reference model to obtain the output score of the answer information to be scored, the processor The following steps are also implemented when the computer-readable instructions are executed:

Acquiring M second sample answer data, each of the second sample answer data includes original answer information and a second score value, and M is a positive integer;

Using the target test site to perform feature transformation on the original answer information of each of the second sample answer data to obtain test site training features;

Forming the test site training feature and the corresponding second score value into a test site sample set;

The decision tree model is trained according to the test site sample set to obtain the decision tree reference model.
One or more non-volatile readable storage media storing computer readable instructions. When the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:

Acquiring N first sample answer data, each of the first sample answer data includes sample answer information and a first score value, and N is a positive integer;

Performing word segmentation processing on the sample answer information of each of the first sample answer data to obtain sample word segmentation of each of the first sample answer data;

Summarize the sample word segmentation of each of the first sample answer data to obtain a sample word segmentation set;

Using the sample word segmentation set to perform feature conversion on the sample answer information of each of the first sample answer data to obtain sample training features;

Training the decision tree model according to the sample training feature and the corresponding first score value to obtain the decision tree sample model;

Extract sample keywords from the decision tree sample model.
The non-volatile readable storage medium of claim 16, wherein, after the sample keywords are extracted from the decision tree sample model, the computer-readable instructions are executed by one or more processors When executed, the one or more processors are caused to further execute the following steps:

Acquiring scoring rule information, where the scoring rule information includes preset test sites and preset keywords corresponding to each of the preset test sites;

Removing keywords that overlap with the preset keywords from the sample keywords to obtain the target keywords;

Sending the target keyword to the client, and obtaining the test center label returned by the client according to the target keyword;

Add each of the target keywords to the corresponding preset test center according to the test center tag to obtain the target test center.
One or more non-volatile readable storage media storing computer readable instructions. When the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:

Get information about the answer to be graded;

Perform keyword extraction on the answer information to be scored to obtain core keywords;

Use the target test site to perform feature transformation on the core keywords to obtain the features of the test site to be scored; wherein, the target test site is obtained by using the keyword determination method of claim 2;

The characteristics of the test point to be scored are input into a preset decision tree reference model to obtain the accurate score of the answer information to be scored.
The non-volatile readable storage medium according to claim 18, wherein said adopting the target test site to perform feature conversion on the core keywords to obtain the features of the test site to be scored comprises:

Obtain valid keywords corresponding to the target test site;

Through the regular matching method, the effective keywords are matched with the core keywords one by one to obtain keyword matching information;

According to the keyword matching information, assign a corresponding matching identifier to each of the core keywords;

According to the matching identifier of each of the core keywords, the characteristics of the test points to be scored are obtained.
The non-volatile readable storage medium of claim 19, wherein the feature of the test point to be scored is input into a preset decision tree reference model to obtain the output score of the answer information to be scored Previously, when the computer-readable instructions are executed by one or more processors, the one or more processors execute the following steps:

Acquiring M second sample answer data, each of the second sample answer data includes original answer information and a second score value, and M is a positive integer;

Using the target test site to perform feature transformation on the original answer information of each of the second sample answer data to obtain test site training features;

Forming the test site training feature and the corresponding second score value into a test site sample set;

The decision tree model is trained according to the test site sample set to obtain the decision tree reference model.