WO2021159816A1 - 成语填空题的答案选择方法、装置和计算机设备 - Google Patents

成语填空题的答案选择方法、装置和计算机设备 Download PDF

Info

Publication number
WO2021159816A1
WO2021159816A1 PCT/CN2020/132602 CN2020132602W WO2021159816A1 WO 2021159816 A1 WO2021159816 A1 WO 2021159816A1 CN 2020132602 W CN2020132602 W CN 2020132602W WO 2021159816 A1 WO2021159816 A1 WO 2021159816A1
Authority
WO
WIPO (PCT)
Prior art keywords
fill
idiom
text
answer
blank
Prior art date
Application number
PCT/CN2020/132602
Other languages
English (en)
French (fr)
Inventor
刘翔
陈秀玲
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Priority to US17/613,506 priority Critical patent/US12008319B2/en
Priority to SG11202112291SA priority patent/SG11202112291SA/en
Priority to KR1020217036693A priority patent/KR102688187B1/ko
Priority to EP20918558.6A priority patent/EP4209956A4/en
Priority to JP2021568947A priority patent/JP7418704B2/ja
Publication of WO2021159816A1 publication Critical patent/WO2021159816A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • G06F40/16Automatic learning of transformation rules, e.g. from examples
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0895Weakly supervised learning, e.g. semi-supervised or self-supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • G09B19/06Foreign languages
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B7/00Electrically-operated teaching apparatus or devices working with questions and answers
    • G09B7/02Electrically-operated teaching apparatus or devices working with questions and answers of the type wherein the student is expected to construct an answer to the question which is presented or wherein the machine gives an answer to the question presented by a student
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B7/00Electrically-operated teaching apparatus or devices working with questions and answers
    • G09B7/06Electrically-operated teaching apparatus or devices working with questions and answers of the multiple-choice answer-type, i.e. where a given question is provided with a series of answers and a choice has to be made from the answers

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to an answer selection method, device, computer equipment, and storage medium for idiom fill-in-the-blank questions.
  • Idioms are long-term, concise and long-standing fixed phrases in the Chinese language. They are the epitome of Chinese traditional culture and come from a variety of sources. The idiom itself has a complicated structure and has been the focus and difficulty in Chinese teaching for many years. Idiom cloze is a key knowledge point and common test questions that need to be mastered in Chinese learning in primary and secondary schools. Faced with questions such as idiom selection and filling in the blanks, the prior art has always used human brain intelligence to answer them. When tutoring their children with homework, parents often seem to be unable to do so with their own knowledge to help idioms fill in the blanks and other difficult questions, so they use many apps and Internet searches to find answers.
  • the main purpose of this application is to provide an answer selection method, device, computer equipment, and storage medium for idiom fill-in-the-blank questions, aiming to overcome the defect of being unable to give answers to idiom fill-in-the-blank questions due to incomplete manual collection of questions.
  • this application provides a method for selecting answers to idiom fill-in-the-blank questions, which includes the following steps:
  • the question text of the idiom fill-in-the-blank question wherein the question text includes a fill-in text and n candidate idioms, and the fill-in text includes m fill-in blanks to be filled in the candidate idiom;
  • This application also provides an answer selection device for idiom fill-in-the-blank questions, including:
  • the first obtaining unit is configured to obtain a question text of an idiom fill-in-the-blank question; wherein the question text includes a fill-in text and n candidate idioms, and the fill-in text includes m fill-in blanks to be filled in the candidate idiom;
  • the second acquiring unit is used to acquire the interpretation texts of all the candidate idioms
  • the first output unit is configured to input the blank-filled text and the explanatory text into a pre-trained idiom selection and blank-filled model to obtain the confidence that each candidate idiom is filled in each of the blanks;
  • the selection unit is used to select m idioms from among the n candidate idioms and fill in the m fill-in spaces randomly to form multiple sets of answers; wherein, in each set of answers, the candidate idioms can only be at most Be selected to fill in the fill-in space once;
  • a calculation unit configured to fill in the confidence of each of the candidate idioms, and calculate the total of the confidence of the candidate idioms in each group of answers to fill in the blanks based on the KM algorithm;
  • the second output unit is used to obtain a group of answers with the highest total confidence as the target answer, and output the target answer as the answer to the idiom fill-in-the-blank question.
  • the present application also provides a computer device, including a memory and a processor, the memory stores a computer program, and when the processor executes the computer program, a method for selecting an answer to an idiom fill-in-the-blank question is implemented, including the following steps:
  • the question text of the idiom fill-in-the-blank question wherein the question text includes a fill-in text and n candidate idioms, and the fill-in text includes m fill-in blanks to be filled in the candidate idiom;
  • the present application also provides a computer-readable storage medium on which a computer program is stored.
  • a method for selecting answers to idiom fill-in-the-blank questions is realized, which includes the following steps:
  • the question text of the idiom fill-in-the-blank question wherein the question text includes a fill-in text and n candidate idioms, and the fill-in text includes m fill-in blanks to be filled in the candidate idiom;
  • the answer selection method, device, computer equipment and storage medium of the idiom fill-in-the-blank question provided in this application are used to obtain the question text of the idiom fill-in-the-blank question; wherein the question text includes a fill-in-the-blank text and n candidate idioms, and the fill-in-the-blank text includes m Fill in the blanks to be filled in the candidate idiom; obtain the explanation texts of all the candidate idioms; input the blank text and the explanation text into the pre-trained idiom selection and fill-in model, and get each candidate idiom to be filled in The confidence of each of the blanks; select m idioms from the n candidate idioms and fill in the m blanks randomly to form multiple sets of answers; fill in each of the candidate idioms according to each candidate idiom The confidence of filling in the blanks, calculating the total confidence of filling the candidate idioms in each set of answers into the filling in the blanks
  • This application uses artificial intelligence to obtain the total confidence of the candidate idioms in each group of answers to fill in the blanks, so that the group of answers with the highest total confidence is used as the target answer; it is convenient to get the answers to the idioms with high efficiency and high accuracy. .
  • Figure 1 is a schematic diagram of the steps of a method for selecting answers to idiom fill-in-the-blank questions in an embodiment of the present application
  • Fig. 2 is a structural block diagram of a device for selecting answers to idiom fill-in-the-blank questions in an embodiment of the present application
  • FIG. 3 is a schematic block diagram of the structure of a computer device according to an embodiment of the application.
  • an embodiment of the present application provides an answer selection method for idiom fill-in-the-blank questions, which includes the following steps:
  • Step S1 obtaining the question text of an idiom fill-in-the-blank question; wherein the question text includes a fill-in text and n candidate idioms, and the fill-in text includes m fill-in blanks to be filled in the candidate idiom;
  • Step S2 obtaining the interpretation texts of all the candidate idioms
  • Step S3 input the blank-filled text and the explanation text into a pre-trained idiom selection fill-in-the-blank model to obtain the confidence that each candidate idiom is filled in each of the blank-fills;
  • Step S4 select m idioms from n said candidate idioms and fill in m said fill-in spaces randomly to form multiple sets of answers; wherein, in each set of answers, said candidate idioms can only be selected at most Fill in the fill in the space once;
  • Step S5 According to the confidence of filling in each of the candidate idioms, based on the KM algorithm, calculate the total of the confidence of filling in the blanks of the candidate idioms in each group of answers;
  • Step S6 Obtain a group of answers with the highest total confidence as the target answer, and output the target answer as the answer to the idiom fill-in-the-blank question.
  • the above method is applied to the scenario of outputting answers to idiom fill-in-the-blank questions, using artificial intelligence to output the answers to idiom fill-in-the-blank questions with high efficiency and high accuracy.
  • the solution in this application can also be applied in the field of smart education to promote the construction of smart cities.
  • the question text of the above-mentioned idiom fill-in-the-blank question is usually an electronic text in the student’s test question.
  • the above-mentioned question text includes fill-in-the-blank text composed of multiple sentences and multiple candidate idioms (ie, candidate answers). Students can choose from the above-mentioned candidate idioms to fill in the above-mentioned fill-in-the-blank text.
  • the explanation text of all idioms (that is, the text that explains the semantics of idioms) is stored in a database (such as an idiom dictionary), and the explanations of all candidate idioms can be obtained from the above database. text.
  • a pre-trained idiom selection fill-in model is obtained by natural language neural network training based on the bert language model.
  • the above-mentioned idiom selection fill-in model is used to predict that each idiom is filled in
  • the idiom fill-in-the-blank question is the confidence of the correct answer.
  • the way to judge whether the idiom filled in the idiom-fill-in-the-blank question is the correct answer is to calculate the semantic coherence between the explanation text of the idiom and the sentence in the idiom-fill-in-the-blank question based on the above-mentioned idiom selection fill-in-the-blank model, so as to determine the corresponding confidence.
  • m idioms are selected from the n candidate idioms and randomly arranged and filled in the m fill-in spaces to form multiple sets of answers.
  • the random arrangement in mathematics is used to select m idioms from the n candidate idioms and randomly arrange them into groups of combinations, each of which is used as a pre-selected answer to fill in the above idiom fill-in-the-blank question . It is understandable that the number of the above combinations is n*(n-1)*(n-2)*...*(n-m+1).
  • step S5 since in step S3, the confidence that all candidate idioms are filled in each of the blanks has been calculated, it is possible to calculate the candidate idioms in each set of answers to fill in the blanks.
  • the sum of confidences In this embodiment, the calculation of the above-mentioned confidence sum is performed based on the KM algorithm, where the KM algorithm is a general algorithm and will not be repeated here.
  • step S6 obtain the set of answers with the highest sum of confidence and use it as the target answer.
  • the above confidence expresses the confidence that the candidate idiom is filled in the blanks to be correct. Obviously, when The higher the confidence, the closer to the correct answer. Therefore, when the above-mentioned total confidence level is the highest, the corresponding target answer will also be closest to the correct answer, and finally the target answer will be output as the answer to the idiom fill-in-the-blank question.
  • the idiom selection fill-in model obtained by deep learning training based on artificial intelligence can obtain the confidence of each candidate idiom to fill in each of the fill-in blanks, and then output the member fill-in blanks with high efficiency and high accuracy according to the confidence level.
  • the target answer to the question it is possible to automatically output the corresponding target answer without manually searching for answers, and without the user having professional knowledge of idioms, which not only improves the efficiency and accuracy of finding answers, but also reduces labor costs.
  • step S3 of inputting the fill-in text and the explanation text into the pre-trained idiom selection fill-in model to obtain each candidate idiom and fill in the confidence of each fill-in include:
  • Step S01 Obtain training samples, where the training samples include corpus texts of multiple idiom fill-in-the-blank questions for which answers have been selected and explanation texts of all idioms in the idiom library;
  • Step S02 Input the training samples into a natural language neural network based on the bert language model for training to obtain the idiom selection fill-in model;
  • the natural language neural network includes a network output layer and a convolutional layer, and
  • the convolutional layer is formed by sequentially stacking multiple convolutional networks.
  • the input layer of the latter convolutional network is connected to the output layers of all previous convolutional networks, and the output layer of each convolutional network is connected to all the convolutional networks.
  • the output layer of the bert language model is connected, and the output layer of the last layer of the convolutional network is connected to the output layer of the network. It is understandable that the feature matrix output by the output layer of the above bert language model is output to each layer of the convolutional network.
  • the output layer of each layer of convolutional network is also connected to all subsequent convolutional networks.
  • Input layer is also connected to all subsequent convolutional networks.
  • the step S02 of inputting the training samples into a natural language neural network based on the bert language model to obtain the idiom selection filling-in-the-blank model specifically includes:
  • the above-mentioned bert language model is obtained by running a self-supervised learning method on the basis of a large amount of corpus, which is a good feature vector representation of word learning; wherein, the above-mentioned self-supervised learning refers to the absence of manual annotation Supervised learning running on the data.
  • the above-mentioned bert language model uses two unsupervised prediction tasks for pre-training, and the two unsupervised prediction tasks are Masked. LM and Next Sentence Prediction.
  • the network architecture of the above bert language model uses a multi-layer Transformer structure. Its biggest feature is that it discards the traditional RNN and CNN, and converts the distance between two words at any position into 1 through the Attention mechanism, which effectively solves the problem of NLP The thorny long-term dependency problem.
  • the structure of Transformer has been widely used in the field of NLP, so I will not repeat it here.
  • the vector representation of the feature matrix output by the bert language model can be directly used as the word embedding feature of the NLP task.
  • the bert language model provides a model for transfer learning of other tasks.
  • the above bert language model can be fine-tuned or fixed according to the task as a feature extractor.
  • the confidence of filling in each of the blanks according to each candidate idiom is calculated, and the total of the confidences of filling in the blanks of the candidate idioms in each set of answers is calculated based on the KM algorithm Before step S5, include:
  • Step S51 respectively determining whether there is at least one candidate idiom in each set of answers, and the confidence of filling in the blank is less than a first threshold
  • step S52 if it exists, the corresponding answer is eliminated, and the total confidence of filling in the blank in the candidate idiom in the eliminated answer is not calculated.
  • the composition since the number of answers in the above composition is n*(n-1)*(n-2)*...*(n-m+1), when the above n and m are larger, the composition The number of answers is also relatively large. If the total confidence of each group of answers is calculated separately, it will cause a large amount of calculation and affect the calculation efficiency. It is understandable that among the answers composed above, some of the answers are obviously correct. For example, in a certain set of answers, when a certain candidate idiom filled in the blanks is obviously not consistent, it can be concluded that the set of answers is not a completely correct answer.
  • the method further includes:
  • Step S7 the target answer is filled in the blank text to obtain the answer text
  • Step S8 Obtain the user's score for the answer text, and determine whether the score is greater than a second threshold; wherein, the user scores the answer text based on standard answers;
  • Step S9 if the score is greater than the second threshold, the answer text and the explanation texts of all candidate idioms in the target answer form a training set, and the training set is input into the idiom selection and fill-in model. train;
  • step S10 the retrained idiom selection fill-in model is saved in the blockchain.
  • users can score the answer text based on the standard answer, thereby obtaining a score for the answer text, and then judging the score Whether it is greater than the second threshold; if it is greater than the threshold, it can be determined that the correct rate of the target answer output by the above-mentioned idiom selection fill-in-the-blank model is high.
  • the corresponding answer text and the explanation text of all candidate idioms in the target answer can be used to form the training set, and Input the training set into the idiom selection fill-in-the-blank model for retraining.
  • the standard answer can be obtained from the teacher user, the standard answer can be used to retrain the idiom selection fill-in-the-blank model.
  • the above-mentioned retrained idiom selection and fill-in model is saved in the blockchain.
  • the blockchain is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the idiom fill-in-the-blank question is used to test students
  • the method further includes:
  • Step S7a Obtain the test text uploaded by the student’s terminal; wherein the test text is the text in which the student fills in the student’s answer in the blank text of the idiom fill-in-the-blank question; when the student’s terminal uploads the test text, The students’ answers filled in the idiom-fill-in-the-blank questions are sequentially spliced to obtain the first idiom combination, and the first idiom combination is hashed based on the blockchain to obtain the corresponding hash value, and the The hash value is added to the test text as an identification code;
  • Step S8a extract all student answers in the test text, and concatenate them sequentially to obtain a second idiom combination
  • Step S9a performing a hash calculation on the second idiom combination based on the blockchain to obtain a corresponding verification hash value
  • Step S10a identifying the identification code in the test text, and verifying whether the verification hash value is consistent with the identification code
  • step S11a if they are consistent, compare the student's answer with the target answer one by one to obtain the accuracy of the student's answer.
  • the above method is applied to the student test scenario.
  • the student fills in the blank text of the idiom fill-in-the-blank question
  • the terminal needs to splice the student’s answers in the idiom fill-in-the-blank question by the student in turn to obtain the first idiom combination, and then combine the first idiom Perform hash calculation to obtain the corresponding hash value, and add the hash value as an identification code to the test text;
  • the hash calculation can transform the above-mentioned first idiom combination into a fixed-length hash value, and the above process is irreversible Yes, the hash value cannot be transformed into the above-mentioned first idiom combination, so as to prevent other users from obtaining the above-menti
  • the student’s answer is extracted and combined and hashed according to the same process to obtain a verification hash value, thereby judging the verification hash value and the above identification code Whether they are consistent or not, if they are consistent, it means that the student’s test text has not been tampered with, and the student’s answer is valid.
  • the student's answer is compared with the target answer to obtain the accuracy of the student's answer.
  • an embodiment of the present application also provides an answer selection device for idiom fill-in-the-blank questions, including:
  • the first obtaining unit 10 is configured to obtain a question text of an idiom fill-in-the-blank question; wherein the question text includes a fill-in text and n candidate idioms, and the fill-in text includes m fill-in blanks to be filled in the candidate idiom;
  • the second acquiring unit 20 is configured to acquire the interpretation texts of all the candidate idioms
  • the first output unit 30 is configured to input the blank-filled text and the explanatory text into a pre-trained idiom selection and blank-filled model to obtain the confidence that each candidate idiom is filled in each of the blanks;
  • the selection unit 40 is configured to select m idioms from n said candidate idioms and fill in m said fill-in spaces randomly to form multiple sets of answers; wherein, in each set of answers, the candidate idioms are at most only Can be selected to fill in the fill-in space once;
  • the calculation unit 50 is configured to calculate the total confidence of filling in the blanks in each of the candidate idioms based on the KM algorithm based on the KM algorithm;
  • the second output unit 60 is configured to obtain a group of answers with the highest sum of confidence as the target answer, and output the target answer as the answer to the idiom fill-in-the-blank question.
  • it further includes:
  • the third acquiring unit is configured to acquire training samples, where the training samples include corpus texts of multiple idiom fill-in-the-blank questions for which answers have been selected and explanation texts of all idioms in the idiom library;
  • the training unit is used to input the training samples into a natural language neural network based on the bert language model for training to obtain the idiom selection fill-in model; wherein the natural language neural network includes a network output layer and a convolutional layer,
  • the convolutional layer is formed by sequentially stacking multiple convolutional networks, the input layer of the latter convolutional network is connected with the output layers of all previous convolutional networks, and the output layer of each convolutional network is It is connected with the output layer of the bert language model, and the output layer of the last layer of the convolutional network is connected with the network output layer.
  • the training unit is specifically used for:
  • the feature matrix is input into the natural language neural network for iterative training, and the idiom selection fill-in-the-blank model is obtained.
  • the device further includes:
  • the first judging unit is configured to judge whether there is at least one candidate idiom in each group of answers, and the confidence of filling in the blank is less than a first threshold;
  • the elimination unit is configured to eliminate the corresponding answer if it exists, and not calculate the total confidence that the candidate idiom in the eliminated answer is filled in the blank.
  • the device further includes:
  • the filling unit is used to fill in the target answer into the blank text to obtain the answer text;
  • the second judgment unit is configured to obtain the user's score on the answer text, and determine whether the score is greater than a second threshold; wherein the user scores the answer text based on standard answers;
  • the retraining unit is configured to, if the score is greater than a second threshold, compose the answer text and the explanation text of all candidate idioms in the target answer into a training set, and input the training set into the idiom selection and fill in the blanks Retrain in the model;
  • the saving unit is used to save the retrained idiom selection fill-in-the-blank model to the blockchain.
  • the idiom fill-in-the-blank question is used to test students; the device further includes:
  • the fourth acquiring unit is used to acquire the test text uploaded by the terminal where the student is located; wherein the test text is the text in which the student fills in the student’s answer in the fill-in text of the idiom fill-in-the-blank question; the student’s terminal is uploading the When testing the text, the student’s answers filled in the idiom fill-in-the-blank questions are sequentially spliced to obtain the first idiom combination, and the first idiom combination is hashed to obtain the corresponding hash value, and the The hash value is added to the test text as an identification code;
  • the extraction unit is used for extracting all student answers in the test text, and concatenating them in turn to obtain the second idiom combination
  • a hash calculation unit configured to perform a hash calculation on the second idiom combination based on the blockchain to obtain a corresponding verification hash value
  • the verification unit is used to identify the identification code in the test text and verify whether the verification hash value is consistent with the identification code
  • the comparison unit is used to compare the student's answer with the target answer if they are consistent to obtain the accuracy of the student's answer.
  • an embodiment of the present application also provides a computer device.
  • the computer device may be a server, and its internal structure may be as shown in FIG. 3.
  • the computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor designed by the computer is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, a computer program, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the database of the computer equipment is used to store idiom fill-in-the-blank questions and so on.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the question text of the idiom fill-in-the-blank question wherein the question text includes a fill-in text and n candidate idioms, and the fill-in text includes m fill-in blanks to be filled in the candidate idiom;
  • FIG. 3 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • An embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored.
  • a method for selecting an answer to an idiom fill-in-the-blank question is realized, which includes the following steps:
  • the question text of the idiom fill-in-the-blank question wherein the question text includes a fill-in text and n candidate idioms, and the fill-in text includes m fill-in blanks to be filled in the candidate idiom;
  • the computer-readable storage medium in this embodiment may be a volatile readable storage medium or a non-volatile readable storage medium.
  • the answer selection method, device, computer equipment and storage medium of the idiom fill-in-the-blank question include: obtaining the question text of the idiom fill-in-the-blank question; wherein, the question text includes the fill-in-the-blank text and n Candidate idioms, the fill-in text includes m fill-in blanks to be filled in the candidate idiom; obtain all the explanation texts of the candidate idioms; input the fill-in text and the explanation text into the pre-trained idiom selection In the fill-in-the-blank model, obtain the confidence that each candidate idiom is filled in each of the fill-in-the-blanks; select m idioms from n said candidate idioms and fill in m fill-in-the-blanks randomly to form multiple sets of answers; according to Fill in the confidence of each of the candidate idioms, based on the KM algorithm, calculate the total confidence of the candidate
  • This application uses artificial intelligence to obtain the total confidence of the candidate idioms in each group of answers to fill in the blanks, so that the group of answers with the highest total confidence is used as the target answer; it is convenient to get the answers to the idioms with high efficiency and high accuracy. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Business, Economics & Management (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Tourism & Hospitality (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Educational Technology (AREA)
  • Educational Administration (AREA)
  • Databases & Information Systems (AREA)
  • Bioethics (AREA)
  • Computer Hardware Design (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Computer Interaction (AREA)

Abstract

涉及人工智能技术领域,提供一种成语填空题的答案选择方法、装置、计算机设备和存储介质,包括:获取成语填空题的问题文本;其中,问题文本包括填空文本以及n个候选成语,填空文本中包括m个待填入候选成语的填空(S1);获取所有候选成语的解释文本(S2);将填空文本以及解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个填空的置信度(S3);从n个候选成语中选择出m个成语组成多组答案(S4);计算每一组答案中的候选成语填入填空的置信度总和(S5);获取置信度总和最高的一组答案,输出为成语填空题的答案(S6)。该方案采用人工智能的方式,高效率、高准确率得出成语填空的答案。该方案还可应用于智慧教育领域中,以推动智慧城市的建设。

Description

成语填空题的答案选择方法、装置和计算机设备
本申请要求于2020年09月04日提交中国专利局、申请号为202010923909.X,发明名称为“成语填空题的答案选择方法、装置和计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,特别涉及一种成语填空题的答案选择方法、装置、计算机设备和存储介质。
背景技术
成语是汉语言中长期习用的形式简洁、含义丰富而又历史悠久的固定词组,它是中国传统文化的缩影,来源多种多样。成语本身结构复杂,多年以来一直是语文教学中的重点和难点。而成语完形填空是中小学语文学习中一个需要重点掌握的知识点和常见考题。面对成语选择填空这一类题目,现有技术中一直通过使用人脑智力来解答。家长们在辅导孩子作业时,凭借自身知识辅导成语填空这类高难度的题目经常显得力不从心,从而借助很多app软件和互联网搜索来查找答案。然而发明人意识到目前存在的app软件和互联网搜索中,成语填空类型的题目都是依靠人工收集答案,无法通过智能化的方法自动判定答案。经常出现因人工收集的题目数量不全,使得家长或者学生经过长时间的搜索扔找不到答案。
技术问题
本申请的主要目的为提供一种成语填空题的答案选择方法、装置、计算机设备和存储介质,旨在克服目前人工收集题目数量不全造成无法给出成语填空题答案的缺陷。
技术解决方案
为实现上述目的,本申请提供了一种成语填空题的答案选择方法,包括以下步骤:
获取成语填空题的问题文本;其中,所述问题文本包括填空文本以及n个候选成语,所述填空文本中包括m个待填入所述候选成语的填空;
获取所有所述候选成语的解释文本;
将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度;
从n个所述候选成语中选择出m个成语随机排列填入m个所述填空中,组成多组答案;其中,在每一组答案中,所述候选成语最多只能被选择填入所述填空中一次;
根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和;
获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案。
本申请还提供了一种成语填空题的答案选择装置,包括:
第一获取单元,用于获取成语填空题的问题文本;其中,所述问题文本包括填空文本以及n个候选成语,所述填空文本中包括m个待填入所述候选成语的填空;
第二获取单元,用于获取所有所述候选成语的解释文本;
第一输出单元,用于将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度;
选择单元,用于从n个所述候选成语中选择出m个成语随机排列填入m个所述填空中,组成多组答案;其中,在每一组答案中,所述候选成语最多只能被选择填入所述填空中一次;
计算单元,用于根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和;
第二输出单元,用于获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案。
本申请还提供一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器执行所述计算机程序时实现一种成语填空题的答案选择方法,包括以下步骤:
获取成语填空题的问题文本;其中,所述问题文本包括填空文本以及n个候选成语,所述填空文本中包括m个待填入所述候选成语的填空;
获取所有所述候选成语的解释文本;
将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度;
从n个所述候选成语中选择出m个成语随机排列填入m个所述填空中,组成多组答案;其中,在每一组答案中,所述候选成语最多只能被选择填入所述填空中一次;
根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和;
获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案。
本申请还提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现一种成语填空题的答案选择方法,包括以下步骤:
获取成语填空题的问题文本;其中,所述问题文本包括填空文本以及n个候选成语,所述填空文本中包括m个待填入所述候选成语的填空;
获取所有所述候选成语的解释文本;
将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度;
从n个所述候选成语中选择出m个成语随机排列填入m个所述填空中,组成多组答案;其中,在每一组答案中,所述候选成语最多只能被选择填入所述填空中一次;
根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和;
获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案。
有益效果
本申请提供的成语填空题的答案选择方法、装置、计算机设备和存储介质,获取成语填空题的问题文本;其中,所述问题文本包括填空文本以及n个候选成语,所述填空文本中包括m个待填入所述候选成语的填空;获取所有所述候选成语的解释文本;将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度;从n个所述候选成语中选择出m个成语随机排列填入m个所述填空中,组成多组答案;根据每个所述候选成语填入每个所述填空的置信度,计算每一组答案中的所述候选成语填入所述填空的置信度总和;获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案。本申请采用人工智能的方式得到每组答案中候选成语填入填空中的置信度总和,从而将置信度总和最高的一组答案作为目标答案;便于高效率、高准确率得出成语填空的答案。
附图说明
图1 是本申请一实施例中成语填空题的答案选择方法步骤示意图;
图2 是本申请一实施例中成语填空题的答案选择装置结构框图;
图3 为本申请一实施例的计算机设备的结构示意框图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
本发明的最佳实施方式
参照图1,本申请一实施例中提供了一种成语填空题的答案选择方法,包括以下步骤:
步骤S1,获取成语填空题的问题文本;其中,所述问题文本包括填空文本以及n个候选成语,所述填空文本中包括m个待填入所述候选成语的填空;
步骤S2,获取所有所述候选成语的解释文本;
步骤S3,将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度;
步骤S4,从n个所述候选成语中选择出m个成语随机排列填入m个所述填空中,组成多组答案;其中,在每一组答案中,所述候选成语最多只能被选择填入所述填空中一次;
步骤S5,根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和;
步骤S6,获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案。
在本实施例中,上述方法应用于输出成语填空题的答案的场景中,利用人工智能高效率、高准确率输出成语填空题的答案。本申请中的方案还可以应用于智慧教育的领域中,以推动智慧城市的建设。
如上述步骤S1所述的,上述成语填空题的问题文本通常为学生的试题中的电子文本,上述问题文本中包括由多个语句组成的填空文本,以及多个候选成语(即候选答案),学生可以从上述候选成语中选择填入上述填空文本的填空中。
如上述步骤S2所述的,在数据库(如成语词典)中存储有所有成语的解释文本(即对成语的语义进行解释的文本),从上述数据库中,便可以获取到所有的候选成语的解释文本。
如上述步骤S3所述的,预先训练有一个成语选择填空模型,该成语选择填空模型是基于bert语言模型的自然语言神经网络训练所得到,上述成语选择填空模型用于预测每个成语填入在成语填空题中为正确答案的置信度。而判断填入成语填空题中的成语是否为正确答案的方式则是基于上述成语选择填空模型计算上述成语的解释文本与成语填空题中的语句的语义连贯度,从而确定出对应的置信度。
如上述步骤S4所述的,从n个所述候选成语中选择出m个成语随机排列填入m个所述填空中,组成多组答案。在本实施例中,采用数学中的随机排列方式从n个所述候选成语中选择出m个成语随机排列成多组组合,每一组组合作为一种填入上述成语填空题中的预选答案。可以理解的是,上述组合的数量为n*(n-1)*(n-2)*...*(n-m+1)。
如上述步骤S5所述的,由于上述步骤S3中已经计算出所有候选成语填入每个所述填空的置信度,因此,可以计算出每一组答案中的所述候选成语填入所述填空的置信度总和。在本实施例中,基于KM算法进行上述置信度总和的计算,其中KM算法为通用算法,在此不进行赘述。
如上述步骤S6所述的,获取置信度总和最高的一组答案,并将其作为目标答案,可以理解的是,上述置信度表达的是候选成语填入填空中为正确的置信程度,显然当其置信度越高时,越贴近于正确答案。因此,当上述置信度总和最高时,则对应的目标答案也将最贴近于正确答案,最终将所述目标答案输出为所述成语填空题的答案。在本实施例中,基于人工智能的深度学习训练得到的成语选择填空模型,可以得到每个候选成语填入每个所述填空的置信度,进而根据置信度高效率、高准确率输出成员填空题的目标答案。在此过程中,无需人工进行搜索答案,也无需用户具备专业的成语知识,便可以自动输出对应的目标答案,不仅提高找出答案的效率以及准确率,而且降低人力成本。
在一实施例中,所述将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度的步骤S3之前,包括:
步骤S01,获取训练样本,其中所述训练样本包括多个已经选择好答案的成语填空题的语料文本以及成语库中的所有成语的解释文本;
步骤S02,将所述训练样本输入至基于bert语言模型的自然语言神经网络中进行训练,得到所述成语选择填空模型;其中,所述自然语言神经网络包括网络输出层以及卷积层,所述卷积层由多个卷积网络依次堆积形成,后一个所述卷积网络的输入层与其之前的所有卷积网络的输出层进行连接,且每一个所述卷积网络的输出层均与所述bert语言模型的输出层连接,最后一层所述卷积网络的输出层与所述网络输出层连接。可以理解的是,上述bert语言模型的输出层输出的特征矩阵分别输出至每一层所述卷积网络,同时,每一层卷积网络的输出层也均连接其之后的所有卷积网络的输入层。在本实施例中,在模型结构上作出上述改进,从而增加神经网络的深度,使其在特征提取上具有更强的提取能力。
在本实施例中,所述将所述训练样本输入至基于bert语言模型的自然语言神经网络中进行训练,得到所述成语选择填空模型的步骤S02,具体包括:
a、将所述训练样本输入至所述bert语言模型中,基于所述bert语言模型,提取所述训练样本的特征矩阵;
b、将所述特征矩阵输入至所述自然语言神经网络中进行迭代训练,得到所述成语选择填空模型。
在一实施例中,上述bert语言模型是通过在海量的语料的基础上运行自监督学习方法得到,其为词学习到一个好的特征向量表示;其中,上述自监督学习是指在没有人工标注的数据上运行的监督学习。
在另一实施例中,上述bert语言模型使用两个无监督预测任务进行预训练,两个无监督预测任务分别是Masked LM和Next Sentence Prediction。
上述bert语言模型的网络架构使用的是多层Transformer结构,其最大的特点是抛弃了传统的RNN和CNN,通过Attention机制将任意位置的两个单词的距离转换成1,有效的解决了NLP中棘手的长期依赖问题。Transformer的结构在NLP领域中已经得到了广泛应用,在此不再进行赘述。
在NLP(自然语言识别)任务中,可以直接使用bert语言模型输出的特征矩阵的向量表示作为NLP任务的词嵌入特征。bert语言模型提供的是一个供其它任务迁移学习的模型,上述bert语言模型可以根据任务微调或者固定之后作为特征提取器。
在一实施例中,所述根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和的步骤S5之前,包括:
步骤S51,分别判断每一组答案中是否存在至少一个所述候选成语填入所述填空的置信度小于第一阈值;
步骤S52,若存在,则将对应的答案剔除,且不计算剔除掉的所述答案中所述候选成语填入所述填空的置信度总和。
在本实施例中,由于上述组成的答案数量为n*(n-1)*(n-2)*...*(n-m+1),当上述n与m较大时,组成的答案数量也比较庞大,若每一组答案均分别计算置信度总和,则会造成计算量大,影响计算效率。可以理解的是,在上述组成的答案中,显然有一些答案是明显正确的。例如在某一组答案中,填入在填空中的某个候选成语明显不符合时,则可以断定该组答案不是完全正确的答案。因此,只需要分别判断每一种答案中是否具有明显不符合的候选成语,进而将该组答案剔除,不进行置信度总和的计算,则可以显著降低运算量。在本实施例中,在已经获取到每个候选成语填入每个所述填空的置信度时,则可以根据置信度的值是否太低(置信度小于一个阈值,即第一阈值),判断出某个候选成语填入在某个填空中是明显错误的。
在一实施例中,所述获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案的步骤S6之后,还包括:
步骤S7,将所述目标答案填入所述填空文本中,得到答案文本;
步骤S8,获取用户对所述答案文本的打分,并判断所述打分是否大于第二阈值;其中,所述用户基于标准答案对所述答案文本进行打分;
步骤S9,若所述打分大于第二阈值,则将所述答案文本以及所述目标答案中所有候选成语的解释文本构成训练集,并将所述训练集输入至所述成语选择填空模型中再训练;
步骤S10,将再训练后的所述成语选择填空模型保存至区块链中。
在本实施例中,由于上述成语填空题的题型在发送变化,其问题文本的具体文本内容也不同于原训练样本中的语料,因此,为了提高上述成语选择填空模型在后续其它场景的计算准确率,提升模型的泛化能力,需要对上述成语选择填空模型继续进行再训练。而在再训练之前,应当确认上述目标答案尽量为正确答案,因此用户(教师等)可以基于标准答案对所述答案文本进行打分,从而获取到对所述答案文本的打分,进而判断所述打分是否大于第二阈值;若大于阈值,可以判定上述成语选择填空模型输出的目标答案正确率高,此时可以采用对应的答案文本以及所述目标答案中所有候选成语的解释文本构成训练集,并将所述训练集输入至所述成语选择填空模型中再训练。在其它实施例中,若可以从教师用户获取到标准答案,则可以利用标准答案进行成语选择填空模型的再训练。
为了加强上述成语选择填空模型的安全性,将上述再训练后的成语选择填空模型保存至区块链中。其中,区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层。
在一实施例中,所述成语填空题用于测试学生;
所述获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案的步骤S6之后,还包括:
步骤S7a,获取学生所在终端上传的测试文本;其中,测试文本为所述学生在所述成语填空题的填空文本中填入学生答案的文本;所述学生所在终端在上传所述测试文本时,将所述学生填入所述成语填空题中的学生答案依次进行拼接得到第一成语组合,并基于区块链对所述第一成语组合进行哈希计算得到对应的哈希值,将所述哈希值作为标识码添加在所述测试文本中;
步骤S8a,提取出所述测试文本中的所有学生答案,并依次进行拼接得到第二成语组合;
步骤S9a,基于区块链对所述第二成语组合进行哈希计算得到对应的验证哈希值;
步骤S10a,识别出所述测试文本中的标识码,并验证所述验证哈希值与所述标识码是否一致;
步骤S11a,若一致,则将所述学生答案与所述目标答案逐一进行对比,以获取所述学生答案的准确率。
在本实施例中,上述方法应用于学生测试场景中,为了测试出学生的真实成语水平,应当保障学生提交的学生答案无误,未被篡改;因此,学生在所述成语填空题的填空文本中填入学生答案之后,通过终端提交具有学生答案的测试文本时,上述终端需要将所述学生填入所述成语填空题中的学生答案依次进行拼接得到第一成语组合,并对第一成语组合进行哈希计算得到对应的哈希值,将哈希值作为标识码添加在所述测试文本中;哈希计算可以将上述第一成语组合变换成固定长度的散列值,且上述过程是不可逆的,无法将散列值变换为上述第一成语组合,避免其它用户获取到上述哈希值后对其进行倒推获取上述学生答案。上述添加在测试文本中的标识码则作为验证上述测试文本是否被篡改的依据。
在获取到上述学生所在终端上传的测试文本时,则提取出学生答案,并按照相同的过程对其进行组合、哈希计算以得到一个验证哈希值,从而判断验证哈希值与上述标识码是否一致,若一致,则表明学生的测试文本未被篡改,其学生答案为有效。此时,再将学生答案与目标答案进行对比,以获取所述学生答案的准确率。
参照图2,本申请一实施例中还提供了一种成语填空题的答案选择装置,包括:
第一获取单元10,用于获取成语填空题的问题文本;其中,所述问题文本包括填空文本以及n个候选成语,所述填空文本中包括m个待填入所述候选成语的填空;
第二获取单元20,用于获取所有所述候选成语的解释文本;
第一输出单元30,用于将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度;
选择单元40,用于从n个所述候选成语中选择出m个成语随机排列填入m个所述填空中,组成多组答案;其中,在每一组答案中,所述候选成语最多只能被选择填入所述填空中一次;
计算单元50,用于根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和;
第二输出单元60,用于获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案。
在一实施例中,还包括:
第三获取单元,用于获取训练样本,其中所述训练样本包括多个已经选择好答案的成语填空题的语料文本以及成语库中的所有成语的解释文本;
训练单元,用于将所述训练样本输入至基于bert语言模型的自然语言神经网络中进行训练,得到所述成语选择填空模型;其中,所述自然语言神经网络包括网络输出层以及卷积层,所述卷积层由多个卷积网络依次堆积形成,后一个所述卷积网络的输入层与其之前的所有卷积网络的输出层进行连接,且每一个所述卷积网络的输出层均与所述bert语言模型的输出层连接,最后一层所述卷积网络的输出层与所述网络输出层连接。
在一实施例中,所述训练单元具体用于:
将所述训练样本输入至所述bert语言模型中,基于所述bert语言模型,提取所述训练样本的特征矩阵;
将所述特征矩阵输入至所述自然语言神经网络中进行迭代训练,得到所述成语选择填空模型。
在一实施例中,所述装置还包括:
第一判断单元,用于分别判断每一组答案中是否存在至少一个所述候选成语填入所述填空的置信度小于第一阈值;
剔除单元,用于若存在,则将对应的答案剔除,且不计算剔除掉的所述答案中所述候选成语填入所述填空的置信度总和。
在一实施例中,所述装置还包括:
填入单元,用于将所述目标答案填入所述填空文本中,得到答案文本;
第二判断单元,用于获取用户对所述答案文本的打分,并判断所述打分是否大于第二阈值;其中,所述用户基于标准答案对所述答案文本进行打分;
再训练单元,用于若所述打分大于第二阈值,则将所述答案文本以及所述目标答案中所有候选成语的解释文本构成训练集,并将所述训练集输入至所述成语选择填空模型中再训练;
保存单元,用于将再训练后的所述成语选择填空模型保存至区块链中。
在一实施例中,所述成语填空题用于测试学生;所述装置还包括:
第四获取单元,用于获取学生所在终端上传的测试文本;其中,测试文本为所述学生在所述成语填空题的填空文本中填入学生答案的文本;所述学生所在终端在上传所述测试文本时,将所述学生填入所述成语填空题中的学生答案依次进行拼接得到第一成语组合,并对所述第一成语组合进行哈希计算得到对应的哈希值,将所述哈希值作为标识码添加在所述测试文本中;
提取单元,用于提取出所述测试文本中的所有学生答案,并依次进行拼接得到第二成语组合;
哈希计算单元,用于基于区块链对所述第二成语组合进行哈希计算得到对应的验证哈希值;
验证单元,用于识别出所述测试文本中的标识码,并验证所述验证哈希值与所述标识码是否一致;
对比单元,用于若一致,则将所述学生答案与所述目标答案逐一进行对比,以获取所述学生答案的准确率。
在本实施例中,上述装置实施例中各个单元的具体实现请参照上述方法实施例中所述,在此不再进行赘述。
参照图3,本申请实施例中还提供一种计算机设备,该计算机设备可以是服务器,其内部结构可以如图3所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设计的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于存储成语填空题等。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种成语填空题的答案选择方法,包括以下步骤:
获取成语填空题的问题文本;其中,所述问题文本包括填空文本以及n个候选成语,所述填空文本中包括m个待填入所述候选成语的填空;
获取所有所述候选成语的解释文本;
将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度;
从n个所述候选成语中选择出m个成语随机排列填入m个所述填空中,组成多组答案;其中,在每一组答案中,所述候选成语最多只能被选择填入所述填空中一次;
根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和;
获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案。
本领域技术人员可以理解,图3中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定。
本申请一实施例还提供一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现一种成语填空题的答案选择方法,包括以下步骤:
获取成语填空题的问题文本;其中,所述问题文本包括填空文本以及n个候选成语,所述填空文本中包括m个待填入所述候选成语的填空;
获取所有所述候选成语的解释文本;
将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度;
从n个所述候选成语中选择出m个成语随机排列填入m个所述填空中,组成多组答案;其中,在每一组答案中,所述候选成语最多只能被选择填入所述填空中一次;
根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和;
获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案。
可以理解的是,本实施例中的计算机可读存储介质可以是易失性可读存储介质,也可以为非易失性可读存储介质。
综上所述,为本申请实施例中提供的成语填空题的答案选择方法、装置、计算机设备和存储介质,包括:获取成语填空题的问题文本;其中,所述问题文本包括填空文本以及n个候选成语,所述填空文本中包括m个待填入所述候选成语的填空;获取所有所述候选成语的解释文本;将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度;从n个所述候选成语中选择出m个成语随机排列填入m个所述填空中,组成多组答案;根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和;获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案。本申请采用人工智能的方式得到每组答案中候选成语填入填空中的置信度总和,从而将置信度总和最高的一组答案作为目标答案;便于高效率、高准确率得出成语填空的答案。

Claims (20)

  1. 一种成语填空题的答案选择方法,其中,包括以下步骤:
    获取成语填空题的问题文本;其中,所述问题文本包括填空文本以及n个候选成语,所述填空文本中包括m个待填入所述候选成语的填空;
    获取所有所述候选成语的解释文本;
    将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度;
    从n个所述候选成语中选择出m个成语随机排列填入m个所述填空中,组成多组答案;其中,在每一组答案中,所述候选成语最多只能被选择填入所述填空中一次;
    根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和;
    获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案。
  2. 根据权利要求1所述的成语填空题的答案选择方法,其中,所述将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度的步骤之前,包括:
    获取训练样本,其中所述训练样本包括多个已经选择好答案的成语填空题的语料文本以及成语库中的所有成语的解释文本;
    将所述训练样本输入至基于bert语言模型的自然语言神经网络中进行训练,得到所述成语选择填空模型;其中,所述自然语言神经网络包括网络输出层以及卷积层,所述卷积层由多个卷积网络依次堆积形成,后一个所述卷积网络的输入层与其之前的所有卷积网络的输出层进行连接,且每一个所述卷积网络的输出层均与所述bert语言模型的输出层连接,最后一层所述卷积网络的输出层与所述网络输出层连接。
  3. 根据权利要求2所述的成语填空题的答案选择方法,其中,所述将所述训练样本输入至基于bert语言模型的自然语言神经网络中进行训练,得到所述成语选择填空模型的步骤,包括:
    将所述训练样本输入至所述bert语言模型中,基于所述bert语言模型,提取所述训练样本的特征矩阵;
    将所述特征矩阵输入至所述自然语言神经网络中进行迭代训练,得到所述成语选择填空模型。
  4. 根据权利要求1所述的成语填空题的答案选择方法,其中,所述根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和的步骤之前,包括:
    分别判断每一组答案中是否存在至少一个所述候选成语填入所述填空的置信度小于第一阈值;
    若存在,则将对应的答案剔除,且不计算剔除掉的所述答案中所述候选成语填入所述填空的置信度总和。
  5. 根据权利要求1所述的成语填空题的答案选择方法,其中,所述获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案的步骤之后,还包括:
    将所述目标答案填入所述填空文本中,得到答案文本;
    获取用户对所述答案文本的打分,并判断所述打分是否大于第二阈值;其中,所述用户基于标准答案对所述答案文本进行打分;
    若所述打分大于第二阈值,则将所述答案文本以及所述目标答案中所有候选成语的解释文本构成训练集,并将所述训练集输入至所述成语选择填空模型中再训练;
    将再训练后的所述成语选择填空模型保存至区块链中。
  6. 根据权利要求1所述的成语填空题的答案选择方法,其中,所述成语填空题用于测试学生;
    所述获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案的步骤之后,还包括:
    获取学生所在终端上传的测试文本;其中,测试文本为所述学生在所述成语填空题的填空文本中填入学生答案的文本;所述学生所在终端在上传所述测试文本时,将所述学生填入所述成语填空题中的学生答案依次进行拼接得到第一成语组合,并基于区块链对所述第一成语组合进行哈希计算得到对应的哈希值,将所述哈希值作为标识码添加在所述测试文本中;
    提取出所述测试文本中的所有学生答案,并依次进行拼接得到第二成语组合;
    基于区块链对所述第二成语组合进行哈希计算得到对应的验证哈希值;
    识别出所述测试文本中的标识码,并验证所述验证哈希值与所述标识码是否一致;
    若一致,则将所述学生答案与所述目标答案逐一进行对比,以获取所述学生答案的准确率。
  7. 一种成语填空题的答案选择装置,其中,包括:
    第一获取单元,用于获取成语填空题的问题文本;其中,所述问题文本包括填空文本以及n个候选成语,所述填空文本中包括m个待填入所述候选成语的填空;
    第二获取单元,用于获取所有所述候选成语的解释文本;
    第一输出单元,用于将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度;
    选择单元,用于从n个所述候选成语中选择出m个成语随机排列填入m个所述填空中,组成多组答案;其中,在每一组答案中,所述候选成语最多只能被选择填入所述填空中一次;
    计算单元,用于根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和;
    第二输出单元,用于获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案。
  8. 根据权利要求7所述的成语填空题的答案选择装置,其中,还包括:
    第三获取单元,用于获取训练样本,其中所述训练样本包括多个已经选择好答案的成语填空题的语料文本以及成语库中的所有成语的解释文本;
    训练单元,用于将所述训练样本输入至基于bert语言模型的自然语言神经网络中进行训练,得到所述成语选择填空模型;其中,所述自然语言神经网络包括网络输出层以及卷积层,所述卷积层由多个卷积网络依次堆积形成,后一个所述卷积网络的输入层与其之前的所有卷积网络的输出层进行连接,且每一个所述卷积网络的输出层均与所述bert语言模型的输出层连接,最后一层所述卷积网络的输出层与所述网络输出层连接。
  9. 一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机程序,其中,所述处理器执行所述计算机程序时实现一种成语填空题的答案选择方法,其中,包括以下步骤:
    获取成语填空题的问题文本;其中,所述问题文本包括填空文本以及n个候选成语,所述填空文本中包括m个待填入所述候选成语的填空;
    获取所有所述候选成语的解释文本;
    将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度;
    从n个所述候选成语中选择出m个成语随机排列填入m个所述填空中,组成多组答案;其中,在每一组答案中,所述候选成语最多只能被选择填入所述填空中一次;
    根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和;
    获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案。
  10. 根据权利要求9所述的计算机设备,其中,所述将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度的步骤之前,包括:
    获取训练样本,其中所述训练样本包括多个已经选择好答案的成语填空题的语料文本以及成语库中的所有成语的解释文本;
    将所述训练样本输入至基于bert语言模型的自然语言神经网络中进行训练,得到所述成语选择填空模型;其中,所述自然语言神经网络包括网络输出层以及卷积层,所述卷积层由多个卷积网络依次堆积形成,后一个所述卷积网络的输入层与其之前的所有卷积网络的输出层进行连接,且每一个所述卷积网络的输出层均与所述bert语言模型的输出层连接,最后一层所述卷积网络的输出层与所述网络输出层连接。
  11. 根据权利要求10所述的计算机设备,其中,所述将所述训练样本输入至基于bert语言模型的自然语言神经网络中进行训练,得到所述成语选择填空模型的步骤,包括:
    将所述训练样本输入至所述bert语言模型中,基于所述bert语言模型,提取所述训练样本的特征矩阵;
    将所述特征矩阵输入至所述自然语言神经网络中进行迭代训练,得到所述成语选择填空模型。
  12. 根据权利要求9所述的计算机设备,其中,所述根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和的步骤之前,包括:
    分别判断每一组答案中是否存在至少一个所述候选成语填入所述填空的置信度小于第一阈值;
    若存在,则将对应的答案剔除,且不计算剔除掉的所述答案中所述候选成语填入所述填空的置信度总和。
  13. 根据权利要求9所述的计算机设备,其中,所述获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案的步骤之后,还包括:
    将所述目标答案填入所述填空文本中,得到答案文本;
    获取用户对所述答案文本的打分,并判断所述打分是否大于第二阈值;其中,所述用户基于标准答案对所述答案文本进行打分;
    若所述打分大于第二阈值,则将所述答案文本以及所述目标答案中所有候选成语的解释文本构成训练集,并将所述训练集输入至所述成语选择填空模型中再训练;
    将再训练后的所述成语选择填空模型保存至区块链中。
  14. 根据权利要求9所述的计算机设备,其中,所述成语填空题用于测试学生;
    所述获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案的步骤之后,还包括:
    获取学生所在终端上传的测试文本;其中,测试文本为所述学生在所述成语填空题的填空文本中填入学生答案的文本;所述学生所在终端在上传所述测试文本时,将所述学生填入所述成语填空题中的学生答案依次进行拼接得到第一成语组合,并基于区块链对所述第一成语组合进行哈希计算得到对应的哈希值,将所述哈希值作为标识码添加在所述测试文本中;
    提取出所述测试文本中的所有学生答案,并依次进行拼接得到第二成语组合;
    基于区块链对所述第二成语组合进行哈希计算得到对应的验证哈希值;
    识别出所述测试文本中的标识码,并验证所述验证哈希值与所述标识码是否一致;
    若一致,则将所述学生答案与所述目标答案逐一进行对比,以获取所述学生答案的准确率。
  15. 一种计算机可读存储介质,其上存储有计算机程序,其中,所述计算机程序被处理器执行时实现一种成语填空题的答案选择方法,其中,包括以下步骤:
    获取成语填空题的问题文本;其中,所述问题文本包括填空文本以及n个候选成语,所述填空文本中包括m个待填入所述候选成语的填空;
    获取所有所述候选成语的解释文本;
    将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度;
    从n个所述候选成语中选择出m个成语随机排列填入m个所述填空中,组成多组答案;其中,在每一组答案中,所述候选成语最多只能被选择填入所述填空中一次;
    根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和;
    获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案。
  16. 根据权利要求15所述的计算机可读存储介质,其中,所述将所述填空文本以及所述解释文本输入至预先训练得到的成语选择填空模型中,得到每个候选成语填入每个所述填空的置信度的步骤之前,包括:
    获取训练样本,其中所述训练样本包括多个已经选择好答案的成语填空题的语料文本以及成语库中的所有成语的解释文本;
    将所述训练样本输入至基于bert语言模型的自然语言神经网络中进行训练,得到所述成语选择填空模型;其中,所述自然语言神经网络包括网络输出层以及卷积层,所述卷积层由多个卷积网络依次堆积形成,后一个所述卷积网络的输入层与其之前的所有卷积网络的输出层进行连接,且每一个所述卷积网络的输出层均与所述bert语言模型的输出层连接,最后一层所述卷积网络的输出层与所述网络输出层连接。
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述将所述训练样本输入至基于bert语言模型的自然语言神经网络中进行训练,得到所述成语选择填空模型的步骤,包括:
    将所述训练样本输入至所述bert语言模型中,基于所述bert语言模型,提取所述训练样本的特征矩阵;
    将所述特征矩阵输入至所述自然语言神经网络中进行迭代训练,得到所述成语选择填空模型。
  18. 根据权利要求15所述的计算机可读存储介质,其中,所述根据每个所述候选成语填入每个所述填空的置信度,基于KM算法,计算每一组答案中的所述候选成语填入所述填空的置信度总和的步骤之前,包括:
    分别判断每一组答案中是否存在至少一个所述候选成语填入所述填空的置信度小于第一阈值;
    若存在,则将对应的答案剔除,且不计算剔除掉的所述答案中所述候选成语填入所述填空的置信度总和。
  19. 根据权利要求15所述的计算机可读存储介质,其中,所述获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案的步骤之后,还包括:
    将所述目标答案填入所述填空文本中,得到答案文本;
    获取用户对所述答案文本的打分,并判断所述打分是否大于第二阈值;其中,所述用户基于标准答案对所述答案文本进行打分;
    若所述打分大于第二阈值,则将所述答案文本以及所述目标答案中所有候选成语的解释文本构成训练集,并将所述训练集输入至所述成语选择填空模型中再训练;
    将再训练后的所述成语选择填空模型保存至区块链中。
  20. 根据权利要求15所述的计算机可读存储介质,其中,所述成语填空题用于测试学生;
    所述获取置信度总和最高的一组答案作为目标答案,将所述目标答案输出为所述成语填空题的答案的步骤之后,还包括:
    获取学生所在终端上传的测试文本;其中,测试文本为所述学生在所述成语填空题的填空文本中填入学生答案的文本;所述学生所在终端在上传所述测试文本时,将所述学生填入所述成语填空题中的学生答案依次进行拼接得到第一成语组合,并基于区块链对所述第一成语组合进行哈希计算得到对应的哈希值,将所述哈希值作为标识码添加在所述测试文本中;
    提取出所述测试文本中的所有学生答案,并依次进行拼接得到第二成语组合;
    基于区块链对所述第二成语组合进行哈希计算得到对应的验证哈希值;
    识别出所述测试文本中的标识码,并验证所述验证哈希值与所述标识码是否一致;
    若一致,则将所述学生答案与所述目标答案逐一进行对比,以获取所述学生答案的准确率。
PCT/CN2020/132602 2020-09-04 2020-11-30 成语填空题的答案选择方法、装置和计算机设备 WO2021159816A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
US17/613,506 US12008319B2 (en) 2020-09-04 2020-11-30 Method and apparatus for selecting answers to idiom fill-in-the-blank questions, and computer device
SG11202112291SA SG11202112291SA (en) 2020-09-04 2020-11-30 Method and apparatus for selecting answers to idiom fill-in-the-blank questions, and computer device
KR1020217036693A KR102688187B1 (ko) 2020-09-04 2020-11-30 성어 괄호넣기문제의 답안 선택장치와 컴퓨터장비
EP20918558.6A EP4209956A4 (en) 2020-09-04 2020-11-30 METHOD AND APPARATUS FOR SELECTING BLANK-FILLING QUESTIONS AND ANSWERS BY IDIOM AND COMPUTER DEVICE
JP2021568947A JP7418704B2 (ja) 2020-09-04 2020-11-30 穴埋め熟語問題の回答を選択する方法、装置およびコンピュータ機器

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010923909.XA CN112069815B (zh) 2020-09-04 2020-09-04 成语填空题的答案选择方法、装置和计算机设备
CN202010923909.X 2020-09-04

Publications (1)

Publication Number Publication Date
WO2021159816A1 true WO2021159816A1 (zh) 2021-08-19

Family

ID=73665904

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/132602 WO2021159816A1 (zh) 2020-09-04 2020-11-30 成语填空题的答案选择方法、装置和计算机设备

Country Status (7)

Country Link
US (1) US12008319B2 (zh)
EP (1) EP4209956A4 (zh)
JP (1) JP7418704B2 (zh)
KR (1) KR102688187B1 (zh)
CN (1) CN112069815B (zh)
SG (1) SG11202112291SA (zh)
WO (1) WO2021159816A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114896986A (zh) * 2022-06-07 2022-08-12 北京百度网讯科技有限公司 增强语义识别模型的训练数据的方法和装置
CN117149989A (zh) * 2023-11-01 2023-12-01 腾讯科技(深圳)有限公司 大语言模型训练方法、文本处理方法及装置

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11923074B2 (en) * 2021-02-12 2024-03-05 Iqvia Inc. Professional network-based identification of influential thought leaders and measurement of their influence via deep learning
CN113420134B (zh) * 2021-06-22 2022-10-14 康键信息技术(深圳)有限公司 机器阅读理解方法、装置、计算机设备和存储介质
US20240211686A1 (en) * 2022-12-23 2024-06-27 Document Crunch, Inc. Context-based natural language processing
CN117057325B (zh) * 2023-10-13 2024-01-05 湖北华中电力科技开发有限责任公司 一种应用于电网领域表单填写方法、系统和电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086273A (zh) * 2018-08-14 2018-12-25 北京粉笔未来科技有限公司 基于神经网络解答语法填空题的方法、装置和终端设备
CN109858015A (zh) * 2018-12-12 2019-06-07 湖北工业大学 一种基于ctw和km算法的语义相似度计算方法及装置
CN110096699A (zh) * 2019-03-20 2019-08-06 华南师范大学 基于语义的机器阅读理解的候选答案筛选方法和系统
US20190371299A1 (en) * 2017-02-28 2019-12-05 Huawei Technologies Co., Ltd. Question Answering Method and Apparatus
CN110909144A (zh) * 2019-11-28 2020-03-24 中信银行股份有限公司 问答对话方法、装置、电子设备及计算机可读存储介质
CN111008702A (zh) * 2019-12-06 2020-04-14 北京金山数字娱乐科技有限公司 一种成语推荐模型的训练方法及装置

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8602793B1 (en) * 2006-07-11 2013-12-10 Erwin Ernest Sniedzins Real time learning and self improvement educational system and method
CN109478204B (zh) 2016-05-17 2023-09-15 微软技术许可有限责任公司 非结构化文本的机器理解
CN106409041B (zh) * 2016-11-22 2020-05-19 深圳市鹰硕技术有限公司 一种填空题试题的生成和判卷的方法及系统
US10055401B2 (en) * 2016-12-09 2018-08-21 International Business Machines Corporation Identification and processing of idioms in an electronic environment
US10585985B1 (en) * 2016-12-14 2020-03-10 Educational Testing Service Systems and methods for automatic detection of idiomatic expressions in written responses
CN107193798B (zh) * 2017-05-17 2019-06-04 南京大学 一种基于规则的试题类自动问答系统中的试题理解方法
CN108924167B (zh) * 2018-09-06 2020-12-01 贵阳信息技术研究院(中科院软件所贵阳分部) 一种基于区块链的无法篡改的网络出题和答题方法
CN109446483B (zh) * 2018-09-30 2022-09-30 大连海事大学 一种用于包含主观信息的客观题的机器判卷方法
CN110990556B (zh) * 2019-12-06 2023-07-25 北京金山数字娱乐科技有限公司 成语推荐方法及装置、成语推荐模型的训练方法及装置
CN111382255B (zh) * 2020-03-17 2023-08-01 北京百度网讯科技有限公司 用于问答处理的方法、装置、设备和介质
CN111428499B (zh) * 2020-04-27 2021-10-26 南京大学 一种融合近义词信息用于自动问答系统的成语压缩表示方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190371299A1 (en) * 2017-02-28 2019-12-05 Huawei Technologies Co., Ltd. Question Answering Method and Apparatus
CN109086273A (zh) * 2018-08-14 2018-12-25 北京粉笔未来科技有限公司 基于神经网络解答语法填空题的方法、装置和终端设备
CN109858015A (zh) * 2018-12-12 2019-06-07 湖北工业大学 一种基于ctw和km算法的语义相似度计算方法及装置
CN110096699A (zh) * 2019-03-20 2019-08-06 华南师范大学 基于语义的机器阅读理解的候选答案筛选方法和系统
CN110909144A (zh) * 2019-11-28 2020-03-24 中信银行股份有限公司 问答对话方法、装置、电子设备及计算机可读存储介质
CN111008702A (zh) * 2019-12-06 2020-04-14 北京金山数字娱乐科技有限公司 一种成语推荐模型的训练方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4209956A4

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114896986A (zh) * 2022-06-07 2022-08-12 北京百度网讯科技有限公司 增强语义识别模型的训练数据的方法和装置
CN114896986B (zh) * 2022-06-07 2024-04-05 北京百度网讯科技有限公司 增强语义识别模型的训练数据的方法和装置
CN117149989A (zh) * 2023-11-01 2023-12-01 腾讯科技(深圳)有限公司 大语言模型训练方法、文本处理方法及装置
CN117149989B (zh) * 2023-11-01 2024-02-09 腾讯科技(深圳)有限公司 大语言模型训练方法、文本处理方法及装置

Also Published As

Publication number Publication date
CN112069815B (zh) 2023-01-17
EP4209956A1 (en) 2023-07-12
KR20220031857A (ko) 2022-03-14
JP7418704B2 (ja) 2024-01-22
JP2022530689A (ja) 2022-06-30
US20220261546A1 (en) 2022-08-18
SG11202112291SA (en) 2021-12-30
EP4209956A4 (en) 2024-10-16
CN112069815A (zh) 2020-12-11
US12008319B2 (en) 2024-06-11
KR102688187B1 (ko) 2024-07-24

Similar Documents

Publication Publication Date Title
WO2021159816A1 (zh) 成语填空题的答案选择方法、装置和计算机设备
Ansarifar et al. Phrasal complexity in academic writing: A comparison of abstracts written by graduate students and expert writers in applied linguistics
US11631338B2 (en) Deep knowledge tracing with transformers
US20080126319A1 (en) Automated short free-text scoring method and system
CN107590127A (zh) 一种题库知识点自动标注方法及系统
CN109949637B (zh) 一种客观题目的自动解答方法和装置
Bai et al. A survey of current machine learning approaches to student free-text evaluation for intelligent tutoring
CN118261163B (zh) 基于transformer结构的智能评价报告生成方法及系统
CN113705191A (zh) 样本语句的生成方法、装置、设备及存储介质
Agarwal et al. Autoeval: A nlp approach for automatic test evaluation system
CN115309910A (zh) 语篇要素和要素关系联合抽取方法、知识图谱构建方法
CN115203388A (zh) 机器阅读理解方法、装置、计算机设备和存储介质
CN114625759B (zh) 模型训练方法、智能问答方法、设备、介质及程序产品
CN111680515B (zh) 基于ai识别的答案确定方法、装置、电子设备及介质
CN117242507A (zh) 阅读以及写作能力提升的指导方法及其装置
Lefebvre-Brossard et al. Alloprof: a new french question-answer education dataset and its use in an information retrieval case study
Zhao et al. A study on the innovative model of foreign language teaching in universities using big data corpus
Li [Retracted] An English Writing Grammar Error Correction Technology Based on Similarity Algorithm
CN111428499A (zh) 一种融合近义词信息用于自动问答系统的成语压缩表示方法
Gao et al. An Investigation on the Enactment of Native-Speakerism on Social Media in China: A Critical Discourse Perspective
Alhamed et al. iGrade: an automated short answer grading system
Firoozi Using automated procedures to score written essays in Persian: An application of the multilingual BERT system
Wang et al. The Construction of a Shared Resource Base for Teaching Chinese Culture under the Architecture of Disciplinary Knowledge Mapping
Wibowo et al. Combining Multiple Text Representations for Improved Automatic Evaluation of Indonesian Essay Answers
Bhonsle et al. An Adaptive Approach for Subjective Answer Evaluation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20918558

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021568947

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020918558

Country of ref document: EP

Effective date: 20230404