CN111680515B

CN111680515B - Answer determination method and device based on AI (Artificial Intelligence) recognition, electronic equipment and medium

Info

Publication number: CN111680515B
Application number: CN202010437416.5A
Authority: CN
Inventors: 郑喜民; 喻宁; 冯晶凌; 柳阳
Original assignee: Ping An International Smart City Technology Co Ltd
Current assignee: Ping An International Smart City Technology Co Ltd
Priority date: 2020-05-21
Filing date: 2020-05-21
Publication date: 2022-05-03
Anticipated expiration: 2040-05-21
Also published as: CN111680515A

Abstract

The invention relates to artificial intelligence, which is applied to the field of intelligent education and provides an answer determination method based on AI identification. The method obtains the test questions and determines the test question types, when the test question types are the test question types of the near meaning words, the phrase quantity and the vocabulary types are determined, when the vocabulary type is the first type, inputting the question stem and a plurality of options into a Bert model trained in advance to obtain the similarity between the question stem and each option, when the vocabulary type is the second type, the question stem and the plurality of options are converted into a GloVe word vector and converted into a FastText word vector, the similarity between the question stem and each option is calculated based on the GloVe word vector and the FastText word vector, when the vocabulary type is a third type, determining a target language and a first word vector, translating the test question into other languages and determining a second word vector, calculating the similarity between the question stem and each option by using the first word vector and the second word vector, and determining the option with the highest similarity as an answer.

Description

Answer determination method and device based on AI (Artificial Intelligence) recognition, electronic equipment and medium

Technical Field

The present invention relates to the field of intelligent decision making technologies, and in particular, to an answer determination method and apparatus based on AI recognition, an electronic device, and a medium.

Background

At present, with the development of artificial intelligence, intelligent robots for instructing examinees to learn gradually appear in the market. In the current determination of answers to the subjects of the similar meaning words by the intelligent robot, a single text approximation degree measurement method is adopted, so that the accuracy rate of the answers is low, and in addition, the generalization capability of the single text approximation degree measurement method is also low.

Disclosure of Invention

In view of the above, it is desirable to provide an answer determining method, device, electronic device and medium based on AI identification, which can not only improve the accuracy of answer determination, but also ensure the generalization capability of the similarity calculation method.

An answer determination method based on AI recognition, the answer determination method based on AI recognition comprising:

obtaining a test question to be determined, and determining the type of the test question to be determined, wherein the test question to be determined comprises a question stem and a plurality of options;

when the test question type is a similar meaning word test question type, determining the number of phrases in the test question to be determined;

determining the vocabulary types to which the test questions to be determined belong according to the phrase number, wherein the vocabulary types comprise a first type, a second type and a third type;

when the vocabulary type is the first type, inputting the question stem and the options into a pre-trained Bert model to obtain the similarity between the question stem and each option; or

When the vocabulary type is the second type, converting a target vocabulary in the question stem and the options into a GloVe word vector, converting the question stem and the options into a FastText word vector, and calculating the similarity between the question stem and each option based on the GloVe word vector and the FastText word vector; or

When the vocabulary type is the third type, determining a target language of the target test question, acquiring the question stem and a first word vector of the multiple options based on the target language, translating the target test question into other languages except the target language, acquiring the question stem and a second word vector of the multiple options based on the other languages, and calculating the similarity between the question stem and each option by using the first word vector and the second word vector;

and determining the option with the highest similarity as the answer of the test question to be determined.

According to the preferred embodiment of the present invention, the determining the test question type of the test question to be determined includes:

acquiring a preset identifier;

detecting whether the preset identification exists in the test question to be determined;

when the preset identification exists in the test question to be determined, determining the test question to be determined as the type of the test question of the similar meaning word;

and when the preset identification does not exist in the test question to be determined, determining the test question to be determined as other test question types.

According to the preferred embodiment of the present invention, the determining the number of phrases in the test question to be determined includes:

extracting information corresponding to the preset label from the question stem to serve as a target vocabulary;

calculating the number of words in the target vocabulary and each option;

determining the total word amount of the target vocabulary and the multiple options according to the word amount;

determining the number of the multiple options to obtain the number of the options;

and performing difference operation on the total word amount and the option amount, and subtracting one from the difference operation result to obtain the phrase amount.

According to the preferred embodiment of the present invention, the determining the vocabulary type to which the test question to be determined belongs according to the number of phrases comprises:

when the phrase number is a first preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the first type; or

When the number of the phrases is larger than the first preset threshold value and the number of the phrases is smaller than or equal to a second preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the second type; or

When the number of the phrases is larger than the second preset threshold value and the number of the phrases is smaller than or equal to a third preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the third type; or

And when the phrase number is larger than the third preset threshold value, determining the vocabulary type to which the test question to be determined belongs as other types.

According to a preferred embodiment of the present invention, before inputting the stem and the plurality of options into the pre-trained Bert model, the method further comprises:

obtaining QQP sentence pairs on the data set, and obtaining labels corresponding to the sentence pairs;

combining an MLM mechanism and an NSP mechanism to obtain a semantic vector network layer;

calculating the sentence pairs by utilizing the semantic vector network layer to obtain semantic vectors with context semantic information;

calculating the semantic vector through a pre-constructed similarity calculation network layer to obtain the similarity of the sentence pair;

optimizing the semantic vector network layer and the similarity calculation network layer according to the similarity of the sentence pairs and the labels to obtain a learner;

determining the source of the test questions to be determined;

and acquiring a preset number of test questions from the source, and finely adjusting the learner by using the test questions to obtain the Bert model.

According to a preferred embodiment of the present invention, the converting the target vocabulary and the plurality of options in the stem into a GloVe word vector and the converting the stem and the plurality of options into a FastText word vector, and the calculating the similarity between the stem and each option based on the GloVe word vector and the FastText word vector comprises:

obtaining a third word vector corresponding to each word in the target vocabulary from a first configuration file, and calculating an average value of the third word vectors to obtain a first GloVe word vector;

for each option, acquiring a fourth word vector corresponding to each word from the first configuration file, and calculating an average value of the fourth word vectors to obtain a second GloVe word vector of each option;

calculating the distance between the first GloVe word vector and each second GloVe word vector by using a cosine distance formula, wherein the distance is used as a first distance between the question stem and each option;

acquiring a fifth word vector corresponding to each word in the target vocabulary from a second configuration file, and calculating the average value of the fifth word vectors to obtain a first FastText word vector;

for each option, acquiring a sixth word vector corresponding to each word from the second configuration file, and calculating the average value of the sixth word vectors to obtain a second FastText word vector of each option;

calculating the distance between the first FastText word vector and each second FastText word vector by using a cosine distance formula, wherein the distance is used as a second distance between the question stem and each option;

and performing weighted sum operation on the first distance and the second distance, and taking an operation result as the similarity of the question stem and each option.

According to a preferred embodiment of the present invention, the calculating the similarity between the stem and each option using the first word vector and the second word vector comprises:

sequentially splicing the first word vector and the second word vector to obtain the question stem and the target word vectors of the multiple options;

and calculating the distance between the question stem and each option in the multiple options according to the target word vector based on a cosine distance formula to obtain the similarity between the question stem and each option.

An answer determination device based on AI recognition, the answer determination device based on AI recognition comprising:

the device comprises a determining unit, a judging unit and a judging unit, wherein the determining unit is used for acquiring a test question to be determined and determining the type of the test question to be determined, and the test question to be determined comprises a question stem and a plurality of options;

the determining unit is further configured to determine the number of phrases in the test question to be determined when the test question type is a near meaning word test question type;

the determining unit is further configured to determine the vocabulary types to which the test questions to be determined belong according to the number of the phrases, where the vocabulary types include a first type, a second type, and a third type;

the input unit is used for inputting the question stem and the options into a pre-trained Bert model when the vocabulary type is the first type, so that the similarity between the question stem and each option is obtained; or

The calculation unit is used for converting a target vocabulary in the stem and the options into GloVe word vectors, converting the stem and the options into FastText word vectors and calculating the similarity between the stem and each option based on the GloVe word vectors and the FastText word vectors when the vocabulary type is the second type; or

The calculation unit is further configured to, when the vocabulary type is the third type, determine a target language of the target test question, obtain the stem and a first word vector of the multiple options based on the target language, translate the target test question into another language except for the target language, obtain the stem and a second word vector of the multiple options based on the another language, and calculate a similarity between the stem and each option by using the first word vector and the second word vector;

the determining unit is further configured to determine the option with the highest similarity as the answer to the test question to be determined.

According to the preferred embodiment of the present invention, the determining of the test question type of the test question to be determined by the determining unit includes:

acquiring a preset identifier;

According to a preferred embodiment of the present invention, the determining unit determining the number of phrases in the test question to be determined includes:

calculating the number of words in the target vocabulary and each option;

According to the preferred embodiment of the present invention, the determining unit determining the vocabulary type to which the test question to be determined belongs according to the number of phrases comprises:

When the number of the phrases is larger than the second preset threshold value and the number of the phrases is smaller than or equal to a third preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the third type; or alternatively

And when the number of the phrases is larger than the third preset threshold value, determining the vocabulary type to which the test question to be determined belongs as other types.

According to a preferred embodiment of the invention, the apparatus further comprises:

an obtaining unit, configured to obtain QQP a sentence pair on a data set and obtain a label corresponding to the sentence pair before inputting the stem and the options into a pre-trained Bert model;

the combining unit is used for combining the MLM mechanism and the NSP mechanism to obtain a semantic vector network layer;

the computing unit is further configured to compute the sentence pair by using the semantic vector network layer to obtain a semantic vector with context semantic information;

the computing unit is further configured to compute the semantic vector through a pre-constructed similarity computing network layer to obtain the similarity of the sentence pair;

the optimizing unit is used for optimizing the semantic vector network layer and the similarity calculation network layer according to the similarity of the sentence pairs and the labels to obtain a learner;

the determining unit is further configured to determine a source of the test question to be determined;

and the adjusting unit is used for acquiring a preset number of test questions from the source and finely adjusting the learner by utilizing the test questions to obtain the Bert model.

According to a preferred embodiment of the present invention, the calculating unit converts the target vocabulary and the plurality of options in the stem into a GloVe word vector, converts the stem and the plurality of options into a FastText word vector, and calculates the similarity between the stem and each option based on the GloVe word vector and the FastText word vector includes:

According to a preferred embodiment of the present invention, the calculating unit calculates the similarity between the stem and each option by using the first word vector and the second word vector includes:

An electronic device, the electronic device comprising:

a memory storing at least one instruction; and

and the processor acquires instructions stored in the memory to realize the answer determination method based on AI identification.

A computer-readable storage medium having stored therein at least one instruction, the at least one instruction being fetched by a processor in an electronic device to implement the AI identification based answer determination method.

According to the technical scheme, the word type to which the target test question belongs can be quickly determined through the number of the phrases, the QQP data set is added in the training of the Bert model, the robustness of the Bert model is improved, the generalization capability of the similarity calculation method can be further ensured, the similarity between the question stem and each option is further calculated through constructing double vectors of a GloVe word vector and a FastText word vector, the accuracy of similarity calculation can be improved on the premise of improving the calculation efficiency, the similarity between the question stem and each option is calculated through different modes selected by the word type of the target test question, the similarity between the question stem and each option can be more accurately obtained, the accuracy of answer determination of the test question to be determined is improved, in addition, when the word types are the second type and the third type, the similarity calculation method belongs to an unsupervised method, and the generalization capability of the similarity calculation method can be ensured.

Drawings

FIG. 1 is a flowchart illustrating an answer determination method based on AI recognition according to a preferred embodiment of the present invention.

FIG. 2 is a functional block diagram of an answer determination apparatus based on AI recognition according to a preferred embodiment of the present invention.

Fig. 3 is a schematic structural diagram of an electronic device implementing an answer determination method based on AI identification according to a preferred embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.

Fig. 1 is a flowchart illustrating an answer determination method based on AI recognition according to a preferred embodiment of the present invention. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.

The invention belongs to the field of intelligent education, and can promote the construction of smart cities. The answer determination method based on AI identification is applied to one or more electronic devices, which are devices capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and the hardware thereof includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.

The electronic device may be any electronic product capable of performing human-computer interaction with a user, for example, a Personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game machine, an interactive Internet Protocol Television (IPTV), an intelligent wearable device, and the like.

The electronic device may also include a network device and/or a user device. The network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of hosts or network servers.

The Network where the electronic device is located includes, but is not limited to, the internet, a wide area Network, a metropolitan area Network, a local area Network, a Virtual Private Network (VPN), and the like.

In at least one embodiment of the invention, the invention relates to the field of artificial intelligence.

S10, obtaining the test questions to be determined, and determining the test question types of the test questions to be determined, wherein the test questions to be determined comprise question stems and a plurality of options.

In at least one embodiment of the present invention, the test questions to be determined may be obtained from a tolford Delta simulation platform, or may be obtained from a PDF test paper by using an OCR (Optical Character Recognition) technique.

In at least one embodiment of the present invention, the test question types include a near word test question type and other test question types.

In at least one embodiment of the present invention, the electronic device determining the test question type of the test question to be determined includes:

the electronic equipment obtains a preset identification, further, the electronic equipment detects whether the preset identification exists in the test questions to be determined, when the preset identification exists in the test questions to be determined, the electronic equipment determines that the test questions to be determined are the type of the test questions with the similar meaning words, and when the preset identification does not exist in the test questions to be determined, the electronic equipment determines that the test questions to be determined are the types of other test questions.

The preset mark can be a double quotation mark, the preset mark can also be an underline, and the specific preset mark can be determined according to an actual scene.

And the test question type of the test question to be determined can be quickly determined by directly detecting whether the preset identification exists in the test question to be determined.

S11, when the test question type is the similar meaning word test question type, determining the number of phrases in the test question to be determined.

In at least one embodiment of the invention, the number of phrases in the test question to be determined is the sum of the number of phrases in the target vocabulary and the number of phrases in the plurality of options.

In at least one embodiment of the present invention, the preset tag refers to an identifier that characterizes a nominal vocabulary, a verb vocabulary.

In at least one embodiment of the present invention, the electronic device determining the number of phrases in the test question to be determined comprises:

the electronic equipment extracts information corresponding to the preset label from the question stem to serve as a target vocabulary, calculates the target vocabulary and the number of words in each option, determines the total number of the target vocabulary and the words of the options according to the number of the words, determines the number of the options to obtain the number of the options, and carries out difference operation on the total number of the words and the number of the options and reduces the difference operation result by one to obtain the number of the phrases.

The number of phrases in the test question to be determined can be quickly determined by calculating the number of words in the target vocabulary and each option.

S12, determining the vocabulary types to which the test questions to be determined belong according to the phrase number, wherein the vocabulary types comprise a first type, a second type and a third type.

In at least one embodiment of the present invention, the determining, by the electronic device, the vocabulary type to which the test question to be determined belongs according to the number of phrases includes:

(1) and when the phrase number is a first preset threshold value, the electronic equipment determines the vocabulary type to which the test question to be determined belongs as the first type.

(2) And when the number of the phrases is greater than the first preset threshold value and the number of the phrases is less than or equal to a second preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the second type.

(3) And when the number of the phrases is greater than the second preset threshold value and the number of the phrases is less than or equal to a third preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the third type.

(4) And when the phrase number is larger than the third preset threshold value, determining the vocabulary type to which the test question to be determined belongs as other types.

According to the big data calculation, when the first preset threshold is 0, the second preset threshold is 3 and the third preset threshold is 5, the calculation result is optimal.

In at least one embodiment of the present invention, when the vocabulary type is the other type, the electronic device may calculate the similarity between the stem and each option in a deep learning manner, so as to determine the answer to the test question to be determined, which is not specifically set forth herein.

And S13, when the vocabulary type is the first type, inputting the stem and the options into a pre-trained Bert model to obtain the similarity between the stem and each option.

In at least one embodiment of the invention, the Bert model comprises a semantic vector network layer and a similarity calculation network layer.

In at least one embodiment of the present invention, before inputting the stem and the plurality of options into the pre-trained Bert model, the method further comprises:

the electronic device acquires Sentence pairs on a QQP (Quora Question Pairs) data set and acquires labels corresponding to the Sentence pairs, further, the electronic device combines an MLM (masked Language model) mechanism and an NSP (Next Sentensice prediction) mechanism to obtain a semantic vector network layer, the electronic device calculates the Sentence pairs by using the semantic vector network layer to obtain semantic vectors with context semantic information, further, the electronic device calculates the semantic vectors by using a pre-constructed similarity calculation network layer to obtain the similarity of the Sentence pairs, the electronic device optimizes the semantic vector network layer and the similarity calculation network layer according to the similarity of the Sentence pairs and the labels to obtain a learner, the electronic device determines the sources of the test questions to be determined, and acquires a preset number of test questions from the sources, and utilizing the test questions to finely adjust the learner to obtain the Bert model.

The sentence pairs on the QQP data set are modeled bidirectionally through an MLM mechanism and an NSP mechanism, a semantic vector network layer with context semantic information can be obtained, the semantic vector network layer and the similarity calculation network layer are optimized through the sentence pairs, the precision of a learner can be improved, the learner is finely adjusted through the test questions in the source, and the Bert model can be more suitable for similarity calculation of the test questions to be determined.

And S14, when the vocabulary type is the second type, converting the target vocabulary in the stem and the options into GloVe word vectors, converting the stem and the options into FastText word vectors, and calculating the similarity between the stem and each option based on the GloVe word vectors and the FastText word vectors.

In at least one embodiment of the present invention, the electronic device converts the target vocabulary and the options in the stem into a GloVe word vector, converts the stem and the options into a FastText word vector, and calculates the similarity between the stem and each option based on the GloVe word vector and the FastText word vector includes:

the electronic equipment obtains a third word vector corresponding to each word in the target vocabulary from a first configuration file, calculates an average value of the third word vectors to obtain a first GloVe word vector, obtains a fourth word vector corresponding to each word from the first configuration file for each option, calculates an average value of the fourth word vector to obtain a second GloVe word vector for each option, calculates a distance between the first GloVe word vector and each second GloVe word vector by using a cosine distance formula to serve as a first distance between the stem and each option, obtains a fifth word vector corresponding to each word in the target vocabulary from a second configuration file, calculates an average value of the fifth word vectors to obtain a first FastText word vector, and obtains a sixth word vector corresponding to each word from the second configuration file for each option, and calculating the average value of the sixth word vectors to obtain a second FastText word vector of each option, calculating the distance between the first FastText word vector and each second FastText word vector by the electronic equipment by using a cosine distance formula to serve as a second distance between the stem and each option, weighting and calculating the first distance and the second distance by the electronic equipment, and taking the calculation result as the similarity between the stem and each option.

The first configuration file stores the mapping relation between a plurality of words and a GloVe word vector, the first configuration file can be glove.840B.300d.txt, the second configuration file stores the mapping relation between a plurality of words and a FastText word vector, and the second configuration file can be crawl-300d-2 M.vec.

Through the implementation mode, the similarity between the stem and each option is calculated by constructing double vectors of the GloVe word vector and the FastText word vector, and the accuracy of similarity calculation can be improved on the premise of improving the calculation efficiency.

And S15, when the vocabulary type is the third type, determining a target language of the target test question, acquiring the stem and a first word vector of the multiple options based on the target language, translating the target test question into other languages except the target language, acquiring the stem and a second word vector of the multiple options based on the other languages, and calculating the similarity between the stem and each option by using the first word vector and the second word vector.

In at least one embodiment of the present invention, the other languages are languages other than the target language, such as: when the target language is English, other languages may be German or French.

In at least one embodiment of the present invention, the calculating, by the electronic device, the similarity between the stem and each option using the first word vector and the second word vector includes:

the electronic equipment sequentially splices the first word vector and the second word vector to obtain the question stem and target word vectors of the multiple options, and based on a cosine distance formula, the electronic equipment calculates the distance between the question stem and each option in the multiple options according to the target word vectors to obtain the similarity between the question stem and each option.

And the dimension number of the target word vector is the sum of the dimension number of the first word vector and the dimension number of the second word vector.

By improving the dimensionality of the target word vector, the accuracy of similarity calculation between the question stem and each option can be improved.

And S16, determining the option with the highest similarity as the answer of the test question to be determined.

In at least one embodiment of the present invention, after determining the option with the highest similarity as the answer to the test question to be determined, the method further includes:

the electronic equipment obtains the test question number of the test question to be determined, further, the electronic equipment generates prompt information according to the test question number and the answer, and the electronic equipment sends the prompt information to the terminal equipment of the appointed contact person.

Through the implementation mode, the appointed contact person can be reminded to check and receive the answer in time.

It is emphasized that to further ensure the privacy and security of the answer, the answer may also be stored in a node of a blockchain.

Fig. 2 is a functional block diagram of an answer determining apparatus based on AI recognition according to a preferred embodiment of the present invention. The answer determining apparatus 11 based on AI identification includes a determining unit 110, an input unit 111, a calculating unit 112, an obtaining unit 113, a combining unit 114, an optimizing unit 115, an adjusting unit 116, a generating unit 117, and a transmitting unit 118. The module/unit referred to in the present invention refers to a series of computer program segments that can be fetched by the processor 13 and that can perform a fixed function, and that are stored in the memory 12. In the present embodiment, the functions of the modules/units will be described in detail in the following embodiments.

The determining unit 110 obtains a test question to be determined, and determines a test question type of the test question to be determined, where the test question to be determined includes a question stem and a plurality of options.

In at least one embodiment of the present invention, the determining unit 110 determines the test question type of the test question to be determined, including:

the determining unit 110 obtains a preset identifier, further, the determining unit 110 detects whether the preset identifier exists in the test questions to be determined, when the preset identifier exists in the test questions to be determined, the determining unit 110 determines that the test questions to be determined are the type of the synonym test questions, and when the preset identifier does not exist in the test questions to be determined, the determining unit 110 determines that the test questions to be determined are the types of other test questions.

When the test question type is a similar meaning word test question type, the determining unit 110 determines the number of phrases in the test question to be determined.

In at least one embodiment of the present invention, the determining unit 110 determines the number of phrases in the test question to be determined, including:

the determining unit 110 extracts information corresponding to the preset tag from the stem as a target vocabulary, the determining unit 110 calculates the target vocabulary and the number of words in each option, the determining unit 110 determines the total number of words of the target vocabulary and the options according to the number of words, the determining unit 110 determines the number of the options to obtain the number of options, and the determining unit 110 performs a difference operation on the total number of words and the number of options and reduces the difference operation result by one to obtain the number of phrases.

The determining unit 110 determines the vocabulary types to which the test questions to be determined belong according to the phrase number, wherein the vocabulary types include a first type, a second type and a third type.

In at least one embodiment of the present invention, the determining unit 110 determines the vocabulary type to which the test question to be determined belongs according to the number of phrases, including:

(1) when the number of phrases is a first preset threshold, the determining unit 110 determines the vocabulary type to which the test question to be determined belongs as the first type.

(2) When the number of phrases is greater than the first preset threshold and the number of phrases is less than or equal to a second preset threshold, the determining unit 110 determines the vocabulary type to which the test question to be determined belongs as the second type.

(3) When the number of phrases is greater than the second preset threshold and the number of phrases is less than or equal to a third preset threshold, the determining unit 110 determines the vocabulary type to which the test question to be determined belongs as the third type.

(4) When the number of phrases is greater than the third preset threshold, the determining unit 110 determines the vocabulary type to which the test question to be determined belongs as another type.

In at least one embodiment of the present invention, when the vocabulary type is the other type, the determining unit 110 may calculate the similarity between the stem and each option in a deep learning manner, and further determine the answer to the test question to be determined, which is not specifically set forth herein.

When the vocabulary type is the first type, the input unit 111 inputs the stem and the plurality of options into a pre-trained Bert model to obtain the similarity between the stem and each option.

In at least one embodiment of the present invention, before inputting the stem and the plurality of options into a pre-trained Bert model, the obtaining unit 113 obtains a Sentence pair on a qqp (quala Question pairs) data set and obtains a label corresponding to the Sentence pair, further, the combining unit 114 combines an mlm (masked Language model) mechanism and an nsp (next sequence predicate) mechanism to obtain a semantic vector network layer, the calculating unit 112 calculates the Sentence pair by using the semantic vector network layer to obtain a semantic vector with context semantic information, further, the calculating unit 112 calculates the semantic vector by using a pre-constructed similarity calculation network layer to obtain a similarity of the Sentence pair, the optimizing unit 115 optimizes the semantic vector network layer and the similarity calculation network layer according to the similarity of the Sentence pair and the label, obtaining a learner, wherein the determining unit 110 determines the source of the test questions to be determined, the adjusting unit 116 obtains a preset number of test questions from the source, and fine-adjusts the learner by using the test questions to obtain the Bert model.

The sentences on the QQP data set are modeled bidirectionally through an MLM mechanism and an NSP mechanism, a semantic vector network layer with context semantic information can be obtained, the semantic vector network layer and the similarity calculation network layer are optimized through the sentence pairs, the precision of the learner can be improved, the learner is finely adjusted through the test questions in the source, and the Bert model can be more suitable for similarity calculation of the test questions to be determined.

When the vocabulary type is the second type, the calculating unit 112 converts the target vocabulary in the stem and the options into GloVe word vectors, converts the stem and the options into FastText word vectors, and calculates the similarity between the stem and each option based on the GloVe word vectors and the FastText word vectors.

In at least one embodiment of the present invention, the calculating unit 112 converts the target vocabulary and the plurality of options in the stem into a GloVe word vector, converts the stem and the plurality of options into a FastText word vector, and calculates the similarity between the stem and each option based on the GloVe word vector and the FastText word vector includes:

the calculating unit 112 obtains a third word vector corresponding to each word in the target vocabulary from a first configuration file, and calculates an average value of the third word vectors to obtain a first GloVe word vector, for each option, the calculating unit 112 obtains a fourth word vector corresponding to each word from the first configuration file, and calculates an average value of the fourth word vector to obtain a second GloVe word vector for each option, the calculating unit 112 calculates a distance between the first GloVe word vector and each second GloVe word vector by using a cosine distance formula as a first distance between the stem and each option, the calculating unit 112 obtains a fifth word vector corresponding to each word in the target vocabulary from a second configuration file, and calculates an average value of the fifth word vectors to obtain a first FastText word vector, for each option, the calculating unit 112 obtains a sixth word vector corresponding to each word from the second configuration file, and calculating an average value of the sixth word vectors to obtain a second FastText word vector of each option, wherein the calculating unit 112 calculates a distance between the first FastText word vector and each second FastText word vector by using a cosine distance formula as a second distance between the stem and each option, and the calculating unit 112 performs weighting and operation on the first distance and the second distance and uses an operation result as a similarity between the stem and each option.

When the vocabulary type is the third type, the calculating unit 112 determines a target language of the target test question, obtains the stem and a first word vector of the plurality of options based on the target language, translates the target test question into other languages except the target language, obtains the stem and a second word vector of the plurality of options based on the other languages, and calculates a similarity between the stem and each option by using the first word vector and the second word vector.

In at least one embodiment of the present invention, the other languages are languages other than the target language, such as: when the target language is English, other languages may be German or French, etc.

In at least one embodiment of the present invention, the calculating unit 112 calculates the similarity between the stem and each option by using the first word vector and the second word vector includes:

the calculating unit 112 sequentially splices the first word vector and the second word vector to obtain the question stem and the target word vectors of the multiple options, and based on a cosine distance formula, the calculating unit 112 calculates the distance between the question stem and each option of the multiple options according to the target word vectors to obtain the similarity between the question stem and each option.

By improving the dimensionality of the target word vector, the accuracy of similarity calculation of the question stem and each option can be improved.

The determining unit 110 determines the option with the highest similarity as the answer to the test question to be determined.

In at least one embodiment of the present invention, after determining the option with the highest similarity as the answer to the test question to be determined, the obtaining unit 113 obtains the test question number of the test question to be determined, further, the generating unit 117 generates the prompt information according to the test question number and the answer, and the sending unit 118 sends the prompt information to the terminal device of the designated contact.

In one embodiment of the present invention, the electronic device 1 includes, but is not limited to, a memory 12, a processor 13, and a computer program stored in the memory 12 and executable on the processor 13, such as an answer determination program based on AI recognition.

It will be appreciated by a person skilled in the art that the schematic diagram is only an example of the electronic device 1 and does not constitute a limitation of the electronic device 1, and that it may comprise more or less components than shown, or some components may be combined, or different components, e.g. the electronic device 1 may further comprise an input output device, a network access device, a bus, etc.

The Processor 13 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. The processor 13 is an operation core and a control center of the electronic device 1, and is connected to each part of the whole electronic device 1 by using various interfaces and lines, and acquires an operating system of the electronic device 1 and various installed application programs, program codes, and the like.

The processor 13 obtains an operating system of the electronic device 1 and various installed application programs. The processor 13 obtains the application program to implement the steps in each of the above embodiments of the answer determination method based on AI identification, such as the steps shown in fig. 1.

Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 12 and retrieved by the processor 13 to accomplish the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the acquisition process of the computer program in the electronic device 1. For example, the computer program may be divided into a determination unit 110, an input unit 111, a calculation unit 112, an acquisition unit 113, a combination unit 114, an optimization unit 115, an adjustment unit 116, a generation unit 117, and a transmission unit 118.

The memory 12 can be used for storing the computer programs and/or modules, and the processor 13 can implement various functions of the electronic device 1 by running or acquiring the computer programs and/or modules stored in the memory 12 and calling data stored in the memory 12. The memory 12 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 12 may include a non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other non-volatile solid state storage device.

The memory 12 may be an external memory and/or an internal memory of the electronic device 1. Further, the memory 12 may be a memory having a physical form, such as a memory stick, a TF Card (Trans-flash Card), or the like.

The integrated modules/units of the electronic device 1 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow in the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium and used for instructing related hardware to implement the steps of the above-described embodiments of the method when the computer program is acquired by a processor.

Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an available file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).

Referring to fig. 1, the memory 12 of the electronic device 1 stores a plurality of instructions to implement an answer determination method based on AI identification, and the processor 13 can obtain the plurality of instructions to implement: obtaining a test question to be determined, and determining the type of the test question to be determined, wherein the test question to be determined comprises a question stem and a plurality of options; when the test question type is a similar meaning word test question type, determining the number of phrases in the test question to be determined; determining the vocabulary types to which the test questions to be determined belong according to the phrase number, wherein the vocabulary types comprise a first type, a second type and a third type; when the vocabulary type is the first type, inputting the question stem and the options into a pre-trained Bert model to obtain the similarity between the question stem and each option; or when the vocabulary type is the second type, converting a target vocabulary in the stem and the options into a GloVe word vector, converting the stem and the options into a FastText word vector, and calculating the similarity between the stem and each option based on the GloVe word vector and the FastText word vector; or when the vocabulary type is the third type, determining a target language of the target test question, acquiring the stem and a first word vector of the multiple options based on the target language, translating the target test question into other languages except the target language, acquiring the stem and a second word vector of the multiple options based on the other languages, and calculating the similarity between the stem and each option by using the first word vector and the second word vector; and determining the option with the highest similarity as the answer of the test question to be determined.

Specifically, the processor 13 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.

The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.

Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. An answer determination method based on AI recognition, comprising:

When the vocabulary type is the third type, determining a target language of a target test question, acquiring the question stem and a first word vector of the multiple options based on the target language, translating the target test question into other languages except the target language, acquiring the question stem and a second word vector of the multiple options based on the other languages, and calculating the similarity between the question stem and each option by using the first word vector and the second word vector;

2. The AI recognition-based answer determination method of claim 1, wherein the determining of the test question type of the test question to be determined comprises:

acquiring a preset identifier;

3. The AI recognition-based answer determination method of claim 1, wherein the determining the number of phrases in the test question to be determined comprises:

extracting information corresponding to a preset label from the question stem to serve as a target vocabulary;

calculating the number of words in the target vocabulary and each option;

4. The AI recognition-based answer determination method of claim 1, wherein the determining the vocabulary type to which the test question to be determined belongs according to the number of phrases comprises:

5. The AI-recognition based answer determination method of claim 1, wherein prior to inputting the stem and the plurality of options into a pre-trained Bert model, the AI-recognition based answer determination method further comprises:

determining the source of the test questions to be determined;

6. The AI-recognition based answer determination method of claim 1, wherein the converting the target vocabulary and the plurality of options in the stem into GloVe word vectors and the converting the stem and the plurality of options into FastText word vectors, the calculating the similarity of the stem to each option based on the GloVe word vectors and the FastText word vectors comprises:

7. The AI-recognition-based answer determination method of claim 1, wherein the calculating the similarity of the stem to each option using the first word vector and the second word vector comprises:

8. An answer determination device based on AI recognition, characterized in that the answer determination device based on AI recognition comprises:

The calculation unit is used for converting a target vocabulary in the stem and the options into GloVe word vectors, converting the stem and the options into FastText word vectors and calculating the similarity between the stem and each option based on the GloVe word vectors and the FastText word vectors when the vocabulary type is the second type; or alternatively

The calculating unit is further configured to, when the vocabulary type is the third type, determine a target language of a target test question, obtain the stem and a first word vector of the multiple options based on the target language, translate the target test question into another language except for the target language, obtain the stem and a second word vector of the multiple options based on the another language, and calculate a similarity between the stem and each option by using the first word vector and the second word vector;

9. An electronic device, characterized in that the electronic device comprises:

a memory storing at least one instruction; and

a processor that retrieves instructions stored in the memory to implement the AI recognition based answer determination method of any one of claims 1 to 7.

10. A computer-readable storage medium characterized by: the computer-readable storage medium has stored therein at least one instruction, which is retrieved by a processor in an electronic device to implement the AI recognition based answer determination method according to any one of claims 1 to 7.