CN111680515B - Answer determination method and device based on AI (Artificial Intelligence) recognition, electronic equipment and medium - Google Patents

Answer determination method and device based on AI (Artificial Intelligence) recognition, electronic equipment and medium Download PDF

Info

Publication number
CN111680515B
CN111680515B CN202010437416.5A CN202010437416A CN111680515B CN 111680515 B CN111680515 B CN 111680515B CN 202010437416 A CN202010437416 A CN 202010437416A CN 111680515 B CN111680515 B CN 111680515B
Authority
CN
China
Prior art keywords
type
stem
question
word vector
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010437416.5A
Other languages
Chinese (zh)
Other versions
CN111680515A (en
Inventor
郑喜民
喻宁
冯晶凌
柳阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN202010437416.5A priority Critical patent/CN111680515B/en
Publication of CN111680515A publication Critical patent/CN111680515A/en
Application granted granted Critical
Publication of CN111680515B publication Critical patent/CN111680515B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to artificial intelligence, which is applied to the field of intelligent education and provides an answer determination method based on AI identification. The method obtains the test questions and determines the test question types, when the test question types are the test question types of the near meaning words, the phrase quantity and the vocabulary types are determined, when the vocabulary type is the first type, inputting the question stem and a plurality of options into a Bert model trained in advance to obtain the similarity between the question stem and each option, when the vocabulary type is the second type, the question stem and the plurality of options are converted into a GloVe word vector and converted into a FastText word vector, the similarity between the question stem and each option is calculated based on the GloVe word vector and the FastText word vector, when the vocabulary type is a third type, determining a target language and a first word vector, translating the test question into other languages and determining a second word vector, calculating the similarity between the question stem and each option by using the first word vector and the second word vector, and determining the option with the highest similarity as an answer.

Description

Answer determination method and device based on AI (Artificial Intelligence) recognition, electronic equipment and medium
Technical Field
The present invention relates to the field of intelligent decision making technologies, and in particular, to an answer determination method and apparatus based on AI recognition, an electronic device, and a medium.
Background
At present, with the development of artificial intelligence, intelligent robots for instructing examinees to learn gradually appear in the market. In the current determination of answers to the subjects of the similar meaning words by the intelligent robot, a single text approximation degree measurement method is adopted, so that the accuracy rate of the answers is low, and in addition, the generalization capability of the single text approximation degree measurement method is also low.
Disclosure of Invention
In view of the above, it is desirable to provide an answer determining method, device, electronic device and medium based on AI identification, which can not only improve the accuracy of answer determination, but also ensure the generalization capability of the similarity calculation method.
An answer determination method based on AI recognition, the answer determination method based on AI recognition comprising:
obtaining a test question to be determined, and determining the type of the test question to be determined, wherein the test question to be determined comprises a question stem and a plurality of options;
when the test question type is a similar meaning word test question type, determining the number of phrases in the test question to be determined;
determining the vocabulary types to which the test questions to be determined belong according to the phrase number, wherein the vocabulary types comprise a first type, a second type and a third type;
when the vocabulary type is the first type, inputting the question stem and the options into a pre-trained Bert model to obtain the similarity between the question stem and each option; or
When the vocabulary type is the second type, converting a target vocabulary in the question stem and the options into a GloVe word vector, converting the question stem and the options into a FastText word vector, and calculating the similarity between the question stem and each option based on the GloVe word vector and the FastText word vector; or
When the vocabulary type is the third type, determining a target language of the target test question, acquiring the question stem and a first word vector of the multiple options based on the target language, translating the target test question into other languages except the target language, acquiring the question stem and a second word vector of the multiple options based on the other languages, and calculating the similarity between the question stem and each option by using the first word vector and the second word vector;
and determining the option with the highest similarity as the answer of the test question to be determined.
According to the preferred embodiment of the present invention, the determining the test question type of the test question to be determined includes:
acquiring a preset identifier;
detecting whether the preset identification exists in the test question to be determined;
when the preset identification exists in the test question to be determined, determining the test question to be determined as the type of the test question of the similar meaning word;
and when the preset identification does not exist in the test question to be determined, determining the test question to be determined as other test question types.
According to the preferred embodiment of the present invention, the determining the number of phrases in the test question to be determined includes:
extracting information corresponding to the preset label from the question stem to serve as a target vocabulary;
calculating the number of words in the target vocabulary and each option;
determining the total word amount of the target vocabulary and the multiple options according to the word amount;
determining the number of the multiple options to obtain the number of the options;
and performing difference operation on the total word amount and the option amount, and subtracting one from the difference operation result to obtain the phrase amount.
According to the preferred embodiment of the present invention, the determining the vocabulary type to which the test question to be determined belongs according to the number of phrases comprises:
when the phrase number is a first preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the first type; or
When the number of the phrases is larger than the first preset threshold value and the number of the phrases is smaller than or equal to a second preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the second type; or
When the number of the phrases is larger than the second preset threshold value and the number of the phrases is smaller than or equal to a third preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the third type; or
And when the phrase number is larger than the third preset threshold value, determining the vocabulary type to which the test question to be determined belongs as other types.
According to a preferred embodiment of the present invention, before inputting the stem and the plurality of options into the pre-trained Bert model, the method further comprises:
obtaining QQP sentence pairs on the data set, and obtaining labels corresponding to the sentence pairs;
combining an MLM mechanism and an NSP mechanism to obtain a semantic vector network layer;
calculating the sentence pairs by utilizing the semantic vector network layer to obtain semantic vectors with context semantic information;
calculating the semantic vector through a pre-constructed similarity calculation network layer to obtain the similarity of the sentence pair;
optimizing the semantic vector network layer and the similarity calculation network layer according to the similarity of the sentence pairs and the labels to obtain a learner;
determining the source of the test questions to be determined;
and acquiring a preset number of test questions from the source, and finely adjusting the learner by using the test questions to obtain the Bert model.
According to a preferred embodiment of the present invention, the converting the target vocabulary and the plurality of options in the stem into a GloVe word vector and the converting the stem and the plurality of options into a FastText word vector, and the calculating the similarity between the stem and each option based on the GloVe word vector and the FastText word vector comprises:
obtaining a third word vector corresponding to each word in the target vocabulary from a first configuration file, and calculating an average value of the third word vectors to obtain a first GloVe word vector;
for each option, acquiring a fourth word vector corresponding to each word from the first configuration file, and calculating an average value of the fourth word vectors to obtain a second GloVe word vector of each option;
calculating the distance between the first GloVe word vector and each second GloVe word vector by using a cosine distance formula, wherein the distance is used as a first distance between the question stem and each option;
acquiring a fifth word vector corresponding to each word in the target vocabulary from a second configuration file, and calculating the average value of the fifth word vectors to obtain a first FastText word vector;
for each option, acquiring a sixth word vector corresponding to each word from the second configuration file, and calculating the average value of the sixth word vectors to obtain a second FastText word vector of each option;
calculating the distance between the first FastText word vector and each second FastText word vector by using a cosine distance formula, wherein the distance is used as a second distance between the question stem and each option;
and performing weighted sum operation on the first distance and the second distance, and taking an operation result as the similarity of the question stem and each option.
According to a preferred embodiment of the present invention, the calculating the similarity between the stem and each option using the first word vector and the second word vector comprises:
sequentially splicing the first word vector and the second word vector to obtain the question stem and the target word vectors of the multiple options;
and calculating the distance between the question stem and each option in the multiple options according to the target word vector based on a cosine distance formula to obtain the similarity between the question stem and each option.
An answer determination device based on AI recognition, the answer determination device based on AI recognition comprising:
the device comprises a determining unit, a judging unit and a judging unit, wherein the determining unit is used for acquiring a test question to be determined and determining the type of the test question to be determined, and the test question to be determined comprises a question stem and a plurality of options;
the determining unit is further configured to determine the number of phrases in the test question to be determined when the test question type is a near meaning word test question type;
the determining unit is further configured to determine the vocabulary types to which the test questions to be determined belong according to the number of the phrases, where the vocabulary types include a first type, a second type, and a third type;
the input unit is used for inputting the question stem and the options into a pre-trained Bert model when the vocabulary type is the first type, so that the similarity between the question stem and each option is obtained; or
The calculation unit is used for converting a target vocabulary in the stem and the options into GloVe word vectors, converting the stem and the options into FastText word vectors and calculating the similarity between the stem and each option based on the GloVe word vectors and the FastText word vectors when the vocabulary type is the second type; or
The calculation unit is further configured to, when the vocabulary type is the third type, determine a target language of the target test question, obtain the stem and a first word vector of the multiple options based on the target language, translate the target test question into another language except for the target language, obtain the stem and a second word vector of the multiple options based on the another language, and calculate a similarity between the stem and each option by using the first word vector and the second word vector;
the determining unit is further configured to determine the option with the highest similarity as the answer to the test question to be determined.
According to the preferred embodiment of the present invention, the determining of the test question type of the test question to be determined by the determining unit includes:
acquiring a preset identifier;
detecting whether the preset identification exists in the test question to be determined;
when the preset identification exists in the test question to be determined, determining the test question to be determined as the type of the test question of the similar meaning word;
and when the preset identification does not exist in the test question to be determined, determining the test question to be determined as other test question types.
According to a preferred embodiment of the present invention, the determining unit determining the number of phrases in the test question to be determined includes:
extracting information corresponding to the preset label from the question stem to serve as a target vocabulary;
calculating the number of words in the target vocabulary and each option;
determining the total word amount of the target vocabulary and the multiple options according to the word amount;
determining the number of the multiple options to obtain the number of the options;
and performing difference operation on the total word amount and the option amount, and subtracting one from the difference operation result to obtain the phrase amount.
According to the preferred embodiment of the present invention, the determining unit determining the vocabulary type to which the test question to be determined belongs according to the number of phrases comprises:
when the phrase number is a first preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the first type; or
When the number of the phrases is larger than the first preset threshold value and the number of the phrases is smaller than or equal to a second preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the second type; or
When the number of the phrases is larger than the second preset threshold value and the number of the phrases is smaller than or equal to a third preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the third type; or alternatively
And when the number of the phrases is larger than the third preset threshold value, determining the vocabulary type to which the test question to be determined belongs as other types.
According to a preferred embodiment of the invention, the apparatus further comprises:
an obtaining unit, configured to obtain QQP a sentence pair on a data set and obtain a label corresponding to the sentence pair before inputting the stem and the options into a pre-trained Bert model;
the combining unit is used for combining the MLM mechanism and the NSP mechanism to obtain a semantic vector network layer;
the computing unit is further configured to compute the sentence pair by using the semantic vector network layer to obtain a semantic vector with context semantic information;
the computing unit is further configured to compute the semantic vector through a pre-constructed similarity computing network layer to obtain the similarity of the sentence pair;
the optimizing unit is used for optimizing the semantic vector network layer and the similarity calculation network layer according to the similarity of the sentence pairs and the labels to obtain a learner;
the determining unit is further configured to determine a source of the test question to be determined;
and the adjusting unit is used for acquiring a preset number of test questions from the source and finely adjusting the learner by utilizing the test questions to obtain the Bert model.
According to a preferred embodiment of the present invention, the calculating unit converts the target vocabulary and the plurality of options in the stem into a GloVe word vector, converts the stem and the plurality of options into a FastText word vector, and calculates the similarity between the stem and each option based on the GloVe word vector and the FastText word vector includes:
obtaining a third word vector corresponding to each word in the target vocabulary from a first configuration file, and calculating an average value of the third word vectors to obtain a first GloVe word vector;
for each option, acquiring a fourth word vector corresponding to each word from the first configuration file, and calculating an average value of the fourth word vectors to obtain a second GloVe word vector of each option;
calculating the distance between the first GloVe word vector and each second GloVe word vector by using a cosine distance formula, wherein the distance is used as a first distance between the question stem and each option;
acquiring a fifth word vector corresponding to each word in the target vocabulary from a second configuration file, and calculating the average value of the fifth word vectors to obtain a first FastText word vector;
for each option, acquiring a sixth word vector corresponding to each word from the second configuration file, and calculating the average value of the sixth word vectors to obtain a second FastText word vector of each option;
calculating the distance between the first FastText word vector and each second FastText word vector by using a cosine distance formula, wherein the distance is used as a second distance between the question stem and each option;
and performing weighted sum operation on the first distance and the second distance, and taking an operation result as the similarity of the question stem and each option.
According to a preferred embodiment of the present invention, the calculating unit calculates the similarity between the stem and each option by using the first word vector and the second word vector includes:
sequentially splicing the first word vector and the second word vector to obtain the question stem and the target word vectors of the multiple options;
and calculating the distance between the question stem and each option in the multiple options according to the target word vector based on a cosine distance formula to obtain the similarity between the question stem and each option.
An electronic device, the electronic device comprising:
a memory storing at least one instruction; and
and the processor acquires instructions stored in the memory to realize the answer determination method based on AI identification.
A computer-readable storage medium having stored therein at least one instruction, the at least one instruction being fetched by a processor in an electronic device to implement the AI identification based answer determination method.
According to the technical scheme, the word type to which the target test question belongs can be quickly determined through the number of the phrases, the QQP data set is added in the training of the Bert model, the robustness of the Bert model is improved, the generalization capability of the similarity calculation method can be further ensured, the similarity between the question stem and each option is further calculated through constructing double vectors of a GloVe word vector and a FastText word vector, the accuracy of similarity calculation can be improved on the premise of improving the calculation efficiency, the similarity between the question stem and each option is calculated through different modes selected by the word type of the target test question, the similarity between the question stem and each option can be more accurately obtained, the accuracy of answer determination of the test question to be determined is improved, in addition, when the word types are the second type and the third type, the similarity calculation method belongs to an unsupervised method, and the generalization capability of the similarity calculation method can be ensured.
Drawings
FIG. 1 is a flowchart illustrating an answer determination method based on AI recognition according to a preferred embodiment of the present invention.
FIG. 2 is a functional block diagram of an answer determination apparatus based on AI recognition according to a preferred embodiment of the present invention.
Fig. 3 is a schematic structural diagram of an electronic device implementing an answer determination method based on AI identification according to a preferred embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
Fig. 1 is a flowchart illustrating an answer determination method based on AI recognition according to a preferred embodiment of the present invention. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.
The invention belongs to the field of intelligent education, and can promote the construction of smart cities. The answer determination method based on AI identification is applied to one or more electronic devices, which are devices capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and the hardware thereof includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The electronic device may be any electronic product capable of performing human-computer interaction with a user, for example, a Personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game machine, an interactive Internet Protocol Television (IPTV), an intelligent wearable device, and the like.
The electronic device may also include a network device and/or a user device. The network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of hosts or network servers.
The Network where the electronic device is located includes, but is not limited to, the internet, a wide area Network, a metropolitan area Network, a local area Network, a Virtual Private Network (VPN), and the like.
In at least one embodiment of the invention, the invention relates to the field of artificial intelligence.
S10, obtaining the test questions to be determined, and determining the test question types of the test questions to be determined, wherein the test questions to be determined comprise question stems and a plurality of options.
In at least one embodiment of the present invention, the test questions to be determined may be obtained from a tolford Delta simulation platform, or may be obtained from a PDF test paper by using an OCR (Optical Character Recognition) technique.
In at least one embodiment of the present invention, the test question types include a near word test question type and other test question types.
In at least one embodiment of the present invention, the electronic device determining the test question type of the test question to be determined includes:
the electronic equipment obtains a preset identification, further, the electronic equipment detects whether the preset identification exists in the test questions to be determined, when the preset identification exists in the test questions to be determined, the electronic equipment determines that the test questions to be determined are the type of the test questions with the similar meaning words, and when the preset identification does not exist in the test questions to be determined, the electronic equipment determines that the test questions to be determined are the types of other test questions.
The preset mark can be a double quotation mark, the preset mark can also be an underline, and the specific preset mark can be determined according to an actual scene.
And the test question type of the test question to be determined can be quickly determined by directly detecting whether the preset identification exists in the test question to be determined.
S11, when the test question type is the similar meaning word test question type, determining the number of phrases in the test question to be determined.
In at least one embodiment of the invention, the number of phrases in the test question to be determined is the sum of the number of phrases in the target vocabulary and the number of phrases in the plurality of options.
In at least one embodiment of the present invention, the preset tag refers to an identifier that characterizes a nominal vocabulary, a verb vocabulary.
In at least one embodiment of the present invention, the electronic device determining the number of phrases in the test question to be determined comprises:
the electronic equipment extracts information corresponding to the preset label from the question stem to serve as a target vocabulary, calculates the target vocabulary and the number of words in each option, determines the total number of the target vocabulary and the words of the options according to the number of the words, determines the number of the options to obtain the number of the options, and carries out difference operation on the total number of the words and the number of the options and reduces the difference operation result by one to obtain the number of the phrases.
The number of phrases in the test question to be determined can be quickly determined by calculating the number of words in the target vocabulary and each option.
S12, determining the vocabulary types to which the test questions to be determined belong according to the phrase number, wherein the vocabulary types comprise a first type, a second type and a third type.
In at least one embodiment of the present invention, the determining, by the electronic device, the vocabulary type to which the test question to be determined belongs according to the number of phrases includes:
(1) and when the phrase number is a first preset threshold value, the electronic equipment determines the vocabulary type to which the test question to be determined belongs as the first type.
(2) And when the number of the phrases is greater than the first preset threshold value and the number of the phrases is less than or equal to a second preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the second type.
(3) And when the number of the phrases is greater than the second preset threshold value and the number of the phrases is less than or equal to a third preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the third type.
(4) And when the phrase number is larger than the third preset threshold value, determining the vocabulary type to which the test question to be determined belongs as other types.
According to the big data calculation, when the first preset threshold is 0, the second preset threshold is 3 and the third preset threshold is 5, the calculation result is optimal.
In at least one embodiment of the present invention, when the vocabulary type is the other type, the electronic device may calculate the similarity between the stem and each option in a deep learning manner, so as to determine the answer to the test question to be determined, which is not specifically set forth herein.
And S13, when the vocabulary type is the first type, inputting the stem and the options into a pre-trained Bert model to obtain the similarity between the stem and each option.
In at least one embodiment of the invention, the Bert model comprises a semantic vector network layer and a similarity calculation network layer.
In at least one embodiment of the present invention, before inputting the stem and the plurality of options into the pre-trained Bert model, the method further comprises:
the electronic device acquires Sentence pairs on a QQP (Quora Question Pairs) data set and acquires labels corresponding to the Sentence pairs, further, the electronic device combines an MLM (masked Language model) mechanism and an NSP (Next Sentensice prediction) mechanism to obtain a semantic vector network layer, the electronic device calculates the Sentence pairs by using the semantic vector network layer to obtain semantic vectors with context semantic information, further, the electronic device calculates the semantic vectors by using a pre-constructed similarity calculation network layer to obtain the similarity of the Sentence pairs, the electronic device optimizes the semantic vector network layer and the similarity calculation network layer according to the similarity of the Sentence pairs and the labels to obtain a learner, the electronic device determines the sources of the test questions to be determined, and acquires a preset number of test questions from the sources, and utilizing the test questions to finely adjust the learner to obtain the Bert model.
The sentence pairs on the QQP data set are modeled bidirectionally through an MLM mechanism and an NSP mechanism, a semantic vector network layer with context semantic information can be obtained, the semantic vector network layer and the similarity calculation network layer are optimized through the sentence pairs, the precision of a learner can be improved, the learner is finely adjusted through the test questions in the source, and the Bert model can be more suitable for similarity calculation of the test questions to be determined.
And S14, when the vocabulary type is the second type, converting the target vocabulary in the stem and the options into GloVe word vectors, converting the stem and the options into FastText word vectors, and calculating the similarity between the stem and each option based on the GloVe word vectors and the FastText word vectors.
In at least one embodiment of the present invention, the electronic device converts the target vocabulary and the options in the stem into a GloVe word vector, converts the stem and the options into a FastText word vector, and calculates the similarity between the stem and each option based on the GloVe word vector and the FastText word vector includes:
the electronic equipment obtains a third word vector corresponding to each word in the target vocabulary from a first configuration file, calculates an average value of the third word vectors to obtain a first GloVe word vector, obtains a fourth word vector corresponding to each word from the first configuration file for each option, calculates an average value of the fourth word vector to obtain a second GloVe word vector for each option, calculates a distance between the first GloVe word vector and each second GloVe word vector by using a cosine distance formula to serve as a first distance between the stem and each option, obtains a fifth word vector corresponding to each word in the target vocabulary from a second configuration file, calculates an average value of the fifth word vectors to obtain a first FastText word vector, and obtains a sixth word vector corresponding to each word from the second configuration file for each option, and calculating the average value of the sixth word vectors to obtain a second FastText word vector of each option, calculating the distance between the first FastText word vector and each second FastText word vector by the electronic equipment by using a cosine distance formula to serve as a second distance between the stem and each option, weighting and calculating the first distance and the second distance by the electronic equipment, and taking the calculation result as the similarity between the stem and each option.
The first configuration file stores the mapping relation between a plurality of words and a GloVe word vector, the first configuration file can be glove.840B.300d.txt, the second configuration file stores the mapping relation between a plurality of words and a FastText word vector, and the second configuration file can be crawl-300d-2 M.vec.
Through the implementation mode, the similarity between the stem and each option is calculated by constructing double vectors of the GloVe word vector and the FastText word vector, and the accuracy of similarity calculation can be improved on the premise of improving the calculation efficiency.
And S15, when the vocabulary type is the third type, determining a target language of the target test question, acquiring the stem and a first word vector of the multiple options based on the target language, translating the target test question into other languages except the target language, acquiring the stem and a second word vector of the multiple options based on the other languages, and calculating the similarity between the stem and each option by using the first word vector and the second word vector.
In at least one embodiment of the present invention, the other languages are languages other than the target language, such as: when the target language is English, other languages may be German or French.
In at least one embodiment of the present invention, the calculating, by the electronic device, the similarity between the stem and each option using the first word vector and the second word vector includes:
the electronic equipment sequentially splices the first word vector and the second word vector to obtain the question stem and target word vectors of the multiple options, and based on a cosine distance formula, the electronic equipment calculates the distance between the question stem and each option in the multiple options according to the target word vectors to obtain the similarity between the question stem and each option.
And the dimension number of the target word vector is the sum of the dimension number of the first word vector and the dimension number of the second word vector.
By improving the dimensionality of the target word vector, the accuracy of similarity calculation between the question stem and each option can be improved.
And S16, determining the option with the highest similarity as the answer of the test question to be determined.
In at least one embodiment of the present invention, after determining the option with the highest similarity as the answer to the test question to be determined, the method further includes:
the electronic equipment obtains the test question number of the test question to be determined, further, the electronic equipment generates prompt information according to the test question number and the answer, and the electronic equipment sends the prompt information to the terminal equipment of the appointed contact person.
Through the implementation mode, the appointed contact person can be reminded to check and receive the answer in time.
It is emphasized that to further ensure the privacy and security of the answer, the answer may also be stored in a node of a blockchain.
According to the technical scheme, the word type to which the target test question belongs can be quickly determined through the number of the phrases, the QQP data set is added in the training of the Bert model, the robustness of the Bert model is improved, the generalization capability of the similarity calculation method can be further ensured, the similarity between the question stem and each option is further calculated through constructing double vectors of a GloVe word vector and a FastText word vector, the accuracy of similarity calculation can be improved on the premise of improving the calculation efficiency, the similarity between the question stem and each option is calculated through different modes selected by the word type of the target test question, the similarity between the question stem and each option can be more accurately obtained, the accuracy of answer determination of the test question to be determined is improved, in addition, when the word types are the second type and the third type, the similarity calculation method belongs to an unsupervised method, and the generalization capability of the similarity calculation method can be ensured.
Fig. 2 is a functional block diagram of an answer determining apparatus based on AI recognition according to a preferred embodiment of the present invention. The answer determining apparatus 11 based on AI identification includes a determining unit 110, an input unit 111, a calculating unit 112, an obtaining unit 113, a combining unit 114, an optimizing unit 115, an adjusting unit 116, a generating unit 117, and a transmitting unit 118. The module/unit referred to in the present invention refers to a series of computer program segments that can be fetched by the processor 13 and that can perform a fixed function, and that are stored in the memory 12. In the present embodiment, the functions of the modules/units will be described in detail in the following embodiments.
The determining unit 110 obtains a test question to be determined, and determines a test question type of the test question to be determined, where the test question to be determined includes a question stem and a plurality of options.
In at least one embodiment of the present invention, the test questions to be determined may be obtained from a tolford Delta simulation platform, or may be obtained from a PDF test paper by using an OCR (Optical Character Recognition) technique.
In at least one embodiment of the present invention, the test question types include a near word test question type and other test question types.
In at least one embodiment of the present invention, the determining unit 110 determines the test question type of the test question to be determined, including:
the determining unit 110 obtains a preset identifier, further, the determining unit 110 detects whether the preset identifier exists in the test questions to be determined, when the preset identifier exists in the test questions to be determined, the determining unit 110 determines that the test questions to be determined are the type of the synonym test questions, and when the preset identifier does not exist in the test questions to be determined, the determining unit 110 determines that the test questions to be determined are the types of other test questions.
The preset mark can be a double quotation mark, the preset mark can also be an underline, and the specific preset mark can be determined according to an actual scene.
And the test question type of the test question to be determined can be quickly determined by directly detecting whether the preset identification exists in the test question to be determined.
When the test question type is a similar meaning word test question type, the determining unit 110 determines the number of phrases in the test question to be determined.
In at least one embodiment of the invention, the number of phrases in the test question to be determined is the sum of the number of phrases in the target vocabulary and the number of phrases in the plurality of options.
In at least one embodiment of the present invention, the preset tag refers to an identifier that characterizes a nominal vocabulary, a verb vocabulary.
In at least one embodiment of the present invention, the determining unit 110 determines the number of phrases in the test question to be determined, including:
the determining unit 110 extracts information corresponding to the preset tag from the stem as a target vocabulary, the determining unit 110 calculates the target vocabulary and the number of words in each option, the determining unit 110 determines the total number of words of the target vocabulary and the options according to the number of words, the determining unit 110 determines the number of the options to obtain the number of options, and the determining unit 110 performs a difference operation on the total number of words and the number of options and reduces the difference operation result by one to obtain the number of phrases.
The number of phrases in the test question to be determined can be quickly determined by calculating the number of words in the target vocabulary and each option.
The determining unit 110 determines the vocabulary types to which the test questions to be determined belong according to the phrase number, wherein the vocabulary types include a first type, a second type and a third type.
In at least one embodiment of the present invention, the determining unit 110 determines the vocabulary type to which the test question to be determined belongs according to the number of phrases, including:
(1) when the number of phrases is a first preset threshold, the determining unit 110 determines the vocabulary type to which the test question to be determined belongs as the first type.
(2) When the number of phrases is greater than the first preset threshold and the number of phrases is less than or equal to a second preset threshold, the determining unit 110 determines the vocabulary type to which the test question to be determined belongs as the second type.
(3) When the number of phrases is greater than the second preset threshold and the number of phrases is less than or equal to a third preset threshold, the determining unit 110 determines the vocabulary type to which the test question to be determined belongs as the third type.
(4) When the number of phrases is greater than the third preset threshold, the determining unit 110 determines the vocabulary type to which the test question to be determined belongs as another type.
According to the big data calculation, when the first preset threshold is 0, the second preset threshold is 3 and the third preset threshold is 5, the calculation result is optimal.
In at least one embodiment of the present invention, when the vocabulary type is the other type, the determining unit 110 may calculate the similarity between the stem and each option in a deep learning manner, and further determine the answer to the test question to be determined, which is not specifically set forth herein.
When the vocabulary type is the first type, the input unit 111 inputs the stem and the plurality of options into a pre-trained Bert model to obtain the similarity between the stem and each option.
In at least one embodiment of the invention, the Bert model comprises a semantic vector network layer and a similarity calculation network layer.
In at least one embodiment of the present invention, before inputting the stem and the plurality of options into a pre-trained Bert model, the obtaining unit 113 obtains a Sentence pair on a qqp (quala Question pairs) data set and obtains a label corresponding to the Sentence pair, further, the combining unit 114 combines an mlm (masked Language model) mechanism and an nsp (next sequence predicate) mechanism to obtain a semantic vector network layer, the calculating unit 112 calculates the Sentence pair by using the semantic vector network layer to obtain a semantic vector with context semantic information, further, the calculating unit 112 calculates the semantic vector by using a pre-constructed similarity calculation network layer to obtain a similarity of the Sentence pair, the optimizing unit 115 optimizes the semantic vector network layer and the similarity calculation network layer according to the similarity of the Sentence pair and the label, obtaining a learner, wherein the determining unit 110 determines the source of the test questions to be determined, the adjusting unit 116 obtains a preset number of test questions from the source, and fine-adjusts the learner by using the test questions to obtain the Bert model.
The sentences on the QQP data set are modeled bidirectionally through an MLM mechanism and an NSP mechanism, a semantic vector network layer with context semantic information can be obtained, the semantic vector network layer and the similarity calculation network layer are optimized through the sentence pairs, the precision of the learner can be improved, the learner is finely adjusted through the test questions in the source, and the Bert model can be more suitable for similarity calculation of the test questions to be determined.
When the vocabulary type is the second type, the calculating unit 112 converts the target vocabulary in the stem and the options into GloVe word vectors, converts the stem and the options into FastText word vectors, and calculates the similarity between the stem and each option based on the GloVe word vectors and the FastText word vectors.
In at least one embodiment of the present invention, the calculating unit 112 converts the target vocabulary and the plurality of options in the stem into a GloVe word vector, converts the stem and the plurality of options into a FastText word vector, and calculates the similarity between the stem and each option based on the GloVe word vector and the FastText word vector includes:
the calculating unit 112 obtains a third word vector corresponding to each word in the target vocabulary from a first configuration file, and calculates an average value of the third word vectors to obtain a first GloVe word vector, for each option, the calculating unit 112 obtains a fourth word vector corresponding to each word from the first configuration file, and calculates an average value of the fourth word vector to obtain a second GloVe word vector for each option, the calculating unit 112 calculates a distance between the first GloVe word vector and each second GloVe word vector by using a cosine distance formula as a first distance between the stem and each option, the calculating unit 112 obtains a fifth word vector corresponding to each word in the target vocabulary from a second configuration file, and calculates an average value of the fifth word vectors to obtain a first FastText word vector, for each option, the calculating unit 112 obtains a sixth word vector corresponding to each word from the second configuration file, and calculating an average value of the sixth word vectors to obtain a second FastText word vector of each option, wherein the calculating unit 112 calculates a distance between the first FastText word vector and each second FastText word vector by using a cosine distance formula as a second distance between the stem and each option, and the calculating unit 112 performs weighting and operation on the first distance and the second distance and uses an operation result as a similarity between the stem and each option.
The first configuration file stores the mapping relation between a plurality of words and a GloVe word vector, the first configuration file can be glove.840B.300d.txt, the second configuration file stores the mapping relation between a plurality of words and a FastText word vector, and the second configuration file can be crawl-300d-2 M.vec.
Through the implementation mode, the similarity between the stem and each option is calculated by constructing double vectors of the GloVe word vector and the FastText word vector, and the accuracy of similarity calculation can be improved on the premise of improving the calculation efficiency.
When the vocabulary type is the third type, the calculating unit 112 determines a target language of the target test question, obtains the stem and a first word vector of the plurality of options based on the target language, translates the target test question into other languages except the target language, obtains the stem and a second word vector of the plurality of options based on the other languages, and calculates a similarity between the stem and each option by using the first word vector and the second word vector.
In at least one embodiment of the present invention, the other languages are languages other than the target language, such as: when the target language is English, other languages may be German or French, etc.
In at least one embodiment of the present invention, the calculating unit 112 calculates the similarity between the stem and each option by using the first word vector and the second word vector includes:
the calculating unit 112 sequentially splices the first word vector and the second word vector to obtain the question stem and the target word vectors of the multiple options, and based on a cosine distance formula, the calculating unit 112 calculates the distance between the question stem and each option of the multiple options according to the target word vectors to obtain the similarity between the question stem and each option.
And the dimension number of the target word vector is the sum of the dimension number of the first word vector and the dimension number of the second word vector.
By improving the dimensionality of the target word vector, the accuracy of similarity calculation of the question stem and each option can be improved.
The determining unit 110 determines the option with the highest similarity as the answer to the test question to be determined.
In at least one embodiment of the present invention, after determining the option with the highest similarity as the answer to the test question to be determined, the obtaining unit 113 obtains the test question number of the test question to be determined, further, the generating unit 117 generates the prompt information according to the test question number and the answer, and the sending unit 118 sends the prompt information to the terminal device of the designated contact.
Through the implementation mode, the appointed contact person can be reminded to check and receive the answer in time.
It is emphasized that to further ensure the privacy and security of the answer, the answer may also be stored in a node of a blockchain.
According to the technical scheme, the word type to which the target test question belongs can be quickly determined through the number of the phrases, the QQP data set is added in the training of the Bert model, the robustness of the Bert model is improved, the generalization capability of the similarity calculation method can be further ensured, the similarity between the question stem and each option is further calculated through constructing double vectors of a GloVe word vector and a FastText word vector, the accuracy of similarity calculation can be improved on the premise of improving the calculation efficiency, the similarity between the question stem and each option is calculated through different modes selected by the word type of the target test question, the similarity between the question stem and each option can be more accurately obtained, the accuracy of answer determination of the test question to be determined is improved, in addition, when the word types are the second type and the third type, the similarity calculation method belongs to an unsupervised method, and the generalization capability of the similarity calculation method can be ensured.
Fig. 3 is a schematic structural diagram of an electronic device implementing an answer determination method based on AI identification according to a preferred embodiment of the invention.
In one embodiment of the present invention, the electronic device 1 includes, but is not limited to, a memory 12, a processor 13, and a computer program stored in the memory 12 and executable on the processor 13, such as an answer determination program based on AI recognition.
It will be appreciated by a person skilled in the art that the schematic diagram is only an example of the electronic device 1 and does not constitute a limitation of the electronic device 1, and that it may comprise more or less components than shown, or some components may be combined, or different components, e.g. the electronic device 1 may further comprise an input output device, a network access device, a bus, etc.
The Processor 13 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. The processor 13 is an operation core and a control center of the electronic device 1, and is connected to each part of the whole electronic device 1 by using various interfaces and lines, and acquires an operating system of the electronic device 1 and various installed application programs, program codes, and the like.
The processor 13 obtains an operating system of the electronic device 1 and various installed application programs. The processor 13 obtains the application program to implement the steps in each of the above embodiments of the answer determination method based on AI identification, such as the steps shown in fig. 1.
Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 12 and retrieved by the processor 13 to accomplish the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the acquisition process of the computer program in the electronic device 1. For example, the computer program may be divided into a determination unit 110, an input unit 111, a calculation unit 112, an acquisition unit 113, a combination unit 114, an optimization unit 115, an adjustment unit 116, a generation unit 117, and a transmission unit 118.
The memory 12 can be used for storing the computer programs and/or modules, and the processor 13 can implement various functions of the electronic device 1 by running or acquiring the computer programs and/or modules stored in the memory 12 and calling data stored in the memory 12. The memory 12 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 12 may include a non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other non-volatile solid state storage device.
The memory 12 may be an external memory and/or an internal memory of the electronic device 1. Further, the memory 12 may be a memory having a physical form, such as a memory stick, a TF Card (Trans-flash Card), or the like.
The integrated modules/units of the electronic device 1 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow in the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium and used for instructing related hardware to implement the steps of the above-described embodiments of the method when the computer program is acquired by a processor.
Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an available file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).
Referring to fig. 1, the memory 12 of the electronic device 1 stores a plurality of instructions to implement an answer determination method based on AI identification, and the processor 13 can obtain the plurality of instructions to implement: obtaining a test question to be determined, and determining the type of the test question to be determined, wherein the test question to be determined comprises a question stem and a plurality of options; when the test question type is a similar meaning word test question type, determining the number of phrases in the test question to be determined; determining the vocabulary types to which the test questions to be determined belong according to the phrase number, wherein the vocabulary types comprise a first type, a second type and a third type; when the vocabulary type is the first type, inputting the question stem and the options into a pre-trained Bert model to obtain the similarity between the question stem and each option; or when the vocabulary type is the second type, converting a target vocabulary in the stem and the options into a GloVe word vector, converting the stem and the options into a FastText word vector, and calculating the similarity between the stem and each option based on the GloVe word vector and the FastText word vector; or when the vocabulary type is the third type, determining a target language of the target test question, acquiring the stem and a first word vector of the multiple options based on the target language, translating the target test question into other languages except the target language, acquiring the stem and a second word vector of the multiple options based on the other languages, and calculating the similarity between the stem and each option by using the first word vector and the second word vector; and determining the option with the highest similarity as the answer of the test question to be determined.
Specifically, the processor 13 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. An answer determination method based on AI recognition, comprising:
obtaining a test question to be determined, and determining the type of the test question to be determined, wherein the test question to be determined comprises a question stem and a plurality of options;
when the test question type is a similar meaning word test question type, determining the number of phrases in the test question to be determined;
determining the vocabulary types to which the test questions to be determined belong according to the phrase number, wherein the vocabulary types comprise a first type, a second type and a third type;
when the vocabulary type is the first type, inputting the question stem and the options into a pre-trained Bert model to obtain the similarity between the question stem and each option; or
When the vocabulary type is the second type, converting a target vocabulary in the question stem and the options into a GloVe word vector, converting the question stem and the options into a FastText word vector, and calculating the similarity between the question stem and each option based on the GloVe word vector and the FastText word vector; or
When the vocabulary type is the third type, determining a target language of a target test question, acquiring the question stem and a first word vector of the multiple options based on the target language, translating the target test question into other languages except the target language, acquiring the question stem and a second word vector of the multiple options based on the other languages, and calculating the similarity between the question stem and each option by using the first word vector and the second word vector;
and determining the option with the highest similarity as the answer of the test question to be determined.
2. The AI recognition-based answer determination method of claim 1, wherein the determining of the test question type of the test question to be determined comprises:
acquiring a preset identifier;
detecting whether the preset identification exists in the test question to be determined;
when the preset identification exists in the test question to be determined, determining the test question to be determined as the type of the test question of the similar meaning word;
and when the preset identification does not exist in the test question to be determined, determining the test question to be determined as other test question types.
3. The AI recognition-based answer determination method of claim 1, wherein the determining the number of phrases in the test question to be determined comprises:
extracting information corresponding to a preset label from the question stem to serve as a target vocabulary;
calculating the number of words in the target vocabulary and each option;
determining the total word amount of the target vocabulary and the multiple options according to the word amount;
determining the number of the multiple options to obtain the number of the options;
and performing difference operation on the total word amount and the option amount, and subtracting one from the difference operation result to obtain the phrase amount.
4. The AI recognition-based answer determination method of claim 1, wherein the determining the vocabulary type to which the test question to be determined belongs according to the number of phrases comprises:
when the phrase number is a first preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the first type; or
When the number of the phrases is larger than the first preset threshold value and the number of the phrases is smaller than or equal to a second preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the second type; or
When the number of the phrases is larger than the second preset threshold value and the number of the phrases is smaller than or equal to a third preset threshold value, determining the vocabulary type to which the test question to be determined belongs as the third type; or
And when the phrase number is larger than the third preset threshold value, determining the vocabulary type to which the test question to be determined belongs as other types.
5. The AI-recognition based answer determination method of claim 1, wherein prior to inputting the stem and the plurality of options into a pre-trained Bert model, the AI-recognition based answer determination method further comprises:
obtaining QQP sentence pairs on the data set, and obtaining labels corresponding to the sentence pairs;
combining an MLM mechanism and an NSP mechanism to obtain a semantic vector network layer;
calculating the sentence pairs by utilizing the semantic vector network layer to obtain semantic vectors with context semantic information;
calculating the semantic vector through a pre-constructed similarity calculation network layer to obtain the similarity of the sentence pair;
optimizing the semantic vector network layer and the similarity calculation network layer according to the similarity of the sentence pairs and the labels to obtain a learner;
determining the source of the test questions to be determined;
and acquiring a preset number of test questions from the source, and finely adjusting the learner by using the test questions to obtain the Bert model.
6. The AI-recognition based answer determination method of claim 1, wherein the converting the target vocabulary and the plurality of options in the stem into GloVe word vectors and the converting the stem and the plurality of options into FastText word vectors, the calculating the similarity of the stem to each option based on the GloVe word vectors and the FastText word vectors comprises:
obtaining a third word vector corresponding to each word in the target vocabulary from a first configuration file, and calculating an average value of the third word vectors to obtain a first GloVe word vector;
for each option, acquiring a fourth word vector corresponding to each word from the first configuration file, and calculating an average value of the fourth word vectors to obtain a second GloVe word vector of each option;
calculating the distance between the first GloVe word vector and each second GloVe word vector by using a cosine distance formula, wherein the distance is used as a first distance between the question stem and each option;
acquiring a fifth word vector corresponding to each word in the target vocabulary from a second configuration file, and calculating the average value of the fifth word vectors to obtain a first FastText word vector;
for each option, acquiring a sixth word vector corresponding to each word from the second configuration file, and calculating the average value of the sixth word vectors to obtain a second FastText word vector of each option;
calculating the distance between the first FastText word vector and each second FastText word vector by using a cosine distance formula, wherein the distance is used as a second distance between the question stem and each option;
and performing weighted sum operation on the first distance and the second distance, and taking an operation result as the similarity of the question stem and each option.
7. The AI-recognition-based answer determination method of claim 1, wherein the calculating the similarity of the stem to each option using the first word vector and the second word vector comprises:
sequentially splicing the first word vector and the second word vector to obtain the question stem and the target word vectors of the multiple options;
and calculating the distance between the question stem and each option in the multiple options according to the target word vector based on a cosine distance formula to obtain the similarity between the question stem and each option.
8. An answer determination device based on AI recognition, characterized in that the answer determination device based on AI recognition comprises:
the device comprises a determining unit, a judging unit and a judging unit, wherein the determining unit is used for acquiring a test question to be determined and determining the type of the test question to be determined, and the test question to be determined comprises a question stem and a plurality of options;
the determining unit is further configured to determine the number of phrases in the test question to be determined when the test question type is a near meaning word test question type;
the determining unit is further configured to determine the vocabulary types to which the test questions to be determined belong according to the number of the phrases, where the vocabulary types include a first type, a second type, and a third type;
the input unit is used for inputting the question stem and the options into a pre-trained Bert model when the vocabulary type is the first type, so that the similarity between the question stem and each option is obtained; or
The calculation unit is used for converting a target vocabulary in the stem and the options into GloVe word vectors, converting the stem and the options into FastText word vectors and calculating the similarity between the stem and each option based on the GloVe word vectors and the FastText word vectors when the vocabulary type is the second type; or alternatively
The calculating unit is further configured to, when the vocabulary type is the third type, determine a target language of a target test question, obtain the stem and a first word vector of the multiple options based on the target language, translate the target test question into another language except for the target language, obtain the stem and a second word vector of the multiple options based on the another language, and calculate a similarity between the stem and each option by using the first word vector and the second word vector;
the determining unit is further configured to determine the option with the highest similarity as the answer to the test question to be determined.
9. An electronic device, characterized in that the electronic device comprises:
a memory storing at least one instruction; and
a processor that retrieves instructions stored in the memory to implement the AI recognition based answer determination method of any one of claims 1 to 7.
10. A computer-readable storage medium characterized by: the computer-readable storage medium has stored therein at least one instruction, which is retrieved by a processor in an electronic device to implement the AI recognition based answer determination method according to any one of claims 1 to 7.
CN202010437416.5A 2020-05-21 2020-05-21 Answer determination method and device based on AI (Artificial Intelligence) recognition, electronic equipment and medium Active CN111680515B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010437416.5A CN111680515B (en) 2020-05-21 2020-05-21 Answer determination method and device based on AI (Artificial Intelligence) recognition, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010437416.5A CN111680515B (en) 2020-05-21 2020-05-21 Answer determination method and device based on AI (Artificial Intelligence) recognition, electronic equipment and medium

Publications (2)

Publication Number Publication Date
CN111680515A CN111680515A (en) 2020-09-18
CN111680515B true CN111680515B (en) 2022-05-03

Family

ID=72434245

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010437416.5A Active CN111680515B (en) 2020-05-21 2020-05-21 Answer determination method and device based on AI (Artificial Intelligence) recognition, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN111680515B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801829B (en) * 2020-12-31 2024-04-30 科大讯飞股份有限公司 Method and device for correlation of test question prediction network model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106997376A (en) * 2017-02-28 2017-08-01 浙江大学 The problem of one kind is based on multi-stage characteristics and answer sentence similarity calculating method
CN108304451A (en) * 2017-12-13 2018-07-20 中国科学院自动化研究所 Multiple-choice question answers method and device
CN109344236A (en) * 2018-09-07 2019-02-15 暨南大学 One kind being based on the problem of various features similarity calculating method
CN109947836A (en) * 2019-03-21 2019-06-28 江西风向标教育科技有限公司 English paper structural method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9342561B2 (en) * 2014-01-08 2016-05-17 International Business Machines Corporation Creating and using titles in untitled documents to answer questions
US10248653B2 (en) * 2014-11-25 2019-04-02 Lionbridge Technologies, Inc. Information technology platform for language translation and task management
US9785252B2 (en) * 2015-07-28 2017-10-10 Fitnii Inc. Method for inputting multi-language texts
CN107220380A (en) * 2017-06-27 2017-09-29 北京百度网讯科技有限公司 Question and answer based on artificial intelligence recommend method, device and computer equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106997376A (en) * 2017-02-28 2017-08-01 浙江大学 The problem of one kind is based on multi-stage characteristics and answer sentence similarity calculating method
CN108304451A (en) * 2017-12-13 2018-07-20 中国科学院自动化研究所 Multiple-choice question answers method and device
CN109344236A (en) * 2018-09-07 2019-02-15 暨南大学 One kind being based on the problem of various features similarity calculating method
CN109947836A (en) * 2019-03-21 2019-06-28 江西风向标教育科技有限公司 English paper structural method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
An Approach for Extracting Exact Answers to Question Answering (QA) System for English Sentences;Raju Barskar et al.;《Procedia Engineering》;20121231;第30卷;第1187-1194页 *
基于词语关联的散文阅读理解问题答案获取方法;乔霈 等;《中文信息学报》;20180331;第32卷(第3期);第135-142页 *

Also Published As

Publication number Publication date
CN111680515A (en) 2020-09-18

Similar Documents

Publication Publication Date Title
WO2022105122A1 (en) Answer generation method and apparatus based on artificial intelligence, and computer device and medium
WO2021139229A1 (en) Text rhetorical sentence generation method, apparatus and device, and readable storage medium
US11232263B2 (en) Generating summary content using supervised sentential extractive summarization
US12008319B2 (en) Method and apparatus for selecting answers to idiom fill-in-the-blank questions, and computer device
CN109857846B (en) Method and device for matching user question and knowledge point
CN112883193A (en) Training method, device and equipment of text classification model and readable medium
CN113656547B (en) Text matching method, device, equipment and storage medium
CN113590810B (en) Abstract generation model training method, abstract generation device and electronic equipment
CN113821622B (en) Answer retrieval method and device based on artificial intelligence, electronic equipment and medium
CN112131883A (en) Language model training method and device, computer equipment and storage medium
CN115310551A (en) Text analysis model training method and device, electronic equipment and storage medium
CN110717021A (en) Input text and related device for obtaining artificial intelligence interview
CN115238115A (en) Image retrieval method, device and equipment based on Chinese data and storage medium
CN111680515B (en) Answer determination method and device based on AI (Artificial Intelligence) recognition, electronic equipment and medium
CN114330483A (en) Data processing method, model training method, device, equipment and storage medium
CN113673702A (en) Method and device for evaluating pre-training language model and storage medium
CN117473057A (en) Question-answering processing method, system, equipment and storage medium
Zheng et al. Weakly-supervised image captioning based on rich contextual information
CN113705207A (en) Grammar error recognition method and device
CN110287487B (en) Master predicate identification method, apparatus, device, and computer-readable storage medium
CN116341646A (en) Pretraining method and device of Bert model, electronic equipment and storage medium
CN113486680B (en) Text translation method, device, equipment and storage medium
CN113420545B (en) Abstract generation method, device, equipment and storage medium
CN113627186B (en) Entity relation detection method based on artificial intelligence and related equipment
Murugathas et al. Domain specific question & answer generation in tamil

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant