CN110442675A

CN110442675A - Question and answer matching treatment, model training method, device, equipment and storage medium

Info

Publication number: CN110442675A
Application number: CN201910569979.7A
Authority: CN
Inventors: 金戈; 徐亮
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-06-27
Filing date: 2019-06-27
Publication date: 2019-11-12

Abstract

This application involves natural language processing field, using based on the attention feature from the model extraction problem of attention mechanism and between answering, with the matching degree between obtaining problem according to attention feature and answer.A kind of question and answer matching treatment, model training method, device, equipment and storage medium are specifically disclosed, this method comprises: obtaining question text and answering text；To described problem text and text progress word segmentation processing is answered, obtains corpus participle data；Insertion processing is carried out to corpus participle data, obtaining insertion indicates data；Based on feature extraction submodel, data, which carry out feature extraction and obtain from attention feature vector, to be indicated to the insertion, the feature extraction submodel is based on the model from attention mechanism；Based on matching submodel, question and answer matched data is generated from attention feature vector according to described, exports the question and answer matched data.

Description

Question and answer matching treatment, model training method, device, equipment and storage medium

Technical field

This application involves natural language processing technique field more particularly to a kind of question and answer matching treatment, model training method, Device, computer equipment and storage medium.

Background technique

In application scenes, need to measure the matching degree between problem and answer；For example, intelligence interview Etc. in scenes, need to evaluate the answer of Mr. Yu's interview question interviewee.Existing question and answer matching process generally passes through The modes such as Keywords matching appraise to answer, be easy to be piled up professional term, phrase and do not understand the answer of behind logic It misleads and provides the higher result of matching degree；And existing question and answer matching process is usually to execute to different default problems Different Metric policies, versatility is poor, larger for a required processing scale of model of application, limits application range.

Summary of the invention

The embodiment of the present application provides a kind of question and answer matching treatment, model training method, device, computer equipment and storage and is situated between Matter can be realized preferably based on feature extraction submodel and matching submodel to the matching degree between various problems and answer It is measured.

In a first aspect, this application provides a kind of question and answer matched processing methods, which comprises

It obtains question text and answers text；

To described problem text and text progress word segmentation processing is answered, obtains corpus participle data；

Insertion processing is carried out to corpus participle data, obtaining insertion indicates data；

Based on feature extraction submodel, to the insertion indicate data carry out feature extraction obtain from attention feature to Amount, the feature extraction submodel are based on the model from attention mechanism；

Based on matching submodel, question and answer matched data is generated from attention feature vector according to described, exports the question and answer Matched data.

Second aspect, this application provides a kind of question and answer matching treatment device, described device includes:

Text obtains module, for obtaining question text and answering text；

Word segmentation processing module obtains corpus participle number for carrying out word segmentation processing to described problem text and answer text According to；

It is embedded in processing module, for carrying out insertion processing to corpus participle data, obtaining insertion indicates data；

Characteristic extracting module indicates that data carry out feature extraction and obtain to the insertion for being based on feature extraction submodel To from attention feature vector, the feature extraction submodel is based on the model from attention mechanism；

Matching primitives module, for generating question and answer matching from attention feature vector according to described based on matching submodel Data export the question and answer matched data.

The third aspect, this application provides a kind of computer equipment, the computer equipment includes memory and processor； The memory is for storing computer program；The processor, by executing the computer program and based on execution is described Above-mentioned question and answer matched processing method is realized when calculation machine program；Or

Realize above-mentioned question and answer Matching Model training method.

Fourth aspect, this application provides a kind of computer readable storage medium, the computer readable storage medium is deposited Computer program is contained, if the computer program is executed by processor, realizes above-mentioned question and answer matched processing method；Or

Realize above-mentioned question and answer Matching Model training method.

This application discloses a kind of question and answer matching treatment, model training method, device, computer equipment and storage medium, By the way that text the problem of acquisition and answer text are segmented, insertion is handled, the insertion for obtaining text is indicated, then passes through base In from the model extraction question text of attention mechanism and answer between text from attention feature, according to from attention spy Sign generates question and answer matched data；Without presetting the corresponding different keywords databases of different interview questions, method can be based on model The a large amount of question and answer match informations practised, obtain matched data according to different problems and corresponding text, versatility is preferable, accuracy It is higher.

Detailed description of the invention

Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to required use in embodiment description Attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description is some embodiments of the present application, for this field For those of ordinary skill, without creative efforts, it is also possible to obtain other drawings based on these drawings.

Fig. 1 is the flow diagram of the question and answer matched processing method of one embodiment of the application；

Fig. 2 is to obtain question text in Fig. 1 and answer the sub-process schematic diagram of an embodiment of text；

Fig. 3 is to obtain question text in Fig. 1 and answer the schematic diagram of a scenario of another embodiment of text；

Fig. 4 is to obtain question text in Fig. 1 and answer the sub-process schematic diagram of another embodiment of text；

Fig. 5 is the sub-process schematic diagram of one embodiment of word segmentation processing in Fig. 1；

Fig. 6 is the sub-process schematic diagram of another embodiment of word segmentation processing in Fig. 1；

Fig. 7 is the sub-process schematic diagram that processing is embedded in Fig. 1；

Fig. 8 is the example schematic diagram of insertion processing；

Fig. 9 is the sub-process schematic diagram that question and answer matched data is generated in Fig. 1；

Figure 10 is the flow diagram of the question and answer Matching Model training method of one embodiment of the application；

Figure 11 is the sub-process schematic diagram of word segmentation processing in Figure 10；

Figure 12 is the sub-process schematic diagram that processing is embedded in Figure 10；

Figure 13 is a kind of structural schematic diagram of question and answer matching treatment device provided by the embodiments of the present application；

Figure 14 is the structural schematic diagram of another question and answer matching treatment device provided by the embodiments of the present application；

Figure 15 is a kind of structural schematic diagram of question and answer Matching Model training device provided by the embodiments of the present application；

Figure 16 is the structural schematic diagram of another question and answer Matching Model training device provided by the embodiments of the present application；

Figure 17 is a kind of structural schematic diagram for computer equipment that one embodiment of the application provides.

Specific embodiment

Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description, it is clear that described embodiment is some embodiments of the present application, instead of all the embodiments.Based on this Shen Please in embodiment, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall in the protection scope of this application.

Flow chart shown in the drawings only illustrates, it is not necessary to including all content and operation/step, also not It is that must be executed by described sequence.For example, some operation/steps can also decompose, combine or partially merge, therefore practical The sequence of execution is possible to change according to the actual situation.In addition, though the division of functional module has been carried out in schematic device, But in some cases, it can be divided with the module being different from schematic device.

Embodiments herein provides a kind of question and answer matched processing method, device, computer equipment and storage medium.Its In, which can be applied in terminal or server, be based on feature extraction submodel and matching to realize Submodel measures the matching degree between various problems and answer.

For example, question and answer matched processing method is used for server, it is of course possible to be used for terminal.Wherein, terminal can be mobile phone, The electronic equipments such as tablet computer, laptop, desktop computer, personal digital assistant and wearable device；Server can be Independent server, or server cluster.But in order to make it easy to understand, following embodiment will be to be applied to asking for server Matched processing method is answered to describe in detail.

With reference to the accompanying drawing, it elaborates to some embodiments of the application.In the absence of conflict, following Feature in embodiment and embodiment can be combined with each other.

Referring to Fig. 1, Fig. 1 is a kind of flow diagram for question and answer matched processing method that embodiments herein provides.

As shown in Figure 1, question and answer matched processing method includes the following steps S110- step S150.

Step S110, it obtains question text and answers text.

In some embodiments, the text of multiple problems, i.e. question text, and each problem has been stored in advance in server Corresponding to different marks, such as problem ID.Server extracts problem according to preset rules, will be sent to terminal the problem of extraction； To object is answered, such as the candidate of interview puts question to the problem of terminal will be received from server, and obtains the object and ask described The answer text of topic, and it is sent to server；Then the answer text that server returns to terminal is related to the mark of the problem Connection.

In other embodiments, as shown in Fig. 2, step S110 obtains question text and answers text, including step S101- step S102.

Step S101, the problem of obtaining interviewer text, matches interview question similar with described problem text, and obtain The mark of the interview question.

Illustratively, interviewer puts question to candidate by oral mode, and server obtains the voice signal of interviewer simultaneously Convert voice signals into question text；Then server matches and problem text in pre-stored multiple interview questions This most similar interview question, while the corresponding mark ID of the interview question can also be inquired.

Specifically, can be by calculating between the problem of voice signal is converted to text and pre-stored multiple interview questions Text similarity match and the most similar interview question of the question text.Question text and the interview to be compared are asked first Topic carry out word embedding operation (Word Embedding), by natural language indicate text conversion be computer it will be appreciated that Then amount or matrix form calculate the cosine similarity (cosine similarity) of the two；If cosine similarity is less than default Value or for minimum, then this interview question is the interview question similar with described problem text matched.

Step S102, answer of the candidate to the non-textual format of the interview question is obtained, by the non-textual format Answer treatment be answer text.

In some scenes, candidate answers interview question by input text, and server can directly acquire candidate The text inputted by input unit is as answer text.Under other scenes, the answer of candidate is voice, writes text Originally, the answer for the non-textual format that these servers of selection result can not be stored directly.Then first semantic answer of candidate is turned For answer text, the content of candidate person writing is switched to answer text by technologies such as Text regions or is selecting candidate The option chosen when selecting topic switchs to answer text associated with the mark of corresponding interview question.

In some other embodiment, as shown in Figure 3 and Figure 4, step S110 obtains question text and answers text, packet Include step S111- step S114.

Step S111, from terminal obtain the terminal encryption after problem voice and the terminal encryption after answer language Sound and the terminal extract the sound bite of encryption code key.

Illustratively, terminal obtains the answer voice of the problem of interviewer voice and candidate, in problem voice and/or returns It answers voice kind and extracts one section of voice as sound bite according to preset rules；Such as extract problem voice ending voice and The voice of voice the beginning part is answered as sound bite.

Encryption code key can be generated according to sound bite in terminal, such as to obtain character secret for the character for identifying in sound bite Key, or the cryptographic Hash etc. of calculating sound bite obtain encryption code key.

Illustratively, terminal identifies one or more characters by speech recognition from sound bite, then will identification The encryption code key that one or more characters out are handled for example, by modes such as zero paddings as preset length.

Terminal can according to encryption secret key pair described problem voice, answer voice and be encrypted, such as by AES128 or Person AES256 Encryption Algorithm is encrypted according to the character secret key pair described problem voice, answer voice；Then after encrypting The problem of voice and the terminal encryption after answer voice and the sound bite be sent to server.

Illustratively, server obtains encrypted problem voice and encrypted answer voice and voice from terminal The data format of segment is as shown in table 1:

Table 1 obtains the data format of data from terminal

Protocol data

Encrypted problem voice

Sound bite

Encrypted answer voice

Problem ID

Step S112, it identifies the character in the sound bite, obtains decryption code key.

After server receives sound bite from terminal, decryption code key can be generated according to sound bite, such as identify Character in sound bite obtains character code key, or the cryptographic Hash etc. of calculating sound bite obtains decryption code key.

Illustratively, server identifies one or more characters by speech recognition from sound bite, then will know Not Chu one or more characters the decryption code key for preset length is handled for example, by modes such as zero paddings.Server will identify that The processing of one or more characters be preset length decryption code key the one or more characters that will identify that of mode and terminal Processing is that the mode of the encryption code key of preset length is identical.

Step S113, the encrypted problem voice according to the decryption secret key pair and the encrypted answer voice It is decrypted, obtains problem voice and answers voice.

Illustratively, server is based on AES128 decipherment algorithm or AES256 decipherment algorithm, according to the decryption code key The encrypted problem voice and encrypted answer voice that obtain from terminal are decrypted, with the problem language after being decrypted Sound and answer voice.

Step S114, speech recognition is carried out to described problem voice and obtains question text, language is carried out to the answer voice Sound, which identifies to obtain, answers text.

Illustratively, server is based on built-in or external call speech recognition engine, the problem of obtaining to decryption Voice carries out speech recognition to obtain question text, carries out speech recognition to the answer text that decryption obtains to obtain answering text This.

By encryption, the decryption processing to problem voice, and answer voice, the secrecy of question and answer content is realized, is promoted The safety of information.

Illustratively, question text includes: Do you have a pet；Answering text includes: my dog is cute, he likes playing。

Step S120, to described problem text and text progress word segmentation processing is answered, obtains corpus participle data.

The problem of for languages such as English text and answer text, have the boundary of natural word；And for languages such as Chinese Text, it is underway text natural language processing when, it usually needs first segmented.

Illustratively, according to based on dictionary segmentation methods or according to the machine learning algorithm based on statistics to described problem Text and answer text carry out word segmentation processing.

In some embodiments, as shown in figure 5, step S120 carries out at participle described problem text and answer text Reason obtains corpus participle data, including step S121 and step S122.

Step S121, according to preset dictionary, word segmentation processing is carried out to described problem text, obtains problem participle data.

Dictionary is the candidate collection of an everyday words, such as I, love, doggie, shellfish shellfish these words, then from text head to tail time It goes through, cutting word if thering is word to occur in dictionary in text, so as to which I is liked that doggie shellfish shellfish word segmentation processing is for I Love doggie shellfish shellfish.

Step S122, according to preset dictionary, word segmentation processing is carried out to the answer text, obtains answering participle data.

Referring to step S121, the corresponding answer participle number of the answer text is obtained to answering after text carries out word segmentation processing According to.

In other embodiments, as shown in fig. 6, step S120 segments described problem text and answer text Processing obtains corpus participle data, including step S123 and step S124.

Step S123, according to preset dictionary, one-hot coding is carried out to described problem text, obtains problem participle number According to.

One-hot coding, i.e. one-hot encoding, an efficient coding；One-hot encoding is such a code system: for a certain attribute Word, how many state is with regard to how many bit, and only one bit is 1, other are all 0.

Illustratively, in preset dictionary include the corresponding word of this attribute of gender, respectively male, women and other. The attribute shares 3 different classification values, and 3 bits is needed to indicate what value the attribute is at this time.For example, only heat of male Code is { 100 }, and the one-hot encoding of women is { 010 }, and the one-hot encoding of other genders is { 001 }.

It illustratively, can also include the attributes such as person, fruit, season, motion mode, i.e., each attribute in preset dictionary Corresponding word and one-hot encoding.

If there are multiple words in certain text, when one-hot encoding being needed to encode, successively the one-hot encoding of each word is stitched together: Such as the one-hot encoding of male is { 100 }, the one-hot encoding of senior class is { 0001 }, then only heat that the two connects to the end Code { 1000001 }.

Use one-hot coding that can data are thinned out text-processing, and the data packet that one-hot is encoded The information of Words ' Attributes in text is contained.

Problem participle data corresponding to the question text are obtained after carrying out word segmentation processing to question text.

Illustratively, problem corresponding to certain question text segments data are as follows: 000,000,000,001 000000000010 100000000000 010000000000 001000000000。

Step S124, according to preset dictionary, one-hot coding is carried out to the answer text, obtains answering participle number According to.

Specifically, obtaining the corresponding answer of answer text to answering after text carries out word segmentation processing referring to step S123 Segment data.

Illustratively, certain is answered the corresponding answer of text and segments data are as follows: 000,000,000,100 000000001000 000100000000 000000010000 000010000000 000000100000 000001000000。

Step S130, insertion processing is carried out to corpus participle data, obtaining insertion indicates data.

In some embodiments, word segmentation processing is carried out to described problem text and answer text, obtains corpus participle number According to later, participle data are added with the beginning location for corresponding to each sentence text in participle data is answered the problem of participle obtains Begin symbol [CLS], adds position between sentence and adds separator [SEP], adds separator [SEP] at sentence ending.

Illustratively, certain problem participle data can handle are as follows: [CLS] you in company waited for how long [SEP] is most liked Company where [SEP].

Illustratively, participle data and answer segment addition starting character in data, separator, divide the problem of participle obtains It is obtained after symbol: [CLS] 000,000,000,001 000,000,000,010 100,000,000,000 010000000000 001000000000[SEP]000000000100 000000001000 000100000000 000000010000[SEP] 000010000000 000000100000 000001000000[SEP]。

In some embodiments, it as shown in fig. 7, step S130 carries out insertion processing to corpus participle data, obtains Data, including step S131, step S132 are indicated to insertion.

Step S131, data and participle information, the paragraph information, position for answering participle data are segmented to described problem Information carries out insertion processing.

Specifically, the paragraph information of described problem participle data is different from the answer participle paragraph information of data.

Illustratively, it is illustrated in figure 8 and data and participle information, the section for answering participle data is segmented to described problem Fall information, location information carries out the schematic diagram for being embedded in processing.

Wherein, data are segmented to described problem and the participle information for answering participle data carries out insertion processing, obtained Insertion result Token Embeddings be term vector, first word is starting character [CLS], the prediction after can be used for Task.

In the present embodiment, question text and answer text each of the corpus that word segmentation processing obtains is carried out to divide Word (token) is sent into token embedding layers, carries out insertion processing to participle information, each word is converted into vector shape Formula.

Illustratively, the token embedding layers of vector that each word is converted into fixed dimension, for example, by each word The vector for being converted into 768 dimensions indicates.

The paragraph information that described problem segments data is different from the answer participle paragraph information of data, by paragraph information Insertion processing is carried out, obtained insertion result Segment Embeddings is used to distinguish problem participle data and described answer is divided Word data.

Illustratively, as shown in figure 8, the paragraph information of problem participle data is EA, the paragraph information of participle data is answered Paragraph information for EB, shown problem participle data is different from the answer participle paragraph information of data.

The insertion result Position Embeddings study of location information obtains.For example, BERT model can be located Manage the list entries of 512 words (token) of longest.It will by allowing BERT model above to learn a vector expression at various locations The information coding of sequence order is come in.This means that Embeddings layers of Position be actually size be (512, 768) inquiry (lookup) table, the first row of table are to represent first position of first sequence, and the second row represents sequence Second position, and so on.

Specifically, first word (token) of the answer text after each word segmentation processing is special sort insertion always, i.e., Starting character [CLS].Corresponding to the final hidden state of the starting character, i.e. the output of Transformer is used as classification task Polymeric sequence indicates.

Step S132, described problem is segmented into data and participle information, the paragraph information, position for answering participle data The insertion results added of information, obtaining insertion indicates data.

Illustratively, the insertion result Token Embeddings for segmenting information is the vector expression of word, paragraph information Insertion result Segment Embeddings can assist BERT model distinguish problem participle data and answer participle data to Amount indicates that the insertion result Position Embeddings of location information can make BERT model learning to the sequence category of input Property.

Illustratively, the vector that information, paragraph information, the insertion result of location information are (1, n, 768) is segmented, these Result is embedded in be added by element, obtaining the synthesis that a size is (1, n, 768) indicates, it can be used as interview content-data, this One synthesis indicates to can be used as feature extraction submodel, such as the input (input of BERT model based coding layer representation)。

Step S140, it is based on feature extraction submodel, data, which carry out feature extraction and obtain paying attention to certainly, to be indicated to the insertion Power feature vector, the feature extraction submodel are based on the model from attention mechanism.

In the present embodiment, question and answer matched data is calculated by question and answer Matching Model；Question and answer Matching Model includes feature Submodel and matching submodel are extracted, wherein feature extraction submodel includes based on the model from attention mechanism, such as BERT mould Type.

BERT (Bidirectional Encoder Representations from Transformers) model, i.e., The encoder (Encoder) of bi-directional conversion (Transformer) is intended to adjust the context in all layers by joint come in advance The training two-way expression of depth；Transformer is that a kind of place one's entire reliance upon calculates the side of input and output characterization from attention Method.

The main innovation point of BERT model has used masking language model in pre-training (pre-train) method (masked language model, MLM) and next prediction (Next Sentence Prediction) two methods difference Capture the expression (representation) of word and sentence level.

Some words in the random shadow model input of language model are covered, the context for being based only upon masking word is aimed at Predict its original vocabulary id.It is different from language model pre-training from left to right, cover the training objective permission table of language model The context of the sign fusion left and right sides, thus one two-way Transformer of depth of pre-training.

The word for randomly choosing in corpus 15%, removes 15% word, such as replace original list with [Mask] mask Word, then using the correctly predicted superseded word of model as target.

In the specific 15% selected word that execute this task of [mask] scapegoat, only 80% is really replaced At [mask] mark, 10% by random replacement at another word, this word of 10% situation is not changed.Here it is The specific practice of Masked double-directional speech model.

Next prediction, i.e., when Next Sentence Prediction refers to doing language model pre-training, point Two kinds of situations select two sentences, and one is two sentences that really sequence is connected in selection corpus；Another is second Sentence throws dice from corpus, and random selection one is spliced to behind first sentence.Model is in addition to doing above-mentioned Masked language It says outside model tasks, it is subsidiary to do a sentence Relationship Prediction again, judge second sentence whether really after first sentence Continuous sentence.Increase this task and facilitate downstream sentence relationship and judges task.

The pre-training of BERT model is a multitask process, and pre-training is substantially by the way that a network structure mould designed Type does language model task, then the natural language text of largely even endless no mark is used, pre- to instruct White silk task, which extracts a large amount of linguistic knowledges, to be encoded in network structure.

Google has opened BERT-Base the and BERT-Large model of pre-training, can pass through the BERT of calling pre-training Model realization extracts corresponding feature vector according to text, can indicate the vector of semantic feature.

In the present embodiment, it is based on trained feature extraction submodel, such as BERT model, data are indicated according to insertion It extracts for describing problem participle data, answer participle data and problem participle data and answering infusing between participle data certainly Power feature of anticipating from attention feature vector.

Two-way Transformer of the framework of BERT model based on multilayer the, based on (E1 from attention mechanism from input E2 ... En) extract can embody two-way, the feature vector (C T1 T2 ... Tn ...) of context language feature.Wherein, Token indicates that different word, E indicate that the insertion vector of input, T_i i-th of word of expression are upper what is exported after BERT is handled Below vector.

Illustratively, feature extraction submodel is the Feature Selection Model based on BERT, in the BERT mould for obtaining pre-training The fine tuning that this specific tasks realizes parameter is measured for question and answer matching degree using fine tuning (Fine-Tuning) mode after type. BERT model can measure task for question and answer matching degree and provide efficient information extraction function.

Question and answer matching degree measures task, and such as answering text scoring task is a kind of sentence relationship generic task.Input BERT mould The insertion of type indicates that data include two item datas, and one is that insertion indicates part corresponding with question text in data, another It is that insertion indicates part corresponding with text is answered in data.Insertion is indicated to the data for corresponding to problem participle data in data As the first input data, insertion is indicated that correspond to the data for answering participle data in data inputs as the second input data To BERT model, include between the first input data and the second input data separator [SEP].

Dual input form based on BERT model, building input one are data corresponding with question text, building input two For with answer text corresponding data, the output of BERT model then for answer the attention feature between text and question text to Amount.Specifically, corresponding to what is exported after the hidden layer corresponding with the starting character [CLS] of BERT model starting character [CLS] C vector f inal hidden state as extract obtain from attention feature vector.

It is by pre-training that linguistic knowledge is hidden such as BERT model based on the feature extraction submodel from attention mechanism Question and answer Matching Model is introduced containing ground；And BERT model is by using masking (Masked) language model, strong antijamming capability, thus Question and answer matched processing method can solve the problems, such as that a certain extent speech recognition errors bring interference information；This feature is extracted Submodel can also capture question text and answer the relationship in text between sentence, so as on sentence scale to question and answer It is measured with degree.

Step S150, based on matching submodel, question and answer matched data, output are generated from attention feature vector according to described The question and answer matched data.

From the attention feature that attention feature vector includes between problem and answer, this is infused certainly according to matching submodel The processing for power feature vector of anticipating is available for measurement described problem text and answers the matching degree between text.

In some embodiments, as shown in figure 9, step S150 is based on matching submodel, according to described from attention spy It levies vector and generates question and answer matched data, export the question and answer matched data, including step S151, step S152.

Step S151, it is based on trained matching submodel, dimension-reduction treatment is carried out from attention feature vector to described, obtains To the bivector for corresponding to two classifications of matching and mismatch.

Illustratively, matching submodel includes linear layer, which is connected to BERT model and the starting character [CLS] After corresponding hidden layer, the hidden feature vector which is exported, i.e., from attention feature vector as input.

Illustratively, linear layer is a sequence-level classifier (sequence-level classifier).

It illustratively, will be from attention feature vector dimension-reduction treatment based on the linear layer in trained matching submodel For bivector (y0, y1)；Wherein, what y0, y1 were indicated is to answer " matching ", " mismatch " two between text and question text The numerical value of classification.

Step S152, it is based on the matching submodel, the bivector is normalized, according to treated Bivector obtains question and answer matched data, exports the question and answer matched data.

Illustratively, matching submodel includes the latter linked classifier of linear layer, such as softmax classifier, for pair Bivector (y0, y1) is normalized.

Illustratively, bivector (y0, y1) is done Softmax normalization by classifier, and y0, y1 are compressed to the number of 0-1 Be worth section, can be expressed as answering " matching " between text and question text, " mismatch " two classifications class probability p0, p1。

Specifically, the sum of p0, p1 are 1；The class probability p0 for answering " matching " between text and question text can be made For question and answer matched data, class probability p0 can also be converted to after ten point system or hundred-mark system as question and answer matched data, be used for Matching degree between metric question text and answer text；Such as the answer text is also used as relative to described problem text This score value.

The question and answer matched processing method that each embodiment of the application provides, by text the problem of acquisition and answer text into Row participle, insertion processing, the insertion for obtaining text indicates, then by based on the model extraction question text from attention mechanism Answer between text from attention feature, to generate question and answer matched data according to from attention feature；Without default difference The corresponding different keywords databases of interview question, a large amount of question and answer match informations that method can be arrived based on model learning, according to difference Problem and corresponding text obtain matched data, and versatility is preferable, and accuracy is higher.

Another embodiment of the application provides a kind of question and answer Matching Model training method, device, computer equipment and deposits Storage media.Wherein, which can be applied in terminal or server, match mould to question and answer to realize Type is trained, and obtained question and answer Matching Model to various problems and can be returned based on feature extraction submodel and matching submodel Matching degree between answering is measured.

For example, question and answer Matching Model training method is used for server, it is of course possible to be used for terminal.

Referring to Fig. 10, the process that Figure 10 is a kind of question and answer Matching Model training method that embodiments herein provides is shown It is intended to.

As shown in Figure 10, the question and answer Matching Model training method specifically includes step S210 to step S270.

Step S210, question and answer Matching Model is obtained, the question and answer Matching Model includes BERT model and the connection of pre-training In the matching submodel of the BERT model.

The BERT model of pre-training impliedly introduces question and answer matching mould as feature extraction submodel, by linguistic knowledge Type；And BERT model is by using masking (Masked) language model, strong antijamming capability, so that question and answer Matching Model can be Solve the problems, such as that speech recognition errors bring interference information to a certain extent；This feature, which extracts submodel, can also capture problem text Originally and the relationship in text between sentence is answered, so as to measure on sentence scale to question and answer matching degree.

Illustratively, matching submodel includes linear layer, which is connected to BERT model and the starting character [CLS] After corresponding hidden layer, the hidden feature vector which is exported is as input.

Illustratively, the vector that linear layer exports is done Softmax normalization by classifier, and the element in vector is compressed to The numerical intervals of 0-1, such as can be expressed as answering point of " matching " between text and question text, " mismatch " two classifications Class Probability p 0, p1.

Step S220, training data is obtained, the training data includes question text sample and described problem samples of text Corresponding answer samples of text and the corresponding matching degree data of the answer samples of text.

Specifically, training data includes text and answer text the problem of being labelled with matching degree data, mark can be by people Work is rule of thumb realized.

Illustratively, certain training data includes question text sample: Do you have a pet；Further include and the problem The corresponding answer samples of text of samples of text: my dog is cute, he likes playing；And including the answer text The corresponding volume matching degree data of sample such as, match score 80.

Step S230, to the problems in training data samples of text, samples of text progress word segmentation processing is answered, is obtained Sample segments data.

Illustratively, according to based on dictionary segmentation methods or according to the machine learning algorithm based on statistics to described problem Samples of text answers samples of text progress word segmentation processing.

In some embodiments, as shown in figure 11, step S230 to the problems in training data samples of text, return It answers samples of text and carries out word segmentation processing, obtain sample participle data, including step S231, step S232.

Step S231, according to preset dictionary, word segmentation processing is carried out to described problem samples of text, obtains problem participle number According to.

Step S232, according to preset dictionary, word segmentation processing is carried out to the answer samples of text, obtains answering participle number According to.

Referring to step S231, the corresponding answer of answer samples of text is obtained to answering after samples of text carries out word segmentation processing Segment data.

Step S240, insertion processing is carried out to sample participle data, obtaining sample indicates data.

In some embodiments, word segmentation processing is carried out to described problem samples of text, answer samples of text, obtains sample After segmenting data, data are segmented the problem of participle obtains and answer the beginning location for corresponding to each sentence text in participle data It adds starting character [CLS], adds position between sentence and add separator [SEP], add separator [SEP] at sentence ending.

In some embodiments, as shown in figure 12, step S240 carries out insertion processing to sample participle data, obtains Data, including step S241, step S242 are indicated to sample.

Step S241, data and participle information, the paragraph information, position for answering participle data are segmented to described problem Information carries out insertion processing.

Step S242, described problem is segmented into data and participle information, the paragraph information, position for answering participle data The insertion results added of information, obtaining sample indicates data.

Illustratively, the vector that information, paragraph information, the insertion result of location information are (1, n, 768) is segmented, these Result is embedded in be added by element, obtaining the synthesis that a size is (1, n, 768) indicates that can be used as sample indicates data, this One synthesis indicates to can be used as feature extraction submodel, such as the input (input of BERT model based coding layer representation)。

Step S250, the described BERT model indicates that data carry out feature extraction to the sample, obtains from attention feature Vector.

In the present embodiment, it is based on feature extraction submodel, such as BERT model, indicates that data extraction is used for according to sample Description problem segments data, answer participle data and problem participle data and answer between participle data from attention feature From attention feature vector.

Specifically, corresponding to starting character for what is exported after the hidden layer corresponding with the starting character [CLS] of BERT model The C vector f inal hidden state of [CLS] as extract obtain from attention feature vector.

Step S260, the described matching submodel generates question and answer matched data from attention feature vector according to described.

From the attention feature that attention feature vector includes between problem and answer, this is infused certainly according to matching submodel The processing for power feature vector of anticipating is available for measurement described problem samples of text and answers the matching between samples of text Degree.

Illustratively, based on matching submodel, dimension-reduction treatment is carried out from attention feature vector to described, is corresponded to Matching and the bivector for mismatching two classifications.It is then based on the matching submodel, normalizing is carried out to the bivector Change processing, according to treated, bivector obtains question and answer matched data, exports the question and answer matched data.

Illustratively, the class probability of " matching " between samples of text and question text sample will can be answered as question and answer Matched data can also convert the class probability to after ten point system or hundred-mark system as question and answer matched data, ask for measuring It inscribes samples of text and answers the matching degree between samples of text；Such as the answer samples of text is also used as relative to described The score value of question text sample.

Step S270, it is based on preset loss function, is calculated according to the question and answer matched data and the matching degree data Penalty values adjust the parameter in the question and answer Matching Model according to the penalty values.

Illustratively, the question and answer matched data obtained according to prediction and the question and answer matched data actually marked calculate the two Difference calculates penalty values according to the difference, and then backpropagation adjusts the ginseng in the question and answer Matching Model according to the penalty values Number, such as the parameter in adjustment matching submodel and/or the parameter in BERT model.

New training data is obtained according to step S220 later, and successively executes step S230- step S270；Until certain Fluctuation enough hours that penalty values are less than preset threshold value or penalty values when executing step S270 stop to question and answer Matching Model Training, obtain trained to question and answer Matching Model.

The question and answer Matching Model training method that each embodiment of the application provides, by the problems in training data text and Answer text is segmented, insertion is handled, and the insertion for obtaining text indicates, then by being mentioned based on the model from attention mechanism Take question text and answer between text from attention feature, to generate question and answer matched data according to from attention feature；It The question and answer matched data that the matching degree data based on mark and prediction obtain afterwards calculates penalty values, to adjust question and answer according to penalty values Parameter in Matching Model；Question and answer Matching Model can learn the attention force information between a large amount of question and answer.It is thus based on training Question and answer Matching Model when realizing question and answer matched processing method, without presetting the corresponding different keywords databases of different interview questions, The a large amount of question and answer match informations that can be arrived based on model learning, obtain matched data according to different problems and corresponding text, lead to Preferable with property, accuracy is higher.

Figure 13 is please referred to, Figure 13 is a kind of structural representation for question and answer matching treatment device that one embodiment of the application provides Figure, which can be configured in server or terminal, for executing question and answer matched processing method above-mentioned.

As shown in figure 13, question and answer matching treatment device, comprising: text obtains module 110, word segmentation processing module 120, embedding Enter processing module 130, characteristic extracting module 140, matching primitives module 150.

Text obtains module 110, for obtaining question text and answering text.

Illustratively, as shown in figure 14, text acquisition module 110 includes:

Voice acquisition submodule 111, for after obtaining the terminal encryption from terminal problem voice and the terminal add Answer voice and the terminal after close extract the sound bite of encryption code key；

Character recognition submodule 112, the character in the sound bite, obtains decryption code key for identification；

Voice decrypt submodule 113, for the encrypted problem voice according to the decryption secret key pair and it is described add Answer voice after close is decrypted, and obtains problem voice and answers voice；

Speech recognition submodule 114 obtains question text for carrying out speech recognition to described problem voice, to described time Voice progress speech recognition is answered to obtain answering text.

Word segmentation processing module 120 obtains corpus participle for carrying out word segmentation processing to described problem text and answer text Data.

Illustratively, as shown in figure 14, word segmentation processing module 120 includes:

Problem segments submodule 121, for carrying out word segmentation processing to described problem text, obtaining according to preset dictionary Problem segments data；

Participle submodule 122 is answered, for word segmentation processing being carried out to the answer text, being obtained according to preset dictionary Answer participle data.

It is embedded in processing module 130, for carrying out insertion processing to corpus participle data, obtaining insertion indicates data.

Illustratively, as shown in figure 14, insertion processing module 130 includes:

Insertion processing submodule 131, for described problem segment data and it is described answer participle data participle information, Paragraph information, location information carry out insertion processing, and described problem segments the paragraph information of data and described answer segments data Paragraph information is different；

Be embedded in addition submodule 132, for by described problem segment data and it is described answer participle data participle information, Paragraph information, the insertion results added of location information, obtaining insertion indicates data.

Characteristic extracting module 140 indicates that data carry out feature extraction to the insertion for being based on feature extraction submodel It obtains from attention feature vector, the feature extraction submodel is based on the model from attention mechanism.

Matching primitives module 150, for generating question and answer from attention feature vector according to described based on matching submodel With data, the question and answer matched data is exported.

Illustratively, as shown in figure 14, matching primitives module 150 includes:

Vector dimensionality reduction submodule 151, for be based on trained matching submodel, to it is described from attention feature vector into Row dimension-reduction treatment obtains corresponding to the bivector for matching and mismatching two classifications；

Submodule 152 is normalized, for based on the matching submodel, the bivector to be normalized, According to treated, bivector obtains question and answer matched data, exports the question and answer matched data.

Figure 15 is please referred to, Figure 15 is that a kind of structure for question and answer Matching Model training device that one embodiment of the application provides is shown It is intended to, which can be configured in server or terminal, for executing question and answer matching mould above-mentioned Type training method.

As shown in figure 15, the question and answer Matching Model training device, comprising:

Model obtains module 210, and for obtaining question and answer Matching Model, the question and answer Matching Model includes the BERT of pre-training Model and the matching submodel for being connected to the BERT model；

Data acquisition module 220, for obtaining training data, the training data include question text sample, with it is described The corresponding answer samples of text of question text sample and the corresponding matching degree data of the answer samples of text；

Sample word segmentation module 230, for dividing the problems in training data samples of text, answer samples of text Word processing obtains sample participle data；

Sample is embedded in module 240, and for carrying out insertion processing to sample participle data, obtaining sample indicates data；

Characteristic vector pickup module 250 indicates that data carry out feature extraction to the sample for the BERT model, obtains To from attention feature vector；

Matching obtains module 260, generates question and answer from attention feature vector according to described for the matching submodel With data；

Parameter adjustment module 270, for being based on preset loss function, according to the question and answer matched data and the matching Degree adjusts the parameter in the question and answer Matching Model according to the penalty values according to penalty values are calculated.

Illustratively, as shown in figure 16, sample word segmentation module 230 includes:

Problem segments submodule 231, according to preset dictionary, carries out word segmentation processing to described problem samples of text, obtains Problem segments data；

Participle submodule 232 is answered, according to preset dictionary, word segmentation processing is carried out to the answer samples of text, is obtained Answer participle data.

Illustratively, as shown in figure 16, sample insertion module 240 includes:

Insertion processing submodule 241, for described problem segment data and it is described answer participle data participle information, Paragraph information, location information carry out insertion processing, and described problem segments the paragraph information of data and described answer segments data Paragraph information is different；

Be embedded in addition submodule 242, for by described problem segment data and it is described answer participle data participle information, Paragraph information, the insertion results added of location information, obtaining sample indicates data.

It should be noted that it is apparent to those skilled in the art that, for convenience of description and succinctly, The device of foregoing description and each module, the specific work process of unit, can refer to corresponding processes in the foregoing method embodiment, Details are not described herein.

The present processes, device can be used in numerous general or special purpose computing system environments or configuration.Such as: it is personal Computer, server computer, handheld device or portable device, multicomputer system, are based on microprocessor at laptop device System, set-top box, programmable consumer-elcetronics devices, network PC, minicomputer, mainframe computer including any of the above Distributed computing environment of system or equipment etc..

Illustratively, above-mentioned method, apparatus can be implemented as a kind of form of computer program, which can To be run in computer equipment as shown in figure 17.

Figure 17 is please referred to, Figure 17 is a kind of structural schematic diagram of computer equipment provided by the embodiments of the present application.The calculating Machine equipment can be server or terminal.

Refering to fig. 17, which includes processor, memory and the network interface connected by system bus, In, memory may include non-volatile memory medium and built-in storage.

Non-volatile memory medium can storage program area and computer program.The computer program includes program instruction, The program instruction is performed, and processor may make to execute any one question and answer matched processing method.

Processor supports the operation of entire computer equipment for providing calculating and control ability.

Built-in storage provides environment for the operation of the computer program in non-volatile memory medium, the computer program quilt When processor executes, processor may make to execute any one question and answer matched processing method.

The network interface such as sends the task dispatching of distribution for carrying out network communication.It will be understood by those skilled in the art that The structure of the computer equipment, only the block diagram of part-structure relevant to application scheme, is not constituted to the application side The restriction for the computer equipment that case is applied thereon, specific computer equipment may include more more or less than as shown in the figure Component, perhaps combine certain components or with different component layouts.

It should be understood that processor can be central processing unit (Central Processing Unit, CPU), it should Processor can also be other general processors, digital signal processor (Digital Signal Processor, DSP), specially With integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor are patrolled Collect device, discrete hardware components etc..Wherein, general processor can be microprocessor or the processor be also possible to it is any often The processor etc. of rule.

Wherein, in one embodiment, the processor is for running computer program stored in memory, with reality Existing following steps: obtaining question text and answers text；To described problem text and text progress word segmentation processing is answered, obtains language Material participle data；Insertion processing is carried out to corpus participle data, obtaining insertion indicates data；Based on feature extraction submodule Type indicates that data carry out feature extraction and obtain from attention feature vector to the insertion, and the feature extraction submodel is base In the model from attention mechanism；Based on matching submodel, question and answer matched data is generated from attention feature vector according to described, Export the question and answer matched data.

Specifically, processor is realized for realizing the acquisition question text and when answering text: described in terminal acquisition The voice of the answer voice after problem voice and the terminal encryption after terminal encryption and terminal extraction encryption code key Segment；It identifies the character in the sound bite, obtains decryption code key；According to encrypted problem described in the decryption secret key pair Voice and the encrypted answer voice are decrypted, and obtain problem voice and answer voice；Described problem voice is carried out Speech recognition obtains question text, carries out speech recognition to the answer voice and obtains answering text.

Specifically, processor carries out word segmentation processing to described problem text and answer text for realizing described, language is obtained It when material participle data, realizes: according to preset dictionary, word segmentation processing being carried out to described problem text, obtain problem participle data； According to preset dictionary, word segmentation processing is carried out to the answer text, obtains answering participle data.

Specifically, processor carries out insertion processing to corpus participle data for realizing described, obtaining insertion is indicated It when data, realizes: data and participle information, the paragraph information, location information for answering participle data is segmented to described problem Insertion processing is carried out, the paragraph information that described problem segments data is different from the answer participle paragraph information of data；By institute Problem participle data and participle information, paragraph information, the insertion results added of location information for answering participle data are stated, are obtained Data are indicated to insertion.

Specifically, processor is based on matching submodel for realizing described, generated according to described from attention feature vector Question and answer matched data when exporting the question and answer matched data, is realized: being based on trained matching submodel, to described from attention Power feature vector carries out dimension-reduction treatment, obtains corresponding to the bivector for matching and mismatching two classifications；Based on the matching The bivector is normalized in submodel, and according to treated, bivector obtains question and answer matched data, output The question and answer matched data.

In another embodiment, the processor is for running computer program stored in memory, to realize Following steps: obtaining question and answer Matching Model, and the question and answer Matching Model includes the BERT model of pre-training and is connected to described The matching submodel of BERT model；Training data is obtained, the training data includes question text sample and described problem text The corresponding answer samples of text of sample and the corresponding matching degree data of the answer samples of text；To in the training data The problem of samples of text, answer samples of text carry out word segmentation processing, obtain sample participle data；To the sample segment data into Row insertion processing, obtaining sample indicates data；The BERT model indicates that data carry out feature extraction to the sample, obtains certainly Attention feature vector；The matching submodel generates question and answer matched data from attention feature vector according to described；Based on pre- If loss function, penalty values are calculated according to the question and answer matched data and the matching degree data, according to the penalty values tune Parameter in the whole question and answer Matching Model.

Specifically, processor is for realizing described to the problems in training data samples of text, answer samples of text Word segmentation processing is carried out, when obtaining sample participle data, realizes: according to preset dictionary, described problem samples of text being divided Word processing obtains problem participle data；According to preset dictionary, word segmentation processing is carried out to the answer samples of text, is returned Answer participle data.

Specifically, processor carries out insertion processing to sample participle data for realizing described, sample expression is obtained It when data, realizes: data and participle information, the paragraph information, location information for answering participle data is segmented to described problem Insertion processing is carried out, the paragraph information that described problem segments data is different from the answer participle paragraph information of data；By institute Problem participle data and participle information, paragraph information, the insertion results added of location information for answering participle data are stated, are obtained Data are indicated to sample.

As seen through the above description of the embodiments, those skilled in the art can be understood that the application can It realizes by means of software and necessary general hardware platform.Based on this understanding, the technical solution essence of the application On in other words the part that contributes to existing technology can be embodied in the form of software products, the computer software product It can store in storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions are used so that a computer equipment (can be personal computer, server or the network equipment etc.) executes the certain of each embodiment of the application or embodiment Method described in part, such as:

A kind of computer readable storage medium, the computer-readable recording medium storage have computer program, the meter It include program instruction in calculation machine program, the processor executes described program instruction, realizes provided by the embodiments of the present application any Item question and answer matched processing method；Or

Realize the question and answer Matching Model training method of any of the above-described.

Wherein, the computer readable storage medium can be the storage inside of computer equipment described in previous embodiment Unit, such as the hard disk or memory of the computer equipment.The computer readable storage medium is also possible to the computer The plug-in type hard disk being equipped on the External memory equipment of equipment, such as the computer equipment, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card) etc..

The above, the only specific embodiment of the application, but the protection scope of the application is not limited thereto, it is any Those familiar with the art within the technical scope of the present application, can readily occur in various equivalent modifications or replace It changes, these modifications or substitutions should all cover within the scope of protection of this application.Therefore, the protection scope of the application should be with right It is required that protection scope subject to.

Claims

1. a kind of question and answer matched processing method characterized by comprising

It obtains question text and answers text；

Based on feature extraction submodel, data, which carry out feature extraction and obtain from attention feature vector, institute, to be indicated to the insertion Stating feature extraction submodel is based on the model from attention mechanism；

Based on matching submodel, question and answer matched data is generated from attention feature vector according to described, exports the question and answer matching Data.

2. question and answer matched processing method as described in claim 1, it is characterised in that: the acquisition question text and answer text This, comprising:

From terminal obtain the terminal encryption after problem voice and the terminal encryption after answer voice and the terminal Extract the sound bite of encryption code key；

It identifies the character in the sound bite, obtains decryption code key；

It is decrypted, is obtained according to encrypted problem voice described in the decryption secret key pair and the encrypted answer voice Problem voice and answer voice；

Speech recognition is carried out to described problem voice and obtains question text, speech recognition is carried out to the answer voice and is answered Text.

3. question and answer matched processing method as described in claim 1, which is characterized in that described to described problem text and answer text This progress word segmentation processing obtains corpus participle data, comprising:

According to preset dictionary, word segmentation processing is carried out to described problem text, obtains problem participle data；

According to preset dictionary, word segmentation processing is carried out to the answer text, obtains answering participle data.

4. question and answer matched processing method as claimed in claim 3, which is characterized in that described to be carried out to corpus participle data Insertion processing, obtaining insertion indicates data, comprising:

Data are segmented to described problem and the participle information for answering participle data, paragraph information, location information are embedded in Processing, the paragraph information that described problem segments data are different from the answer participle paragraph information of data；

By described problem segment data and it is described answer the participle participle information of data, paragraph information, location information insertion knot Fruit is added, and obtaining insertion indicates data.

5. such as question and answer matched processing method of any of claims 1-4, which is characterized in that described based on matching submodule Type generates question and answer matched data from attention feature vector according to described, exports the question and answer matched data, comprising:

Based on trained matching submodel, dimension-reduction treatment is carried out from attention feature vector to described, obtains corresponding to matching With the bivector for mismatching two classifications；

Based on the matching submodel, the bivector is normalized, according to treated, bivector is obtained Question and answer matched data exports the question and answer matched data.

6. a kind of question and answer Matching Model training method characterized by comprising

Question and answer Matching Model is obtained, the question and answer Matching Model includes the BERT model of pre-training and is connected to the BERT model Matching submodel；

Training data is obtained, the training data includes question text sample, answer corresponding with described problem samples of text text This sample and the corresponding matching degree data of the answer samples of text；

To the problems in training data samples of text, samples of text progress word segmentation processing is answered, obtains sample participle data；

Insertion processing is carried out to sample participle data, obtaining sample indicates data；

The BERT model indicates that data carry out feature extraction to the sample, obtains from attention feature vector；

The matching submodel generates question and answer matched data from attention feature vector according to described；

Based on preset loss function, penalty values are calculated according to the question and answer matched data and the matching degree data, according to institute It states penalty values and adjusts parameter in the question and answer Matching Model.

7. question and answer Matching Model training method as claimed in claim 6, it is characterised in that: described in the training data Question text sample answers samples of text progress word segmentation processing, obtains sample participle data, comprising:

According to preset dictionary, word segmentation processing is carried out to described problem samples of text, obtains problem participle data；

According to preset dictionary, word segmentation processing is carried out to the answer samples of text, obtains answering participle data；

Described to carry out insertion processing to sample participle data, obtaining sample indicates data, comprising:

By described problem segment data and it is described answer the participle participle information of data, paragraph information, location information insertion knot Fruit is added, and obtaining sample indicates data.

8. a kind of question and answer matching treatment device characterized by comprising

Text obtains module, for obtaining question text and answering text；

Word segmentation processing module obtains corpus participle data for carrying out word segmentation processing to described problem text and answer text；

Characteristic extracting module indicates that data carry out feature extraction and obtain certainly to the insertion for being based on feature extraction submodel Attention feature vector, the feature extraction submodel are based on the model from attention mechanism；

Matching primitives module, for generating question and answer matched data from attention feature vector according to described based on matching submodel, Export the question and answer matched data.

9. a kind of computer equipment, which is characterized in that the computer equipment includes memory and processor；

The memory is for storing computer program；

The processor, for executing the computer program and realization such as claim 1- when executing the computer program Question and answer matched processing method described in any one of 5；Or

Realize question and answer Matching Model training method as claimed in claims 6 or 7.

10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In: if the computer program is executed by processor, realize question and answer matching treatment side according to any one of claims 1 to 5 Method；Or