CN111563161A

CN111563161A - Sentence recognition method, sentence recognition device and intelligent equipment

Info

Publication number: CN111563161A
Application number: CN202010340047.8A
Authority: CN
Inventors: 熊为星; 马力; 熊友军
Original assignee: Ubtech Robotics Corp
Current assignee: Ubtech Robotics Corp
Priority date: 2020-04-26
Filing date: 2020-04-26
Publication date: 2020-08-21
Anticipated expiration: 2040-04-26
Also published as: CN111563161B

Abstract

The method comprises the steps of splicing current user sentences into historical sentences and inputting the historical sentences into a pre-training model to obtain information of spliced sentences excavated by the pre-training model so as to realize that more characteristic information is excavated under the condition of less linguistic data; furthermore, the character-level coding and the coding of the context information are split through the RNN model, the required features are extracted, and the current user statement is recognized based on the extracted features. The recognition process also effectively combines the context information, and improves the understanding efficiency and the understanding accuracy of the current user statement.

Description

Sentence recognition method, sentence recognition device and intelligent equipment

Technical Field

The present application belongs to the technical field of artificial intelligence, and in particular, relates to a sentence recognition method, a sentence recognition apparatus, an intelligent device, and a computer-readable storage medium.

Background

With the development of technology, man-machine conversation interactive systems are increasingly widely applied, and in order to realize automatic man-machine conversation, a computer needs to analyze the contained intentions and conversation behavior categories from characters input by a user and extract keywords in the analyzed intentions and conversation behavior categories to formulate a corresponding reply strategy. In recent years, with the development of deep learning techniques and the improvement of computing power of computers, people have begun to apply deep learning techniques to human-computer interactive systems. However, the current representative model for processing the task of multi-task semantic analysis based on multi-turn conversation still has the problem of low accuracy and the like, and cannot meet the requirements of people.

Disclosure of Invention

The embodiment of the application provides a sentence identification method, a sentence identification device, an intelligent device and a computer readable storage medium, which can improve the understanding efficiency and the understanding accuracy of the current user sentence.

A first aspect of the present application provides a sentence recognition method, including:

acquiring a current user statement and a historical statement set, wherein the current user statement is a current round of user statements, the historical statement set is composed of at least one historical statement, and the historical statements comprise user statements of each historical round and system statements of each historical round;

splicing the current user statement with each historical statement in the historical statement set respectively to obtain at least one spliced statement;

respectively marking the position for at least one splicing statement;

inputting at least one spliced statement with a marked position into a pre-training model, and respectively and correspondingly obtaining at least one pre-training result;

inputting the at least one pre-training result into a trained first-class Recurrent Neural Network (RNN) model, and respectively and correspondingly obtaining a forward output result and a backward output result which are output by the first-class recurrent neural Network model and are related to each pre-training result;

and identifying the current user sentence according to the forward output result and the backward output result which are related to each pre-training result.

A second aspect of the present application provides a sentence recognition apparatus, including:

the system comprises a statement acquisition unit, a statement acquisition unit and a history statement set, wherein the statement acquisition unit is used for acquiring a current user statement and a history statement set, the current user statement is a current round of user statements, the history statement set is composed of at least one history statement, and the history statement comprises each history round of user statements and each history round of system statements;

the sentence splicing unit is used for splicing the current user sentence with each historical sentence in the historical sentence set respectively to obtain at least one spliced sentence;

the position marking unit is used for marking the position of at least one splicing statement respectively;

the pre-training result acquisition unit is used for inputting at least one spliced statement with a marked position into a pre-training model and respectively and correspondingly obtaining at least one pre-training result;

a first network output result obtaining unit, configured to input the at least one pre-training result into a trained first-class recurrent neural network model, and respectively obtain a forward output result and a backward output result, which are output by the first-class recurrent neural network model and are related to each pre-training result, in a corresponding manner;

and the sentence recognition unit is used for recognizing the current user sentence according to the forward output result and the backward output result which are related to each pre-training result.

A third aspect of the present application provides a smart device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method according to the first aspect when executing the computer program.

A fourth aspect of the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of the first aspect as described above.

A fifth aspect of the application provides a computer program product comprising a computer program which, when executed by one or more processors, performs the steps of the method as described in the first aspect above.

As can be seen from the above, in the present application, a current user statement and a historical statement set are first obtained, where the current user statement is a current turn of user statements, the historical statement set is composed of at least one historical statement, the historical statements include user statements of each historical turn and system statements of each historical turn, then the current user statement is respectively spliced with each historical statement in the historical statement set to obtain at least one spliced statement, then positions are respectively marked for the at least one spliced statement, at least one spliced statement with a marked position is input into a pre-training model to respectively obtain at least one pre-training result, then the at least one pre-training result is input into a trained first type of cyclic neural network model, and a forward output result and a backward output result, which are output by the first type of cyclic neural network model and are related to each pre-training result, are respectively obtained correspondingly And finally, identifying the current user statement according to the forward output result and the backward output result which are related to each pre-training result. According to the scheme, the current user sentences are spliced with the historical sentences and then input into the pre-training model to obtain the information of the spliced sentences excavated by the pre-training model, so that more characteristic information is excavated under the condition of less linguistic data; furthermore, the character-level coding and the coding of the context information are split through an RNN model, required features are extracted, and the current user statement is recognized based on the extracted features. The recognition process also effectively combines the context information, and improves the understanding efficiency and the understanding accuracy of the current user statement. It is understood that the beneficial effects of the second aspect to the fifth aspect can be referred to the related description of the first aspect, and are not described herein again.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

FIG. 1 is a schematic flow chart illustrating an implementation of a sentence recognition method provided in an embodiment of the present application;

FIG. 2 is a schematic diagram of an output of a first-class recurrent neural network model in a sentence recognition method provided by an embodiment of the present application;

FIG. 3 is a schematic diagram of a model framework adopted by the sentence recognition method provided in the embodiment of the present application;

fig. 4 is a block diagram of a sentence recognition apparatus according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of an intelligent device provided in an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

Furthermore, in the description of the present application and the appended claims, the terms "first," "second," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.

Example one

Referring to fig. 1, a sentence recognition method provided in an embodiment of the present application is described below, where the sentence recognition method in the embodiment of the present application includes:

step 101, obtaining a current user statement and a historical statement set;

in the embodiment of the application, the sentence input by the user this time, that is, the current user sentence, may be received first. The current user sentence is an object of the current sentence recognition, that is, the current sentence recognition aims to recognize the intention, the dialogue behavior and the word slot of the current user sentence. Considering that the statements of multiple rounds of conversations are often closely connected, the influence of historical statements on the recognition of the current user statement may be considered, where the historical statements include historical user statements and historical system statements, where the historical user statements refer to user statements of each historical round, the historical system statements refer to system statements of each historical round, the user statements are statements input by a user, and the system statements are statements fed back by the smart device according to the user statements. Specifically, all the historical sentences in the multiple rounds of conversations may be stored in the historical sentence set, so that all the historical sentences in the multiple rounds of conversations may be obtained only by obtaining the historical sentence set.

Assuming that a round of conversation includes a user statement u and a system statement s fed back by the smart device according to the history information, when the current round is the nth round, the current user statement may be marked as u_nAt this time, there are also n-1 historical turns of user statements (i.e., 1 st, 2 nd, 3 rd, … … through n-1 st) and n-1 historical turns of system statements (i.e., 1 st, 2 nd, 3 rd, … … through n-1 st). For better explanation, the historical statement set may be divided into a historical user statement set and a historical system statement set, which are respectively denoted by C_uAnd C_sThe above history user languageThe sentence sets and the historical system sentence sets are composed as follows:

wherein, the above-mentioned C_uFor a set of historical user statements, C_sThe historical system statement set and the historical system statement set form the historical statement set together.

102, splicing the current user statement with each historical statement in the historical statement set respectively to obtain at least one spliced statement;

in the embodiment of the present application, in order to effectively combine context information, after a current user statement and a historical statement set are obtained, the current user statement may be respectively spliced with each historical statement in the historical statement set to obtain at least one spliced statement. In consideration of the richness of the content contained in the multiple rounds of conversations, the current user sentences can be spliced with the historical sentences respectively. Alternatively, the step 102 may be embodied as:

a1, acquiring all words forming the current user sentence;

in the embodiment of the application, the current user statement u is processed_nPerforming word segmentation to obtain the current user sentence u_nThe individual words of (a). In particular, assume that the current user statement u_nTherein contain

Each word, then the current user statement u_nCan be expressed as

Wherein w is a term constituting a sentence; the subscript mark of w indicates which sentence the word constitutes, i.e. the sentence the word belongs to; the upper corner of w indicates the first word that makes up the sentence. That is, the current user sentence u described above_nIn the representation of (a) or (b),

refer to the constituent current user statement u_nThe term of (1) is used in the following description,

refer to the constituent current user statement u_nThe 2 nd word of (a), and so on,

refer to the constituent current user statement u_nTo (1) a

A word.

A2, obtaining all words forming the history sentences to be spliced;

in this embodiment of the present application, the history statement to be spliced is any history statement in the history statement set. The word segmentation processing can be performed on the historical sentences to be spliced to obtain each word forming the historical sentences to be spliced. Similar to the step a1, the history statement to be spliced can be represented in the same manner: for example, assume that the history statement to be stitched is s_n-1And the history sentence s to be spliced_n-1Therein contain

The words and phrases indicate the history sentence s to be spliced_n-1Can be expressed as

And A3, splicing the words forming the current user statement with the words forming the history statement to be spliced through a preset spacer according to the sequence of the history statement to be spliced and the current user statement to obtain a spliced statement of the current user statement and the history statement to be spliced.

In the embodiment of the present application, each concatenation process is to use the current user statement u as described above_nPlaced after a history statement, i.e. based on the aboveAnd splicing the historical sentences to be spliced and the current user sentences in sequence, wherein the obtained spliced sentences are that the historical sentences to be spliced are in front and the current user sentences are in back. In particular, during splicing, a spacer [ SEP ] is used]Pieced together in the middle of two sentences. With current user statement u_nIs s with the history sentence to be spliced_n-1For example, the resulting splice results are shown below:

wherein, the above-mentioned C is the concatenation statement, and the subscript indicates which two statements the concatenation statement is formed by. According to the example of the splicing result, the words forming the historical sentence to be spliced are preceded to form the current user sentence u_nAfter each word of (a), by means of a spacer [ SEP ]]Spaced apart.

It should be noted that, since there are n-1 historical user sentences and n-1 historical system sentences in the multi-turn conversation, each historical user sentence and each historical system sentence can be used as the historical sentence to be spliced and the current user sentence u_nSplicing is performed, so that k spliced sentences can be finally obtained, wherein k is 2 (n-1).

103, marking the position for at least one splicing statement respectively;

in the embodiment of the present application, a position needs to be marked for each concatenation statement, specifically, which part of content in the concatenation statement comes from the historical statement and which part of content comes from the current user statement. In some embodiments, the operation of marking the position may be implemented by using a boolean vector, that is, after k concatenation statements are obtained, corresponding boolean vectors may be generated for the respective concatenation statements, so as to mark the position of the respective concatenation statements by the corresponding boolean vectors. Alternatively, the step 103 may be embodied as:

b1, initializing Boolean vectors based on the splicing sentences to be processed;

in the embodiment of the present application, the to-be-processed concatenation statement is any concatenation statement. To is directed atThe concatenation statement to be processed may initialize a boolean vector for the concatenation statement to be processed, where the number of elements included in the boolean vector initialized for the concatenation statement to be processed is the same as the number of elements included in the concatenation statement to be processed, so that each element in the boolean vector corresponds to only one element in the concatenation statement to be processed. For example, according to the current user statement u_nAnd the historical statement s to be spliced_n-1The obtained concatenation statement comprises the following elements:

each form to be spliced historical statement s_n-1The words of (a) are,

form a current user statement u_nAnd a spacer [ SEP ]](ii) a Based on this, it can be known that the current user sentence u is used_nAnd the historical statement s to be spliced_n-1The obtained splicing statement comprises

Each element also contains in the corresponding generated Boolean vector

Each element has a value of a default value, wherein the default value may be 0.

B2, setting elements corresponding to the elements belonging to the current user statement in the to-be-processed spliced statement in the Boolean vector as a preset first numerical value, wherein the to-be-processed spliced statement is any spliced statement;

in the embodiment of the present application, the marking of the position is realized by assigning values to elements in a boolean vector. And setting the corresponding element of the element belonging to the current user statement in the to-be-processed spliced statement in the Boolean vector as a preset first numerical value. In particular, due to the post-vector in Boolean

Each element corresponds to each word of the current user sentence, so that the later elements in the Boolean vector can be added

The value of each element is set to a preset first value. Specifically, the first numerical value may be 1.

B3, setting elements corresponding to the elements belonging to the history statements in the to-be-processed splicing statements in the Boolean vectors as preset second numerical values;

b4, setting the corresponding element of the interval character in the splicing sentence to be processed in the Boolean vector as a preset second numerical value;

in the embodiment of the present application, the boolean vector may be divided into

The values of the other elements than the element are set to a preset second numerical value. Specifically, the second value may be 0. According to the current user statement u_nAnd the historical statement s to be spliced_n-1The resulting stitched sentence

For example, the to-be-processed concatenation statement is described, and the boolean vectors generated according to the to-be-processed concatenation statement include common vectors

The value of each element is "0",

the value of each element is "1", and the Boolean vector is specifically

And B5, filling or trimming the lengths of the Boolean vectors to preset lengths, and taking the filled or trimmed Boolean vectors as the Boolean vectors corresponding to the splicing sentences to be processed.

In the embodiment of the present application, in consideration of the daily dialog, the length of the sentence input by the user is not fixed, and the sentence fed back by the system of the smart device is not fixed, that is, the lengths of the current user sentence and each history sentence are different, so that the length of the boolean vector of each spliced sentence generated according to step B3 is also long or short, and for convenience of subsequent operations, it is considered here to fill or prune the generated boolean vector to a preset length, where the preset length may be 64 or another numerical value, and the preset length may be set by a developer, and is not limited here. Optionally, the step B4 is specifically represented as:

b51, comparing the length of the Boolean vector with a preset length;

in this embodiment of the present application, the length of the boolean vector generated according to the to-be-processed concatenation statement may be compared with a preset length, and whether to perform the filling processing or the trimming processing may be determined according to a comparison result.

B52, if the length of the boolean vector is longer than the preset length, deleting elements from the last element of the boolean vector until the element of the boolean vector equals the preset length, obtaining the clipped boolean vector;

in the embodiment of the present application, when the length of the boolean vector is found to be longer than the preset length, the boolean vector is considered to be too long, and the pruning process is required at this time. The pruning treatment is specifically to delete the last element of the boolean vector and sequentially delete the elements forward until the elements of the boolean vector are equal to the preset length, so as to obtain the pruned boolean vector; that is, each time an element is deleted, the last element is deleted. For better explanation of the clipping process, only by way of example, assuming that the preset length is 16, and assuming that the generated boolean vector is {0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1}, the clipped resultant boolean vector is {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1 }.

And B53, if the length of the Boolean vector is shorter than the preset length, adding an element from the last element of the Boolean vector backwards until the element of the Boolean vector is equal to the preset length to obtain the filled Boolean vector, wherein the value of the added element is a preset second value.

In the embodiment of the present application, when the length of the boolean vector is found to be shorter than the preset length, the boolean vector is considered to be too short, and the padding process is required. The filling process is to add new elements backward from the last element of the boolean vector until the element of the boolean vector is equal to the preset length, so as to obtain the filled boolean vector; that is, each time an element is added, the addition operation is performed after the last element. For better illustration of the padding process, by way of example only, assuming that the preset length is 16, and assuming that the generated boolean vector is {0, 0, 0, 0, 1, 1}, the padded resulting boolean vector is {0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0 }. Optionally, for the k obtained splicing statements and the boolean vectors corresponding to the k splicing statements, the embodiment of the present application successively uses two different types of cyclic neural network models for processing.

It should be noted that, if the length of the boolean vector is equal to the preset length, the generated boolean vector does not need to be filled or pruned.

104, inputting at least one spliced statement with a marked position into a pre-training model, and respectively and correspondingly obtaining at least one pre-training result;

in the embodiment of the present application, before the processing of the concatenated sentences with the marked positions by the RNN model, the concatenated sentences and the boolean vectors corresponding to the concatenated sentences are processed by a pre-trained model, where the pre-trained model is specifically a pre-trained bert (bidirectional Encoder expressions from transformations) model. In this embodiment of the present application, a to-be-processed concatenation statement with a marked position (when a boolean vector is used to mark a position, the to-be-processed concatenation statement and a boolean vector corresponding to the to-be-processed concatenation statement) may be input into the BERT model, and then an output of a last hidden layer in the BERT model is used as a pre-training result associated with the to-be-processed concatenation statement, where the to-be-processed concatenation statement is any concatenation statement as before. For convenience of description, taking the boolean vector marking position and the BERT model as a preprocessing model as an example, in the embodiment of the present application, the pre-training result is expressed by H, which is specifically shown as follows:

in the above formula, H₁To splice the sentences

And corresponding Boolean vectors

The output of the last hidden layer obtained after input into the BERT model, i.e. according to the concatenation statement

And corresponding Boolean vectors

And obtaining k pre-training results H in the formula by analogy. Since there are k concatenated sentences, the obtained pre-training result H also has k, which are respectively represented as H₁、H₂、……、H_k。

Step 105, inputting the at least one pre-training result into a trained first-class recurrent neural network model, and respectively and correspondingly obtaining a forward output result and a backward output result which are output by the first-class recurrent neural network model and are related to each pre-training result;

in the embodiment of the present application, the session information of each spliced statement may be expressed through a strong characterization capability of a pre-training model, for example, a BERT model, but since parameters in the pre-training model are fixed, it is not possible to express more specific relationship information of the spliced statement. Based on this, the first-class RNN, that is, the first-class recurrent neural network model, is introduced in the embodiment of the present application, so as to obtain the correlation between each word of the current user statement and each word of the historical statements constituting the concatenated statement. By means of the first-class recurrent neural network model, coding is carried out along the words, and the attention relation distribution situation of each word and each historical statement in the current user statement can be obtained. The first recurrent neural network model adopts bidirectional rnn (birnn), specifically BiGRU, so that the output of the first recurrent neural network model includes a forward output result and a backward output result, so as not to lose the following information. Specifically, the output results of the encoding are as follows:

in the above equation, o is an output result of the first-type recurrent neural network model (i.e., BiGRU), a lower subscript f of o indicates an output result calculated by forward propagation (i.e., a forward output result), b indicates an output result calculated by backward propagation (i.e., a backward output result), and l is the preset length (i.e., the length of the clipped or filled boolean vector). Taking any pre-training result as an input of the first type of recurrent neural network model as an example for explanation: since the length of the boolean vector of the concatenation statement is limited to l, only l elements exist in the pre-training result H, and the l elements pass through the first-class recurrent neural network model to obtain l forward output results of forward propagation and l backward output results of backward propagation; the one forward output result and the one backward output result are the outputs of the first type of recurrent neural network model obtained aiming at the input (a pre-training result H); that is, the forward output result and the backward output result are related to a pre-training result H. By analogy, each pre-training result is correspondingly related to l forward output results and l backward output results.

And 106, identifying the current user sentence according to the forward output result and the backward output result related to each pre-training result.

In the embodiment of the present application, the output of the first-type recurrent neural network model may be used to represent the relationship between each word of the historical sentences and the word of the current user sentence, and to fill the characterization result of the part; based on this, the forward output result and the backward output result related to each pre-training result can be analyzed to identify the intention, the dialogue behavior and the word slot of the current user sentence. In the embodiment of the present application, a second type of recurrent neural network model is used for encoding along a context, and specifically, the attention relationship distribution output by the first type of recurrent neural network model can be encoded, so that current session semantic frame information is effectively output according to the content of the context, where the semantic frame information includes an intention, a dialogue behavior, and a word slot. That is, the output of the first-type recurrent neural network model described above can be understood as an intermediate result.

In an application scenario, a second-class recurrent neural network model can be used to splice and analyze the forward output result and the backward output result related to each pre-training result to obtain the intention and the dialogue behavior of the current user statement, which are specifically expressed as follows:

c1, splicing the last forward output result and the last backward output result related to the same pre-training result to obtain a splicing result related to each pre-training result;

in the embodiment of the present application, the splicing process specifically includes: and splicing the output result of the forward last neuron and the output result of the backward last neuron of the first-class circular neural network model related to the same pre-training result, wherein the output results are used for representing the full-text understanding condition of the splicing of the current user statement and each historical statement, and the expression of at least one obtained splicing result is as follows:

through the splicing process, k splicing results are obtained and are represented by a character O, and the k splicing results can represent a series of splicing full-text understanding conditions. With O₁For example, it is composed of

And

are spliced to obtain

And

average and pre-training result H₁In this connection, that is, the above-mentioned O can be considered₁And pre-training result H₁And (4) correlating. By analogy with the above, and the pre-training result H₂The related splicing result is O₂… …, and pre-training result H_kThe related splicing result is O_k。

And C2, inputting the splicing results related to the pre-training results into the trained second type of cyclic neural network model, and obtaining the intention and the dialogue behavior of the current user sentence through the training processing of the second type of cyclic neural network model.

In the embodiment of the present application, the recognition intent and the dialogue behavior generally need more information concerning the context than the recognition word slot, so the recognition intent and the dialogue behavior do not share parameters with the RNN model used for recognizing the word slot, and the recognition intent and the dialogue behavior are different from the input when recognizing the word slot. In the embodiment of the present application, the second type of recurrent neural network model specifically includes two GRU models, which are respectively denoted as a first GRU model (GRU model)_GRepresented) and a second GRU model with GRU_SRepresentation), the first GRU model described aboveThe second GRU model is used for identifying the intention and the dialogue behavior, and the word slot is identified; in addition, the second recurrent neural network model also includes a BiGRU model, and the parameters of the BiGRU model are different from the parameters of the BiGRU model constituting the first recurrent neural network model, and the BiGRU is also used for identifying the word slot. Each of the models included in the above-described second-class recurrent neural network model will be explained and explained later.

Specifically, the second type of recurrent neural network model predicts the intention and the dialogue behavior of the current user statement through at least one concatenation result, which means that the first GRU model predicts the intention and the dialogue behavior of the current user statement through at least one concatenation result, and the process is as follows:

inputting at least one splicing result into a first GRU model to obtain the output of a last hidden layer, wherein the specific expression is as follows:

G＝GRU_G(O₁,O₂,...,O_k-1,O_k)

wherein O is the splicing result in step C1, and the specific expression of the splicing result in step C1 may be referred to specifically, which is not described herein again. Taking the output G of the hidden layer of the first GRU model as the input of the classification layer for classifying the intention and the dialogue behavior, considering that both the intention and the dialogue behavior may be multi-label results, the activation function used here is preferably sigmoid, which is expressed as:

P^Intent＝Sigmoid(UG)

P^Action＝Sigmoid(VG)

in the above formula, U and V are both network parameters of the first GRU model, and refer to objects learned in the training process; g is the output of the last hidden layer of the first GRU model given in step C2; p^IntentA probability of hitting each intent for the current user statement; p^ActionThe probability of hitting each dialog action for the current user statement. Thus, the prediction of the intent and dialog behavior of the current user statement may be embodied as: inputting all splicing results into the first GRU model to obtain the output of the last hidden layer in the first GRU model; the above-mentioned first stepThe output of the last hidden layer in a GRU model is used as the input of the classification layer related to the intention and the dialogue behavior in the first GRU model, and the probability that the current user statement output by the classification layer related to the intention and the dialogue behavior belongs to each intention category and the probability that the current user statement belongs to each dialogue behavior category are obtained; and determining the intention of the current user sentence according to the probability that the current user sentence belongs to each intention category, and determining the conversation behavior of the current user sentence according to the probability that the current user sentence belongs to each conversation behavior category. For example, a target intention category is determined as the intention of the current user sentence, where the target intention category may be an intention category with the highest probability, or an intention category with a probability higher than a preset intention probability threshold, and is not limited herein; and determining the target dialogue behavior type as the dialogue behavior of the current user statement, where the target dialogue behavior type may be the dialogue behavior type with the highest probability, or may be the dialogue behavior type with the probability higher than a preset dialogue behavior probability threshold, and is not limited herein.

In an application scenario, a second type of recurrent neural network model may be used to screen and analyze the forward output result and the backward output result related to each pre-training result to obtain the word slot of the current user sentence, which is specifically represented as:

d1, screening the forward output result and the backward output result related to each pre-training result based on a preset screening condition to obtain at least one screening result;

in this embodiment, when the intelligent device analyzes the word slot of the current user sentence, the intelligent device may screen the forward output result and the backward output result output by the first-class recurrent neural network model and respectively related to each pre-training result, and screen out the current user sentence u_nThe respective words of (a) are distributed with the attention relationship of the historical conversation. Optionally, when the positions are marked by using boolean vectors, the step D1 specifically includes:

d11, acquiring Boolean vectors of the spliced sentences associated with the output to be screened, and recording the Boolean vectors as the Boolean vectors to be screened, wherein the output to be screened is a forward output result and a backward output result which are output by the first type of recurrent neural network model and are related to any pre-training result;

d12, performing cross multiplication operation based on the output to be screened and the Boolean vector;

d13, elements larger than 0 in the cross multiplication operation result are reserved, elements smaller than or equal to 0 are removed, and a screening result associated with the output to be screened is obtained.

In the embodiment of the present application, the output of the first type of recurrent neural network model is described in detail: while the output of the first type of recurrent neural network model includes a forward output result and a backward output result, please refer to fig. 2, fig. 2 is an output schematic of the first type of recurrent neural network model (BiGRU model): the arrangement sequence of the forward output result and the backward output result in the respective directions is O¹To o^lThus, in effect, the first forward output result

And the last backward output result

Are correlated, and so on, it can be known that the ith forward output result is correlated with the (l-i + 1) th backward output result (i is a positive integer not greater than l); that is, the forward output results are arranged in a forward order, and the backward output results are arranged in a reverse order, so that the forward output results and the backward output results associated with the same pre-training result can be obtained. On the basis of understanding the output of the first-class recurrent neural network model, the above-mentioned screening process can be illustrated by the following equation:

in the above formula, the character B represents a boolean vector of a concatenated statement associated with an output to be filtered, and the output to be filtered is a forward output result and a backward output result related to any one of the pre-training results output by the first-type recurrent neural network model. As can be seen from the above formula, after the outputs to be screened (i.e., the forward outputs and the backward outputs associated with the same pre-training result) are obtained, the forward output results and the backward output results associated with each output to be screened are spliced to obtain one group of spliced outputs under the outputs to be screened, and then cross product operations are performed on the boolean vectors of the spliced sentences associated with the outputs to be screened and the one group of spliced outputs under the outputs to be screened respectively; elements which are larger than 0 in the result obtained by the cross multiplication operation are reserved, elements which are smaller than or equal to 0 are removed, namely, only the parts which are larger than 0 can really participate in the subsequent operation.

D2, inputting the at least one screening result into a trained second-class recurrent neural network model to obtain a word slot of the current user sentence;

in this embodiment of the present application, the predicting, by the second type of recurrent neural network model, the word slot of the current user statement according to at least one filtering result means that the predicting, by the second GRU model, the word slot of the current user statement according to at least one filtering result, which may be specifically expressed as:

wherein, GRU_G(i.e., the first GRU model) and GRU_S(i.e., the second GRU model) does not share parameters; s_jIs GRU_sThe word slot information output aiming at the jth word of the current user statement, j satisfies

h is the screening result in step D1, and only h greater than 0 can participate in the operation in this step, and the formula and the related explanation used in the screening in step D1 can be referred to, which is not described herein again. Thus, the word slot information of all words in the current user sentence can be obtained, which is specifically expressed as:

in the above equation, S on the left side of the equation is used to represent the second GRU model GRU_sAnd outputting a word slot contained in the current user sentence. It should be noted that S is not the final word slot prediction result of the current user sentence. Before outputting the final word slot prediction result, the above G (output of the last hidden layer of the first GRU model, please refer to step C2) and S (word slot information of all words in the current user sentence obtained by the second GRU model operation, please refer to step D2) may also be used as input of a new BiGRU model, the result of the output layer of the BiGRU model is put into the softmax layer, and the final word slot prediction result is output through the new BiGRU model, as follows:

by the above

The calculation formula can calculate the probability value of each word in the current user sentence hitting the corresponding word slot, and after comparing the probability value with a preset word slot probability threshold value, whether the word slot corresponding to the jth word is established can be judged; that is, the above

The word slots corresponding to all the words in the current user sentence are given, and the words are passed through the word slots

The calculation formula can know whether a word slot corresponding to a certain word in the current user sentence is established or not; for example, the first word in the current user statement

The corresponding word slot is denoted S₁In the calculation of P₁ ^SlotThen, willThe P is₁ ^SlotComparing with the word slot probability threshold value if P₁ ^SlotIf the probability threshold of the word slot is higher than the threshold of the probability threshold of the word slot, the first word in the current user sentence can be determined

The corresponding word slot is really S₁(ii) a If P₁ ^SlotIf the probability threshold of the word slot is not higher than the threshold of the probability of the word slot, the first word in the current user sentence can be determined

There is no corresponding word slot. It can be considered that the word slots that may respectively correspond to each word in the current user sentence are obtained by the second GRU model, and the word slots that may respectively correspond to each word in the current user sentence are screened by the new BiGRU model.

It should be noted that, in the training process of the second type of recurrent neural network model, the cross entropy can be used as loss functions of the three tasks, namely the intention prediction, the dialogue behavior prediction and the word slot prediction, and the sum of the losses of the tasks is used as an optimized object.

Referring to fig. 3, fig. 3 illustrates a model framework adopted by the sentence recognition method provided in the embodiment of the present application, and the following briefly introduces:

obtaining the current user statement u_nAnd a history sentence u₁、s₁、u₂、s₂、……、u_n-1，s_n-1Then, the current user statement u_nSplicing with each historical statement to obtain k spliced statements (and boolean vectors of each spliced statement, not shown in fig. 3), inputting the k spliced statements (and boolean vectors of each spliced statement, not shown in fig. 3) into a BERT model, and inputting the output of the BERT model into a first-type recurrent neural network model (i.e., BiGRUu model in the figure), i.e., BiGRU model_uThe output of the model is then passed through a second type recurrent neural network model (including the first GRU model GRU without shared parameters)_GAnd a second GRU model GRU_SAnd BiGRU in FIG. 3_SA model; it should be noted that the first GRU model GRU is not separately shown in FIG. 3_GAnd a second GRU model GRU_S(ii) a That is, although only one GRU model is shown in fig. 3, the GRU model actually adopts two different sets of parameters) to be processed, and the current user statement u is obtained_nIntention, dialog behavior (by the first GRU model GRU)_GModel) and word slots (by BiGRU)_SModel derived).

As can be seen from the above, the sentence identification method provided in the present application scheme splices the current user sentence into the historical sentence and inputs the concatenated sentence into the BERT model to obtain the information of the spliced sentence excavated by the BERT model, so as to realize that more feature information is excavated under the condition of less corpus; furthermore, the character-level coding and the coding of the context information are split through two types of RNN models, each type of RNN is responsible for coding a set of content, the division of labor is clear, required features are extracted respectively, and finally the intention, the dialogue behavior and the word slot information of the current user statement are predicted jointly through a single model (namely the second type of RNN model). The prediction process also effectively combines context information, and improves the understanding efficiency and the understanding accuracy of the current user statement.

Example two

A second embodiment of the present application provides a sentence recognition apparatus, where the sentence recognition apparatus may be integrated in an intelligent device, as shown in fig. 4, a sentence recognition apparatus 400 in the embodiment of the present application includes:

a statement obtaining unit 401, configured to obtain a current user statement and a historical statement set, where the current user statement is a current turn of user statements, the historical statement set is formed by at least one historical statement, and the historical statement includes user statements of each historical turn and system statements of each historical turn;

a sentence splicing unit 402, configured to splice the current user sentence with each historical sentence in the historical sentence set, respectively, to obtain at least one spliced sentence;

a position marking unit 403, configured to mark a position for at least one splicing statement respectively;

a pre-training result obtaining unit 404, configured to input at least one spliced statement with a marked position into a pre-training model, and obtain at least one pre-training result respectively;

a first network output result obtaining unit 405, configured to input the at least one pre-training result into a trained first-class recurrent neural network model, and respectively obtain a forward output result and a backward output result, which are output by the first-class recurrent neural network model and are related to each pre-training result, correspondingly;

and a sentence recognizing unit 406, configured to recognize the current user sentence according to the forward output result and the backward output result related to each pre-training result.

Optionally, the sentence recognizing unit 406 includes:

the output splicing subunit is used for splicing the last forward output result and the last backward output result related to the same pre-training result to obtain splicing results related to each pre-training result;

and the intention and dialogue behavior acquisition subunit is used for inputting the splicing results related to the pre-training results into the trained second type of cyclic neural network model, and obtaining the intention and dialogue behavior of the current user statement through the training processing of the second type of cyclic neural network model.

Optionally, the recurrent neural network model of the second type includes a first GRU model, and the intention and dialog behavior obtaining subunit includes:

a hidden layer output obtaining subunit, configured to input all splicing results into the first GRU model, so as to obtain an output of a last hidden layer in the first GRU model;

a probability obtaining subunit, configured to use an output of a last hidden layer in the first GRU model as an input of a classification layer related to an intention and a dialog behavior in the first GRU model, and obtain a probability that the current user statement output by the classification layer related to the intention and the dialog behavior belongs to each intention category and a probability that the current user statement belongs to each dialog behavior category;

and an intention and dialogue behavior determination subunit, configured to determine an intention of the current user sentence according to a probability that the current user sentence belongs to each intention category, and determine a dialogue behavior of the current user sentence according to a probability that the current user sentence belongs to each dialogue behavior category.

Optionally, the sentence recognizing unit 406 includes:

the output screening subunit is used for screening forward output results and backward output results related to each pre-training result based on preset screening conditions to obtain at least one screening result, wherein one screening result comprises a forward output result and a backward output result related to one pre-training result;

and the word slot obtaining subunit is used for inputting the at least one screening result into the trained second-class recurrent neural network model to obtain the word slot of the current user statement.

Optionally, the sentence splicing unit 402 includes:

a word obtaining subunit, configured to obtain words that constitute the current user statement, and obtain words that constitute to-be-spliced historical statements, where the to-be-spliced historical statement is any one of the historical statements in the historical statement set;

and the sentence splicing subunit is used for splicing each word forming the current user sentence after each word forming the historical sentence to be spliced is spliced through a preset spacer based on the sequence of the historical sentence to be spliced and the current user sentence to obtain the spliced sentence of the current user sentence and the historical sentence to be spliced.

Optionally, the position marking unit 403 includes:

a boolean vector initialization unit, configured to initialize a boolean vector based on the to-be-processed concatenation statement, where the to-be-processed concatenation statement is any concatenation statement, and each element in the boolean vector uniquely corresponds to one element in the to-be-processed concatenation statement;

a first value setting subunit, configured to set, as a preset first value, an element, in the to-be-processed concatenation statement, corresponding to an element in a boolean vector, where the element belongs to the current user statement, and the to-be-processed concatenation statement is any concatenation statement;

a second numerical value setting subunit, configured to set, as a preset second numerical value, an element, in the boolean vector, of an element, in the to-be-processed concatenation statement, that belongs to a history statement, and set, as a preset second numerical value, an element, in the boolean vector, of an interval symbol in the to-be-processed concatenation statement;

and the vector editing subunit is used for filling or trimming the length of the Boolean vector to a preset length, taking the filled or trimmed Boolean vector as the Boolean vector corresponding to the splicing statement to be processed, and marking the position of the corresponding splicing statement to be processed through the Boolean vector.

Optionally, the vector editing subunit includes:

a length comparison subunit, configured to compare the length of the boolean vector with a preset length;

a vector clipping subunit, configured to, if the length of the boolean vector is longer than the preset length, delete an element from a last element of the boolean vector until the element of the boolean vector equals the preset length, and obtain the clipped boolean vector;

and the vector filling subunit is used for adding an element backwards from the last element of the Boolean vector if the length of the Boolean vector is shorter than the preset length until the element of the Boolean vector is equal to the preset length to obtain the filled Boolean vector, wherein the value of the added element is a preset second value.

Optionally, the pre-training result obtaining unit 404 is specifically configured to input the to-be-processed joined sentence with the marked position into the pre-training model, and use an output of a last hidden layer in the pre-training model as a pre-training result associated with the to-be-processed joined sentence, where the to-be-processed joined sentence is any joined sentence.

As can be seen from the above, the sentence recognition apparatus provided in the embodiment of the present application splices the current user sentence into the historical sentence and inputs the spliced sentence into the BERT model to obtain information of the spliced sentence excavated by the BERT model, so as to realize that more feature information is excavated in the case of less corpus; furthermore, the character-level coding and the coding of the context information are split through two types of RNN models, each type of RNN is responsible for coding a set of content, the division of labor is clear, required features are extracted respectively, and finally the intention, the dialogue behavior and the word slot information of the current user statement are predicted jointly through a single model (namely the second type of RNN model). The prediction process also effectively combines context information, and improves the understanding efficiency and the understanding accuracy of the current user statement.

EXAMPLE III

An embodiment of the present application provides an intelligent device, please refer to fig. 5, where the intelligent device 5 in the embodiment of the present application includes: a memory 501, one or more processors 502 (only one shown in fig. 5), and a computer program stored on the memory 501 and executable on the processors. Wherein: the memory 501 is used for storing software programs and modules, and the processor 502 executes various functional applications and data processing by running the software programs and units stored in the memory 501, so as to acquire resources corresponding to the preset events. Specifically, the processor 502 realizes the following steps by running the above-mentioned computer program stored in the memory 501:

respectively marking the position for at least one splicing statement;

inputting the at least one pre-training result into a trained first-class cyclic neural network model, and respectively and correspondingly obtaining a forward output result and a backward output result which are output by the first-class cyclic neural network model and are related to each pre-training result;

Assuming that the above is the first possible implementation manner, in a second possible implementation manner provided on the basis of the first possible implementation manner, the identifying the current user sentence according to the forward output result and the backward output result associated with each pre-training result includes:

splicing the last forward output result and the last backward output result related to the same pre-training result to obtain a splicing result related to each pre-training result;

and inputting the splicing results related to the pre-training results into the trained second type of cyclic neural network model, and obtaining the intention and the dialogue behavior of the current user statement through the training processing of the second type of cyclic neural network model.

In a third possible embodiment based on the two possible embodiments, the second-class recurrent neural network model includes a first GRU model, and the obtaining of the intention and the dialogue behavior of the current user sentence through the training process of the second-class recurrent neural network model by inputting the concatenation result related to each pre-training result into the trained second-class recurrent neural network model includes:

inputting all splicing results into the first GRU model to obtain the output of the last hidden layer in the first GRU model;

taking the output of the last hidden layer in the first GRU model as the input of the classification layer related to the intention and the dialogue behavior in the first GRU model, and obtaining the probability that the current user statement output by the classification layer related to the intention and the dialogue behavior belongs to each intention category and the probability that the current user statement belongs to each dialogue behavior category;

and determining the intention of the current user sentence according to the probability that the current user sentence belongs to each intention category, and determining the conversation behavior of the current user sentence according to the probability that the current user sentence belongs to each conversation behavior category.

In a third possible implementation manner provided on the basis of the one possible implementation manner, the recognizing the current user sentence according to the forward output result and the backward output result associated with each pre-training result includes:

based on preset screening conditions, screening forward output results and backward output results related to each pre-training result to obtain at least one screening result, wherein one screening result comprises a forward output result and a backward output result related to one pre-training result;

and inputting the at least one screening result into a trained second-class recurrent neural network model to obtain the word slot of the current user sentence.

In a fifth possible implementation manner based on the first possible implementation manner, or based on the second possible implementation manner, or based on the third possible implementation manner, or based on the fourth possible implementation manner, the concatenating the current user sentence with each of the historical sentences in the historical sentence set to obtain at least one concatenated sentence includes:

acquiring all words and phrases forming the current user sentence;

acquiring all words and phrases forming a historical statement to be spliced, wherein the historical statement to be spliced is any one of the historical statements in the historical statement set;

and splicing each word forming the current user statement after each word forming the historical sentence to be spliced through a preset interval symbol based on the sequence of the historical sentence to be spliced and the current user statement to obtain a spliced sentence of the current user statement and the historical sentence to be spliced.

In a sixth possible implementation form provided on the basis of the fifth possible implementation form, the marking positions of at least one spliced sentence respectively includes:

initializing a Boolean vector based on the splicing statement to be processed, wherein the splicing statement to be processed is any splicing statement, and each element in the Boolean vector uniquely corresponds to one element in the splicing statement to be processed;

setting elements corresponding to the elements belonging to the current user statement in the to-be-processed spliced statement in the Boolean vector as a preset first numerical value, wherein the to-be-processed spliced statement is any spliced statement;

setting elements corresponding to the elements belonging to the history statements in the to-be-processed spliced statement in the Boolean vector as preset second numerical values;

setting the corresponding element of the interval character in the splicing statement to be processed in the Boolean vector as a preset second numerical value;

and filling or trimming the length of the Boolean vector to a preset length, taking the filled or trimmed Boolean vector as the Boolean vector corresponding to the splicing sentence to be processed, and marking the position of the corresponding splicing sentence to be processed by the Boolean vector.

In a seventh possible implementation manner provided on the basis of the six possible implementation manners, the filling or clipping the length of the boolean vector to a preset length includes:

comparing the length of the Boolean vector with a preset length;

if the length of the Boolean vector is longer than the preset length, deleting elements from the last element of the Boolean vector until the elements of the Boolean vector are equal to the preset length to obtain the trimmed Boolean vector;

and if the length of the Boolean vector is shorter than the preset length, newly adding elements from the last element of the Boolean vector backwards until the elements of the Boolean vector are equal to the preset length to obtain the filled Boolean vector, wherein the newly added elements take a preset second value.

In an eighth possible implementation manner provided on the basis of the first possible implementation manner, the second possible implementation manner, the third possible implementation manner, or the fourth possible implementation manner, the inputting of the at least one concatenated sentence with a marked position into the pre-training model to obtain at least one pre-training result respectively includes:

inputting the spliced sentences to be processed with the marked positions into the pre-training model, and taking the output of the last hidden layer in the pre-training model as a pre-training result associated with the spliced sentences to be processed, wherein the spliced sentences to be processed are any spliced sentences.

It should be understood that in the embodiments of the present Application, the Processor 502 may be a Central Processing Unit (CPU), and the Processor may be other general processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Memory 501 may include both read-only memory and random access memory and provides instructions and data to processor 502. Some or all of the memory 501 may also include non-volatile random access memory. For example, the memory 501 may also store device class information.

As can be seen from the above, the intelligent device proposed in the present application scheme splices the current user sentence into the historical sentence and inputs the historical sentence into the BERT model to obtain information of the spliced sentence excavated by the BERT model, so as to realize that more feature information is excavated under the condition of less corpus; furthermore, the character-level coding and the coding of the context information are split through two types of RNN models, each type of RNN is responsible for coding a set of content, the division of labor is clear, required features are extracted respectively, and finally the intention, the dialogue behavior and the word slot information of the current user statement are predicted jointly through a single model (namely the second type of RNN model). The prediction process also effectively combines context information, and improves the understanding efficiency and the understanding accuracy of the current user statement.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned functions may be distributed as different functional units and modules according to needs, that is, the internal structure of the apparatus may be divided into different functional units or modules to implement all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art would appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of external device software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, the division of the above-described modules or units is only one logical functional division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

The integrated unit may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, all or part of the flow in the method of the embodiments described above can be realized by a computer program, which can be stored in a computer-readable storage medium and can realize the steps of the embodiments of the methods described above when the computer program is executed by a processor. The computer program includes computer program code, and the computer program code may be in a source code form, an object code form, an executable file or some intermediate form. The computer-readable storage medium may include: any entity or device capable of carrying the above-described computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer readable Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signal, telecommunication signal, software distribution medium, etc. It should be noted that the computer readable storage medium may contain other contents which can be appropriately increased or decreased according to the requirements of the legislation and the patent practice in the jurisdiction, for example, in some jurisdictions, the computer readable storage medium does not include an electrical carrier signal and a telecommunication signal according to the legislation and the patent practice.

The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A sentence recognition method, comprising:

respectively marking the position for at least one splicing statement;

2. The sentence recognition method of claim 1 wherein the recognizing the current user sentence according to the forward output result and the backward output result associated with each pre-training result comprises:

and inputting the splicing results related to the pre-training results into a trained second type of cyclic neural network model, and processing through the second type of cyclic neural network model to obtain the intention and the dialogue behavior of the current user statement.

3. The sentence recognition method of claim 2, wherein the second type of recurrent neural network model includes a first GRU model, and the inputting the concatenation result associated with each pre-training result into the trained second type of recurrent neural network model, and obtaining the intention and the dialogue behavior of the current user sentence through a training process of the second type of recurrent neural network model, includes:

taking the output of the last hidden layer in the first GRU model as the input of a classification layer related to the intention and the dialogue behavior in the first GRU model, and obtaining the probability that the current user statement output by the classification layer related to the intention and the dialogue behavior belongs to each intention category and the probability that the current user statement belongs to each dialogue behavior category;

and determining the intention of the current user statement according to the probability that the current user statement belongs to each intention category, and determining the conversation behavior of the current user statement according to the probability that the current user statement belongs to each conversation behavior category.

4. The sentence recognition method of claim 1 wherein the recognizing the current user sentence according to the forward output result and the backward output result associated with each pre-training result comprises:

and inputting the at least one screening result into a trained second type of cyclic neural network model, and processing through the second type of cyclic neural network model to obtain a word slot of the current user statement.

5. The sentence identification method of any one of claims 1 to 4, wherein the splicing the current user sentence with each historical sentence in the historical sentence set to obtain at least one spliced sentence comprises:

acquiring all words and phrases forming the current user sentence;

splicing all words forming the current user statement after all words forming the historical sentence to be spliced through a preset spacer based on the sequence of the historical sentence to be spliced and the current user statement to obtain a spliced sentence of the current user statement and the historical sentence to be spliced.

6. The sentence recognition method of claim 5, wherein the marking positions for the at least one spliced sentence, respectively, comprises:

setting elements corresponding to the elements belonging to the current user statement in the to-be-processed spliced statement in the Boolean vector as preset first numerical values;

setting elements corresponding to the elements belonging to the historical sentences in the to-be-processed spliced sentences in the Boolean vectors as preset second numerical values;

setting the corresponding elements of the spacers in the splicing statement to be processed in the Boolean vector as preset second numerical values;

and filling or trimming the length of the Boolean vector to a preset length, taking the filled or trimmed Boolean vector as the Boolean vector corresponding to the splicing statement to be processed, and marking the position of the corresponding splicing statement to be processed through the Boolean vector.

7. The sentence recognition method of claim 6 wherein the padding or pruning the length of the boolean vector to a preset length comprises:

comparing the length of the Boolean vector with a preset length;

if the length of the Boolean vector is longer than the preset length, deleting elements from the last element of the Boolean vector forward until the elements of the Boolean vector are equal to the preset length to obtain the trimmed Boolean vector;

and if the length of the Boolean vector is shorter than the preset length, newly adding elements from the last element of the Boolean vector backwards until the elements of the Boolean vector are equal to the preset length to obtain the filled Boolean vector, wherein the value of the newly added elements is a preset second numerical value.

8. The sentence recognition method of any one of claims 1 to 4, wherein the inputting of the at least one spliced sentence with the marked position into the pre-training model to obtain at least one pre-training result respectively comprises:

9. A sentence recognition apparatus, comprising:

10. A smart device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 8 when executing the computer program.

11. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 8.