CN111563161A - Sentence recognition method, sentence recognition device and intelligent equipment - Google Patents

Sentence recognition method, sentence recognition device and intelligent equipment Download PDF

Info

Publication number
CN111563161A
CN111563161A CN202010340047.8A CN202010340047A CN111563161A CN 111563161 A CN111563161 A CN 111563161A CN 202010340047 A CN202010340047 A CN 202010340047A CN 111563161 A CN111563161 A CN 111563161A
Authority
CN
China
Prior art keywords
statement
sentence
current user
historical
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010340047.8A
Other languages
Chinese (zh)
Other versions
CN111563161B (en
Inventor
熊为星
马力
熊友军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ubtech Robotics Corp
Original Assignee
Ubtech Robotics Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ubtech Robotics Corp filed Critical Ubtech Robotics Corp
Priority to CN202010340047.8A priority Critical patent/CN111563161B/en
Publication of CN111563161A publication Critical patent/CN111563161A/en
Application granted granted Critical
Publication of CN111563161B publication Critical patent/CN111563161B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The method comprises the steps of splicing current user sentences into historical sentences and inputting the historical sentences into a pre-training model to obtain information of spliced sentences excavated by the pre-training model so as to realize that more characteristic information is excavated under the condition of less linguistic data; furthermore, the character-level coding and the coding of the context information are split through the RNN model, the required features are extracted, and the current user statement is recognized based on the extracted features. The recognition process also effectively combines the context information, and improves the understanding efficiency and the understanding accuracy of the current user statement.

Description

Sentence recognition method, sentence recognition device and intelligent equipment
Technical Field
The present application belongs to the technical field of artificial intelligence, and in particular, relates to a sentence recognition method, a sentence recognition apparatus, an intelligent device, and a computer-readable storage medium.
Background
With the development of technology, man-machine conversation interactive systems are increasingly widely applied, and in order to realize automatic man-machine conversation, a computer needs to analyze the contained intentions and conversation behavior categories from characters input by a user and extract keywords in the analyzed intentions and conversation behavior categories to formulate a corresponding reply strategy. In recent years, with the development of deep learning techniques and the improvement of computing power of computers, people have begun to apply deep learning techniques to human-computer interactive systems. However, the current representative model for processing the task of multi-task semantic analysis based on multi-turn conversation still has the problem of low accuracy and the like, and cannot meet the requirements of people.
Disclosure of Invention
The embodiment of the application provides a sentence identification method, a sentence identification device, an intelligent device and a computer readable storage medium, which can improve the understanding efficiency and the understanding accuracy of the current user sentence.
A first aspect of the present application provides a sentence recognition method, including:
acquiring a current user statement and a historical statement set, wherein the current user statement is a current round of user statements, the historical statement set is composed of at least one historical statement, and the historical statements comprise user statements of each historical round and system statements of each historical round;
splicing the current user statement with each historical statement in the historical statement set respectively to obtain at least one spliced statement;
respectively marking the position for at least one splicing statement;
inputting at least one spliced statement with a marked position into a pre-training model, and respectively and correspondingly obtaining at least one pre-training result;
inputting the at least one pre-training result into a trained first-class Recurrent Neural Network (RNN) model, and respectively and correspondingly obtaining a forward output result and a backward output result which are output by the first-class recurrent neural Network model and are related to each pre-training result;
and identifying the current user sentence according to the forward output result and the backward output result which are related to each pre-training result.
A second aspect of the present application provides a sentence recognition apparatus, including:
the system comprises a statement acquisition unit, a statement acquisition unit and a history statement set, wherein the statement acquisition unit is used for acquiring a current user statement and a history statement set, the current user statement is a current round of user statements, the history statement set is composed of at least one history statement, and the history statement comprises each history round of user statements and each history round of system statements;
the sentence splicing unit is used for splicing the current user sentence with each historical sentence in the historical sentence set respectively to obtain at least one spliced sentence;
the position marking unit is used for marking the position of at least one splicing statement respectively;
the pre-training result acquisition unit is used for inputting at least one spliced statement with a marked position into a pre-training model and respectively and correspondingly obtaining at least one pre-training result;
a first network output result obtaining unit, configured to input the at least one pre-training result into a trained first-class recurrent neural network model, and respectively obtain a forward output result and a backward output result, which are output by the first-class recurrent neural network model and are related to each pre-training result, in a corresponding manner;
and the sentence recognition unit is used for recognizing the current user sentence according to the forward output result and the backward output result which are related to each pre-training result.
A third aspect of the present application provides a smart device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method according to the first aspect when executing the computer program.
A fourth aspect of the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of the first aspect as described above.
A fifth aspect of the application provides a computer program product comprising a computer program which, when executed by one or more processors, performs the steps of the method as described in the first aspect above.
As can be seen from the above, in the present application, a current user statement and a historical statement set are first obtained, where the current user statement is a current turn of user statements, the historical statement set is composed of at least one historical statement, the historical statements include user statements of each historical turn and system statements of each historical turn, then the current user statement is respectively spliced with each historical statement in the historical statement set to obtain at least one spliced statement, then positions are respectively marked for the at least one spliced statement, at least one spliced statement with a marked position is input into a pre-training model to respectively obtain at least one pre-training result, then the at least one pre-training result is input into a trained first type of cyclic neural network model, and a forward output result and a backward output result, which are output by the first type of cyclic neural network model and are related to each pre-training result, are respectively obtained correspondingly And finally, identifying the current user statement according to the forward output result and the backward output result which are related to each pre-training result. According to the scheme, the current user sentences are spliced with the historical sentences and then input into the pre-training model to obtain the information of the spliced sentences excavated by the pre-training model, so that more characteristic information is excavated under the condition of less linguistic data; furthermore, the character-level coding and the coding of the context information are split through an RNN model, required features are extracted, and the current user statement is recognized based on the extracted features. The recognition process also effectively combines the context information, and improves the understanding efficiency and the understanding accuracy of the current user statement. It is understood that the beneficial effects of the second aspect to the fifth aspect can be referred to the related description of the first aspect, and are not described herein again.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a schematic flow chart illustrating an implementation of a sentence recognition method provided in an embodiment of the present application;
FIG. 2 is a schematic diagram of an output of a first-class recurrent neural network model in a sentence recognition method provided by an embodiment of the present application;
FIG. 3 is a schematic diagram of a model framework adopted by the sentence recognition method provided in the embodiment of the present application;
fig. 4 is a block diagram of a sentence recognition apparatus according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an intelligent device provided in an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.
Furthermore, in the description of the present application and the appended claims, the terms "first," "second," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.
Example one
Referring to fig. 1, a sentence recognition method provided in an embodiment of the present application is described below, where the sentence recognition method in the embodiment of the present application includes:
step 101, obtaining a current user statement and a historical statement set;
in the embodiment of the application, the sentence input by the user this time, that is, the current user sentence, may be received first. The current user sentence is an object of the current sentence recognition, that is, the current sentence recognition aims to recognize the intention, the dialogue behavior and the word slot of the current user sentence. Considering that the statements of multiple rounds of conversations are often closely connected, the influence of historical statements on the recognition of the current user statement may be considered, where the historical statements include historical user statements and historical system statements, where the historical user statements refer to user statements of each historical round, the historical system statements refer to system statements of each historical round, the user statements are statements input by a user, and the system statements are statements fed back by the smart device according to the user statements. Specifically, all the historical sentences in the multiple rounds of conversations may be stored in the historical sentence set, so that all the historical sentences in the multiple rounds of conversations may be obtained only by obtaining the historical sentence set.
Assuming that a round of conversation includes a user statement u and a system statement s fed back by the smart device according to the history information, when the current round is the nth round, the current user statement may be marked as unAt this time, there are also n-1 historical turns of user statements (i.e., 1 st, 2 nd, 3 rd, … … through n-1 st) and n-1 historical turns of system statements (i.e., 1 st, 2 nd, 3 rd, … … through n-1 st). For better explanation, the historical statement set may be divided into a historical user statement set and a historical system statement set, which are respectively denoted by CuAnd CsThe above history user languageThe sentence sets and the historical system sentence sets are composed as follows:
Figure BDA0002468239530000051
wherein, the above-mentioned CuFor a set of historical user statements, CsThe historical system statement set and the historical system statement set form the historical statement set together.
102, splicing the current user statement with each historical statement in the historical statement set respectively to obtain at least one spliced statement;
in the embodiment of the present application, in order to effectively combine context information, after a current user statement and a historical statement set are obtained, the current user statement may be respectively spliced with each historical statement in the historical statement set to obtain at least one spliced statement. In consideration of the richness of the content contained in the multiple rounds of conversations, the current user sentences can be spliced with the historical sentences respectively. Alternatively, the step 102 may be embodied as:
a1, acquiring all words forming the current user sentence;
in the embodiment of the application, the current user statement u is processednPerforming word segmentation to obtain the current user sentence unThe individual words of (a). In particular, assume that the current user statement unTherein contain
Figure BDA0002468239530000061
Each word, then the current user statement unCan be expressed as
Figure BDA0002468239530000062
Wherein w is a term constituting a sentence; the subscript mark of w indicates which sentence the word constitutes, i.e. the sentence the word belongs to; the upper corner of w indicates the first word that makes up the sentence. That is, the current user sentence u described abovenIn the representation of (a) or (b),
Figure BDA0002468239530000063
refer to the constituent current user statement unThe term of (1) is used in the following description,
Figure BDA0002468239530000064
refer to the constituent current user statement unThe 2 nd word of (a), and so on,
Figure BDA0002468239530000065
refer to the constituent current user statement unTo (1) a
Figure BDA0002468239530000066
A word.
A2, obtaining all words forming the history sentences to be spliced;
in this embodiment of the present application, the history statement to be spliced is any history statement in the history statement set. The word segmentation processing can be performed on the historical sentences to be spliced to obtain each word forming the historical sentences to be spliced. Similar to the step a1, the history statement to be spliced can be represented in the same manner: for example, assume that the history statement to be stitched is sn-1And the history sentence s to be splicedn-1Therein contain
Figure BDA0002468239530000067
The words and phrases indicate the history sentence s to be splicedn-1Can be expressed as
And A3, splicing the words forming the current user statement with the words forming the history statement to be spliced through a preset spacer according to the sequence of the history statement to be spliced and the current user statement to obtain a spliced statement of the current user statement and the history statement to be spliced.
In the embodiment of the present application, each concatenation process is to use the current user statement u as described abovenPlaced after a history statement, i.e. based on the aboveAnd splicing the historical sentences to be spliced and the current user sentences in sequence, wherein the obtained spliced sentences are that the historical sentences to be spliced are in front and the current user sentences are in back. In particular, during splicing, a spacer [ SEP ] is used]Pieced together in the middle of two sentences. With current user statement unIs s with the history sentence to be splicedn-1For example, the resulting splice results are shown below:
Figure BDA0002468239530000071
wherein, the above-mentioned C is the concatenation statement, and the subscript indicates which two statements the concatenation statement is formed by. According to the example of the splicing result, the words forming the historical sentence to be spliced are preceded to form the current user sentence unAfter each word of (a), by means of a spacer [ SEP ]]Spaced apart.
It should be noted that, since there are n-1 historical user sentences and n-1 historical system sentences in the multi-turn conversation, each historical user sentence and each historical system sentence can be used as the historical sentence to be spliced and the current user sentence unSplicing is performed, so that k spliced sentences can be finally obtained, wherein k is 2 (n-1).
103, marking the position for at least one splicing statement respectively;
in the embodiment of the present application, a position needs to be marked for each concatenation statement, specifically, which part of content in the concatenation statement comes from the historical statement and which part of content comes from the current user statement. In some embodiments, the operation of marking the position may be implemented by using a boolean vector, that is, after k concatenation statements are obtained, corresponding boolean vectors may be generated for the respective concatenation statements, so as to mark the position of the respective concatenation statements by the corresponding boolean vectors. Alternatively, the step 103 may be embodied as:
b1, initializing Boolean vectors based on the splicing sentences to be processed;
in the embodiment of the present application, the to-be-processed concatenation statement is any concatenation statement. To is directed atThe concatenation statement to be processed may initialize a boolean vector for the concatenation statement to be processed, where the number of elements included in the boolean vector initialized for the concatenation statement to be processed is the same as the number of elements included in the concatenation statement to be processed, so that each element in the boolean vector corresponds to only one element in the concatenation statement to be processed. For example, according to the current user statement unAnd the historical statement s to be splicedn-1The obtained concatenation statement comprises the following elements:
Figure BDA0002468239530000072
each form to be spliced historical statement sn-1The words of (a) are,
Figure BDA0002468239530000073
form a current user statement unAnd a spacer [ SEP ]](ii) a Based on this, it can be known that the current user sentence u is usednAnd the historical statement s to be splicedn-1The obtained splicing statement comprises
Figure BDA0002468239530000074
Each element also contains in the corresponding generated Boolean vector
Figure BDA0002468239530000075
Each element has a value of a default value, wherein the default value may be 0.
B2, setting elements corresponding to the elements belonging to the current user statement in the to-be-processed spliced statement in the Boolean vector as a preset first numerical value, wherein the to-be-processed spliced statement is any spliced statement;
in the embodiment of the present application, the marking of the position is realized by assigning values to elements in a boolean vector. And setting the corresponding element of the element belonging to the current user statement in the to-be-processed spliced statement in the Boolean vector as a preset first numerical value. In particular, due to the post-vector in Boolean
Figure BDA0002468239530000081
Each element corresponds to each word of the current user sentence, so that the later elements in the Boolean vector can be added
Figure BDA0002468239530000082
The value of each element is set to a preset first value. Specifically, the first numerical value may be 1.
B3, setting elements corresponding to the elements belonging to the history statements in the to-be-processed splicing statements in the Boolean vectors as preset second numerical values;
b4, setting the corresponding element of the interval character in the splicing sentence to be processed in the Boolean vector as a preset second numerical value;
in the embodiment of the present application, the boolean vector may be divided into
Figure BDA0002468239530000083
The values of the other elements than the element are set to a preset second numerical value. Specifically, the second value may be 0. According to the current user statement unAnd the historical statement s to be splicedn-1The resulting stitched sentence
Figure BDA0002468239530000084
For example, the to-be-processed concatenation statement is described, and the boolean vectors generated according to the to-be-processed concatenation statement include common vectors
Figure BDA0002468239530000085
The value of each element is "0",
Figure BDA0002468239530000086
the value of each element is "1", and the Boolean vector is specifically
Figure BDA0002468239530000087
And B5, filling or trimming the lengths of the Boolean vectors to preset lengths, and taking the filled or trimmed Boolean vectors as the Boolean vectors corresponding to the splicing sentences to be processed.
In the embodiment of the present application, in consideration of the daily dialog, the length of the sentence input by the user is not fixed, and the sentence fed back by the system of the smart device is not fixed, that is, the lengths of the current user sentence and each history sentence are different, so that the length of the boolean vector of each spliced sentence generated according to step B3 is also long or short, and for convenience of subsequent operations, it is considered here to fill or prune the generated boolean vector to a preset length, where the preset length may be 64 or another numerical value, and the preset length may be set by a developer, and is not limited here. Optionally, the step B4 is specifically represented as:
b51, comparing the length of the Boolean vector with a preset length;
in this embodiment of the present application, the length of the boolean vector generated according to the to-be-processed concatenation statement may be compared with a preset length, and whether to perform the filling processing or the trimming processing may be determined according to a comparison result.
B52, if the length of the boolean vector is longer than the preset length, deleting elements from the last element of the boolean vector until the element of the boolean vector equals the preset length, obtaining the clipped boolean vector;
in the embodiment of the present application, when the length of the boolean vector is found to be longer than the preset length, the boolean vector is considered to be too long, and the pruning process is required at this time. The pruning treatment is specifically to delete the last element of the boolean vector and sequentially delete the elements forward until the elements of the boolean vector are equal to the preset length, so as to obtain the pruned boolean vector; that is, each time an element is deleted, the last element is deleted. For better explanation of the clipping process, only by way of example, assuming that the preset length is 16, and assuming that the generated boolean vector is {0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1}, the clipped resultant boolean vector is {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1 }.
And B53, if the length of the Boolean vector is shorter than the preset length, adding an element from the last element of the Boolean vector backwards until the element of the Boolean vector is equal to the preset length to obtain the filled Boolean vector, wherein the value of the added element is a preset second value.
In the embodiment of the present application, when the length of the boolean vector is found to be shorter than the preset length, the boolean vector is considered to be too short, and the padding process is required. The filling process is to add new elements backward from the last element of the boolean vector until the element of the boolean vector is equal to the preset length, so as to obtain the filled boolean vector; that is, each time an element is added, the addition operation is performed after the last element. For better illustration of the padding process, by way of example only, assuming that the preset length is 16, and assuming that the generated boolean vector is {0, 0, 0, 0, 1, 1}, the padded resulting boolean vector is {0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0 }. Optionally, for the k obtained splicing statements and the boolean vectors corresponding to the k splicing statements, the embodiment of the present application successively uses two different types of cyclic neural network models for processing.
It should be noted that, if the length of the boolean vector is equal to the preset length, the generated boolean vector does not need to be filled or pruned.
104, inputting at least one spliced statement with a marked position into a pre-training model, and respectively and correspondingly obtaining at least one pre-training result;
in the embodiment of the present application, before the processing of the concatenated sentences with the marked positions by the RNN model, the concatenated sentences and the boolean vectors corresponding to the concatenated sentences are processed by a pre-trained model, where the pre-trained model is specifically a pre-trained bert (bidirectional Encoder expressions from transformations) model. In this embodiment of the present application, a to-be-processed concatenation statement with a marked position (when a boolean vector is used to mark a position, the to-be-processed concatenation statement and a boolean vector corresponding to the to-be-processed concatenation statement) may be input into the BERT model, and then an output of a last hidden layer in the BERT model is used as a pre-training result associated with the to-be-processed concatenation statement, where the to-be-processed concatenation statement is any concatenation statement as before. For convenience of description, taking the boolean vector marking position and the BERT model as a preprocessing model as an example, in the embodiment of the present application, the pre-training result is expressed by H, which is specifically shown as follows:
Figure BDA0002468239530000101
in the above formula, H1To splice the sentences
Figure BDA0002468239530000111
And corresponding Boolean vectors
Figure BDA0002468239530000112
The output of the last hidden layer obtained after input into the BERT model, i.e. according to the concatenation statement
Figure BDA0002468239530000113
And corresponding Boolean vectors
Figure BDA0002468239530000114
And obtaining k pre-training results H in the formula by analogy. Since there are k concatenated sentences, the obtained pre-training result H also has k, which are respectively represented as H1、H2、……、Hk
Step 105, inputting the at least one pre-training result into a trained first-class recurrent neural network model, and respectively and correspondingly obtaining a forward output result and a backward output result which are output by the first-class recurrent neural network model and are related to each pre-training result;
in the embodiment of the present application, the session information of each spliced statement may be expressed through a strong characterization capability of a pre-training model, for example, a BERT model, but since parameters in the pre-training model are fixed, it is not possible to express more specific relationship information of the spliced statement. Based on this, the first-class RNN, that is, the first-class recurrent neural network model, is introduced in the embodiment of the present application, so as to obtain the correlation between each word of the current user statement and each word of the historical statements constituting the concatenated statement. By means of the first-class recurrent neural network model, coding is carried out along the words, and the attention relation distribution situation of each word and each historical statement in the current user statement can be obtained. The first recurrent neural network model adopts bidirectional rnn (birnn), specifically BiGRU, so that the output of the first recurrent neural network model includes a forward output result and a backward output result, so as not to lose the following information. Specifically, the output results of the encoding are as follows:
Figure BDA0002468239530000115
in the above equation, o is an output result of the first-type recurrent neural network model (i.e., BiGRU), a lower subscript f of o indicates an output result calculated by forward propagation (i.e., a forward output result), b indicates an output result calculated by backward propagation (i.e., a backward output result), and l is the preset length (i.e., the length of the clipped or filled boolean vector). Taking any pre-training result as an input of the first type of recurrent neural network model as an example for explanation: since the length of the boolean vector of the concatenation statement is limited to l, only l elements exist in the pre-training result H, and the l elements pass through the first-class recurrent neural network model to obtain l forward output results of forward propagation and l backward output results of backward propagation; the one forward output result and the one backward output result are the outputs of the first type of recurrent neural network model obtained aiming at the input (a pre-training result H); that is, the forward output result and the backward output result are related to a pre-training result H. By analogy, each pre-training result is correspondingly related to l forward output results and l backward output results.
And 106, identifying the current user sentence according to the forward output result and the backward output result related to each pre-training result.
In the embodiment of the present application, the output of the first-type recurrent neural network model may be used to represent the relationship between each word of the historical sentences and the word of the current user sentence, and to fill the characterization result of the part; based on this, the forward output result and the backward output result related to each pre-training result can be analyzed to identify the intention, the dialogue behavior and the word slot of the current user sentence. In the embodiment of the present application, a second type of recurrent neural network model is used for encoding along a context, and specifically, the attention relationship distribution output by the first type of recurrent neural network model can be encoded, so that current session semantic frame information is effectively output according to the content of the context, where the semantic frame information includes an intention, a dialogue behavior, and a word slot. That is, the output of the first-type recurrent neural network model described above can be understood as an intermediate result.
In an application scenario, a second-class recurrent neural network model can be used to splice and analyze the forward output result and the backward output result related to each pre-training result to obtain the intention and the dialogue behavior of the current user statement, which are specifically expressed as follows:
c1, splicing the last forward output result and the last backward output result related to the same pre-training result to obtain a splicing result related to each pre-training result;
in the embodiment of the present application, the splicing process specifically includes: and splicing the output result of the forward last neuron and the output result of the backward last neuron of the first-class circular neural network model related to the same pre-training result, wherein the output results are used for representing the full-text understanding condition of the splicing of the current user statement and each historical statement, and the expression of at least one obtained splicing result is as follows:
Figure BDA0002468239530000131
through the splicing process, k splicing results are obtained and are represented by a character O, and the k splicing results can represent a series of splicing full-text understanding conditions. With O1For example, it is composed of
Figure BDA0002468239530000132
And
Figure BDA0002468239530000133
are spliced to obtain
Figure BDA0002468239530000134
And
Figure BDA0002468239530000135
average and pre-training result H1In this connection, that is, the above-mentioned O can be considered1And pre-training result H1And (4) correlating. By analogy with the above, and the pre-training result H2The related splicing result is O2… …, and pre-training result HkThe related splicing result is Ok
And C2, inputting the splicing results related to the pre-training results into the trained second type of cyclic neural network model, and obtaining the intention and the dialogue behavior of the current user sentence through the training processing of the second type of cyclic neural network model.
In the embodiment of the present application, the recognition intent and the dialogue behavior generally need more information concerning the context than the recognition word slot, so the recognition intent and the dialogue behavior do not share parameters with the RNN model used for recognizing the word slot, and the recognition intent and the dialogue behavior are different from the input when recognizing the word slot. In the embodiment of the present application, the second type of recurrent neural network model specifically includes two GRU models, which are respectively denoted as a first GRU model (GRU model)GRepresented) and a second GRU model with GRUSRepresentation), the first GRU model described aboveThe second GRU model is used for identifying the intention and the dialogue behavior, and the word slot is identified; in addition, the second recurrent neural network model also includes a BiGRU model, and the parameters of the BiGRU model are different from the parameters of the BiGRU model constituting the first recurrent neural network model, and the BiGRU is also used for identifying the word slot. Each of the models included in the above-described second-class recurrent neural network model will be explained and explained later.
Specifically, the second type of recurrent neural network model predicts the intention and the dialogue behavior of the current user statement through at least one concatenation result, which means that the first GRU model predicts the intention and the dialogue behavior of the current user statement through at least one concatenation result, and the process is as follows:
inputting at least one splicing result into a first GRU model to obtain the output of a last hidden layer, wherein the specific expression is as follows:
G=GRUG(O1,O2,...,Ok-1,Ok)
wherein O is the splicing result in step C1, and the specific expression of the splicing result in step C1 may be referred to specifically, which is not described herein again. Taking the output G of the hidden layer of the first GRU model as the input of the classification layer for classifying the intention and the dialogue behavior, considering that both the intention and the dialogue behavior may be multi-label results, the activation function used here is preferably sigmoid, which is expressed as:
PIntent=Sigmoid(UG)
PAction=Sigmoid(VG)
in the above formula, U and V are both network parameters of the first GRU model, and refer to objects learned in the training process; g is the output of the last hidden layer of the first GRU model given in step C2; pIntentA probability of hitting each intent for the current user statement; pActionThe probability of hitting each dialog action for the current user statement. Thus, the prediction of the intent and dialog behavior of the current user statement may be embodied as: inputting all splicing results into the first GRU model to obtain the output of the last hidden layer in the first GRU model; the above-mentioned first stepThe output of the last hidden layer in a GRU model is used as the input of the classification layer related to the intention and the dialogue behavior in the first GRU model, and the probability that the current user statement output by the classification layer related to the intention and the dialogue behavior belongs to each intention category and the probability that the current user statement belongs to each dialogue behavior category are obtained; and determining the intention of the current user sentence according to the probability that the current user sentence belongs to each intention category, and determining the conversation behavior of the current user sentence according to the probability that the current user sentence belongs to each conversation behavior category. For example, a target intention category is determined as the intention of the current user sentence, where the target intention category may be an intention category with the highest probability, or an intention category with a probability higher than a preset intention probability threshold, and is not limited herein; and determining the target dialogue behavior type as the dialogue behavior of the current user statement, where the target dialogue behavior type may be the dialogue behavior type with the highest probability, or may be the dialogue behavior type with the probability higher than a preset dialogue behavior probability threshold, and is not limited herein.
In an application scenario, a second type of recurrent neural network model may be used to screen and analyze the forward output result and the backward output result related to each pre-training result to obtain the word slot of the current user sentence, which is specifically represented as:
d1, screening the forward output result and the backward output result related to each pre-training result based on a preset screening condition to obtain at least one screening result;
in this embodiment, when the intelligent device analyzes the word slot of the current user sentence, the intelligent device may screen the forward output result and the backward output result output by the first-class recurrent neural network model and respectively related to each pre-training result, and screen out the current user sentence unThe respective words of (a) are distributed with the attention relationship of the historical conversation. Optionally, when the positions are marked by using boolean vectors, the step D1 specifically includes:
d11, acquiring Boolean vectors of the spliced sentences associated with the output to be screened, and recording the Boolean vectors as the Boolean vectors to be screened, wherein the output to be screened is a forward output result and a backward output result which are output by the first type of recurrent neural network model and are related to any pre-training result;
d12, performing cross multiplication operation based on the output to be screened and the Boolean vector;
d13, elements larger than 0 in the cross multiplication operation result are reserved, elements smaller than or equal to 0 are removed, and a screening result associated with the output to be screened is obtained.
In the embodiment of the present application, the output of the first type of recurrent neural network model is described in detail: while the output of the first type of recurrent neural network model includes a forward output result and a backward output result, please refer to fig. 2, fig. 2 is an output schematic of the first type of recurrent neural network model (BiGRU model): the arrangement sequence of the forward output result and the backward output result in the respective directions is O1To olThus, in effect, the first forward output result
Figure BDA0002468239530000151
And the last backward output result
Figure BDA0002468239530000152
Are correlated, and so on, it can be known that the ith forward output result is correlated with the (l-i + 1) th backward output result (i is a positive integer not greater than l); that is, the forward output results are arranged in a forward order, and the backward output results are arranged in a reverse order, so that the forward output results and the backward output results associated with the same pre-training result can be obtained. On the basis of understanding the output of the first-class recurrent neural network model, the above-mentioned screening process can be illustrated by the following equation:
Figure BDA0002468239530000161
in the above formula, the character B represents a boolean vector of a concatenated statement associated with an output to be filtered, and the output to be filtered is a forward output result and a backward output result related to any one of the pre-training results output by the first-type recurrent neural network model. As can be seen from the above formula, after the outputs to be screened (i.e., the forward outputs and the backward outputs associated with the same pre-training result) are obtained, the forward output results and the backward output results associated with each output to be screened are spliced to obtain one group of spliced outputs under the outputs to be screened, and then cross product operations are performed on the boolean vectors of the spliced sentences associated with the outputs to be screened and the one group of spliced outputs under the outputs to be screened respectively; elements which are larger than 0 in the result obtained by the cross multiplication operation are reserved, elements which are smaller than or equal to 0 are removed, namely, only the parts which are larger than 0 can really participate in the subsequent operation.
D2, inputting the at least one screening result into a trained second-class recurrent neural network model to obtain a word slot of the current user sentence;
in this embodiment of the present application, the predicting, by the second type of recurrent neural network model, the word slot of the current user statement according to at least one filtering result means that the predicting, by the second GRU model, the word slot of the current user statement according to at least one filtering result, which may be specifically expressed as:
Figure BDA0002468239530000162
wherein, GRUG(i.e., the first GRU model) and GRUS(i.e., the second GRU model) does not share parameters; sjIs GRUsThe word slot information output aiming at the jth word of the current user statement, j satisfies
Figure BDA0002468239530000171
h is the screening result in step D1, and only h greater than 0 can participate in the operation in this step, and the formula and the related explanation used in the screening in step D1 can be referred to, which is not described herein again. Thus, the word slot information of all words in the current user sentence can be obtained, which is specifically expressed as:
Figure BDA0002468239530000172
in the above equation, S on the left side of the equation is used to represent the second GRU model GRUsAnd outputting a word slot contained in the current user sentence. It should be noted that S is not the final word slot prediction result of the current user sentence. Before outputting the final word slot prediction result, the above G (output of the last hidden layer of the first GRU model, please refer to step C2) and S (word slot information of all words in the current user sentence obtained by the second GRU model operation, please refer to step D2) may also be used as input of a new BiGRU model, the result of the output layer of the BiGRU model is put into the softmax layer, and the final word slot prediction result is output through the new BiGRU model, as follows:
Figure BDA0002468239530000173
by the above
Figure BDA0002468239530000174
The calculation formula can calculate the probability value of each word in the current user sentence hitting the corresponding word slot, and after comparing the probability value with a preset word slot probability threshold value, whether the word slot corresponding to the jth word is established can be judged; that is, the above
Figure BDA0002468239530000175
The word slots corresponding to all the words in the current user sentence are given, and the words are passed through the word slots
Figure BDA0002468239530000176
The calculation formula can know whether a word slot corresponding to a certain word in the current user sentence is established or not; for example, the first word in the current user statement
Figure BDA0002468239530000177
The corresponding word slot is denoted S1In the calculation of P1 SlotThen, willThe P is1 SlotComparing with the word slot probability threshold value if P1 SlotIf the probability threshold of the word slot is higher than the threshold of the probability threshold of the word slot, the first word in the current user sentence can be determined
Figure BDA0002468239530000178
The corresponding word slot is really S1(ii) a If P1 SlotIf the probability threshold of the word slot is not higher than the threshold of the probability of the word slot, the first word in the current user sentence can be determined
Figure BDA0002468239530000179
There is no corresponding word slot. It can be considered that the word slots that may respectively correspond to each word in the current user sentence are obtained by the second GRU model, and the word slots that may respectively correspond to each word in the current user sentence are screened by the new BiGRU model.
It should be noted that, in the training process of the second type of recurrent neural network model, the cross entropy can be used as loss functions of the three tasks, namely the intention prediction, the dialogue behavior prediction and the word slot prediction, and the sum of the losses of the tasks is used as an optimized object.
Referring to fig. 3, fig. 3 illustrates a model framework adopted by the sentence recognition method provided in the embodiment of the present application, and the following briefly introduces:
obtaining the current user statement unAnd a history sentence u1、s1、u2、s2、……、un-1,sn-1Then, the current user statement unSplicing with each historical statement to obtain k spliced statements (and boolean vectors of each spliced statement, not shown in fig. 3), inputting the k spliced statements (and boolean vectors of each spliced statement, not shown in fig. 3) into a BERT model, and inputting the output of the BERT model into a first-type recurrent neural network model (i.e., BiGRUu model in the figure), i.e., BiGRU modeluThe output of the model is then passed through a second type recurrent neural network model (including the first GRU model GRU without shared parameters)GAnd a second GRU model GRUSAnd BiGRU in FIG. 3SA model; it should be noted that the first GRU model GRU is not separately shown in FIG. 3GAnd a second GRU model GRUS(ii) a That is, although only one GRU model is shown in fig. 3, the GRU model actually adopts two different sets of parameters) to be processed, and the current user statement u is obtainednIntention, dialog behavior (by the first GRU model GRU)GModel) and word slots (by BiGRU)SModel derived).
As can be seen from the above, the sentence identification method provided in the present application scheme splices the current user sentence into the historical sentence and inputs the concatenated sentence into the BERT model to obtain the information of the spliced sentence excavated by the BERT model, so as to realize that more feature information is excavated under the condition of less corpus; furthermore, the character-level coding and the coding of the context information are split through two types of RNN models, each type of RNN is responsible for coding a set of content, the division of labor is clear, required features are extracted respectively, and finally the intention, the dialogue behavior and the word slot information of the current user statement are predicted jointly through a single model (namely the second type of RNN model). The prediction process also effectively combines context information, and improves the understanding efficiency and the understanding accuracy of the current user statement.
Example two
A second embodiment of the present application provides a sentence recognition apparatus, where the sentence recognition apparatus may be integrated in an intelligent device, as shown in fig. 4, a sentence recognition apparatus 400 in the embodiment of the present application includes:
a statement obtaining unit 401, configured to obtain a current user statement and a historical statement set, where the current user statement is a current turn of user statements, the historical statement set is formed by at least one historical statement, and the historical statement includes user statements of each historical turn and system statements of each historical turn;
a sentence splicing unit 402, configured to splice the current user sentence with each historical sentence in the historical sentence set, respectively, to obtain at least one spliced sentence;
a position marking unit 403, configured to mark a position for at least one splicing statement respectively;
a pre-training result obtaining unit 404, configured to input at least one spliced statement with a marked position into a pre-training model, and obtain at least one pre-training result respectively;
a first network output result obtaining unit 405, configured to input the at least one pre-training result into a trained first-class recurrent neural network model, and respectively obtain a forward output result and a backward output result, which are output by the first-class recurrent neural network model and are related to each pre-training result, correspondingly;
and a sentence recognizing unit 406, configured to recognize the current user sentence according to the forward output result and the backward output result related to each pre-training result.
Optionally, the sentence recognizing unit 406 includes:
the output splicing subunit is used for splicing the last forward output result and the last backward output result related to the same pre-training result to obtain splicing results related to each pre-training result;
and the intention and dialogue behavior acquisition subunit is used for inputting the splicing results related to the pre-training results into the trained second type of cyclic neural network model, and obtaining the intention and dialogue behavior of the current user statement through the training processing of the second type of cyclic neural network model.
Optionally, the recurrent neural network model of the second type includes a first GRU model, and the intention and dialog behavior obtaining subunit includes:
a hidden layer output obtaining subunit, configured to input all splicing results into the first GRU model, so as to obtain an output of a last hidden layer in the first GRU model;
a probability obtaining subunit, configured to use an output of a last hidden layer in the first GRU model as an input of a classification layer related to an intention and a dialog behavior in the first GRU model, and obtain a probability that the current user statement output by the classification layer related to the intention and the dialog behavior belongs to each intention category and a probability that the current user statement belongs to each dialog behavior category;
and an intention and dialogue behavior determination subunit, configured to determine an intention of the current user sentence according to a probability that the current user sentence belongs to each intention category, and determine a dialogue behavior of the current user sentence according to a probability that the current user sentence belongs to each dialogue behavior category.
Optionally, the sentence recognizing unit 406 includes:
the output screening subunit is used for screening forward output results and backward output results related to each pre-training result based on preset screening conditions to obtain at least one screening result, wherein one screening result comprises a forward output result and a backward output result related to one pre-training result;
and the word slot obtaining subunit is used for inputting the at least one screening result into the trained second-class recurrent neural network model to obtain the word slot of the current user statement.
Optionally, the sentence splicing unit 402 includes:
a word obtaining subunit, configured to obtain words that constitute the current user statement, and obtain words that constitute to-be-spliced historical statements, where the to-be-spliced historical statement is any one of the historical statements in the historical statement set;
and the sentence splicing subunit is used for splicing each word forming the current user sentence after each word forming the historical sentence to be spliced is spliced through a preset spacer based on the sequence of the historical sentence to be spliced and the current user sentence to obtain the spliced sentence of the current user sentence and the historical sentence to be spliced.
Optionally, the position marking unit 403 includes:
a boolean vector initialization unit, configured to initialize a boolean vector based on the to-be-processed concatenation statement, where the to-be-processed concatenation statement is any concatenation statement, and each element in the boolean vector uniquely corresponds to one element in the to-be-processed concatenation statement;
a first value setting subunit, configured to set, as a preset first value, an element, in the to-be-processed concatenation statement, corresponding to an element in a boolean vector, where the element belongs to the current user statement, and the to-be-processed concatenation statement is any concatenation statement;
a second numerical value setting subunit, configured to set, as a preset second numerical value, an element, in the boolean vector, of an element, in the to-be-processed concatenation statement, that belongs to a history statement, and set, as a preset second numerical value, an element, in the boolean vector, of an interval symbol in the to-be-processed concatenation statement;
and the vector editing subunit is used for filling or trimming the length of the Boolean vector to a preset length, taking the filled or trimmed Boolean vector as the Boolean vector corresponding to the splicing statement to be processed, and marking the position of the corresponding splicing statement to be processed through the Boolean vector.
Optionally, the vector editing subunit includes:
a length comparison subunit, configured to compare the length of the boolean vector with a preset length;
a vector clipping subunit, configured to, if the length of the boolean vector is longer than the preset length, delete an element from a last element of the boolean vector until the element of the boolean vector equals the preset length, and obtain the clipped boolean vector;
and the vector filling subunit is used for adding an element backwards from the last element of the Boolean vector if the length of the Boolean vector is shorter than the preset length until the element of the Boolean vector is equal to the preset length to obtain the filled Boolean vector, wherein the value of the added element is a preset second value.
Optionally, the pre-training result obtaining unit 404 is specifically configured to input the to-be-processed joined sentence with the marked position into the pre-training model, and use an output of a last hidden layer in the pre-training model as a pre-training result associated with the to-be-processed joined sentence, where the to-be-processed joined sentence is any joined sentence.
As can be seen from the above, the sentence recognition apparatus provided in the embodiment of the present application splices the current user sentence into the historical sentence and inputs the spliced sentence into the BERT model to obtain information of the spliced sentence excavated by the BERT model, so as to realize that more feature information is excavated in the case of less corpus; furthermore, the character-level coding and the coding of the context information are split through two types of RNN models, each type of RNN is responsible for coding a set of content, the division of labor is clear, required features are extracted respectively, and finally the intention, the dialogue behavior and the word slot information of the current user statement are predicted jointly through a single model (namely the second type of RNN model). The prediction process also effectively combines context information, and improves the understanding efficiency and the understanding accuracy of the current user statement.
EXAMPLE III
An embodiment of the present application provides an intelligent device, please refer to fig. 5, where the intelligent device 5 in the embodiment of the present application includes: a memory 501, one or more processors 502 (only one shown in fig. 5), and a computer program stored on the memory 501 and executable on the processors. Wherein: the memory 501 is used for storing software programs and modules, and the processor 502 executes various functional applications and data processing by running the software programs and units stored in the memory 501, so as to acquire resources corresponding to the preset events. Specifically, the processor 502 realizes the following steps by running the above-mentioned computer program stored in the memory 501:
acquiring a current user statement and a historical statement set, wherein the current user statement is a current round of user statements, the historical statement set is composed of at least one historical statement, and the historical statements comprise user statements of each historical round and system statements of each historical round;
splicing the current user statement with each historical statement in the historical statement set respectively to obtain at least one spliced statement;
respectively marking the position for at least one splicing statement;
inputting at least one spliced statement with a marked position into a pre-training model, and respectively and correspondingly obtaining at least one pre-training result;
inputting the at least one pre-training result into a trained first-class cyclic neural network model, and respectively and correspondingly obtaining a forward output result and a backward output result which are output by the first-class cyclic neural network model and are related to each pre-training result;
and identifying the current user sentence according to the forward output result and the backward output result which are related to each pre-training result.
Assuming that the above is the first possible implementation manner, in a second possible implementation manner provided on the basis of the first possible implementation manner, the identifying the current user sentence according to the forward output result and the backward output result associated with each pre-training result includes:
splicing the last forward output result and the last backward output result related to the same pre-training result to obtain a splicing result related to each pre-training result;
and inputting the splicing results related to the pre-training results into the trained second type of cyclic neural network model, and obtaining the intention and the dialogue behavior of the current user statement through the training processing of the second type of cyclic neural network model.
In a third possible embodiment based on the two possible embodiments, the second-class recurrent neural network model includes a first GRU model, and the obtaining of the intention and the dialogue behavior of the current user sentence through the training process of the second-class recurrent neural network model by inputting the concatenation result related to each pre-training result into the trained second-class recurrent neural network model includes:
inputting all splicing results into the first GRU model to obtain the output of the last hidden layer in the first GRU model;
taking the output of the last hidden layer in the first GRU model as the input of the classification layer related to the intention and the dialogue behavior in the first GRU model, and obtaining the probability that the current user statement output by the classification layer related to the intention and the dialogue behavior belongs to each intention category and the probability that the current user statement belongs to each dialogue behavior category;
and determining the intention of the current user sentence according to the probability that the current user sentence belongs to each intention category, and determining the conversation behavior of the current user sentence according to the probability that the current user sentence belongs to each conversation behavior category.
In a third possible implementation manner provided on the basis of the one possible implementation manner, the recognizing the current user sentence according to the forward output result and the backward output result associated with each pre-training result includes:
based on preset screening conditions, screening forward output results and backward output results related to each pre-training result to obtain at least one screening result, wherein one screening result comprises a forward output result and a backward output result related to one pre-training result;
and inputting the at least one screening result into a trained second-class recurrent neural network model to obtain the word slot of the current user sentence.
In a fifth possible implementation manner based on the first possible implementation manner, or based on the second possible implementation manner, or based on the third possible implementation manner, or based on the fourth possible implementation manner, the concatenating the current user sentence with each of the historical sentences in the historical sentence set to obtain at least one concatenated sentence includes:
acquiring all words and phrases forming the current user sentence;
acquiring all words and phrases forming a historical statement to be spliced, wherein the historical statement to be spliced is any one of the historical statements in the historical statement set;
and splicing each word forming the current user statement after each word forming the historical sentence to be spliced through a preset interval symbol based on the sequence of the historical sentence to be spliced and the current user statement to obtain a spliced sentence of the current user statement and the historical sentence to be spliced.
In a sixth possible implementation form provided on the basis of the fifth possible implementation form, the marking positions of at least one spliced sentence respectively includes:
initializing a Boolean vector based on the splicing statement to be processed, wherein the splicing statement to be processed is any splicing statement, and each element in the Boolean vector uniquely corresponds to one element in the splicing statement to be processed;
setting elements corresponding to the elements belonging to the current user statement in the to-be-processed spliced statement in the Boolean vector as a preset first numerical value, wherein the to-be-processed spliced statement is any spliced statement;
setting elements corresponding to the elements belonging to the history statements in the to-be-processed spliced statement in the Boolean vector as preset second numerical values;
setting the corresponding element of the interval character in the splicing statement to be processed in the Boolean vector as a preset second numerical value;
and filling or trimming the length of the Boolean vector to a preset length, taking the filled or trimmed Boolean vector as the Boolean vector corresponding to the splicing sentence to be processed, and marking the position of the corresponding splicing sentence to be processed by the Boolean vector.
In a seventh possible implementation manner provided on the basis of the six possible implementation manners, the filling or clipping the length of the boolean vector to a preset length includes:
comparing the length of the Boolean vector with a preset length;
if the length of the Boolean vector is longer than the preset length, deleting elements from the last element of the Boolean vector until the elements of the Boolean vector are equal to the preset length to obtain the trimmed Boolean vector;
and if the length of the Boolean vector is shorter than the preset length, newly adding elements from the last element of the Boolean vector backwards until the elements of the Boolean vector are equal to the preset length to obtain the filled Boolean vector, wherein the newly added elements take a preset second value.
In an eighth possible implementation manner provided on the basis of the first possible implementation manner, the second possible implementation manner, the third possible implementation manner, or the fourth possible implementation manner, the inputting of the at least one concatenated sentence with a marked position into the pre-training model to obtain at least one pre-training result respectively includes:
inputting the spliced sentences to be processed with the marked positions into the pre-training model, and taking the output of the last hidden layer in the pre-training model as a pre-training result associated with the spliced sentences to be processed, wherein the spliced sentences to be processed are any spliced sentences.
It should be understood that in the embodiments of the present Application, the Processor 502 may be a Central Processing Unit (CPU), and the Processor may be other general processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Memory 501 may include both read-only memory and random access memory and provides instructions and data to processor 502. Some or all of the memory 501 may also include non-volatile random access memory. For example, the memory 501 may also store device class information.
As can be seen from the above, the intelligent device proposed in the present application scheme splices the current user sentence into the historical sentence and inputs the historical sentence into the BERT model to obtain information of the spliced sentence excavated by the BERT model, so as to realize that more feature information is excavated under the condition of less corpus; furthermore, the character-level coding and the coding of the context information are split through two types of RNN models, each type of RNN is responsible for coding a set of content, the division of labor is clear, required features are extracted respectively, and finally the intention, the dialogue behavior and the word slot information of the current user statement are predicted jointly through a single model (namely the second type of RNN model). The prediction process also effectively combines context information, and improves the understanding efficiency and the understanding accuracy of the current user statement.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned functions may be distributed as different functional units and modules according to needs, that is, the internal structure of the apparatus may be divided into different functional units or modules to implement all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art would appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of external device software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, the division of the above-described modules or units is only one logical functional division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
The integrated unit may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, all or part of the flow in the method of the embodiments described above can be realized by a computer program, which can be stored in a computer-readable storage medium and can realize the steps of the embodiments of the methods described above when the computer program is executed by a processor. The computer program includes computer program code, and the computer program code may be in a source code form, an object code form, an executable file or some intermediate form. The computer-readable storage medium may include: any entity or device capable of carrying the above-described computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer readable Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signal, telecommunication signal, software distribution medium, etc. It should be noted that the computer readable storage medium may contain other contents which can be appropriately increased or decreased according to the requirements of the legislation and the patent practice in the jurisdiction, for example, in some jurisdictions, the computer readable storage medium does not include an electrical carrier signal and a telecommunication signal according to the legislation and the patent practice.
The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims (11)

1. A sentence recognition method, comprising:
acquiring a current user statement and a historical statement set, wherein the current user statement is a current round of user statements, the historical statement set is composed of at least one historical statement, and the historical statements comprise user statements of each historical round and system statements of each historical round;
splicing the current user statement with each historical statement in the historical statement set respectively to obtain at least one spliced statement;
respectively marking the position for at least one splicing statement;
inputting at least one spliced statement with a marked position into a pre-training model, and respectively and correspondingly obtaining at least one pre-training result;
inputting the at least one pre-training result into a trained first-class cyclic neural network model, and respectively and correspondingly obtaining a forward output result and a backward output result which are output by the first-class cyclic neural network model and are related to each pre-training result;
and identifying the current user sentence according to the forward output result and the backward output result which are related to each pre-training result.
2. The sentence recognition method of claim 1 wherein the recognizing the current user sentence according to the forward output result and the backward output result associated with each pre-training result comprises:
splicing the last forward output result and the last backward output result related to the same pre-training result to obtain a splicing result related to each pre-training result;
and inputting the splicing results related to the pre-training results into a trained second type of cyclic neural network model, and processing through the second type of cyclic neural network model to obtain the intention and the dialogue behavior of the current user statement.
3. The sentence recognition method of claim 2, wherein the second type of recurrent neural network model includes a first GRU model, and the inputting the concatenation result associated with each pre-training result into the trained second type of recurrent neural network model, and obtaining the intention and the dialogue behavior of the current user sentence through a training process of the second type of recurrent neural network model, includes:
inputting all splicing results into the first GRU model to obtain the output of the last hidden layer in the first GRU model;
taking the output of the last hidden layer in the first GRU model as the input of a classification layer related to the intention and the dialogue behavior in the first GRU model, and obtaining the probability that the current user statement output by the classification layer related to the intention and the dialogue behavior belongs to each intention category and the probability that the current user statement belongs to each dialogue behavior category;
and determining the intention of the current user statement according to the probability that the current user statement belongs to each intention category, and determining the conversation behavior of the current user statement according to the probability that the current user statement belongs to each conversation behavior category.
4. The sentence recognition method of claim 1 wherein the recognizing the current user sentence according to the forward output result and the backward output result associated with each pre-training result comprises:
based on preset screening conditions, screening forward output results and backward output results related to each pre-training result to obtain at least one screening result, wherein one screening result comprises a forward output result and a backward output result related to one pre-training result;
and inputting the at least one screening result into a trained second type of cyclic neural network model, and processing through the second type of cyclic neural network model to obtain a word slot of the current user statement.
5. The sentence identification method of any one of claims 1 to 4, wherein the splicing the current user sentence with each historical sentence in the historical sentence set to obtain at least one spliced sentence comprises:
acquiring all words and phrases forming the current user sentence;
acquiring all words and phrases forming a historical statement to be spliced, wherein the historical statement to be spliced is any one of the historical statements in the historical statement set;
splicing all words forming the current user statement after all words forming the historical sentence to be spliced through a preset spacer based on the sequence of the historical sentence to be spliced and the current user statement to obtain a spliced sentence of the current user statement and the historical sentence to be spliced.
6. The sentence recognition method of claim 5, wherein the marking positions for the at least one spliced sentence, respectively, comprises:
initializing a Boolean vector based on the splicing statement to be processed, wherein the splicing statement to be processed is any splicing statement, and each element in the Boolean vector uniquely corresponds to one element in the splicing statement to be processed;
setting elements corresponding to the elements belonging to the current user statement in the to-be-processed spliced statement in the Boolean vector as preset first numerical values;
setting elements corresponding to the elements belonging to the historical sentences in the to-be-processed spliced sentences in the Boolean vectors as preset second numerical values;
setting the corresponding elements of the spacers in the splicing statement to be processed in the Boolean vector as preset second numerical values;
and filling or trimming the length of the Boolean vector to a preset length, taking the filled or trimmed Boolean vector as the Boolean vector corresponding to the splicing statement to be processed, and marking the position of the corresponding splicing statement to be processed through the Boolean vector.
7. The sentence recognition method of claim 6 wherein the padding or pruning the length of the boolean vector to a preset length comprises:
comparing the length of the Boolean vector with a preset length;
if the length of the Boolean vector is longer than the preset length, deleting elements from the last element of the Boolean vector forward until the elements of the Boolean vector are equal to the preset length to obtain the trimmed Boolean vector;
and if the length of the Boolean vector is shorter than the preset length, newly adding elements from the last element of the Boolean vector backwards until the elements of the Boolean vector are equal to the preset length to obtain the filled Boolean vector, wherein the value of the newly added elements is a preset second numerical value.
8. The sentence recognition method of any one of claims 1 to 4, wherein the inputting of the at least one spliced sentence with the marked position into the pre-training model to obtain at least one pre-training result respectively comprises:
inputting the spliced sentences to be processed with the marked positions into the pre-training model, and taking the output of the last hidden layer in the pre-training model as a pre-training result associated with the spliced sentences to be processed, wherein the spliced sentences to be processed are any spliced sentences.
9. A sentence recognition apparatus, comprising:
the system comprises a statement acquisition unit, a statement acquisition unit and a history statement set, wherein the statement acquisition unit is used for acquiring a current user statement and a history statement set, the current user statement is a current round of user statements, the history statement set is composed of at least one history statement, and the history statement comprises each history round of user statements and each history round of system statements;
the sentence splicing unit is used for splicing the current user sentence with each historical sentence in the historical sentence set respectively to obtain at least one spliced sentence;
the position marking unit is used for marking the position of at least one splicing statement respectively;
the pre-training result acquisition unit is used for inputting at least one spliced statement with a marked position into a pre-training model and respectively and correspondingly obtaining at least one pre-training result;
a first network output result obtaining unit, configured to input the at least one pre-training result into a trained first-class recurrent neural network model, and respectively obtain a forward output result and a backward output result, which are output by the first-class recurrent neural network model and are related to each pre-training result, in a corresponding manner;
and the sentence recognition unit is used for recognizing the current user sentence according to the forward output result and the backward output result which are related to each pre-training result.
10. A smart device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 8 when executing the computer program.
11. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 8.
CN202010340047.8A 2020-04-26 2020-04-26 Statement identification method, statement identification device and intelligent equipment Active CN111563161B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010340047.8A CN111563161B (en) 2020-04-26 2020-04-26 Statement identification method, statement identification device and intelligent equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010340047.8A CN111563161B (en) 2020-04-26 2020-04-26 Statement identification method, statement identification device and intelligent equipment

Publications (2)

Publication Number Publication Date
CN111563161A true CN111563161A (en) 2020-08-21
CN111563161B CN111563161B (en) 2023-05-23

Family

ID=72071534

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010340047.8A Active CN111563161B (en) 2020-04-26 2020-04-26 Statement identification method, statement identification device and intelligent equipment

Country Status (1)

Country Link
CN (1) CN111563161B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112464662A (en) * 2020-12-02 2021-03-09 平安医疗健康管理股份有限公司 Medical phrase matching method, device, equipment and storage medium
CN112559715A (en) * 2020-12-24 2021-03-26 北京百度网讯科技有限公司 Attitude identification method, attitude identification device, attitude identification equipment and storage medium
CN114154488A (en) * 2021-12-10 2022-03-08 北京金山数字娱乐科技有限公司 Statement processing method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017024884A1 (en) * 2015-08-07 2017-02-16 广州神马移动信息科技有限公司 Search intention identification method and device
WO2018028077A1 (en) * 2016-08-11 2018-02-15 中兴通讯股份有限公司 Deep learning based method and device for chinese semantics analysis

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017024884A1 (en) * 2015-08-07 2017-02-16 广州神马移动信息科技有限公司 Search intention identification method and device
WO2018028077A1 (en) * 2016-08-11 2018-02-15 中兴通讯股份有限公司 Deep learning based method and device for chinese semantics analysis

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112464662A (en) * 2020-12-02 2021-03-09 平安医疗健康管理股份有限公司 Medical phrase matching method, device, equipment and storage medium
CN112464662B (en) * 2020-12-02 2022-09-30 深圳平安医疗健康科技服务有限公司 Medical phrase matching method, device, equipment and storage medium
CN112559715A (en) * 2020-12-24 2021-03-26 北京百度网讯科技有限公司 Attitude identification method, attitude identification device, attitude identification equipment and storage medium
CN112559715B (en) * 2020-12-24 2023-09-22 北京百度网讯科技有限公司 Attitude identification method, device, equipment and storage medium
CN114154488A (en) * 2021-12-10 2022-03-08 北京金山数字娱乐科技有限公司 Statement processing method and device

Also Published As

Publication number Publication date
CN111563161B (en) 2023-05-23

Similar Documents

Publication Publication Date Title
CN111259142B (en) Specific target emotion classification method based on attention coding and graph convolution network
CN107885756B (en) Deep learning-based dialogue method, device and equipment
CN111695352A (en) Grading method and device based on semantic analysis, terminal equipment and storage medium
CN112257449B (en) Named entity recognition method and device, computer equipment and storage medium
CN107704482A (en) Method, apparatus and program
CN111563161A (en) Sentence recognition method, sentence recognition device and intelligent equipment
JP6677419B2 (en) Voice interaction method and apparatus
CN111653275A (en) Method and device for constructing voice recognition model based on LSTM-CTC tail convolution and voice recognition method
CN110399472B (en) Interview question prompting method and device, computer equipment and storage medium
CN112528637A (en) Text processing model training method and device, computer equipment and storage medium
CN112417855A (en) Text intention recognition method and device and related equipment
CN110942774A (en) Man-machine interaction system, and dialogue method, medium and equipment thereof
WO2021082695A1 (en) Training method, feature extraction method, apparatus and electronic device
CN112632248A (en) Question answering method, device, computer equipment and storage medium
CN115393933A (en) Video face emotion recognition method based on frame attention mechanism
CN111858878A (en) Method, system and storage medium for automatically extracting answer from natural language text
CN112183106A (en) Semantic understanding method and device based on phoneme association and deep learning
CN109147868A (en) Protein function prediction technique, device, equipment and storage medium
CN109829441B (en) Facial expression recognition method and device based on course learning
CN114694255A (en) Sentence-level lip language identification method based on channel attention and time convolution network
CN115617975B (en) Intention recognition method and device for few-sample multi-turn conversation
CN113435208A (en) Student model training method and device and electronic equipment
CN113065347A (en) Criminal case judgment prediction method, system and medium based on multitask learning
CN111680514A (en) Information processing and model training method, device, equipment and storage medium
CN115482575A (en) Facial expression recognition method based on label distribution learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant