CN109388793B

CN109388793B - Entity marking method, intention identification method, corresponding device and computer storage medium

Info

Publication number: CN109388793B
Application number: CN201710655187.2A
Authority: CN
Inventors: 胡于响
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2017-08-03
Filing date: 2017-08-03
Publication date: 2023-04-07
Anticipated expiration: 2037-08-03
Also published as: CN109388793A; WO2019024704A1

Abstract

The invention provides an entity labeling method, an intention identification method, corresponding devices and a computer storage medium. The entity labeling method comprises the following steps: carrying out word encoding on attribute labels of at least part of words in the sentence by using a knowledge graph to obtain first expression vectors of at least part of words; performing word coding on at least part of words in the sentence based on the sentence structure to obtain second expression vectors of at least part of words; and fusing the first expression vector and the second expression vector to obtain an entity labeling result of the sentence. The intention identification method comprises the following steps: performing combined coding on attribute labels of at least part of words in a sentence by using a knowledge graph to obtain a first sentence vector of the sentence; coding the sentence based on the sentence structure to obtain a second sentence vector of the sentence; and fusing the first sentence vector and the second sentence vector of the sentence to obtain an intention recognition result of the sentence. The method provided by the invention can improve the accuracy of entity marking and intention identification.

Description

Entity marking method, intention identification method, corresponding device and computer storage medium

[ technical field ] A method for producing a semiconductor device

The present invention relates to the field of computer application technologies, and in particular, to an entity labeling method, an intent recognition method, corresponding apparatuses, and a computer storage medium.

[ background ] A method for producing a semiconductor device

Natural language processing is an important and even central part of artificial intelligence, and aims to understand what a sentence expresses, and mainly comprises two tasks: entity tagging and intent recognition. Wherein the entity labels are attribute labels for labeling entity words in a sentence, and the intention identification is to identify what intention or purpose a sentence is intended to achieve. For example, if there is such a sentence, "which movies Zhou Jie Lun has played," the task of entity tagging is to tag the entity word "Zhou Jie Lun" as the Movie _ actor tag, movie _ actor referring to the Movie actor; and the intent recognition is to identify which movies an actor is playing in the sentence.

The existing entity labeling and intention identifying method is only based on sentence structures, and the intention identifying and entity labeling accuracy rate is low due to the mode of only based on the sentence structures.

[ summary of the invention ]

In view of the above, the present invention provides an entity labeling method, an intent recognition method, corresponding apparatuses, and a computer storage medium, so as to improve the accuracy of entity labeling and intent recognition.

The specific technical scheme is as follows:

the invention provides an entity labeling method, which comprises the following steps:

carrying out word encoding on attribute labels of at least part of words in the sentence by using a knowledge graph to obtain first expression vectors of at least part of words;

performing word coding on at least part of words in the sentence based on the sentence structure to obtain second expression vectors of at least part of words;

and fusing the first expression vector and the second expression vector to obtain an entity labeling result of the sentence.

The invention also provides an intention identification method, which comprises the following steps:

performing combined coding on attribute labels of at least part of words in a sentence by using a knowledge graph to obtain a first sentence vector of the sentence;

coding the sentence based on the sentence structure to obtain a second sentence vector of the sentence;

and fusing the first sentence vector and the second sentence vector of the sentence to obtain an intention recognition result of the sentence.

The invention provides an entity labeling device, which is characterized by comprising the following components:

the first word encoding unit is used for carrying out word encoding on attribute labels of at least part of words in the sentence by using a knowledge graph to obtain first expression vectors of at least part of words;

the second word coding unit is used for carrying out word coding on at least part of words in the sentence based on the sentence structure to obtain a second expression vector of at least part of words;

and the vector fusion unit is used for fusing the first expression vector and the second expression vector to obtain an entity labeling result of the sentence.

The present invention also provides an intention recognition apparatus, including:

the first sentence coding unit is used for carrying out combined coding on attribute labels of at least part of words in a sentence by using a knowledge graph to obtain a first sentence vector of the sentence;

the second sentence coding unit is used for coding the sentences based on sentence structures to obtain second sentence vectors of the sentences;

and the vector fusion unit is used for fusing the first sentence vector and the second sentence vector of the sentence to obtain an intention identification result of the sentence.

The invention provides an apparatus comprising

A memory including one or more programs;

one or more processors, coupled to the memory, execute the one or more programs to perform the operations performed in the above-described methods.

The present invention also provides a computer storage medium encoded with a computer program that, when executed by one or more computers, causes the one or more computers to perform the operations performed in the above-described method.

According to the technical scheme, the knowledge graph is introduced into entity marking and intention recognition, namely, the entity marking and intention recognition are carried out by fusing the attribute information of the entity in the knowledge graph with a sentence structure-based mode, and compared with the prior art which is based on the sentence structure, the accuracy is improved.

[ description of the drawings ]

FIG. 1 is a flowchart of a method for entity tagging according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of word encoding using a knowledge-graph according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating word encoding based on sentence structure according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of entity labeling by fusing a knowledge graph and a sentence structure manner according to an embodiment of the present invention;

FIG. 5 is a flow chart of a method for intent recognition provided by an embodiment of the present invention;

FIG. 6 is a diagram illustrating sentence encoding using a knowledge-graph according to an embodiment of the present invention;

FIG. 7 is a diagram illustrating intent recognition by fusing knowledge-graphs and sentence structure patterns provided by an embodiment of the present invention;

FIG. 8 is a block diagram of an entity tagging apparatus according to an embodiment of the present invention;

FIG. 9 is a block diagram of an intent translation device according to an embodiment of the present invention;

fig. 10 is a block diagram of an exemplary device provided in an embodiment of the present invention.

[ detailed description ] embodiments

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.

The core idea of the invention is to introduce the knowledge graph into entity labeling and intention recognition, i.e. to fuse the attribute information of the entity in the knowledge graph with a sentence structure-based mode to perform entity labeling and intention recognition, thereby improving the accuracy. The method and the device provided by the invention are respectively described in detail by combining the embodiments.

Fig. 1 is a flowchart of a method for entity tagging provided in an embodiment of the present invention, as shown in fig. 1, the method may include the following steps:

in 101, a knowledge-graph is preprocessed.

The knowledge graph stores each entity, attribute information corresponding to each entity, and a relationship between the entities. However, knowledge-graphs are typically divided into domains/categories, e.g., in a music domain/category, the entity "zhojojron" corresponds to an attribute label: "singer", "composer" and "word writer", while in the movie field/category there is also an entity "zhou jen", which corresponds to the attribute label "actor". In the embodiment of the invention, in order to facilitate the utilization of the knowledge graph, the knowledge graph can be preprocessed firstly. Specifically, the following steps may be included:

s11, firstly, integrating the attribute labels of the entities in the knowledge graph in each field to obtain the attribute labels corresponding to the entities.

Still taking the entity "zhogeren" as an example, after the attribute tags in each field are respectively integrated, all the attribute tags corresponding to the entity "zhogeren" are obtained as follows: "singer", "composer", "word writer", "actor".

And S12, storing the attribute labels corresponding to the entities in a key value storage engine.

After the attribute labels corresponding to the entities are obtained, the entities are respectively used as keys (keys), the attribute labels corresponding to the entities are used as values (values), and then the key-value pairs are stored in a key value storage engine.

It should be noted that the preprocessing of the knowledge-graph is for the purpose of quickly searching the attribute labels of the entities in the knowledge-graph in the following convenience, but is not a necessary step of the present invention. Of course, other ways of preprocessing the knowledge-graph may be used.

At 102, the knowledge graph is used to perform word encoding on the attribute labels of the words in the sentence, so as to obtain a first expression vector of each word.

In this step, the first expression vector of each word obtained by the knowledge graph is used for the purpose of enabling the first expression vector of each word to include attribute information of an entity in the knowledge graph. Specifically, the method can be realized by the following steps:

and S21, identifying an entity in the sentence and the attribute tag corresponding to the entity by using the knowledge graph.

In this step, the longest matching principle can be used to match the sentence in the knowledge graph, so as to identify the entity in the sentence. Specifically, each n-gram (n-gram) of a sentence is obtained, and in the embodiment of the present invention, the n-gram refers to a collocation formed by n consecutive words, where n is each value of 1 or more. And matching each n-gram with the knowledge graph respectively to see which n-grams are matched with the entities in the knowledge graph, and taking the n-gram with the longest length as the identified entity when a plurality of overlapped n-grams are matched with the entities.

For example, in the sentence "which movies were played by Zhou Jieren", obtaining each n-gram separately includes:

1-gram: "Zhou", "Jie", "Lun", \8230; "Ying";

2-gram: "Zhoujie", "Jie Lun", "Lorenta",\ 8230; "film";

3-gram: "Zhou Jie Lun", "Jie Lun performance", "Lon performance",' 8230 "," some movies ";

……

where "Zhou Jie" can be matched to entities in the knowledge graph, "Zhou Ji Lun" can also be matched to entities in the knowledge graph, and there is an overlap between the two, then the "Zhou Ji Lun" with the longest length is taken as the identified entity.

When determining the attribute tag corresponding to each entity, a key value storage engine may be queried to find a value corresponding to the entity as a key.

And S22, segmenting the sentence by using the recognition result, and labeling attribute labels for the obtained words.

When the words are segmented for the sentences, the recognized entities are used as independent words, and then the words are segmented for other contents of the sentences. Still taking "what movies were played by cygerron" as an example, after performing word segmentation, we obtain: "Zhou Ji Lun", "show", "over", "which", "movie".

After each word is labeled with an attribute label, "zhou jieren" labels "singer", "composer", "word writer" and "actor", since "actor", "over", "which" and "movie" are not entities of the knowledge graph, they may all be labeled with "O" indicating that there is no corresponding attribute label.

It should be noted that, in the embodiments of the present invention, each word in a sentence is taken as an example for description, but it is not excluded that the processing is performed by using at least some words in the sentence. For example, in this step, after segmenting a sentence, attribute tags are labeled on at least part of the obtained words, such as only the entities in the indication map.

And S23, carrying out word coding on the attribute labels of the words, and carrying out conversion of a full connection layer on a coding result to obtain a first expression vector of each word.

In this step, word encoding is performed on the attribute tags of each word, so as to convert the attribute tag set of each word into a string of codes that can be identified by a computer. The encoding method adopted in this embodiment may include, but is not limited to, one-hot (one-hot) encoding.

As shown in fig. 2, after the one-hot encoding is performed on the attribute tag set corresponding to each word, an encoding result is obtained. The length of the encoding result may be the total number of attribute tags, for example, if there are M attribute tags in the knowledge graph, the encoding result is M bits, and each bit corresponds to one attribute tag. And the value of each bit in the coding result is used for indicating whether the attribute label corresponding to the bit exists or not. For example, the result of word encoding by "Zhougelon" has 4 bits as 1, indicating that there exists attribute label corresponding to these 4 positions in "Zhougelon".

And for the one-hot encoding result, performing conversion of a full connection layer, and aiming at mapping the encoding result of the attribute label of each word to an entity label, wherein the entity label is the label for performing entity labeling on the word in the sentence. And obtaining a first expression vector of each word after full connection layer conversion.

In the embodiment of the present invention, the full connection layer may be trained in advance. The training process may include: the method comprises the steps of taking a sentence with entity labels as a training sample in advance, using a knowledge graph to carry out entity recognition, word segmentation, attribute label labeling and one-hot coding on the sentence in the training sample, using the sentence as input of a full link layer, outputting a first expression vector formed by the entity labels corresponding to all words in the sentence as target output of the full link layer, and training the full link layer. The trained full link layer is actually used for mapping the coding result to the entity label after one-hot coding.

Continuing as shown in fig. 2, after the one-hot encoding result corresponding to each word is respectively subjected to full connection layer conversion, the first expression vector of each word is obtained, and is respectively represented as: t-dit 1, T-dit 2, T-dit 3, T-dit 4 and T-dit 5.

In 103, each word in the sentence is word-coded based on the sentence structure, and a second expression vector of each word is obtained.

In this step, the following steps may be specifically performed:

and S31, determining word vectors of all words in the sentence.

When determining the word vector of each word in the sentence, an existing word vector generation tool, such as word2vec and the like, may be adopted to pre-train the word2vec based on semantics, and then the word2vec may be used to generate word vectors for each word, where the word vectors corresponding to each word are the same in length. The method for determining the word vectors is based on semantics, so that the distance between the word vectors represents the association degree between word semantics, and the distance between corresponding word vectors is smaller for words with higher association degree between semantics. Since the semantic-based word vector determination method can adopt the existing technology, it is not detailed here.

And S32, inputting the word vectors into a pre-trained neural network to respectively obtain second expression vectors of all the words.

And inputting each word vector into a pre-trained neural network so as to encode the sentence according to the word granularity. The neural network described above may employ, for example, a bidirectional RNN (recurrent neural network), a unidirectional RNN, a CNN (convolutional neural network), or the like. Bi-directional RNNs are preferred, among others, because bi-directional RNNs are able to encode sentences in a loop. The basic idea of bi-directional RNNs is to propose two RNNs for each training sequence forward and backward, respectively, and to connect both RNNs to an output layer. This structure enables context information to be provided to the output layer both before and after each point in the input sequence. Specifically, according to the present invention, when a well-participated sentence is input, if the sentence includes n words, after bidirectional RNN, n output vectors will be provided, and each word corresponds to one vector. The ith vector contains the information of all preceding words due to RNN memory, so the output vector of the last word is also called the "sentence vector" because it contains the information of all preceding words in theory.

Still taking "what movies were played by jegery" as an example, as shown in fig. 3, after the words "jegery", "play", "pass", "which" and "movies" obtained after word segmentation respectively determine corresponding word vectors, the word vectors are input into the bidirectional RNN, so as to obtain the second expression vectors of the lyrics and are respectively marked as: and the second expression vector of each word comprises context information, namely the influence of the sentence structure is considered again, and the sentence structure information is contained. Where output5 contains information of the entire sentence, it may be referred to as a sentence vector.

It should be noted that the processes performed in

steps

102 and 103 based on the knowledge graph and based on the sentence structure may be performed sequentially in any order, or may be performed simultaneously. The order shown in this embodiment is only one of the execution modes.

At 104, the first expression vector and the second expression vector are fused to obtain an entity labeling result of the sentence.

The fusion of the first expression vector and the second expression vector in this step is actually the fusion of the entity label obtained based on the knowledge graph and the entity label obtained based on the sentence structure. Specifically, the following steps may be specifically performed:

s41, splicing the first expression vector and the second expression vector of each word respectively to obtain a third expression vector of each word.

In this step, the two vectors may be spliced according to a preset order, so as to obtain a longer vector, which is a third expression vector.

It should be noted that, in addition to the manner of splicing the first expression vector and the second expression vector, other fusion manners such as superimposing the first expression vector and the second expression vector may be adopted, but since the splicing manner can separately consider the influence based on the knowledge graph and the influence based on the sentence structure, different parameters are respectively adopted in the subsequent conversion process of the full-connected layer, and therefore, the splicing manner is preferred.

And S42, converting the third expression vector of each word into a result vector of each word through a full connection layer.

And inputting the third expression vector of each word into a pre-trained full-link layer for conversion, thereby mapping each third expression vector to an entity label, and obtaining a result vector after conversion. The length of the result vector is the total number of the corresponding entity tags, each bit of the result vector corresponds to each entity tag, and each value corresponds to the score of each entity tag.

In the embodiment of the present invention, the fully-connected layer may be trained in advance. The training process may include: the sentences marked with entity labels are used as training samples in advance, the steps in the

steps

102 and 103 are respectively executed, namely, a first expression vector and a second expression vector of each word are respectively obtained for the sentences in the training samples, then, the result (namely, a third expression vector) obtained after the first expression vector and the second expression vector are spliced is used as the input of the full-link layer, and the entity labels of the sentences are used as the output of the full-link layer for training. And the trained full connection layer is used for mapping the third expression vector of each word in the sentence to the entity label.

And S43, carrying out entity annotation on the sentence according to the result vector of each word.

Each word corresponds to a result vector, and the entity tag with the highest score can be selected to perform entity labeling on each word in the sentence according to the score of each entity tag in the result vector.

Still taking "what movies were played by shepherd" as an example, as shown in fig. 4, the first expression vector and the second expression vector of each word are respectively spliced to obtain a third expression vector. In fig. 4, after the first expression vector T-dit 1 of "zhou jilun" is spliced with the second expression vector Output1, a third expression vector K1 is obtained, and other words are similar. Then, the third expression vectors K1, K2, \8230andK 5 of each word are respectively input into the full-connection layer to respectively obtain the result vector of each word. The entity label "Actor _ name" in the result vector corresponding to the word "zhou jiron" has the highest score, and the word "zhou jiron" can be labeled by using the "Actor _ name", and the entity label with the highest score in the result vectors corresponding to other words is "O", which indicates that the entity label is not an entity, so that the other words are labeled by using the entity label "O".

Fig. 5 is a flowchart of an intention recognition method provided in an embodiment of the present invention, and as shown in fig. 5, the method may include the following steps:

at 501, the attribute labels of the words in the sentence are combined and encoded by using a knowledge graph to obtain a first sentence vector of the sentence.

Similarly to the entity labeling, the knowledge graph may be preprocessed first before this step, and the process of preprocessing will not be described in detail, and reference may be made to the related description of 101 in fig. 1.

In this step, a first sentence vector of the sentence is obtained by using the knowledge graph, so that the first sentence vector includes attribute information of an entity in the knowledge graph. Specifically, the method can be realized by the following steps:

and S51, identifying an entity in the sentence and the attribute tag corresponding to the entity by using the knowledge graph.

For detailed implementation of this step, refer to step S21 in step 102 in the embodiment shown in fig. 1, which is not described herein again.

And S52, carrying out combined coding on the attribute labels of the words, and carrying out full-connection layer conversion on the coding result to obtain a first sentence vector of the sentence.

After the attribute labels of all the words in the sentence are obtained, the attribute labels of all the words are combined and coded uniformly to obtain a coding result. The encoding result is a vector, the length of the vector corresponds to the total number of the attribute tags, each bit corresponds to one attribute tag, and the value of each bit is the weight of the attribute tag in the sentence.

When determining the weight of the attribute tag in the sentence, the weight can be determined according to the number of times the attribute tag appears in the sentence and the number of the attribute tags of the same entity corresponding to the attribute tagAnd (4) determining. In particular, the attribute tag label _i Weight of (2)

The following formula may be used to determine:

wherein M represents the mth word in the sentence, and M represents the number of words in the sentence. a is _im Indicating the label _i For the value of the mth word, if label _i Not the attribute label of the m-th word, then a _im If label takes a value of 0 _i Is the attribute label of the mth word, then a _im Is taken as

Wherein count (label) _m ) The number of attribute tags for the mth word.

Still taking the sentence "which movies were played by shepherd" as an example, all attribute labels corresponding to "shepherd" are: the "singer", "composer", "word author", "actor", and other words do not have corresponding attribute labels in the knowledge map. Then for the attribute tag of "singer", its weight in the sentence is:

then the bit corresponding to "singer" takes a value of 0.25 in the encoded result. Similarly, the bit values of the 'composer', 'word writer' and 'actor' in the encoding result are 0.25, and the bit values of the other attribute tags in the encoding result are 0.

And after the coding result is obtained, the coding result is subjected to full-connection layer conversion, so that the coding result of the sentence based on the attribute tag is mapped to the entity tag. The entity label is the label for carrying out entity labeling on the words in the sentence. And obtaining a first sentence vector of the sentence after the full-connection layer conversion. The length of the first sentence vector corresponds to the total number of the entity tags, and each bit value of the first sentence vector is the weight value of the entity tag corresponding to the bit in the sentence.

In the embodiment of the present invention, the full connection layer may be trained in advance. The training process may include: the method comprises the steps of taking a sentence marked with an entity label as a training sample in advance, carrying out entity recognition, word segmentation, attribute label marking and combined coding on the sentence in the training sample by using a knowledge graph, taking an obtained coding result as the input of a full-connection layer, taking a first sentence vector formed by the entity labels corresponding to all words in the sentence as the target output of the full-connection layer, and training the full-connection layer. The trained full-link layer is actually used for mapping the coding result to the entity label after combined coding.

The process in this step may be as shown in fig. 6, after the attribute labels of the words in "which movies were played by jieren" are subjected to combinatorial coding, the obtained coding result passes through the full connection layer, and finally the first sentence vector is obtained, which is denoted as S-dit.

At 502, a sentence is encoded based on the sentence structure to obtain a second sentence vector for the sentence.

In this step, the following steps may be specifically executed:

s61, determining word vectors of all words in the sentence.

And S62, inputting the word vector of each word into a pre-trained neural network to obtain a second sentence vector of the sentence.

Specifically, after the word vector of each word is input into a pre-trained neural network, a second expression vector of each word is obtained, and the second expression vector of the last word is used as a second sentence vector of the sentence.

The above processing procedure for determining the word vector of each word in the sentence and inputting the word vector of each word into the pre-trained neural network is consistent with the corresponding implementation in step 103 in the embodiment shown in fig. 1, and is not described herein again. Only after the second expression vector of each word is obtained, the second expression vector of the last word is used as the second sentence vector of the sentence, and the second expression vectors of other words are not used in the sentence intention recognition. That is, output5 in fig. 3 is adopted as the second sentence vector of the sentence.

At 503, a first sentence vector and a second sentence vector of the sentence are fused to obtain an intention recognition result for the sentence.

The fusion of the first sentence vector and the second sentence vector in this step is actually the fusion of the intention information obtained based on the knowledge graph and the intention information obtained based on the sentence structure. The entity labeling result based on the knowledge graph has a great influence on the intention recognition, and taking 'which movies are played by Zhougelon' as an example, correctly labeling Zhougelon as 'actor' has a great influence on the correct intention recognition result 'which movies are played by one actor', and if the entity 'Zhougelon' is wrongly labeled as 'singer', the intention recognition result is probably not obtained.

Specifically, the present step may include the steps of:

and S71, splicing the first sentence vector and the second sentence vector to obtain a third sentence vector.

In this step, the two vectors may be spliced according to a preset order, so as to obtain a longer vector, which is a third sentence vector.

In addition to the method of splicing the first sentence vector and the second sentence vector, other fusion methods such as superimposing the first sentence vector and the second sentence vector may be used. However, the splicing mode can separately consider the influence based on the knowledge graph and the influence based on the sentence structure, so that different parameters are respectively adopted in the subsequent conversion process of the full-connection layer, and the splicing mode is preferred.

And S72, converting the third sentence vector into a result vector through a full connection layer.

And inputting the third sentence vector into a pre-trained full-connection layer for conversion, so that the third sentence vector is mapped to the sentence intention, and a result vector is obtained after conversion. The length of the result vector corresponds to the number of categories of sentence intentions, and each bit of the result vector corresponds to the score of each category of sentence intentions.

In the embodiment of the present invention, the fully-connected layer may be trained in advance. The training process may include: the sentences of which the sentence intentions are determined are taken as training samples in advance, the steps in the

steps

501 and 502 are respectively executed, a first sentence vector and a second sentence vector are respectively obtained aiming at the sentences in the training samples, then the result (namely, a third sentence vector) obtained by splicing the first sentence vector and the second sentence vector is taken as the input of the full connection layer, and the sentence intentions of the sentences are taken as the output of the full connection layer for training. The trained full-link layer is used for mapping a third sentence vector of a sentence to the sentence intention.

And S73, determining the sentence intention according to the result vector.

In this step, the sentence intentions may be determined according to the scores of the sentence intention categories in the result vector, for example, the sentence intention with the highest score may be used as the identified intention of the sentence.

Still taking "what movies were played by shepherd" as an example, as shown in fig. 7, a first sentence vector S-dit and a second sentence vector Output5 of a sentence are spliced to obtain a third sentence vector K. And then inputting the third sentence vector K into the full-connection layer to finally obtain a result vector. The highest scoring sentence intent in the result vector is: "which movies an actor has played".

The above is a detailed description of the method provided by the present invention, and the following is a detailed description of the apparatus provided by the present invention with reference to the examples.

Fig. 8 is a structural diagram of an entity tagging apparatus according to an embodiment of the present invention, and as shown in fig. 8, the apparatus may include: the first word encoding unit 10, the second word encoding unit 20, and the vector fusing unit 30 may further include an atlas preprocessing unit 40. The main functions of each component unit are as follows:

the first word encoding unit 10 is responsible for performing word encoding on the attribute labels of the words in the sentence by using the knowledge graph to obtain a first expression vector of each word.

Specifically, the first word encoding unit 10 may include: a matching subunit 11, a word segmentation subunit 12 and a first word encoding subunit 13.

The matching subunit 11 is responsible for identifying an entity in the sentence and an attribute tag corresponding to the entity by using the knowledge graph. Specifically, the matching subunit 11 may match the sentence in the knowledge graph by using the longest matching rule, and identify an entity in the sentence. For example, each n-gram of a sentence may be obtained, where n is each value of 1 or more. And matching each n-gram with the knowledge graph respectively to see which n-grams are matched with the entities in the knowledge graph, and taking the n-gram with the longest length as the identified entity when a plurality of overlapped n-grams are matched with the entities.

The word segmentation subunit 12 is responsible for segmenting words of the sentence according to the recognition result of the matching subunit 11, and labeling attribute labels for the obtained words. The word segmentation subunit may use the entity identified by the matching subunit 11 as an independent word when segmenting a sentence.

The first word encoding subunit 13 is responsible for performing word encoding on the attribute tags of each word, for example, one-hot encoding may be performed on the attribute tags of each word, and the encoding result is subjected to full-link layer conversion to obtain a first expression vector of each word.

And for the one-hot encoding result, performing conversion of a full connection layer, and aiming at mapping the encoding result of the attribute label of each word to an entity label, wherein the entity label is the label for performing entity labeling on the word in the sentence. And obtaining a first expression vector of each word after full-connection layer conversion.

The map preprocessing unit 40 is responsible for integrating the attribute tags of the entities in the knowledge map in each field to obtain an attribute tag set corresponding to each entity; and storing the attribute label set corresponding to each entity in a key value storage engine. Accordingly, the matching subunit 11 may match the sentences in the key-value storage engine using a longest matching algorithm.

The second word encoding unit 20 is responsible for performing word encoding on each word in the sentence based on the sentence structure to obtain a second expression vector of each word. Specifically, the second word encoding unit 20 may first determine a word vector of each word in the sentence; and then inputting the word vectors into a pre-trained neural network to respectively obtain second expression vectors of all the words.

The second word encoding unit 20 may employ an existing word vector generation tool, such as word2vec, when determining a word vector of each word in a sentence, pre-train the word2vec based on semantics, and then generate a word vector for each word using the word2vec, respectively, with the word vector length corresponding to each word being the same. The method for determining the word vectors is based on semantics, so that the distance between the word vectors can represent the association degree between word semantics, and the distance between the corresponding word vectors is smaller for words with higher association degree between semantics.

The neural network described above may employ, for example, a bidirectional RNN (recurrent neural network), a unidirectional RNN, a CNN (convolutional neural network), or the like. Among them, bidirectional RNN is preferable.

The vector fusion unit 30 is responsible for fusing the first expression vector and the second expression vector to obtain an entity labeling result of the sentence.

Specifically, the vector fusion unit 30 may respectively splice the first expression vector and the second expression vector of each word to obtain a third expression vector of each word; then converting the third expression vector of each word into a result vector of each word through a full-connection layer, wherein the length of the result vector corresponds to the total number of the entity labels, each bit of the result vector corresponds to each entity label, and the value of each bit represents the score of the corresponding entity label; and finally, carrying out entity labeling on the sentence according to the result vector of each word.

When the sentence is subjected to entity tagging according to the result vector of each word, the vector fusion unit 30 may perform entity tagging on each word in the sentence according to the entity tag with the highest score in the result vector of each word.

Fig. 9 is a block diagram of an intention identifying apparatus according to an embodiment of the present invention, and as shown in fig. 9, the apparatus may include: the first sentence encoding unit 50, the second sentence encoding unit 60, and the vector fusing unit 70 may further include a map preprocessing unit 80. The main functions of each component unit are as follows:

the first sentence encoding unit 50 is responsible for performing combined encoding on the attribute tags of each word in the sentence by using the knowledge graph to obtain a first sentence vector of the sentence.

The first sentence encoding unit 50 may specifically include: a matching sub-unit 51, a participle sub-unit 52 and a combinatorial coding sub-unit 53.

The matching subunit 51 is responsible for identifying an entity in the sentence and an attribute tag corresponding to the entity by using the knowledge graph. In particular, the matching subunit 51 may match the sentence in the knowledge graph using a longest matching algorithm to identify the entity in the sentence.

The word segmentation subunit 52 is responsible for segmenting words of the sentence by using the recognition result, and labeling attribute labels to the obtained words. Wherein the entities identified by the matching subunit 51 are regarded as independent words in word segmentation.

The combined coding subunit 53 is responsible for performing combined coding on the attribute tags of each word and phrase, and performing full-connection layer conversion on the coding result to obtain a first sentence vector of the sentence, where the length of the first sentence vector corresponds to the total number of the entity tags, and each bit value of the first sentence vector is a weight value of the entity tag corresponding to the bit in the sentence.

The map preprocessing unit 80 is responsible for integrating the attribute tags of each entity in the knowledge map in each field to obtain an attribute tag set corresponding to each entity; and storing the attribute label set corresponding to each entity in a key value storage engine. Accordingly, the matching subunit 51 may match the sentences in the key value storage engine by using a longest matching algorithm.

The second sentence encoding unit 60 is responsible for encoding the sentence based on the sentence structure, resulting in a second sentence vector of the sentence. Specifically, the second sentence encoding unit 60 may first determine a word vector of each word in the sentence; and then inputting the word vector into a pre-trained neural network to obtain a second sentence vector of the sentence.

Wherein the second sentence encoding unit 60 generates a word vector for each word in the sentence, respectively, using word2vec trained in advance based on semantics, when determining the word vector for each word in the sentence.

When the word vector is input to the pre-trained neural network to obtain a second sentence vector of the sentence, the second sentence encoding unit 60 may specifically input the word vector to the pre-trained neural network to obtain a second expression vector of each word; and taking the second expression vector of the last word as a second sentence vector of the sentence.

The vector fusion unit 70 is responsible for fusing the first sentence vector and the second sentence vector of the sentence to obtain the intention recognition result of the sentence. Specifically, the first sentence vector and the second sentence vector may be spliced to obtain a third sentence vector; converting the third sentence vector into a result vector through a full connection layer, wherein the length of the result vector corresponds to the category number of the sentence intentions, each bit of the result vector corresponds to each category of sentence intentions, and the value of each bit represents the score of the corresponding sentence intentions; sentence intent is determined from the result vector.

The vector fusing unit 70 may determine the sentence intention according to the result vector, and may use the sentence intention with the highest score in the result vector as the sentence intention of the sentence.

The method for entity labeling and intention recognition can be applied to various scenes based on natural language processing, and an example of an application scene is as follows:

in the field of intelligent question and answer, for example, a user inputs a question "which movies a singer has played" on an intelligent question and answer client on a mobile phone, and after the entity marking and intention identification, the entity "zhou jilun" is marked as "Actor _ name" and the intention is "which movies a singer has played". Then the processing logic for this intent is to look up in the movie database for the movie name corresponding to the entity labeled "Actor _ name" in the sentence. Suppose that the movie name corresponding to "Zhou Jilun" is found in the movie database: the intelligent question-answer client can directly return answers to the user by the aid of secret which cannot be spoken, the mausoleum, the Tiantai love and the golden armor in the full city \8230 \ 8230;: secret which can not be spoken, mausoleum, tiantai love, manchu with golden armor \8230;, 8230;.

Fig. 10 exemplarily shows an example apparatus 1000 in accordance with various embodiments. The apparatus 1000 may include one or more processors 1002, system control logic 1001 coupled to at least one processor 1002, non-volatile memory (NMV)/memory 1004 coupled to the system control logic 1001, and a network interface 1006 coupled to the system control logic 1001.

The processor 1002 may include one or more single-core or multi-core processors. The processor 1002 may comprise any combination of general purpose processors or dedicated processors (e.g., image processor, application processor, baseband processor, etc.).

System control logic 1001, in one embodiment, may include any suitable interface controllers to provide any suitable interface to at least one of processors 1002 and/or to any suitable device or component in communication with system control logic 1001.

The system control logic 1001, for one embodiment, may include one or more memory controllers to provide an interface to a system memory 1003. System memory 1003 is used to load and store data and/or instructions. For example, corresponding to apparatus 1000, in one embodiment, system memory 1003 may include any suitable volatile memory.

The NVM/memory 1004 may include one or more tangible, non-transitory computer-readable media for storing data and/or instructions. For example, the NVM/memory 1004 may include any suitable non-volatile storage device, such as one or more Hard Disk Drives (HDDs), one or more Compact Disks (CDs), and/or one or more Digital Versatile Disks (DVDs).

The NVM/memory 1004 may include storage resources that are physically part of a device on which the system is installed or may be accessed, but not necessarily part of a device. For example, the NVM/memory 1004 may be network accessible via the network interface 1006.

System memory 1003 and NVM/storage 1004 may include copies of temporary or persistent instructions 1010, respectively. The instructions 1010 may include instructions that, when executed by at least one of the processors 1002, cause the device 1000 to implement one or a combination of the methods described in fig. 1 or fig. 5. In various embodiments, the instructions 1010 or hardware, firmware, and/or software components may additionally/alternatively be disposed in the system control logic 1001, the network interface 1006, and/or the processor 1002.

Network interface 1006 may include a receiver to provide a wireless interface for device 1000 to communicate with one or more networks and/or any suitable devices. Network interface 1006 may include any suitable hardware and/or firmware. The network interface 1006 may include multiple antennas to provide a multiple-input multiple-output wireless interface. In one embodiment, network interface 1006 may include a network adapter, a wireless network adapter, a telephone modem, and/or a wireless modem.

In one embodiment, at least one of the processors 1002 may be packaged together with logic for one or more controllers of system control logic. In one embodiment, at least one of the processors may be packaged together with logic for one or more controllers of system control logic to form a system in a package. In one embodiment, at least one of the processors may be integrated on the same die with logic for one or more controllers of system control logic. In one embodiment, at least one of the processors may be integrated on the same die with logic for one or more controllers of system control logic to form a system chip.

The apparatus 1000 may further include an input/output device 1005. Input/output devices 1005 may include a user interface intended to enable a user to interact with device 1000, may include a peripheral component interface designed to enable peripheral components to interact with the system, and/or may include sensors intended to determine environmental conditions and/or location information about device 1000.

In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the units is only one logical functional division, and other divisions may be realized in practice.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. An entity labeling method, characterized in that the method comprises:

carrying out word encoding on attribute labels of at least part of words in the sentence by using a knowledge graph to obtain first expression vectors of at least part of words; the knowledge graph is divided by fields, and each entity, the attribute label corresponding to each entity and the relationship between the entities are stored in the knowledge graph;

2. The method of claim 1, wherein the word encoding attribute tags of at least some words in the sentence using the knowledge-graph comprises:

identifying an entity in the sentence and an attribute tag corresponding to the entity by using a knowledge graph;

utilizing the recognition result to perform word segmentation on the sentence, and labeling attribute labels on at least part of the obtained words;

and performing word encoding on the attribute labels of at least part of words, and performing full-connection layer conversion on the encoding result to obtain first expression vectors of at least part of words.

3. The method of claim 2, wherein identifying entities in the sentence using a knowledge graph comprises:

and matching the sentences in the knowledge graph by adopting a longest matching principle, and identifying the entities in the sentences.

4. The method of claim 3, further comprising: integrating attribute tags of each entity in each field in the knowledge graph to obtain an attribute tag set corresponding to each entity; storing the attribute label sets corresponding to the entities in a key value storage engine;

the matching the sentences in the knowledge graph by adopting the longest matching principle comprises the following steps: and matching the sentences in the key value storage engine by adopting a longest matching principle.

5. The method of claim 2, wherein tokenizing the sentence using the recognition result comprises:

the sentence is participled, wherein the identified entities are treated as independent words.

6. The method of claim 2, wherein said word encoding attribute tags for at least some words comprises:

and carrying out one-hot coding on the attribute labels of at least part of words.

7. The method of claim 1, wherein the word encoding at least some words in the sentence based on the sentence structure comprises:

determining word vectors of at least some words in the sentence;

and inputting the word vectors into a pre-trained neural network to respectively obtain second expression vectors of at least part of words.

8. The method of claim 7, wherein determining a word vector for at least some words in the sentence comprises:

and respectively generating word vectors aiming at least part of words in the sentence by using word2vec trained in advance based on semantics.

9. The method of claim 7, wherein the neural network comprises: a bi-directional recurrent neural network.

10. The method of claim 1, wherein fusing the first expression vector and the second expression vector to obtain an entity labeling result of the sentence comprises:

splicing the first expression vector and the second expression vector of at least part of words respectively to obtain a third expression vector of at least part of words;

converting a third expression vector of at least part of words into a result vector of at least part of words through a full connection layer, wherein the length of the result vector corresponds to the total number of entity tags, each bit of the result vector corresponds to each entity tag, and the value of each bit represents the score of the corresponding entity tag;

and carrying out entity labeling on the sentence according to the result vector of at least part of words.

11. The method of claim 10, wherein the entity labeling the sentence according to the result vector of at least some words comprises:

and respectively carrying out entity labeling on at least part of words in the sentence according to the entity labels with the highest scores in the result vectors of at least part of words.

12. An intent recognition method, comprising:

performing combined coding on attribute tags of at least part of words in a sentence by using a knowledge graph to obtain a first sentence vector of the sentence; the knowledge graph is divided by fields, and each entity, the attribute tag corresponding to each entity and the relationship among the entities are stored in the knowledge graph;

13. The method of claim 12, wherein the encoding of the combinations of attribute tags for at least some words in the sentence using the knowledge-graph comprises:

utilizing the entity and the attribute tag corresponding to the entity to perform word segmentation on the sentence, and labeling the attribute tag on at least part of the obtained words;

and performing combined coding on attribute tags of at least part of words, and performing full-connection layer conversion on a coding result to obtain a first sentence vector of the sentence, wherein the length of the first sentence vector corresponds to the total number of the entity tags, and each bit value of the first sentence vector is a weight value of the entity tag corresponding to the bit in the sentence.

14. The method of claim 13, wherein identifying entities in the sentence using a knowledge-graph comprises:

and matching the sentences in a knowledge graph by adopting a longest matching algorithm, and identifying entities in the sentences.

15. The method of claim 14, further comprising: integrating attribute labels of all entities in the knowledge graph in all fields to obtain attribute labels corresponding to all the entities; storing the attribute labels corresponding to the entities in a key value storage engine;

the matching the sentences in the knowledge graph by adopting a longest matching algorithm comprises the following steps: and matching the sentences in the key value storage engine by adopting a longest matching algorithm.

16. The method of claim 12, wherein the encoding the sentence based on the sentence structure to obtain the second sentence vector of the sentence comprises:

determining word vectors of at least some words in the sentence;

and inputting the word vector into a pre-trained neural network to obtain a second sentence vector of the sentence.

17. The method of claim 16, wherein determining a word vector for at least some of the words in the sentence comprises:

18. The method of claim 16, wherein the neural network comprises: a bi-directional recurrent neural network.

19. The method of claim 16, wherein inputting the word vector into a pre-trained neural network to obtain a second sentence vector for the sentence comprises:

inputting the word vectors into a pre-trained neural network to respectively obtain second expression vectors of at least part of words;

and taking the second expression vector of the last word as a second sentence vector of the sentence.

20. The method of claim 12, wherein fusing the first sentence vector and the second sentence vector of the sentence to obtain the intent recognition result for the sentence comprises:

splicing the first sentence vector and the second sentence vector to obtain a third sentence vector;

converting the third sentence vector into a result vector through a full-connection layer, wherein the length of the result vector corresponds to the category number of the sentence intentions, each bit of the result vector corresponds to each category of the sentence intentions, and the value of each bit represents the score of the corresponding sentence intention;

determining the sentence intent from the result vector.

21. The method of claim 20, wherein determining the sentence intent from the result vector comprises:

and taking the sentence intention with the highest score in the result vector as the sentence intention of the sentence.

22. An entity tagging apparatus, the apparatus comprising:

the first word encoding unit is used for carrying out word encoding on attribute labels of at least part of words in the sentence by using a knowledge graph to obtain first expression vectors of at least part of words; the knowledge graph is divided by fields, and each entity, the attribute label corresponding to each entity and the relationship between the entities are stored in the knowledge graph;

23. The apparatus of claim 22, wherein the first word encoding unit comprises:

the matching subunit is used for identifying the entity in the sentence and the attribute tag corresponding to the entity by using a knowledge graph;

the word segmentation subunit is used for segmenting the sentence by utilizing the recognition result of the matching subunit and labeling attribute labels on at least part of the obtained words;

and the first word coding subunit is used for carrying out word coding on the attribute labels of at least part of the words and carrying out full-connection layer conversion on the coding result to obtain a first expression vector of at least part of the words.

24. The apparatus according to claim 23, wherein the matching subunit is specifically configured to:

25. The apparatus of claim 24, further comprising:

the map preprocessing unit is used for integrating the attribute tags of the entities in the knowledge map in each field to obtain an attribute tag set corresponding to each entity; storing the attribute label sets corresponding to the entities in a key value storage engine;

and the matching subunit matches the sentences in the key value storage engine by adopting a longest matching algorithm.

26. The apparatus of claim 23, wherein the sub-word unit is specifically configured to: and segmenting words of the sentence, wherein the entity identified by the matching subunit is taken as an independent word.

27. The apparatus according to claim 23, wherein the first word encoding subunit, when performing word encoding on the attribute tags of at least some of the words, specifically performs:

28. The apparatus according to claim 22, wherein the second word encoding unit is specifically configured to:

determining word vectors of at least some words in the sentence;

29. The apparatus according to claim 28, wherein the second word encoding unit, when determining word vectors for at least some of the words in the sentence, specifically performs:

30. The apparatus of claim 28, wherein the neural network comprises: a bi-directional recurrent neural network.

31. The apparatus according to claim 22, wherein the vector fusion unit is specifically configured to:

converting the third expression vectors of at least part of words into result vectors of at least part of words through a full connection layer, wherein the length of the result vectors corresponds to the total number of the entity labels, each bit of the result vectors corresponds to each entity label, and the value of each bit represents the score of the corresponding entity label;

32. The apparatus according to claim 31, wherein the vector fusing unit performs, when the sentence is entity labeled according to the result vector of at least some words:

33. An intention recognition apparatus, characterized in that the apparatus comprises:

the first sentence coding unit is used for carrying out combined coding on attribute labels of at least part of words in a sentence by using a knowledge graph to obtain a first sentence vector of the sentence; the knowledge graph is divided by fields, and each entity, the attribute label corresponding to each entity and the relationship between the entities are stored in the knowledge graph;

a second sentence coding unit, configured to code the sentence based on a sentence structure to obtain a second sentence vector of the sentence;

and the vector fusion unit is used for fusing the first sentence vector and the second sentence vector of the sentence to obtain an intention recognition result of the sentence.

34. The apparatus according to claim 33, wherein the first sentence encoding unit specifically includes:

the word segmentation subunit is used for segmenting the sentence by using the entity and the attribute tag corresponding to the entity and labeling the attribute tag on at least part of the obtained words;

and the combined coding subunit is used for performing combined coding on the attribute tags of at least part of the words and performing full-connection layer conversion on a coding result to obtain a first sentence vector of the sentence, wherein the length of the first sentence vector corresponds to the total number of the entity tags, and each bit value of the first sentence vector is a weight value of the entity tag corresponding to the bit in the sentence.

35. The apparatus according to claim 34, wherein the matching subunit, in particular for identifying entities in the sentence using a knowledge-graph, comprises:

and matching the sentences in a knowledge graph by adopting a longest matching algorithm to identify the entities in the sentences.

36. The apparatus of claim 35, further comprising:

37. The apparatus according to claim 33, wherein the second sentence encoding unit is specifically configured to:

determining word vectors of at least some words in the sentence;

38. The apparatus according to claim 37, wherein the second sentence encoding unit, when determining the word vectors for at least some of the words in the sentence, specifically performs:

39. The apparatus of claim 38, wherein the neural network comprises: a bi-directional recurrent neural network.

40. The apparatus according to claim 37, wherein the second sentence encoding unit is specifically configured to:

41. The apparatus according to claim 33, wherein the vector fusion unit is specifically configured to:

determining the sentence intent from the result vector.

42. The apparatus according to claim 41, wherein the vector fusion unit, when determining the sentence intent from the result vector, specifically performs:

43. An apparatus comprising

A memory including one or more programs;

one or more processors, coupled to the memory, that execute the one or more programs to perform operations performed in the method of any of claims 1-11.

44. An apparatus comprising

A memory including one or more programs;

one or more processors, coupled to the memory, that execute the one or more programs to perform operations in the method of any of claims 12 to 21.

45. A computer storage medium encoded with a computer program that, when executed by one or more computers, causes the one or more computers to perform operations performed in a method as claimed in any one of claims 1 to 11.

46. A computer storage medium encoded with a computer program that, when executed by one or more computers, causes the one or more computers to perform operations performed in a method as claimed in any one of claims 12 to 21.