CN109614618B - Method and device for processing foreign words in set based on multiple semantics - Google Patents
Method and device for processing foreign words in set based on multiple semantics Download PDFInfo
- Publication number
- CN109614618B CN109614618B CN201811498210.2A CN201811498210A CN109614618B CN 109614618 B CN109614618 B CN 109614618B CN 201811498210 A CN201811498210 A CN 201811498210A CN 109614618 B CN109614618 B CN 109614618B
- Authority
- CN
- China
- Prior art keywords
- word
- semantic
- vector
- sense
- meaning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The embodiment of the application provides a method and a device for processing a word outside a collection based on multiple semantics. The method comprises the following steps: acquiring the weight of each semantic meaning of the outer word according to the upper word and the lower word of the outer word in the sentence; generating a semantic vector of each semantic according to the word vector of the semantic origin in each semantic; and according to the weight of each semantic, weighting and summing the semantic vectors of each semantic to generate a simulation word vector. The simulated word vector generated by the technical scheme provided by the application can realize matching of sentence clauses and other semantics of the external word, so that the semantics expressed by the simulated word vector are richer and plump, and are suitable for richer semantic environments; when the simulation word vector is used in the intelligent interaction system, the relevance of the response and the problem is high, the response accuracy is improved, the simulation word vector is suitable for a richer dialogue environment, the intelligent question-answering system is more intelligent, the user goodness is greatly improved, and the word-out-of-set problem in the prior art is solved.
Description
The present application claims priority from chinese patent office, application number 201810556386.2, chinese patent application entitled "multi-semantic based out-of-set word processing method, intelligent question-answering method and apparatus" filed on 1/6/2018, the entire contents of which are incorporated herein by reference.
Technical Field
The application relates to the technical field of natural language processing, in particular to a method and a device for processing foreign words in a set based on multiple semantics.
Background
With the development of natural language processing technology, dialogue systems established based on natural language processing technology are widely used, and common dialogue systems such as chat robots can automatically generate corresponding responses according to chat contents input by users.
In the prior art, dialogue systems can be classified into a search dialogue system based on a knowledge base and a generation dialogue system based on a deep learning model according to different response methods. In the dialogue system based on the deep learning model, a dialogue model based on RNN (recurrent neural network: recurrent Neural Networks) is established, and a great amount of corpus training is carried out by using the model, so that the dialogue model can learn a potential response mode to an unknown dialogue from a question-answer pair, and the answer content is not limited to the knowledge in the training corpus.
When the dialogue system based on the deep learning model performs corpus training and corpus response, word vectors are used as operation objects, and the word vectors are a mathematical expression form for word segmentation in the corpus. The contribution of word vectors in deep learning is: the cosine included angle or Euclidean distance is calculated by the two word vectors, so that the distance between the two segmented words can be obtained, and the smaller the distance between the two segmented words is, the higher the similarity between the two segmented words is. In the training process of the dialogue system, generating a word vector space containing known word segmentation word vectors according to training corpus; in the response process of the dialogue system, according to the distance between the word vector of the word segmentation of the problem and the word vector of the known word segmentation, the response content of the problem is generated by combining a machine learning algorithm.
However, the word vector space obtained based on corpus training has poor inclusion ability for business terms, dialect vocabulary, foreign language and combined words in the professional field, so that in an open dialogue system with unrestricted problem content, the dialogue system often encounters an out-of-speech (OOV) word, which refers to a word segment not included in the word vector space. When the dialog system encounters a problem containing an out-of-set word, its accuracy in presenting the content of the response is reduced, a condition known as an out-of-set word (OOV) problem. Currently, the prior art lacks an effective solution to the word out-of-set problem.
Disclosure of Invention
The embodiment of the application provides a method and a device for processing foreign words based on multiple semantics, which are used for solving the problems in the prior art.
In a first aspect, an embodiment of the present application provides a method for processing a word outside a set based on multiple semantics, including:
acquiring the weight of each semantic meaning of the outer word according to the upper word and the lower word of the outer word in the sentence; the upper word and the lower word comprise at least one preamble word and at least one postamble word of the external word in the sentence;
generating a semantic vector of each semantic according to the word vector of the semantic origin in each semantic;
and according to the weight of each semantic, weighting and summing the semantic vectors of each semantic to generate a simulation word vector.
In a second aspect, an embodiment of the present application provides an intelligent question-answering method, which is applied to the multi-semantic-based out-of-set word processing method provided in the embodiment of the present application, and includes:
acquiring a word out-of-set from word segmentation results of unknown problems;
generating a simulation word vector of the external word on the basis of the multi-semantics of the external word;
and matching the answers of the questions from the trained question-answering model according to the simulated word vector and word vectors of the rest word segmentation in the questions.
In a third aspect, an embodiment of the present application provides a device for processing a word outside a set based on multiple semantics, including:
the semantic weight acquisition unit is used for acquiring the weight of each semantic meaning of the external word according to the upper word and the lower word of the external word in the sentence; the upper word and the lower word comprise at least one preamble word and at least one postamble word of the external word in the sentence;
the semantic vector generation unit is used for generating a semantic vector of each semantic according to the word vector of the semantic origin in each semantic;
and the simulated word vector generation unit is used for weighting and summing the semantic vectors of each semantic according to the weight of each semantic to generate a simulated word vector.
According to the technical scheme, the embodiment of the application provides a multi-semantic-based method for processing the external words, an intelligent question-answering method and a device. Comprising the following steps: acquiring the weight of each semantic meaning of the outer word according to the upper word and the lower word of the outer word in the sentence; the upper word and the lower word comprise at least one preamble word and at least one postamble word of the external word in the sentence; generating a semantic vector of each semantic according to the word vector of the semantic origin in each semantic; and according to the weight of each semantic, weighting and summing the semantic vectors of each semantic to generate a simulation word vector. The simulation word vector generated by the technical scheme is generated based on a plurality of semantics of the external word, and according to the semantic relativity of the external word and the upper word and the lower word, the simulation word vector is generated by fusion according to different weights; therefore, the meaning of the sentence can be matched, and other semantics of the external word can be considered, so that the semantics expressed by the simulated word vector are richer and plump, and the semantic environment is richer; therefore, when the simulation word vector generated by the embodiment of the application is used in an intelligent interaction system, the relevance of the response and the problem is high, the response accuracy is improved, the simulation word vector is suitable for a richer dialogue environment, the intelligent question-answering system is more intelligent, the user goodness is greatly improved, and the problem of word gathering outside in the prior art is solved.
Drawings
In order to more clearly illustrate the technical solutions of the present application, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a flowchart of a method for processing a multi-semantic based exoword according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating a method step S110 of processing a word out of a set based on multiple semantics according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating a method step S111 of processing a word out of a set based on multiple semantics according to an embodiment of the present application;
FIG. 4 is a flowchart illustrating a method step S112 of processing a word out of a set based on multiple semantics according to an embodiment of the present application;
FIG. 5 is a flowchart illustrating a method step S120 of processing a word out of a set based on multiple semantics according to an embodiment of the present application;
FIG. 6 is a flow chart of an intelligent interaction method according to an embodiment of the present disclosure;
FIG. 7 is a block diagram of a multi-semantic based word processing apparatus according to an embodiment of the present application;
fig. 8 is a block diagram of an intelligent interaction device according to an embodiment of the present application.
Detailed Description
In order to better understand the technical solutions in the present application, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.
When the dialogue system based on the deep learning model performs corpus training and corpus response, word vectors are used as operation objects, and the word vectors are a mathematical expression form for word segmentation in the corpus. The contribution of word vectors in deep learning is: the cosine included angle or Euclidean distance is calculated by the two word vectors, so that the distance between the two segmented words can be obtained, and the smaller the distance between the two segmented words is, the higher the similarity between the two segmented words is.
In the technical field of natural language processing, a word vector is One-Hot Representation type, and the dimension of the word vector is determined according to the number of known word segments in a word segmentation dictionary, wherein each dimension in the word vector represents One word segment in the word segmentation dictionary, so that in the One-hot representation type word vector, only the value of One dimension is 1, and the other dimensions are all 0. Since the number of known tokens in a token dictionary is typically large, the One-Hot Representation type of token vector dimension is very high. However, high-dimensional word vectors are susceptible to dimension disasters when applied to the field of deep learning, and because each word in the word vector independently has one dimension, the similarity between two words is difficult to reflect, and the method is not suitable for a deep learning model.
Thus, in a dialogue system based on a deep learning model, another word vector is generally used: distributed Representation. The word vector maps each word into a low-dimensional real number vector with a fixed length through corpus training, and a word vector space is formed by putting all Distributed Representation word vectors together, wherein each word vector corresponds to a point of the word vector space, for example, a certain word vector is: [0.792, -0.177, -0.107,0.109, … ]. In the word vector space, the distance between two points represents the similarity between two word segments, and can be represented by cosine included angles and Euclidean distances between two word vectors. Based on the characteristics of the Distributed Representation type word vector, the Distributed Representation type word vector is preferred in the present application.
In the prior art, the word vector space has poor capacity for containing business terms, dialect words, foreign language and combined words in the professional field due to the limitation of the corpus quantity and the content richness, so that in an open dialogue system with unrestricted problem content, the dialogue system frequently encounters an out-of-word (OOV), and because the out-of-word does not exist in the word vector space, when the dialogue system encounters a problem containing the out-of-word, answer matching cannot be performed by utilizing the word vector space, and therefore, answer cannot be given to the problem containing the out-of-word.
In order to solve the problem of foreign word collection in the prior art, one scheme is as follows: when the problem presented by the user contains the out-of-set word, a random word vector is generated for the out-of-set word by using a random generation mode, the random word vector can be mapped to a point in a word vector space, and then word vector matching is carried out by using the random word vector as the word vector of the out-of-set word, so that a response is given to the problem containing the out-of-set word. The scheme can solve the problem that the external word can not be responded in the deep learning-based dialogue system in the prior art, but because word vectors of the external word are randomly generated and have uncertainty, although the problem containing the external word can be responded, the content of the response is not ensured, and the problem of the external word can not be thoroughly solved.
Example 1
In order to solve the problem of the word out of the set in the prior art, the embodiment of the application provides a method for processing the word out of the set based on multiple semantics, referring to fig. 1, which is a flowchart of the method for processing the word out of the set based on multiple semantics, wherein the method comprises the following steps:
step S110, obtaining the weight of each semantic meaning of the external word according to the upper word and the lower word of the external word in the sentence;
the foreign word usually contains a plurality of semantics, and in this application, the semantics of the foreign word can be obtained from a knowledge network (english name is HowNet), which is a common sense knowledge base that uses concepts represented by terms of chinese and english as description objects to reveal the relationship between concepts and the attributes of the concepts as basic content. In the knowledge network, the meaning source is the minimum unit of the most basic meaning which is not easy to subdivide, one word can have a plurality of semantics, each semantic meaning can contain a plurality of meaning sources, for example, the meaning of the word and the meaning source thereof can be expressed in the following form:
wherein each row lists the semantics of a word and the source of meaning for each semantic. In each row, the first column represents the word itself, the second column represents the number of the semantic meaning of the word, and the number of the semantic meaning sources and the content of the semantic meaning sources in each semantic meaning are respectively expressed in a number+the semantic meaning source mode from the second column. For example: the "word" has 6 semantics in total; wherein, the 1 st semantic has 2 sense sources: functional words, progress; the 2 nd semantic has 1 sense origin: functional words; the 3 rd semantic has 1 sense origin: living; etc.
In sentences, the semantics of the word segmentation are part of the sentence meaning, so in sentences expressing different meanings, the semantics of the word segmentation are different, for example, two sentences are listed below:
sentence 1: i want the birthday gift to be an apple computer.
Sentence 2: i like to eat apples.
In the two sentences shown in the above example, the semantics of "apple" are clearly different; in the sentence, other segmented words adjacent to the target segmented word are semantically related to the target segmented word, and the local sentence meaning of the sentence is expressed together.
For example, in sentence 1, the words of the "apple" adjacent to each other are respectively "one" and "computer", wherein "apple computer" means an apple brand computer, and "one" is a counting unit of "apple computer", and it can be seen that "one" and "apple" and "computer" are related semantically; in sentence 2, "eat" is a verb, and is semantically related to "apple" in sentence 2.
Based on the characteristics that the semantic meaning of the word segmentation is different in different sentences and the semantic meaning of adjacent word segmentation in the sentences, in step S110, the weight of each semantic meaning of the outer word is obtained according to the upper word and the lower word of the outer word in the sentences; thus reflecting the degree of contribution of each semantic meaning of the out-of-set word to the meaning in a particular sentence by weight.
The present application defines the concept of upper and lower words, where the upper and lower words include at least one preface word and at least one successor word of an external word in a sentence, specifically: and sequentially searching at least one word segmentation in the sentence in the direction far away from the outer words in the sentence by taking the outer words in the sentence as the center, and sequentially searching at least one word segmentation in the direction far away from the outer words in the sentence.
Fig. 2 is a flowchart of a multi-semantic-based out-of-set word processing method step S110 according to an embodiment of the present application.
In an alternative embodiment, as shown in fig. 2, step S110 includes the steps of:
step S111, obtaining the upper and lower words of the outer word in the sentence;
in the application, the upper word and the lower word can be one front word and one rear word in the sentence of the outer word, can be two front words and two rear words in the sentence of the outer word, and can also be a plurality of front words and a plurality of rear words in the sentence of the outer word. Fig. 3 is a flowchart of a multi-semantic-based out-of-set word processing method step S111 according to an embodiment of the present application.
In order to quantitatively obtain the context of the out-of-set word from the sentence in a manner that can be followed, in an alternative embodiment, as shown in fig. 3, step S111 may include the steps of:
step S1111, setting a word taking window value C for restraining the number of the upper and lower words, wherein C is an integer and is more than or equal to 1;
in the embodiment of the application, a value window C is defined, where the value window C is used to restrict the number of upper and lower words, and when the number of word segmentation in front of and behind the outer word in the sentence is greater than C, the number of upper and lower words is 2C.
Step S1112, obtaining the upper and lower words from the word segmentation of the sentence containing the word outside the set according to the word taking window value C;
the upper word and the lower word comprise C word fragments positioned in the front of the outer word set and C word fragments positioned in the rear of the outer word set in sentences.
Illustratively, a word taking window value c=1 is set; sentences containing the exowords are: i want to buy an apple computer; the foreign words in the sentence are: and (5) apples.
Firstly, all the segmented words in sentences are obtained, namely: i want to buy an apple computer
Because the word taking window value c=1, the upper word and the lower word are the previous word and the next word of the external word in the sentence, namely: and a computer.
For example, a word taking window value c=2 is set, and sentences containing words outside the set are: i want to buy an apple computer; the foreign words in the sentence are: and (5) apples.
Firstly, all the segmented words in sentences are obtained, namely: i want to buy an apple computer
Because the word taking window value c=2, the upper word and the lower word are the first two word segments and the second two word segments of the outer word in the sentence. However, in the sentence, there is only one word behind the out-of-set word, and for this case, the present application stops continuing the acquisition if the beginning or end of the sentence is acquired forward or backward when acquiring the upper and lower words. Therefore, when the word taking window value c=2, the upper and lower words of "apple" obtained from the sentence are: one wants to buy, one, a computer.
Step S112, obtaining the first class distance between the upper word and the lower word and each semantic;
in the sentence, the context word is semantically related to the outer word, so as to obtain the degree of relativity of the context word and each semantic of the outer word, thereby reasonably determining the weight of each semantic of the outer word.
Fig. 4 is a flowchart of a multi-semantic-based out-of-set word processing method step S112 according to an embodiment of the present application.
In an alternative embodiment, as shown in fig. 4, step S112 includes the steps of:
step S1121, obtaining the cosine distance between each word segmentation of the upper word and the lower word and each meaning source in each semantic;
illustratively, the semantics and origins of "apple" are:
apple 3 5 carries specific brand computer can 1 fruit 3 tree fruit reproduction of style value
When the value window value c=1, the upper word and the lower word of the "apple" include the following segmentation words: and a computer.
The cosine distance between the upper word and the lower word and each sense origin in the first semantic meaning is obtained, and the cosine distance is expressed by COS (semantic meaning origin), and the cosine distances are respectively as follows:
COS (one, carry), COS (one, style value), COS (one, computer), COS (one, energy)
The cosine distance between the upper word and the lower word 'computer' and each meaning source in the first semantic meaning is obtained, and is expressed by COS (semantic meaning source), and the cosine distances are respectively as follows:
COS (computer, portable), COS (computer, model value), COS (computer ), COS (computer, energy)
The cosine distance between the upper word and the lower word and each sense origin in the second semantic is obtained, and the cosine distance is expressed by COS (semantic, sense origin), and the cosine distances are respectively as follows:
COS (one, fruit)
The cosine distance between the upper and lower words 'computer' and each meaning source in the second semantic meaning is obtained, and is expressed by COS (semantic meaning source), and the cosine distances are respectively as follows:
COS (computer, fruit)
The cosine distance between the upper word and the lower word and each sense origin in the third semantic is obtained, and the cosine distance is expressed by COS (semantic, sense origin), and the cosine distances are respectively as follows:
COS (one, tree), COS (one, fruit), COS (one, reproduction)
The cosine distance between the upper and lower words 'computer' and each meaning source in the third semantic meaning is obtained, and is expressed by COS (semantic meaning source), and the cosine distances are respectively as follows:
COS (computer, tree), COS (computer, fruit), COS (computer, reproduction)
Step S1122, obtaining the average distance between each word segment of the upper word and the lower word and the sense source in each semantic meaning according to the cosine distance;
illustratively, the average distance in step S1122 is expressed in Da, the number of upper and lower words of "apple" is 2, and the semantic number of "apple" is 3, so 6 (2×3) distances Da can be obtained in total:
da (one, semantic 1) = [ COS (one, carry) +cos (one, style value) +cos (one, computer) +cos (one, energy) ]/5
Da (computer, semantic 1) = [ COS (computer, carry) +cos (computer, style value) +cos (computer ) +cos (computer, energy) ]/5
Da (one, semantic 2) =cos (one, fruit)
Da (computer, semantic 2) =cos (computer, fruit)
Da (one, semantic 3) = [ COS (one, tree) +cos (one, fruit) +cos (one, reproduction) ]/3
Da (computer, semantic 3) = [ COS (computer, tree) +cos (computer, fruit) +cos (computer, reproduction) ] ≡3
Step S1123, obtaining the first class distance between the upper and lower words and each semantic meaning according to the average distance.
In this embodiment of the present application, the context includes a plurality of word segments, and the first class distance between the context and each semantic meaning is an average value of the distance Da between the word segments and each semantic meaning.
Illustratively:
first class distance d1= [ Da (one, semantic 1) +da (computer, semantic 1) ]/2 between upper and lower words and first semantic meaning
First-class distance d2= [ Da (one, semantic 2) +da (computer, semantic 2) ] 2 between upper and lower words and second semantic meaning
First class distance d3= [ Da (one, semantic 3) +da (computer, semantic 3) ]2 of upper and lower words and third semantic
Step S113, calculating the weight of each semantic according to the first class distance.
In the application, the first class distance is obtained through cosine distance calculation, and the higher the value of the first class distance is, the higher the correlation degree between the upper word and the lower word and the semantic is, and the weight is correspondingly higher. It can be seen that, in the present application, the value of the first class distance is positively correlated with the value of the semantic weight.
Based on the forward-related relationship, in an alternative embodiment, the weight of each semantic term of an out-of-set word is calculated using the following formula:
wherein n is the number of the semantics of the external word, wm is the weight of the mth semantics of the external word, dm is the first type distance between the upper word and the lower word and the mth semantics of the external word,the sum of the first class distances for all semantics of the out-word.
Step S120, generating semantic vectors of each semantic according to word vectors of semantic origins of each semantic;
fig. 5 is a flowchart of a multi-semantic-based out-of-set word processing method step S120 according to an embodiment of the present application.
In an alternative embodiment, as shown in fig. 5, step S120 includes the steps of:
step S121, acquiring an original sense word vector of each original sense in each semantic of the external word;
illustratively, the foreign word "apple" has 3 semantics in total, and in step S121, an original word vector of each original meaning in the 3 semantics needs to be obtained, for example: the sense primitive word vectors T11 to T15 in the semantic meaning 1, the sense primitive word vector T21 in the semantic meaning 2 and the sense primitive word vectors T31 to T33 in the semantic meaning 3 are obtained.
Step S122, setting the weight of the meaning source for each meaning source in each semantic according to the number of the meaning sources in each semantic;
in the embodiment of the application, the size of the weight of the sense origin is determined according to the number of sense origins in the semantics, and the more the number of sense origins is, the smaller the weight of the sense origin allocated to each sense origin is, so as to reflect the contribution degree of each sense origin to the semantics.
In an alternative embodiment, within each semantic meaning, the sense origin weights of the sense origins may be the same and obtained using the following formula:
Wp=1/x
wherein Wp is the weight of the sense origins, and x is the number of sense origins in the semantics.
Illustratively, the sense primitive weights of sense primitive word vectors T11-T15 are w1=1/5;
the sense primitive weight of the sense primitive word vector T21 is w2=1;
the sense primitive weights of the sense primitive word vectors T31 to T33 are w3=1/3.
And step S123, carrying out weighted summation on the word vectors of the meaning source in each semantic according to the weight of the meaning source, and generating the semantic vector of each semantic.
Step S123 obtains the semantic vector for each semantic using the following formula:
wherein Ti is the semantic vector of the ith semantic, n is the number of the semantic origins in the ith semantic, tij is the semantic origin word vector of the jth semantic origin in the ith semantic, wi is the semantic origin weight of the jth semantic origin in the ith semantic.
Illustratively, according to the 1 st semantic meaning word vector T11-T15 and the 1 st semantic meaning weight W11-W15 of the foreign word "apple" obtained in step S122, the semantic vector of the first semantic meaning of "apple" is calculated as:
T1=T11×W11+T12×W12++T13×W13+T14×W14+T15×W15
in this application, tij may be a low-dimensional vector of Distributed Representation type, for example, dimension m=50 or dimension m=100.
Step S130, according to the weight of each semantic, the semantic vectors of each semantic are weighted and summed to generate a simulation word vector.
In step S110 and step S120, the embodiment of the present application obtains the weight of each semantic meaning of the external word and the semantic meaning vector of each semantic meaning. In step S130, a simulated word vector that incorporates the multi-semantics of the foreign word can be generated by weighting and summing the weights of the semantics using the semantic vector.
Illustratively, the semantic vectors T1-T3 generated for the exoword "apple" and the weights W1-W3 of the semantics are weighted and summed to generate the simulated word vector Tout according to steps S110 and S120:
Tout=T1×W1+T2×W2+T3×W3
as can be seen from the above formula, the simulated word vector Tout is generated based on a plurality of semantics of the external word, and according to the semantic relativity of the external word and the upper and lower words, the simulated word vector Tout is generated by fusion according to different weights; therefore, the simulated word vector generated by the embodiment of the application can match sentence clauses and other semantics of the external word are considered, so that the semantics expressed by the simulated word vector are richer and plump, and the simulated word vector is suitable for richer semantic environments; therefore, when the simulation word vector generated by the embodiment of the application is used in an intelligent question-answering system, the relevance of the answer and the problem is high, the answer accuracy is improved, the intelligent question-answering system is adapted to a richer dialogue environment, the intelligent question-answering system is more intelligent, the user goodness is greatly improved, and the word-out-of-set problem in the prior art is solved.
As can be seen from the above technical solutions, the embodiments of the present application provide a method for processing a word out of a set based on multiple semantics, including: acquiring the weight of each semantic meaning of the outer word according to the upper word and the lower word of the outer word in the sentence; the upper word and the lower word comprise at least one preamble word and at least one postamble word of the external word in the sentence; generating a semantic vector of each semantic according to the word vector of the semantic origin in each semantic; and according to the weight of each semantic, weighting and summing the semantic vectors of each semantic to generate a simulation word vector. The simulated word vector generated by the embodiment of the application is generated based on a plurality of semantics of the external word, and according to the semantic relativity of the external word and the upper word and the lower word, the simulated word vector is generated by fusion according to different weights; the method can realize matching of sentence meaning and other semantics of the external words, so that the semantics expressed by the simulated word vector are richer and plump, and are suitable for richer semantic environments; therefore, when the simulated word vector generated by the embodiment of the application is used in the intelligent question-answering system, the relevance of the answer and the problem is high, the answer accuracy is improved, the method is suitable for a richer dialogue environment, the intelligent question-answering system is more intelligent, the user goodness is greatly improved, and the foreign word gathering problem in the prior art is solved.
Example two
The embodiment of the application provides an intelligent question-answering method, wherein the multi-semantic-based out-of-set word processing method provided by the embodiment of the application is applied, fig. 6 is a flowchart of the intelligent question-answering method shown in the embodiment of the application, and as shown in fig. 6, the method comprises the following steps:
step S210, acquiring a word out of a set from word segmentation results of unknown problems;
the intelligent question-answering system can only have answering capability through training of training corpus, and in the training process, the intelligent question-answering system can generate a word vector space for expressing the word vectors of the known word segmentation according to the known word segmentation; when a user asks a question to the trained intelligent question-answering system, the intelligent question-answering system performs word segmentation on unknown questions according to preset word segmentation rules, and acquires out-of-set words which are not in a word vector space according to word segmentation results.
In the intelligent question-answering system, the word vector space is not used for matching the word vector, so that the intelligent question-answering system cannot accurately answer when encountering the word vector space.
Step S220, generating a simulation word vector of the external word based on the multi-semantics of the external word;
in step S220, generating a simulated word vector for the exowords obtained in step S210 by using the exoword processing method based on multiple semantics provided in the first embodiment of the present application;
step S230, matching the question answers from the trained question answer model according to the simulated word vector and word vectors of the rest of the word segmentation in the question.
As can be seen from the above technical solutions, the embodiments of the present application provide an intelligent question-answering method, including: acquiring a word out-of-set from word segmentation results of unknown problems; generating a simulation word vector of the external word on the basis of the multi-semantics of the external word; and matching the answers of the questions from the trained question-answering model according to the simulated word vector and word vectors of the rest word segmentation in the questions. According to the intelligent question-answering method, when the external words of the unknown problems are encountered, the simulation word vector of the external words is generated based on the multi-semantics of the external words, and the multi-semantics-based external word processing method is applied in the process of generating the simulation word vector, so that when an intelligent question-answering system generates answers, the relevance between the answers and the problems is high, the answer accuracy is improved, the method is suitable for richer dialogue environments, the intelligent question-answering system is more intelligent, the user goodness is greatly improved, and the problem of the external words in the prior art is solved.
Example III
An embodiment of the present application provides a multi-semantic-based out-of-set word processing device, and fig. 7 is a block diagram of the multi-semantic-based out-of-set word processing device, as shown in fig. 7, where the device includes:
a semantic weight obtaining unit 310, configured to obtain a weight of each semantic meaning of the external word according to the upper word and the lower word of the external word in the sentence;
a semantic vector generating unit 320, configured to generate a semantic vector of each semantic according to the word vector of the semantic origin of each semantic;
the simulated word vector generating unit 330 is configured to generate a simulated word vector by weighted summation of the semantic vectors of each semantic according to the weight of each semantic.
As can be seen from the above technical solutions, the embodiments of the present application provide a multi-semantic-based processing device for outer words, which is configured to obtain weights of each semantic meaning of outer words according to upper and lower words of the outer words in sentences; the upper word and the lower word comprise at least one preamble word and at least one postamble word of the external word in the sentence; generating a semantic vector of each semantic according to the word vector of the semantic origin in each semantic; and according to the weight of each semantic, weighting and summing the semantic vectors of each semantic to generate a simulation word vector. The simulated word vector generated by the embodiment of the application is generated based on a plurality of semantics of the external word, and according to the semantic relativity of the external word and the upper word and the lower word, the simulated word vector is generated by fusion according to different weights; the method can realize matching of sentence meaning and other semantics of the external words, so that the semantics expressed by the simulated word vector are richer and plump, and are suitable for richer semantic environments; therefore, when the simulated word vector generated by the embodiment of the application is used in the intelligent question-answering system, the relevance of the answer and the problem is high, the answer accuracy is improved, the method is suitable for a richer dialogue environment, the intelligent question-answering system is more intelligent, the user goodness is greatly improved, and the foreign word gathering problem in the prior art is solved.
Example IV
The embodiment of the application provides an intelligent question-answering device, wherein the method for processing the multi-semantic-based out-of-set words provided by the embodiment of the application is applied, and fig. 8 is a block diagram of the intelligent question-answering device shown in the embodiment of the application, and as shown in fig. 8, the device comprises:
an out-of-set word obtaining unit 410, configured to obtain an out-of-set word from a word segmentation result of an unknown problem;
an out-of-set word processing unit 420, configured to generate a simulated word vector of the out-of-set word based on the multiple semantics of the out-of-set word;
and a answering unit 430, configured to match answers to questions from the trained question-answering model according to the simulated word vector and word vectors of the rest of the word segmentation in the questions.
According to the technical scheme, the embodiment of the application provides an intelligent question answering device which is used for acquiring a word out-of-set from word segmentation results of unknown problems; generating a simulation word vector of the external word on the basis of the multi-semantics of the external word; and matching the answers of the questions from the trained question-answering model according to the simulated word vector and word vectors of the rest word segmentation in the questions. According to the intelligent question-answering method, when the external words of the unknown problems are encountered, the simulation word vector of the external words is generated based on the multi-semantics of the external words, and the multi-semantics-based external word processing method is applied in the process of generating the simulation word vector, so that when an intelligent question-answering system generates answers, the relevance between the answers and the problems is high, the answer accuracy is improved, the method is suitable for richer dialogue environments, the intelligent question-answering system is more intelligent, the user goodness is greatly improved, and the problem of the external words in the prior art is solved.
The subject application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.
Claims (8)
1. The method for processing the outer word of the set based on the multiple semantics is characterized by comprising the following steps:
according to the upper word and the lower word of the outer word in the sentence, the obtaining the weight of each semantic meaning of the outer word comprises the following steps: acquiring the upper and lower words of the outer word in the sentence, acquiring the first class distance between the upper and lower words and each semantic meaning, and calculating the weight of each semantic meaning according to the first class distance;
the upper word and the lower word comprise at least one preamble word and at least one postamble word of the external word in the sentence;
generating a semantic vector for each semantic meaning from the word vector for the semantic meaning source of each semantic meaning comprises: acquiring the sense primitive word vector of each sense primitive in each semantic of the out-of-set word, setting a sense primitive weight for each sense primitive in each semantic according to the number of sense primitives in each semantic, and carrying out weighted summation on the word vector of the sense primitive in each semantic according to the sense primitive weight to generate a semantic vector of each semantic;
and according to the weight of each semantic, weighting and summing the semantic vectors of each semantic to generate a simulation word vector.
2. The method of claim 1, wherein the step of obtaining the first class distance of the context word from each semantic comprises:
acquiring cosine distances between each word segmentation of the upper word and the lower word and each meaning source in each semantic;
according to the cosine distance, obtaining the average distance between each word segmentation of the upper word and the lower word and the sense source in each semantic meaning;
and acquiring the first class distance between the upper word and the lower word and each semantic meaning according to the average distance.
3. The method of claim 1, wherein the step of calculating a weight for each semantic meaning based on the first class of distances uses the following formula:
wherein n is the number of the semantics of the external word, wm is the weight of the mth semantics of the external word, dm is the first type distance between the upper word and the lower word and the mth semantics of the external word,the sum of the first class distances for all semantics of the out-word.
4. The method of claim 1, wherein the step of obtaining the upper and lower words of the out-of-set word in the sentence comprises:
setting a word taking window value C for restraining the number of the upper words and the lower words, wherein C is an integer and is more than or equal to 1;
acquiring the upper word and the lower word from the word segmentation of the sentence containing the word outside the set according to the word taking window value C;
the upper word and the lower word comprise C word fragments positioned in the front of the outer word set and C word fragments positioned in the rear of the outer word set in sentences.
5. The method of claim 1, wherein the step of determining the position of the substrate comprises,
setting the weight of the meaning source for each meaning source in each semantic according to the number of the meaning sources in each semantic, and using the following formula:
Wp=1/x
wherein Wp is the weight of the sense origins, and x is the number of sense origins in the semantics.
6. The method of any one of claims 1-5, further comprising:
acquiring a word out-of-set from word segmentation results of unknown problems;
generating a simulation word vector of the external word on the basis of the multi-semantics of the external word;
and matching the answers of the questions from the trained question-answering model according to the simulated word vector and word vectors of the rest word segmentation in the questions.
7. A multi-semantic-based out-of-set word processing device, comprising:
the semantic weight obtaining unit is configured to obtain the weight of each semantic meaning of the external word according to the upper word and the lower word of the external word in the sentence, where the obtaining comprises: acquiring the upper and lower words of the outer word in the sentence, acquiring the first class distance between the upper and lower words and each semantic meaning, and calculating the weight of each semantic meaning according to the first class distance; the upper word and the lower word comprise at least one preamble word and at least one postamble word of the external word in the sentence;
the semantic vector generating unit is configured to generate a semantic vector of each semantic according to a word vector of a semantic origin in each semantic, and includes: acquiring the sense primitive word vector of each sense primitive in each semantic of the out-of-set word, setting a sense primitive weight for each sense primitive in each semantic according to the number of sense primitives in each semantic, and carrying out weighted summation on the word vector of the sense primitive in each semantic according to the sense primitive weight to generate a semantic vector of each semantic;
and the simulated word vector generation unit is used for weighting and summing the semantic vectors of each semantic according to the weight of each semantic to generate a simulated word vector.
8. The apparatus as recited in claim 7, further comprising:
the external word collection acquisition unit is used for acquiring external words from word segmentation results of unknown problems;
the external word collection processing unit is used for generating a simulation word vector of the external word collection based on the multi-semantics of the external word collection;
and the answering unit is used for matching the answers of the questions from the trained question answering model according to the simulated word vector and the word vectors of the rest word segmentation in the questions.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810556386.2A CN108763217A (en) | 2018-06-01 | 2018-06-01 | Word treatment method, intelligent answer method and device outside collection based on multi-semantic meaning |
CN2018105563862 | 2018-06-01 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109614618A CN109614618A (en) | 2019-04-12 |
CN109614618B true CN109614618B (en) | 2023-07-14 |
Family
ID=64001970
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810556386.2A Pending CN108763217A (en) | 2018-06-01 | 2018-06-01 | Word treatment method, intelligent answer method and device outside collection based on multi-semantic meaning |
CN201811498210.2A Active CN109614618B (en) | 2018-06-01 | 2018-12-07 | Method and device for processing foreign words in set based on multiple semantics |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810556386.2A Pending CN108763217A (en) | 2018-06-01 | 2018-06-01 | Word treatment method, intelligent answer method and device outside collection based on multi-semantic meaning |
Country Status (1)
Country | Link |
---|---|
CN (2) | CN108763217A (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110727769B (en) | 2018-06-29 | 2024-04-19 | 阿里巴巴(中国)有限公司 | Corpus generation method and device and man-machine interaction processing method and device |
CN109740163A (en) * | 2019-01-09 | 2019-05-10 | 安徽省泰岳祥升软件有限公司 | Semantic representation resource generation method and device applied to deep learning model |
CN109740162B (en) * | 2019-01-09 | 2023-07-11 | 安徽省泰岳祥升软件有限公司 | Text representation method, device and medium |
CN110147446A (en) * | 2019-04-19 | 2019-08-20 | 中国地质大学(武汉) | A kind of word embedding grammar based on the double-deck attention mechanism, equipment and storage equipment |
CN111125333B (en) * | 2019-06-06 | 2022-05-27 | 北京理工大学 | Generation type knowledge question-answering method based on expression learning and multi-layer covering mechanism |
CN112036163A (en) * | 2020-08-28 | 2020-12-04 | 南京航空航天大学 | Method for processing out-of-set words in electric power plan text sequence labeling |
CN113486142A (en) * | 2021-04-16 | 2021-10-08 | 华为技术有限公司 | Semantic-based word semantic prediction method and computer equipment |
CN113254616B (en) * | 2021-06-07 | 2021-10-19 | 佰聆数据股份有限公司 | Intelligent question-answering system-oriented sentence vector generation method and system |
CN113468308B (en) * | 2021-06-30 | 2023-02-10 | 竹间智能科技(上海)有限公司 | Conversation behavior classification method and device and electronic equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017177901A1 (en) * | 2016-04-12 | 2017-10-19 | 芋头科技(杭州)有限公司 | Semantic matching method and smart device |
CN107798140A (en) * | 2017-11-23 | 2018-03-13 | 北京神州泰岳软件股份有限公司 | A kind of conversational system construction method, semantic controlled answer method and device |
CN108038105A (en) * | 2017-12-22 | 2018-05-15 | 中科鼎富(北京)科技发展有限公司 | A kind of method and device that emulation term vector is generated to unregistered word |
-
2018
- 2018-06-01 CN CN201810556386.2A patent/CN108763217A/en active Pending
- 2018-12-07 CN CN201811498210.2A patent/CN109614618B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017177901A1 (en) * | 2016-04-12 | 2017-10-19 | 芋头科技(杭州)有限公司 | Semantic matching method and smart device |
CN107798140A (en) * | 2017-11-23 | 2018-03-13 | 北京神州泰岳软件股份有限公司 | A kind of conversational system construction method, semantic controlled answer method and device |
CN108038105A (en) * | 2017-12-22 | 2018-05-15 | 中科鼎富(北京)科技发展有限公司 | A kind of method and device that emulation term vector is generated to unregistered word |
Non-Patent Citations (2)
Title |
---|
基于多谓词语义框架的长短语文本相似度计算;王景中等;《计算机工程与设计》;20180416(第04期);全文 * |
基于知网义原词向量表示的无监督词义消歧方法;唐共波等;《中文信息学报》;20151115(第06期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN108763217A (en) | 2018-11-06 |
CN109614618A (en) | 2019-04-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109614618B (en) | Method and device for processing foreign words in set based on multiple semantics | |
US11775760B2 (en) | Man-machine conversation method, electronic device, and computer-readable medium | |
US11487804B2 (en) | System and method for automatically generating concepts related to a target concept | |
US11409964B2 (en) | Method, apparatus, device and storage medium for evaluating quality of answer | |
US20190325023A1 (en) | Multi-scale model for semantic matching | |
US20130159277A1 (en) | Target based indexing of micro-blog content | |
CN109284502B (en) | Text similarity calculation method and device, electronic equipment and storage medium | |
CN109635294B (en) | Single-semantic-based unregistered word processing method, intelligent question-answering method and device | |
US20230058194A1 (en) | Text classification method and apparatus, device, and computer-readable storage medium | |
CN110895656B (en) | Text similarity calculation method and device, electronic equipment and storage medium | |
CN116541493A (en) | Interactive response method, device, equipment and storage medium based on intention recognition | |
CN109858013B (en) | Supervised word vector training method and device | |
US11481609B2 (en) | Computationally efficient expressive output layers for neural networks | |
CN108038105B (en) | Method and device for generating simulated word vector for unknown words | |
CN113821527A (en) | Hash code generation method and device, computer equipment and storage medium | |
CN109299459B (en) | Word vector training method and device for single semantic supervision | |
Hallili | Toward an ontology-based chatbot endowed with natural language processing and generation | |
CN109271633B (en) | Word vector training method and device for single semantic supervision | |
Sharma et al. | Review on Chatbot Design Techniques in Speech Conversation Systems | |
Asakiewicz et al. | Building a cognitive application using watson deepqa | |
Chang et al. | Deep learning for sentence clustering in essay grading support | |
Lee | Natural Language Processing: A Textbook with Python Implementation | |
Santana et al. | A Chatbot to Support Basic Students Questions. | |
Nagender et al. | Whatsapp auto responder using natural language processing and AI | |
Kowsher et al. | Knowledge-base optimization to reduce the response time of bangla chatbot |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |