CN109062902A - A kind of text semantic expression and device - Google Patents
A kind of text semantic expression and device Download PDFInfo
- Publication number
- CN109062902A CN109062902A CN201810942947.2A CN201810942947A CN109062902A CN 109062902 A CN109062902 A CN 109062902A CN 201810942947 A CN201810942947 A CN 201810942947A CN 109062902 A CN109062902 A CN 109062902A
- Authority
- CN
- China
- Prior art keywords
- word
- text
- target
- path
- interdependent
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
- G06F40/35—Discourse or dialogue representation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
This application discloses a kind of text semantic expression and devices, this method comprises: after getting target text to be expressed, target text carries out word segmentation processing, to obtain each target word, then interdependent syntactic analysis is carried out to target text, with the dependence between each target word of determination, then, semantic meaning representation can be carried out to target text according to the dependence between each target word.It can be seen that, the embodiment of the present application is after getting target text to be expressed, semantic meaning representation is no longer carried out to target text using common one-hot mode, but according to the dependence between target word each in target text, semantic meaning representation is carried out to the target text, that is, the semantic relation in text between word is considered when carrying out semantic meaning representation to target text, to improve the accuracy of semantic meaning representation result.
Description
Technical field
This application involves natural language processing technique field more particularly to a kind of text semantic expression and devices.
Background technique
Text can refer to sentence or chapter, and the semantic meaning representation of text refers to the text of natural language form, be encoded into one
A specific vector, so that this vector includes the semantic information of the text.One good semantic meaning representation result will be helpful to mention
Rise the effect and performance of each generic tasks such as text similarity retrieval, emotional semantic classification, domain classification.
Existing semantic meaning representation mode generallys use the mode of one-hot, that is, is indicated in a text with 0,1
Word whether there is, and specifically, the vocabulary including a large amount of words can be pre-created can be by word by taking text A as an example
The word for belonging to text A in table is indicated with 1, and the word that text A is not belonging in vocabulary is indicated with 0, to form one by 0 He
The text vector of 1 composition expresses the semantic information of text A, and makes the dimension of text vector and the word number phase in vocabulary
Together.
But the existing this mode for carrying out semantic meaning representation to text using one-hot mode, it is not intended that in text
Semantic relation between word causes semantic meaning representation result inaccurate.
Summary of the invention
The main purpose of the embodiment of the present application is to provide the expression and device of a kind of text semantic, can be improved language
The accuracy of adopted expression of results.
The embodiment of the present application provides a kind of text semantic expression, comprising:
Obtain target text to be expressed;
The target text is subjected to word segmentation processing, obtains each target word;
Interdependent syntactic analysis is carried out to the target text, determines the dependence between each target word;
According to the dependence between each target word, semantic meaning representation is carried out to the target text.
Optionally, the dependence between each target word of the determination, comprising:
The determining domination word with the target word with dependence, obtains by the target word and the domination
The word pair of word composition, wherein the word that dominates is root node mark or another target different from the target word
Word, the root node mark are the marks of the root node of interdependent syntax tree, and the interdependent syntax tree describes each target word
Dependence between language;
Word pair corresponding for each target word determines interdependent between two words of the word centering
Relationship.
Optionally, the dependence according between each target word carries out semantic meaning representation to the target text,
Include:
For each word pair, the corresponding term vector of each word of word centering and the word centering two are determined
The corresponding relation vector of dependence between a word;
Corresponding two term vectors and relation vector are encoded using each word, obtain the target text
Text code vector, wherein the text code vector expresses the syntactic information and sequence of terms information of the target text;
Using the text code vector, the semantic information of the target text is expressed.
Optionally, the dependence according between each target word carries out semantic meaning representation to the target text,
Include:
According between each target word dependence and every interdependent path, the target text is carried out semantic
Expression, wherein every interdependent path is every single sub path in interdependent syntax tree, and the interdependent syntax tree describes each
Dependence between target word, the terminal of the subpath are the leaf node of the interdependent syntax tree.
Optionally, the dependence and every interdependent path according between each target word, to the target
Text carries out semantic meaning representation, comprising:
Determine the application scenarios of the semantic meaning representation result of the target text;
Different degree of the every interdependent path in the application scenarios is determined respectively;
According to the different degree of dependence and every interdependent path between each target word, to the target text
Carry out semantic meaning representation.
Optionally, the different degree for determining every interdependent path respectively in the application scenarios, comprising:
For each word pair, the corresponding term vector of each word of word centering and the word centering two are determined
The corresponding relation vector of dependence between a word;
Corresponding two term vectors and relation vector are encoded using each word, obtain the target text
Text code vector, wherein the text code vector expresses the syntactic information and sequence of terms information of the target text;
Every interdependent path is encoded, the path code vector corresponding to every interdependent path, the path are obtained
Coding vector expresses the routing information that each target word is formed in the interdependent path;
Using the text code vector and the path code vector, the path weight value in the interdependent path is determined,
In, the path weight value characterizes different degree of the interdependent path under the application scenarios.
Optionally, the different degree of the dependence and every interdependent path according between each target word is right
The target text carries out semantic meaning representation, comprising:
According to the path code vector sum path weight value for corresponding to every interdependent path, determines and correspond to all interdependent paths
Path code vector;
Using the text code vector and corresponding to the path code vector in all interdependent paths, the target is expressed
The semantic information of text.
The embodiment of the present application also provides a kind of text semantic expression devices, comprising:
Target text acquiring unit, for obtaining target text to be expressed;
Target word obtaining unit obtains each target word for the target text to be carried out word segmentation processing;
Dependence determination unit determines each target word for carrying out interdependent syntactic analysis to the target text
Between dependence;
Text semantic expression unit, for according to the dependence between each target word, to the target text into
Row semantic meaning representation.
Optionally, the dependence determination unit includes:
Word is to subelement is obtained, for the determining domination word with the target word with dependence, obtain by
The target word and the word pair for dominating word composition, wherein the domination word is root node mark or is different from
Another target word of the target word, the root node mark is the mark of the root node of interdependent syntax tree, described interdependent
Syntax tree describes the dependence between each target word;
Dependence determines subelement, is used for word pair corresponding for each target word, determines the word
Dependence between two words of centering.
Optionally, the text semantic expression unit includes:
First relation vector determines subelement, for determining each word pair of word centering for each word pair
The corresponding relation vector of dependence between two words of term vector and the word centering answered;
First coding vector obtains subelement, for utilizing each word to corresponding two term vectors and relation vector
It is encoded, obtains the text code vector of the target text, wherein the text code vector expresses the target text
This syntactic information and sequence of terms information;
First semantic information expresses subelement, for using the text code vector, expresses the language of the target text
Adopted information.
Optionally, the text semantic expression unit, specifically for according to the dependence between each target word with
And every interdependent path, semantic meaning representation is carried out to the target text, wherein every interdependent path is in interdependent syntax tree
Every single sub path, the interdependent syntax tree describes the dependence between each target word, the terminal of the subpath
For the leaf node of the interdependent syntax tree.
Optionally, the text semantic expression unit includes:
Application scenarios determine subelement, the application scenarios of the semantic meaning representation result for determining the target text;
Different degree determines subelement, for determining different degree of the every interdependent path in the application scenarios respectively;
Text semantic express subelement, for according between each target word dependence and every interdependent path
Different degree, to the target text carry out semantic meaning representation.
Optionally, the different degree determines that subelement includes:
Second relation vector determines subelement, for determining each word pair of word centering for each word pair
The corresponding relation vector of dependence between two words of term vector and the word centering answered;
Second coding vector obtains subelement, for utilizing each word to corresponding two term vectors and relation vector
It is encoded, obtains the text code vector of the target text, wherein the text code vector expresses the target text
This syntactic information and sequence of terms information;
Path code vector obtains subelement, for encoding to every interdependent path, obtains interdependent corresponding to every
The path code vector in path, the path code vector express the path that each target word is formed in the interdependent path
Information;
Path weight value determines subelement, for determining institute using the text code vector and the path code vector
State the path weight value in interdependent path, wherein the path weight value characterizes weight of the interdependent path under the application scenarios
It spends.
Optionally, the text semantic expression subelement includes:
Path code vector determines subelement, for according to the path code vector sum path for corresponding to every interdependent path
Weight determines the path code vector for corresponding to all interdependent paths;
Second semantic information expresses subelement, for using the text code vector and corresponding to all interdependent paths
Path code vector, express the semantic information of the target text.
The embodiment of the present application also provides a kind of text semantic expression devices, comprising: processor, memory, system bus;
The processor and the memory are connected by the system bus;
The memory includes instruction, described instruction for storing one or more programs, one or more of programs
The processor is set to execute any one implementation in above-mentioned text semantic expression when being executed by the processor.
The embodiment of the present application also provides a kind of computer readable storage medium, deposited in the computer readable storage medium
Instruction is contained, when described instruction is run on the terminal device, so that the terminal device executes above-mentioned text semantic expression side
Any one implementation in method.
The embodiment of the present application also provides a kind of computer program product, the computer program product is on the terminal device
When operation, so that the terminal device executes any one implementation in above-mentioned text semantic expression.
A kind of text semantic expression provided by the embodiments of the present application and device, are getting target text to be expressed
Afterwards, word segmentation processing will be carried out to target text, to obtain each target word, interdependent syntax then is carried out to each target text
Analysis, determine the dependence between each target word, then, can according to the dependence between each target word,
Semantic meaning representation is carried out to target text.As it can be seen that the embodiment of the present application after getting target text to be expressed, no longer uses normal
The one-hot mode seen carries out semantic meaning representation to target text, but according between target word each in target text according to
Relationship is deposited, semantic meaning representation is carried out to the target text, that is, consider word in text when carrying out semantic meaning representation to target text
Between semantic relation, to improve the accuracy of semantic meaning representation result.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is the application
Some embodiments for those of ordinary skill in the art without creative efforts, can also basis
These attached drawings obtain other attached drawings.
Fig. 1 is a kind of flow diagram of text semantic expression provided by the embodiments of the present application;
The flow diagram of dependence of the Fig. 2 between each target word of determination provided by the embodiments of the present application;
Fig. 3 is the result schematic diagram provided by the embodiments of the present application that interdependent syntactic analysis is carried out to target text;
Fig. 4 is the structural schematic diagram in interdependent syntax tree provided by the embodiments of the present application and interdependent path;
Fig. 5 is one of the flow diagram provided by the embodiments of the present application that semantic meaning representation is carried out to target text;
Fig. 6 is the structural schematic diagram of the text code vector provided by the embodiments of the present application for generating target text;
Fig. 7 is the two of the flow diagram provided by the embodiments of the present application that semantic meaning representation is carried out to target text;
Fig. 8 is the structural schematic diagram of the path code vector provided by the embodiments of the present application for generating target text;
Fig. 9 is a kind of composition schematic diagram of text semantic expression device provided by the embodiments of the present application.
Specific embodiment
In some text semantic expressions, semantic meaning representation is usually carried out to text using one-hot mode.But
It is that the dimension of vocabulary is general excessively high (Chinese common words are more than 100,000) in this one-hot expression way, causes to calculate multiple
It is miscellaneous to spend greatly, meanwhile, this expression way has ignored the semantic relation between text word, for example, word " apple " and " pears
Son " although all indicate fruit, the two words be in one-hot expression way it is completely irrelevant, be expressed as 0 or 1,
That is, not accounting in incidence relation semantically between word, so as to cause text semantic expression of results inaccuracy.
To solve drawbacks described above, the embodiment of the present application provides a kind of text semantic expression, to be expressed getting
Text after, first to the text carry out word segmentation processing, obtain each word in the text, then, to the text carry out it is interdependent
Syntactic analysis, to determine the dependence in the text between each word, then, according to the interdependent pass between each word
System carries out semantic meaning representation to the text.As it can be seen that the embodiment of the present application no longer carries out text using traditional one-hot mode
Semantic meaning representation, but according to the dependence between word each in text, semantic meaning representation is carried out to text, that is, consider text
Influence of the semantic relation to text semantic expression of results in this between each word, to improve text semantic expression of results
Accuracy.
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application
In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is
Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art
Every other embodiment obtained without making creative work, shall fall in the protection scope of this application.
First embodiment
Be a kind of flow diagram of text semantic expression provided in this embodiment referring to Fig. 1, this method include with
Lower step:
S101: target text to be expressed is obtained.
In the present embodiment, the present embodiment will be used to realize any text definition of text semantic expression for target text.
Also, the present embodiment does not limit the languages type of target text, for example, target text can be Chinese text or English text
Deng;The present embodiment does not limit the length of target text yet, for example, target text can be sentence text, be also possible to chapter text
This;The present embodiment does not limit the source of target text yet, for example, target text can be it is from speech recognition as a result,
It can be the daily record data being collected into from each operation system of platform;The present embodiment does not limit the type of target text yet,
For example, target text can be certain words in people's every-day language, it is also possible to speech draft, magazine article, literary works etc.
In part text.
It is understood that sentence text refers to a sentence, it is the set of each word, chapter text refers to one
The set of consecutive sentence can be according to subsequent step after obtaining sentence text or chapter text as target text to be expressed
Semantic meaning representation is carried out to it.
S102: carrying out word segmentation processing for target text, obtains each target word.
In the present embodiment, after target text to be expressed being got by step S101, in order to target text
It realizes more accurately semantic meaning representation, word segmentation processing can be carried out to target text, to obtain each word for including in target text
Each word that participle obtains is defined as target word here by language.
Wherein, when target text is sentence text, it can use the segmenting method of existing or future appearance to target text
This progress word segmentation processing obtains each word in target text, as each target word, such as, it is assumed that target text is
" he cry Tom go take coat ", then after carrying out word segmentation processing to it, available six target words be " he, cry, Tom, go,
It takes, coat ".
Alternatively, needing first to carry out subordinate sentence processing to target text, obtaining target text if target text is chapter text
Each subordinate sentence text, recycle segmenting method to each subordinate sentence text carry out word segmentation processing, obtain each in target text
Word, as each target word.
S103: interdependent syntactic analysis is carried out to target text, determines the dependence between each target word.
In the present embodiment, by step S102, after obtaining the corresponding each target word of target text, further,
It can use interdependent syntactic analysis method and interdependent syntactic analysis carried out to target text, to determine between each target word
Dependence, wherein the dependence between target word refers to the semantic association relationship between target word, for example,
In six target words " he, cry, Tom, go, take, coat " of target text " he makes Tom go to take coat ", " he " and " crying "
Between semantic association relationship be subject-predicate relationship.
It should be noted that specifically carrying out interdependent syntactic analysis between each target word of determination to target text
The realization process of dependence can be found in the related introduction of subsequent second embodiment.
S104: according to the dependence between each target word, semantic meaning representation is carried out to target text.
In the present embodiment, by step S103, the dependence in target text between each target word is determined
It afterwards, may further be according to dependence between each target word, such as subject-predicate relationship, dynamic guest's relationship etc., to target text
This progress semantic meaning representation.
Specifically, it during carrying out semantic meaning representation to target text, can be carried out first according to target text
Interdependent syntactic analysis as a result, the corresponding interdependent syntax tree of building target text, then according to mesh each in interdependent syntax tree
The dependence between the semantic information and each target word and other target words of word is marked, determines that target text is corresponding
Text code vector semantic table is carried out to target text it is then possible to using the corresponding text code vector of target text
It reaches.It should be noted that specifically carrying out semantic meaning representation to target text according to the dependence between each target word
Realization process can be found in the related introduction of subsequent second embodiment.
It further, can also be according to the interdependent syntax tree while building target text corresponding interdependent syntax tree
Tree construction, obtain the corresponding a plurality of interdependent path of target text, wherein every interdependent path is to the semantic table of target text
Up to playing a significant role, after the corresponding text code vector of target text has been determined, mesh can be further determined that on its basis
The corresponding path code vector in every interdependent path in text is marked, which, which is able to reflect out, corresponds on interdependent path
Set membership of each target word on the interdependent path of correspondence.And then it can be according to the dependence between each target word
And every interdependent path in target text, it realizes and more accurate semantic meaning representation is carried out to target text, that is, can be according to mesh
The corresponding path code vector in every interdependent path is marked in the corresponding text code vector of text and target text, is realized to mesh
It marks text and carries out more accurate semantic meaning representation.
Therefore, it in order to improve the accuracy rate of target text semantic meaning representation, can also determine between each target word
Dependence after, further combined with every in target text interdependent path, realize the semantic meaning representation to target text, specifically
Dependence between each target word is combined with every in target text interdependent path, semanteme is carried out to target text
The realization process of expression can be found in the related introduction of subsequent third embodiment.
To sum up, a kind of text semantic expression provided in this embodiment will after getting target text to be expressed
Target text carries out word segmentation processing, to obtain each target word, then interdependent syntactic analysis is carried out to target text, with determination
Dependence between each target word then can be according to the dependence between each target word, to target text
Carry out semantic meaning representation.As it can be seen that the embodiment of the present application after getting target text to be expressed, no longer uses common one-
Hot mode carries out semantic meaning representation to target text, but according to the dependence between target word each in target text, it is right
The target text carries out semantic meaning representation, that is, the language in text between word is considered when carrying out semantic meaning representation to target text
Adopted relationship, to improve the accuracy of semantic meaning representation result.
Second embodiment
The present embodiment first will be to step S103 in first embodiment " determining the dependence between each target word "
Specific embodiment be introduced.
Referring to fig. 2, it illustrates the processes of the dependence between each target word of determination provided in this embodiment to show
Be intended to, the process the following steps are included:
S201: the determining domination word with target word with dependence obtains by target word and dominates word group
At word pair, wherein dominate word be root node mark or different from target word another target word, root node mark
It is the mark of the root node of interdependent syntax tree, interdependent syntax tree describes the dependence between each target word.
In the present embodiment, target text is being segmented using segmenting method, it is corresponding each obtains target text
After target word, interdependent syntactic analysis may further be carried out to target text using interdependent syntactic analysis method, for example, can be with
Using Harbin Institute of Technology's language technology platform (Language Technology Platform, abbreviation LTP) to target text carry out according to
Syntactic analysis is deposited, is analyzed as a result, can determine have in target text with each target word according to the analysis result
The domination word of dependence, and then each target word and its corresponding domination word can be formed into word pair, together
When, it can also be according to the analysis as a result, the corresponding interdependent syntax tree of building target text, the interdependent syntax tree can describe target
Dependence in text between each target word.
Wherein, for each target word in target text, there is the domination word of dependence with the target word,
It is different from another target of the target word in the mark of root node in interdependent syntax tree or interdependent syntax tree
Word.
For example: it is based on the example above, after carrying out word segmentation processing to target text " he makes Tom go to take coat ", is obtained
Six target words are " he, cry, Tom, go, take, coat ", using LTP to the target text " he makes Tom go by coat " into
After the interdependent syntactic analysis of row, as shown in figure 3, the box of Fig. 3 bottommost is illustrated, to target text, " he cries obtained analysis result
Tom goes to take coat " carry out the result of interdependent syntactic analysis, wherein and there are " inputs " that one is directed toward it for each target word
Arrow, and the word of the other end of arrow connection is the domination word for having dependence with each target word, each mesh
The corresponding domination word of mark word may be constructed word pair, also, target word is right by its in semantic association relationship
What the domination word answered was dominated.
As shown in figure 3, the domination word of target word " he " is " crying ", the dependence that the two has is " subject-predicate relationship
(subject-verb, abbreviation SBV) ", the two can partner word pair;The domination word of target word " crying " is interdependent sentence
The root node of method tree identifies " ROOT ", and the dependence that the two has is " Key Relationships (head, abbreviation HED) ", and the two can be with
Partner word pair, indicates the core of entire target text sentence;The domination word of target word " Tom " is " crying ", the two
The dependence having is " and language (double, abbreviation DBL) ", and the two can partner word pair;Target word " going "
Dominating word is " taking ", and the dependence that the two has is " verbal endocentric phrase (adverbial, abbreviation ADV) ", and the two can form
A pair of of word pair;The domination word of target word " taking " is " crying ", and the dependence that the two has is " dynamic guest's relationship (verb-
Object, abbreviation VOB) ", the two can partner word pair;The domination word of target word " coat " is " taking ", the two tool
Some dependences be also " VOB), the two can partner word pair.
Meanwhile based on it is shown in Fig. 3 to target text " he makes Tom go to take coat " interdependent syntactic analysis of progress as a result,
Wherein every a pair of of word centering can be dominated into word as the father node of corresponding target word, further by the interdependent syntax
Analysis result is launched into the form of tree,, should be according to as shown in the left hand view of Fig. 4 to construct the corresponding interdependent syntax tree of target text
Dependence in target text between each target word can be described by depositing syntax tree, from each leaf of the interdependent syntax tree
Target word in node starts successively to find corresponding father node upwards, the available target word it is corresponding one it is interdependent
Path starts successively to search out corresponding father node upwards to be " taking " -> " crying " -> " ROOT " by taking leaf node " going " as an example, this
The corresponding interdependent path of the available target word " going " of sample is " ROOT- is cried, and-taking-goes ", as the 3rd article of Fig. 4 right part of flg according to
It deposits shown in path, similarly, corresponding three interdependent paths such as Fig. 4 of target word in other available three leaf nodes is right
Shown in the figure of side, that is to say, that produce four interdependent paths according to the interdependent syntax tree in left side is corresponding.
S202: word pair corresponding for each target word, determine between two words of word centering according to
Deposit relationship.
In the present embodiment, S201 gets in target text each target word and corresponding through the above steps
The word of word composition is dominated to rear, word pair corresponding for each target word can determine word centering mesh
It marks word and it dominates the dependence between word, for example, being based on the example above, " he makes Tom go to take for target text
Target word " he " and its domination word " crying " in coat ", can determine the interdependent pass of the word centering " he " and " crying "
System is " SBV ", similarly, can determine that the dependence of word centering " Tom " and " crying " is " DBL " etc..
After determining the dependence between each target word by step S201-S202, next, the present embodiment
It will be by following step S501-S503, to step S104 in first embodiment " according to the interdependent pass between each target word
The specific embodiment of system, to target text progress semantic meaning representation " is introduced.
In the present embodiment, according to the semantic information of target word each in interdependent syntax tree and each target word with
Dependence between other target words determines the corresponding text code vector of target text, it is then possible to compile using text
Code vector realizes the semantic meaning representation to target text.
Referring to Fig. 5, it illustrates a kind of flow diagram provided in this embodiment for carrying out semantic meaning representation to target text,
The process the following steps are included:
S501: for each word pair, the corresponding term vector of each word of word centering and word centering two are determined
The corresponding relation vector of dependence between word.
In the present embodiment, each word that target text includes is obtained to rear by step S201, it may further root
According to each word pair, the corresponding text code vector of target text is generated, to realize the semantic meaning representation to target text, is had
Body can use the semantic meaning representation model constructed in advance and generate text code vector.
Wherein, the semantic meaning representation model constructed in advance can be encoding model neural network based, such as based on convolution mind
Through network (Convolutional Neural Network, abbreviation CNN) or Recognition with Recurrent Neural Network (Recurrent Neural
Network, abbreviation RNN) etc. encoding model.
Specifically, it during carrying out semantic meaning representation to target text, for each word pair, can determine first
Out the corresponding relationship of dependence between two words of the corresponding term vector of each word of word centering and word centering to
Amount indicates that word centering dominates the term vector of word with y for example, can indicate the term vector of word centering target word with x,
And target word and the dependence dominated between word can then be indicated with relation vector r.Wherein it is possible to using word vectors
Change method or for generate the correlation model of term vector to target word and dominate word carry out word vectors, obtain word to
X and term vector y is measured, it is, for example, possible to use Word2vec method or Glove (Global Vectors for Word
The open source softwares such as Representation), target word to word centering and dominate word and carry out word vectors, obtain
Term vector x and term vector y, and target word and the relation vector r dominated between word then can be directly used at random initially
The mode of change obtains, it is to be understood that and the word of different dependences is different corresponding relation vector r, also, after
It is continuous to be updated according to r value of the semantic meaning representation model to initialization.
S502: corresponding two term vectors and relation vector are encoded using each word, obtain target text
Text code vector, wherein text code vector expresses the syntactic information and sequence of terms information of target text.
In the present embodiment, the term vector x of each word centering target word determined by step S501, dominate word
Term vector y and after representing the relation vector r of dependence between the two, three can be spliced into a Vector Groups, used
To indicate corresponding word pair, for example, three can be spliced into ternary Vector Groups p=[x, r, y], to indicate target text
A word pair.
For example, as shown in fig. 6, for by target word " he " and dominating word " crying " word pair for constituting, the two
Dependence is SBV, it is assumed that target word " he " and dominates word " crying " progress word respectively using Word2vec method
After vectorization, obtaining " he " corresponding term vector is x1, " crying " corresponding term vector is y1, then by way of random initializtion
The relation vector for obtaining indicating dependence between the two is r1, and then three vectors can be spliced, obtain ternary to
Amount group p1=[x1,r1,y1], to indicate that the word pair being made of target word " he " and domination word " crying " similarly can
To use ternary Vector Groups p2、p3、p4、p5、p6Respectively indicate corresponding word to " crying " and " ROOT ", " Tom " and " crying ",
" going " and " taking ", " taking " and " crying ", " coat " and " taking ", as shown in Figure 6.
It is understood that I=[p can be denoted as target text I1,p2…pi…pN], wherein N is indicated
The number of target word in target text I, and piI-th of word is then represented in target text to corresponding ternary Vector Groups,
For example, as shown in fig. 6, I=[p can be denoted as target text " he makes Tom go to take coat "1,p2,p3,p4,p5,
p6], wherein it is contained in the target text six target words " he, cry, Tom, go, take, coat ", and p1、p2、p3、p4、
p5、p6Six words belonging to this six target words have been respectively represented to corresponding ternary Vector Groups.
Further, it can use each word in target text to encode corresponding ternary Vector Groups, obtain mesh
The corresponding text code vector of text is marked, text coding vector expresses the syntactic information and sequence of terms letter of target text
Breath.Wherein, the syntactic information of target text refers to the grammatical relation between each target word of composition target text, and word
Word order column information then refers to sequencing information of each target word in target text.
Specifically, word each in target text can be inputted into the semanteme constructed in advance to corresponding ternary Vector Groups
It is encoded in expression model, for example, being input to a deep neural network (such as CNN or RNN) the coding mould constructed in advance
Type is encoded, so that obtaining deep neural network end layer implies exports coding vector h, and then can be as target text
Corresponding text code vector, to carry out semantic meaning representation to target text.For example, as shown in fig. 6, for target text " he
Cry Tom go take coat ", can by it includes six words to corresponding ternary Vector Groups p1、p2、p3、p4、p5、p6It is input to pre-
The deep neural network encoding model first constructed is encoded, and then corresponding text code vector h can be obtained.
S503: text code vector is used, the semantic information of target text is expressed.
In the present embodiment, it after the text code vector h that target text is obtained by step S502, may further use
Text coding vector h, expresses the semantic information of target text, as shown in fig. 6, text code vector h can be used, expresses mesh
The semantic information of mark text " he makes Tom go to take coat ".Wherein, during obtaining text code vector h, by target
Semantic information, the syntactic information of target text (grammatical relation between each target word) of each target word in text
And sequence of terms information (sequencing information of each target word in target text) is encoded in text coding vector, institute
Semantic meaning representation can be carried out to target text using the text code vector obtained.
It is understood that if target text is chapter text, it can also be by the sequence of sentence each in chapter text
Column information is encoded in the corresponding text code vector h of target text.Also, due in an encoding process, being by target text
In each word corresponding ternary Vector Groups are input in semantic meaning representation model and are encoded simultaneously, realize parallel work-flow,
The problem of so as to avoid directly being operated on syntax tree and cannot achieve parallel work-flow, when can effectively save coding
Between, improve code efficiency.
To sum up, the present embodiment according to the semantic information of each target word in the corresponding interdependent syntax tree of target text and
Dependence between each target word and other target words determines the corresponding text code vector of target text, then,
Using text coding vector, semantic meaning representation is carried out to target text, thus each target in having fully considered target text
On the basis of semantic relation between word, the semantic meaning representation to target text is realized, and then improves target text semanteme
The accuracy of expression of results.
3rd embodiment
The present embodiment will be to step S104 in first embodiment " according to the dependence between each target word, to mesh
Another specific embodiment of mark text progress semantic meaning representation " is introduced.
It in the present embodiment, not only can be using the dependence between each target word of target text to target text
This progress semantic meaning representation can also further utilize every interdependent path of target text, in conjunction with the two jointly to target text
Carry out semantic meaning representation, wherein every interdependent path is every single sub path in the corresponding interdependent syntax tree of target text, such as Fig. 4
Right part of flg shown in, the terminal of every single sub path is each leaf node of interdependent syntax tree.
Referring to Fig. 7, it illustrates a kind of flow diagram provided in this embodiment for carrying out semantic meaning representation to target text,
The process the following steps are included:
S701: the application scenarios of the semantic meaning representation result of target text are determined.
In the present embodiment, semantic meaning representation is carried out to target text in order to realize, firstly, it is necessary to determine the language of target text
The application scenarios of adopted expression of results, wherein the application scenarios of the semantic meaning representation result of target text can be the emotion point of sentence
The various different application scenes of the natural language processing fields such as class, sentence similarity retrieval, category classification.
S702: different degree of the every interdependent path in the application scenarios is determined respectively.
It in the present embodiment, can be with after the application scenarios of semantic meaning representation result that target text is determined by step S701
According to the application scenarios, every interdependent path of interdependent syntax tree corresponding to target text is analyzed and processed, with true respectively
Make different degree of the every interdependent path in the application scenarios.Wherein, the different degree about every interdependent path can use
Every interdependent path weight size shared in the application scenarios characterizes, for example, different degree is higher, corresponding weighted value more
Greatly, vice versa, it is possible to further using the normalization of the weighted value in every interdependent path as a result, come correspond to characterization every
Different degree of the interdependent path in the semantic meaning representation result of target text.
In a kind of implementation of the present embodiment, step S702 can specifically include step A-D:
Step A: for each word pair, the corresponding term vector of each word of word centering and word centering two are determined
The corresponding relation vector of dependence between word.
Step B: corresponding two term vectors and relation vector are encoded using each word, obtain target text
Text code vector, wherein text coding vector expresses the syntactic information and sequence of terms information of target text.
It should be noted that a kind of realization side of semantic meaning representation is carried out in step A-B and second embodiment to target text
Step S501-S502 in formula is consistent, and related place refers to the introduction of above-mentioned steps S501-S502, and details are not described herein.
Step C: encoding every interdependent path, obtains the path code vector corresponding to every interdependent path,
In, which expresses the routing information that each target word is formed in corresponding interdependent path.
In the present embodiment, it can be compiled according to every interdependent path in the corresponding interdependent syntax tree of target text
Code, to obtain every interdependent respective path code vector in path, wherein every interdependent respective path code vector table in path
The routing information that each target word is formed in every interdependent path has been reached, namely has expressed each mesh on every interdependent path
Mark set membership information of the word on the interdependent path of correspondence.
Specifically, first by the corresponding term vector of target word each in the available every interdependent path step A,
And then can use the corresponding characterized corresponding interdependent path of term vector of all target words in every interdependent path, such as scheme
Shown in 4 right part of flg, for target text " he makes Tom go to take coat ", it can correspond to and generate four interdependent paths, be the 1st respectively
Article " ROOT- cry-he ", the 2nd article " ROOT- is cried-Tom ", the 3rd article " ROOT- is cried-take-go " and the 4th article " ROOT- is cried-take-
Coat ", and then it is jointly corresponding interdependent to characterize to can use the corresponding term vector of all target words in every interdependent path
Path, such as path interdependent for the 1st article " ROOT- cry-he ", i.e., available " crying " and " he " corresponding term vector are characterized
The interdependent path of this, similarly, the available term vector set for characterizing other three interdependent paths.
Further, the corresponding term vector set in every interdependent path can be inputted into the encoding model constructed in advance respectively
It is encoded, is encoded for example, being input to a deep neural network (such as CNN or RNN) encoding model constructed in advance,
To obtain the coding vector of the implicit output of deep neural network end layer, and then can be as corresponding to every interdependent path
Path code vector, to target text carry out semantic meaning representation, as shown in figure 8, " he makes Tom go to take for target text
Coat ", can by it includes the corresponding term vector set in four interdependent paths be separately input to construct in advance depth mind
It is encoded through network code model, and then the path code vector S corresponding to four interdependent paths can be obtained1、S2、S3And
S4。
Step D: using the text code vector of target text and the path code vector in every interdependent path, every is determined
The path weight value in interdependent path, wherein path weight value characterizes different degree of the corresponding interdependent path under corresponding application scenarios.
It should be noted that, although every interdependent path in target text plays a significant role semantic meaning representation, but
When the semantic meaning representation result of target text is in different application scene, may only have the interdependent path in part to be in semantic meaning representation
With main function, for example, when the semantic meaning representation result of target text is in the application scenarios of " sentence emotional semantic classification ", mesh
Relative to interdependent path " hey you ", the importance in semantic meaning representation is higher in interdependent path " I is very glad " in mark text.
Based on this, in embodiment, text code vector using target text with correspond to every interdependent path
Path code vector calculates the path weight value in every interdependent path, wherein path weight value characterizes interdependent path in its application
Different degree under scene.
The formula for specifically calculating the path weight value in every interdependent path is as follows:
Wherein, i indicates i-th interdependent path in the corresponding interdependent syntax tree of target text;viIndicate i-th interdependent road
The path weight value of diameter;The text code vector of h expression target text;SiIndicate the path code vector in i-th interdependent path.
Further, to viAfter being normalized, the calculation formula of the path weight value after being normalized is as follows:
Wherein, i indicates i-th interdependent path in the corresponding interdependent syntax tree of target text;viIndicate i-th interdependent road
The path weight value of diameter;vkIndicate the path weight value in the interdependent path of kth item;M indicates the number in the interdependent path in interdependent syntax tree;
aiIt indicates to viPath weight value after being normalized, after obtained normalization, it should be noted that aiIt represents current interdependent
Weight of the path relative to all interdependent path codes, aiValue is bigger, then represents current interdependent path relative to other interdependent roads
It is more important in current application scene for diameter.
S703: according to the different degree of dependence and every interdependent path between each target word, to target text
This progress semantic meaning representation.
In the present embodiment, after determining different degree of the every interdependent path in application scenarios by step S702, into
One step can according between target word each in target text dependence and every interdependent path in application scenarios
Different degree, to target text carry out semantic meaning representation.
In a kind of implementation of the present embodiment, S703 can specifically include step E-F:
Step E: according to correspond to every interdependent path path code vector sum path weight value, determine correspond to it is all according to
Deposit the path code vector in path.
In this implementation, determine that the path for corresponding to every interdependent path in target text is compiled by step S702
After code vector and the path weight value in every interdependent path, the path code vector in every interdependent path can be weighted and be asked
With to determine the path code vector S corresponding to all interdependent paths of target text, specifically calculate all interdependent paths
The formula of path code vector S is as follows:
Wherein, M indicates the number in interdependent path in the corresponding interdependent syntax tree of target text;I is indicated in interdependent syntax tree
I-th interdependent path;SiIndicate the path code vector in i-th interdependent path;aiPath weight value after indicating normalization.
For example: as shown in figure 8, for target text " he cry Tom go take coat ", determine correspond to its four
The path code vector S in the interdependent path of item1、S2、S3And S4Afterwards, four interdependent paths are calculated separately using above-mentioned steps D
Path weight value after normalization, then summation is weighted to the path code vector in its four interdependent paths, it obtains corresponding to institute
There is the path code vector S in interdependent path.
Step F: using text code vector and corresponding to the path code vector in all interdependent paths, target text is expressed
This semantic information.
In this implementation, by step E determine the path code corresponding to all interdependent paths of target text to
After measuring S, in conjunction with the text code vector h of target text, the semantic meaning representation to target text can be realized.
It specifically, can be by the text code vector h of target text and corresponding to all interdependent paths of target text
Path code vector S is stitched together, to realize the semantic meaning representation to target text.Further, the semantic meaning representation result
It can be used in the application scenarios of natural language processings such as sentence emotional semantic classification, category classification, sentence similarity retrieval.With sentence
For sub- emotional semantic classification scene, which can be used as support vector machines (Support Vector Machine, letter
Claim SVM) input data of classifier or multilayer perceptron (Multi-Layer Perceptron, abbreviation MLP), to training
Sentence sentiment classification model.
To sum up, the present embodiment use by between target word each in target text dependence and every interdependent road
The mode that diameter combines, that is, not only according to the semantic information of each target word in the corresponding interdependent syntax tree of target text with
And the dependence between each target word and other target words, determine the corresponding text code vector of target text,
Also every interdependent path is encoded to obtain the path code vector corresponding to every interdependent path, it is jointly right in conjunction with the two
Target text carries out semantic meaning representation, further improves the accuracy of semantic meaning representation result.
Fourth embodiment
A kind of text semantic expression device will be introduced in the present embodiment, and related content refers to above method implementation
Example.
It is a kind of composition schematic diagram of text semantic expression device provided in this embodiment, the device 900 packet referring to Fig. 9
It includes:
Target text acquiring unit 901, for obtaining target text to be expressed;
Target word obtaining unit 902 obtains each target word for the target text to be carried out word segmentation processing;
Dependence determination unit 903 determines each target word for carrying out interdependent syntactic analysis to the target text
Dependence between language;
Text semantic expression unit 904, for according to the dependence between each target word, to the target text
Carry out semantic meaning representation.
In a kind of implementation of the present embodiment, the dependence determination unit 903 includes:
Word is to subelement is obtained, for the determining domination word with the target word with dependence, obtain by
The target word and the word pair for dominating word composition, wherein the domination word is root node mark or is different from
Another target word of the target word, the root node mark is the mark of the root node of interdependent syntax tree, described interdependent
Syntax tree describes the dependence between each target word;
Dependence determines subelement, is used for word pair corresponding for each target word, determines the word
Dependence between two words of centering.
In a kind of implementation of the present embodiment, the text semantic expression unit 904 includes:
First relation vector determines subelement, for determining each word pair of word centering for each word pair
The corresponding relation vector of dependence between two words of term vector and the word centering answered;
First coding vector obtains subelement, for utilizing each word to corresponding two term vectors and relation vector
It is encoded, obtains the text code vector of the target text, wherein the text code vector expresses the target text
This syntactic information and sequence of terms information;
First semantic information expresses subelement, for using the text code vector, expresses the language of the target text
Adopted information.
In a kind of implementation of the present embodiment, the text semantic expression unit 904 is specifically used for according to each mesh
The dependence between word and every interdependent path are marked, semantic meaning representation is carried out to the target text, wherein every described
Interdependent path is every single sub path in interdependent syntax tree, and the interdependent syntax tree describes interdependent between each target word
Relationship, the terminal of the subpath are the leaf node of the interdependent syntax tree.
In a kind of implementation of the present embodiment, the text semantic expression unit 904 includes:
Application scenarios determine subelement, the application scenarios of the semantic meaning representation result for determining the target text;
Different degree determines subelement, for determining different degree of the every interdependent path in the application scenarios respectively;
Text semantic express subelement, for according between each target word dependence and every interdependent path
Different degree, to the target text carry out semantic meaning representation.
In a kind of implementation of the present embodiment, the different degree determines that subelement includes:
Second relation vector determines subelement, for determining each word pair of word centering for each word pair
The corresponding relation vector of dependence between two words of term vector and the word centering answered;
Second coding vector obtains subelement, for utilizing each word to corresponding two term vectors and relation vector
It is encoded, obtains the text code vector of the target text, wherein the text code vector expresses the target text
This syntactic information and sequence of terms information;
Path code vector obtains subelement, for encoding to every interdependent path, obtains interdependent corresponding to every
The path code vector in path, the path code vector express the path that each target word is formed in the interdependent path
Information;
Path weight value determines subelement, for determining institute using the text code vector and the path code vector
State the path weight value in interdependent path, wherein the path weight value characterizes weight of the interdependent path under the application scenarios
It spends.
In a kind of implementation of the present embodiment, the text semantic expression subelement includes:
Path code vector determines subelement, for according to the path code vector sum path for corresponding to every interdependent path
Weight determines the path code vector for corresponding to all interdependent paths;
Second semantic information expresses subelement, for using the text code vector and corresponding to all interdependent paths
Path code vector, express the semantic information of the target text.
Further, the embodiment of the present application also provides a kind of text semantic expression devices, comprising: processor, memory,
System bus;
The processor and the memory are connected by the system bus;
The memory includes instruction, described instruction for storing one or more programs, one or more of programs
The processor is set to execute any implementation method of above-mentioned text semantic expression when being executed by the processor.
Further, described computer-readable to deposit the embodiment of the present application also provides a kind of computer readable storage medium
Instruction is stored in storage media, when described instruction is run on the terminal device, so that the terminal device executes above-mentioned text
Any implementation method of semantic meaning representation method.
Further, the embodiment of the present application also provides a kind of computer program product, the computer program product exists
When being run on terminal device, so that the terminal device executes any implementation method of above-mentioned text semantic expression.
As seen through the above description of the embodiments, those skilled in the art can be understood that above-mentioned implementation
All or part of the steps in example method can be realized by means of software and necessary general hardware platform.Based on such
Understand, substantially the part that contributes to existing technology can be in the form of software products in other words for the technical solution of the application
It embodies, which can store in storage medium, such as ROM/RAM, magnetic disk, CD, including several
Instruction is used so that a computer equipment (can be the network communications such as personal computer, server, or Media Gateway
Equipment, etc.) execute method described in certain parts of each embodiment of the application or embodiment.
It should be noted that each embodiment in this specification is described in a progressive manner, each embodiment emphasis is said
Bright is the difference from other embodiments, and the same or similar parts in each embodiment may refer to each other.For reality
For applying device disclosed in example, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place
Referring to method part illustration.
It should also be noted that, herein, relational terms such as first and second and the like are used merely to one
Entity or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation
There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to contain
Lid non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those
Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment
Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that
There is also other identical elements in process, method, article or equipment including the element.
The foregoing description of the disclosed embodiments makes professional and technical personnel in the field can be realized or use the application.
Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the application.Therefore, the application
It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one
The widest scope of cause.
Claims (17)
1. a kind of text semantic expression characterized by comprising
Obtain target text to be expressed;
The target text is subjected to word segmentation processing, obtains each target word;
Interdependent syntactic analysis is carried out to the target text, determines the dependence between each target word;
According to the dependence between each target word, semantic meaning representation is carried out to the target text.
2. the method according to claim 1, wherein the dependence between each target word of the determination,
Include:
The determining domination word with the target word with dependence, obtains by the target word and the domination word
The word pair of composition, wherein it is described dominate word be root node mark or different from the target word another target word,
The root node mark is the mark of the root node of interdependent syntax tree, and the interdependent syntax tree describes between each target word
Dependence;
Word pair corresponding for each target word, determines the interdependent pass between two words of the word centering
System.
3. according to the method described in claim 2, it is characterized in that, the dependence according between each target word,
Semantic meaning representation is carried out to the target text, comprising:
For each word pair, the corresponding term vector of each word of word centering and two words of the word centering are determined
The corresponding relation vector of dependence between language;
Corresponding two term vectors and relation vector are encoded using each word, obtain the text of the target text
Coding vector, wherein the text code vector expresses the syntactic information and sequence of terms information of the target text;
Using the text code vector, the semantic information of the target text is expressed.
4. the method according to claim 1, wherein the dependence according between each target word,
Semantic meaning representation is carried out to the target text, comprising:
According between each target word dependence and every interdependent path, semantic table is carried out to the target text
It reaches, wherein every interdependent path is every single sub path in interdependent syntax tree, and the interdependent syntax tree describes each mesh
The dependence between word is marked, the terminal of the subpath is the leaf node of the interdependent syntax tree.
5. according to the method described in claim 4, it is characterized in that, the dependence according between each target word with
And every interdependent path, semantic meaning representation is carried out to the target text, comprising:
Determine the application scenarios of the semantic meaning representation result of the target text;
Different degree of the every interdependent path in the application scenarios is determined respectively;
According to the different degree of dependence and every interdependent path between each target word, the target text is carried out
Semantic meaning representation.
6. according to the method described in claim 5, it is characterized in that, described determine every interdependent path in the applied field respectively
Different degree in scape, comprising:
For each word pair, the corresponding term vector of each word of word centering and two words of the word centering are determined
The corresponding relation vector of dependence between language;
Corresponding two term vectors and relation vector are encoded using each word, obtain the text of the target text
Coding vector, wherein the text code vector expresses the syntactic information and sequence of terms information of the target text;
Every interdependent path is encoded, the path code vector corresponding to every interdependent path, the path code are obtained
Vector expresses the routing information that each target word is formed in the interdependent path;
Using the text code vector and the path code vector, the path weight value in the interdependent path is determined, wherein institute
It states path weight value and characterizes different degree of the interdependent path under the application scenarios.
7. according to the method described in claim 6, it is characterized in that, the dependence according between each target word with
And the different degree in every interdependent path, semantic meaning representation is carried out to the target text, comprising:
According to the path code vector sum path weight value for corresponding to every interdependent path, the road for corresponding to all interdependent paths is determined
Diameter coding vector;
Using the text code vector and corresponding to the path code vector in all interdependent paths, the target text is expressed
Semantic information.
8. a kind of text semantic expression device characterized by comprising
Target text acquiring unit, for obtaining target text to be expressed;
Target word obtaining unit obtains each target word for the target text to be carried out word segmentation processing;
Dependence determination unit determines between each target word for carrying out interdependent syntactic analysis to the target text
Dependence;
Text semantic expression unit, for carrying out language to the target text according to the dependence between each target word
Justice expression.
9. device according to claim 8, which is characterized in that the dependence determination unit includes:
Word, for the determining domination word with the target word with dependence, is obtained by described to subelement is obtained
Target word and the word pair for dominating word composition, wherein the domination word is root node mark or is different from described
Another target word of target word, the root node mark are the mark of the root node of interdependent syntax tree, the interdependent syntax
Tree describes the dependence between each target word;
Dependence determines subelement, is used for word pair corresponding for each target word, determines the word centering
Two words between dependence.
10. device according to claim 9, which is characterized in that the text semantic expression unit includes:
First relation vector determines subelement, for determining that each word of word centering is corresponding for each word pair
The corresponding relation vector of dependence between two words of term vector and the word centering;
First coding vector obtains subelement, for being carried out using each word to corresponding two term vectors and relation vector
Coding, obtains the text code vector of the target text, wherein the text code vector expresses the target text
Syntactic information and sequence of terms information;
First semantic information expresses subelement, for using the text code vector, expresses the semantic letter of the target text
Breath.
11. device according to claim 8, which is characterized in that the text semantic expression unit is specifically used for according to each
Dependence and every interdependent path between a target word carry out semantic meaning representation to the target text, wherein described
Every interdependent path is every single sub path in interdependent syntax tree, and the interdependent syntax tree describes between each target word
Dependence, the terminal of the subpath are the leaf node of the interdependent syntax tree.
12. device according to claim 11, which is characterized in that the text semantic expression unit includes:
Application scenarios determine subelement, the application scenarios of the semantic meaning representation result for determining the target text;
Different degree determines subelement, for determining different degree of the every interdependent path in the application scenarios respectively;
Text semantic expresses subelement, for the weight according to dependence and every interdependent path between each target word
It spends, semantic meaning representation is carried out to the target text.
13. device according to claim 12, which is characterized in that the different degree determines that subelement includes:
Second relation vector determines subelement, for determining that each word of word centering is corresponding for each word pair
The corresponding relation vector of dependence between two words of term vector and the word centering;
Second coding vector obtains subelement, for being carried out using each word to corresponding two term vectors and relation vector
Coding, obtains the text code vector of the target text, wherein the text code vector expresses the target text
Syntactic information and sequence of terms information;
Path code vector obtains subelement, for encoding every interdependent path, obtains corresponding to every interdependent path
Path code vector, the path code vector expresses the path letter that each target word is formed in the interdependent path
Breath;
Path weight value determines subelement, for using the text code vector and the path code vector, determine it is described according to
Deposit the path weight value in path, wherein the path weight value characterizes different degree of the interdependent path under the application scenarios.
14. device according to claim 13, which is characterized in that the text semantic expresses subelement and includes:
Path code vector determines subelement, for according to the path code vector sum path power for corresponding to every interdependent path
Weight determines the path code vector for corresponding to all interdependent paths;
Second semantic information expresses subelement, for using the text code vector and corresponding to the road in all interdependent paths
Diameter coding vector expresses the semantic information of the target text.
15. a kind of text semantic expression device characterized by comprising processor, memory, system bus;
The processor and the memory are connected by the system bus;
The memory includes instruction for storing one or more programs, one or more of programs, and described instruction works as quilt
The processor makes the processor perform claim require 1-7 described in any item methods when executing.
16. a kind of computer readable storage medium, which is characterized in that instruction is stored in the computer readable storage medium,
When described instruction is run on the terminal device, so that the terminal device perform claim requires the described in any item methods of 1-7.
17. a kind of computer program product, which is characterized in that when the computer program product is run on the terminal device, make
It obtains the terminal device perform claim and requires the described in any item methods of 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810942947.2A CN109062902B (en) | 2018-08-17 | 2018-08-17 | Text semantic expression method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810942947.2A CN109062902B (en) | 2018-08-17 | 2018-08-17 | Text semantic expression method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109062902A true CN109062902A (en) | 2018-12-21 |
CN109062902B CN109062902B (en) | 2022-12-06 |
Family
ID=64687412
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810942947.2A Active CN109062902B (en) | 2018-08-17 | 2018-08-17 | Text semantic expression method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109062902B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109992777A (en) * | 2019-03-26 | 2019-07-09 | 浙江大学 | A kind of crucial semantic information extracting method of Chinese medicine state of an illness text based on keyword |
CN111062200A (en) * | 2019-12-12 | 2020-04-24 | 北京声智科技有限公司 | Phonetics generalization method, phonetics identification method, device and electronic equipment |
CN111666738A (en) * | 2020-06-09 | 2020-09-15 | 南京师范大学 | Formalized coding method for motion description natural text |
CN112115700A (en) * | 2020-08-19 | 2020-12-22 | 北京交通大学 | Dependency syntax tree and deep learning based aspect level emotion analysis method |
CN112883741A (en) * | 2021-04-29 | 2021-06-01 | 华南师范大学 | Specific target emotion classification method based on dual-channel graph neural network |
WO2021213155A1 (en) * | 2020-11-25 | 2021-10-28 | 平安科技(深圳)有限公司 | Method, apparatus, medium, and electronic device for adding punctuation to text |
CN113593557A (en) * | 2021-07-27 | 2021-11-02 | 中国平安人寿保险股份有限公司 | Distributed session method, device, computer equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080005094A1 (en) * | 2006-07-01 | 2008-01-03 | Kevin Cunnane | Method and system for finding the focus of a document |
CN106156041A (en) * | 2015-03-26 | 2016-11-23 | 科大讯飞股份有限公司 | Hot information finds method and system |
CN106155999A (en) * | 2015-04-09 | 2016-11-23 | 科大讯飞股份有限公司 | Semantics comprehension on natural language method and system |
CN106844327A (en) * | 2015-12-07 | 2017-06-13 | 科大讯飞股份有限公司 | Text code method and system |
CN107145512A (en) * | 2017-03-31 | 2017-09-08 | 北京大学 | The method and apparatus of data query |
-
2018
- 2018-08-17 CN CN201810942947.2A patent/CN109062902B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080005094A1 (en) * | 2006-07-01 | 2008-01-03 | Kevin Cunnane | Method and system for finding the focus of a document |
CN106156041A (en) * | 2015-03-26 | 2016-11-23 | 科大讯飞股份有限公司 | Hot information finds method and system |
CN106155999A (en) * | 2015-04-09 | 2016-11-23 | 科大讯飞股份有限公司 | Semantics comprehension on natural language method and system |
CN106844327A (en) * | 2015-12-07 | 2017-06-13 | 科大讯飞股份有限公司 | Text code method and system |
CN107145512A (en) * | 2017-03-31 | 2017-09-08 | 北京大学 | The method and apparatus of data query |
Non-Patent Citations (3)
Title |
---|
ROTH M等: "Neural semantic role labeling with dependency path embeddings", 《ARXIV》 * |
吕愿愿等: "利用实体与依存句法结构特征的病历短文本分类方法", 《中国医疗器械杂志》 * |
曹莉丽等: "融合词向量的多特征问句相似度计算方法研究", 《现代计算机(专业版)》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109992777A (en) * | 2019-03-26 | 2019-07-09 | 浙江大学 | A kind of crucial semantic information extracting method of Chinese medicine state of an illness text based on keyword |
CN111062200A (en) * | 2019-12-12 | 2020-04-24 | 北京声智科技有限公司 | Phonetics generalization method, phonetics identification method, device and electronic equipment |
CN111062200B (en) * | 2019-12-12 | 2024-03-05 | 北京声智科技有限公司 | Speaking generalization method, speaking recognition device and electronic equipment |
CN111666738A (en) * | 2020-06-09 | 2020-09-15 | 南京师范大学 | Formalized coding method for motion description natural text |
CN112115700A (en) * | 2020-08-19 | 2020-12-22 | 北京交通大学 | Dependency syntax tree and deep learning based aspect level emotion analysis method |
CN112115700B (en) * | 2020-08-19 | 2024-03-12 | 北京交通大学 | Aspect-level emotion analysis method based on dependency syntax tree and deep learning |
WO2021213155A1 (en) * | 2020-11-25 | 2021-10-28 | 平安科技(深圳)有限公司 | Method, apparatus, medium, and electronic device for adding punctuation to text |
CN112883741A (en) * | 2021-04-29 | 2021-06-01 | 华南师范大学 | Specific target emotion classification method based on dual-channel graph neural network |
CN113593557A (en) * | 2021-07-27 | 2021-11-02 | 中国平安人寿保险股份有限公司 | Distributed session method, device, computer equipment and storage medium |
CN113593557B (en) * | 2021-07-27 | 2023-09-12 | 中国平安人寿保险股份有限公司 | Distributed session method, device, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109062902B (en) | 2022-12-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111488734B (en) | Emotional feature representation learning system and method based on global interaction and syntactic dependency | |
CN109062902A (en) | A kind of text semantic expression and device | |
CN110826336B (en) | Emotion classification method, system, storage medium and equipment | |
CN109933664B (en) | Fine-grained emotion analysis improvement method based on emotion word embedding | |
CN107133224B (en) | Language generation method based on subject word | |
CN107229610B (en) | A kind of analysis method and device of affection data | |
CN111274398B (en) | Method and system for analyzing comment emotion of aspect-level user product | |
CN108519890A (en) | A kind of robustness code abstraction generating method based on from attention mechanism | |
CN111931506B (en) | Entity relationship extraction method based on graph information enhancement | |
CN108780464A (en) | Method and system for handling input inquiry | |
CN111797898B (en) | Online comment automatic reply method based on deep semantic matching | |
CN110196928B (en) | Fully parallelized end-to-end multi-turn dialogue system with domain expansibility and method | |
CN115329127A (en) | Multi-mode short video tag recommendation method integrating emotional information | |
CN111144097B (en) | Modeling method and device for emotion tendency classification model of dialogue text | |
Li et al. | Learning document embeddings by predicting n-grams for sentiment classification of long movie reviews | |
CN114756681B (en) | Evaluation and education text fine granularity suggestion mining method based on multi-attention fusion | |
CN115495568B (en) | Training method and device for dialogue model, dialogue response method and device | |
CN111400584A (en) | Association word recommendation method and device, computer equipment and storage medium | |
JP2006190229A (en) | Opinion extraction learning device and opinion extraction classifying device | |
Lee et al. | Off-Topic Spoken Response Detection Using Siamese Convolutional Neural Networks. | |
CN114528398A (en) | Emotion prediction method and system based on interactive double-graph convolutional network | |
Celikyilmaz et al. | A New Pre-Training Method for Training Deep Learning Models with Application to Spoken Language Understanding. | |
CN111382333B (en) | Case element extraction method in news text sentence based on case correlation joint learning and graph convolution | |
CN116932938A (en) | Link prediction method and system based on topological structure and attribute information | |
JP2007241881A (en) | Method, device and program for creating opinion property determination database, and method, device and program for determining opinion property, and computer readable recording medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |