CN109062902A - A kind of text semantic expression and device - Google Patents

A kind of text semantic expression and device Download PDF

Info

Publication number
CN109062902A
CN109062902A CN201810942947.2A CN201810942947A CN109062902A CN 109062902 A CN109062902 A CN 109062902A CN 201810942947 A CN201810942947 A CN 201810942947A CN 109062902 A CN109062902 A CN 109062902A
Authority
CN
China
Prior art keywords
word
text
target
path
interdependent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810942947.2A
Other languages
Chinese (zh)
Other versions
CN109062902B (en
Inventor
华磊
刘权
陈志刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
iFlytek Co Ltd
Original Assignee
iFlytek Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by iFlytek Co Ltd filed Critical iFlytek Co Ltd
Priority to CN201810942947.2A priority Critical patent/CN109062902B/en
Publication of CN109062902A publication Critical patent/CN109062902A/en
Application granted granted Critical
Publication of CN109062902B publication Critical patent/CN109062902B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

This application discloses a kind of text semantic expression and devices, this method comprises: after getting target text to be expressed, target text carries out word segmentation processing, to obtain each target word, then interdependent syntactic analysis is carried out to target text, with the dependence between each target word of determination, then, semantic meaning representation can be carried out to target text according to the dependence between each target word.It can be seen that, the embodiment of the present application is after getting target text to be expressed, semantic meaning representation is no longer carried out to target text using common one-hot mode, but according to the dependence between target word each in target text, semantic meaning representation is carried out to the target text, that is, the semantic relation in text between word is considered when carrying out semantic meaning representation to target text, to improve the accuracy of semantic meaning representation result.

Description

A kind of text semantic expression and device
Technical field
This application involves natural language processing technique field more particularly to a kind of text semantic expression and devices.
Background technique
Text can refer to sentence or chapter, and the semantic meaning representation of text refers to the text of natural language form, be encoded into one A specific vector, so that this vector includes the semantic information of the text.One good semantic meaning representation result will be helpful to mention Rise the effect and performance of each generic tasks such as text similarity retrieval, emotional semantic classification, domain classification.
Existing semantic meaning representation mode generallys use the mode of one-hot, that is, is indicated in a text with 0,1 Word whether there is, and specifically, the vocabulary including a large amount of words can be pre-created can be by word by taking text A as an example The word for belonging to text A in table is indicated with 1, and the word that text A is not belonging in vocabulary is indicated with 0, to form one by 0 He The text vector of 1 composition expresses the semantic information of text A, and makes the dimension of text vector and the word number phase in vocabulary Together.
But the existing this mode for carrying out semantic meaning representation to text using one-hot mode, it is not intended that in text Semantic relation between word causes semantic meaning representation result inaccurate.
Summary of the invention
The main purpose of the embodiment of the present application is to provide the expression and device of a kind of text semantic, can be improved language The accuracy of adopted expression of results.
The embodiment of the present application provides a kind of text semantic expression, comprising:
Obtain target text to be expressed;
The target text is subjected to word segmentation processing, obtains each target word;
Interdependent syntactic analysis is carried out to the target text, determines the dependence between each target word;
According to the dependence between each target word, semantic meaning representation is carried out to the target text.
Optionally, the dependence between each target word of the determination, comprising:
The determining domination word with the target word with dependence, obtains by the target word and the domination The word pair of word composition, wherein the word that dominates is root node mark or another target different from the target word Word, the root node mark are the marks of the root node of interdependent syntax tree, and the interdependent syntax tree describes each target word Dependence between language;
Word pair corresponding for each target word determines interdependent between two words of the word centering Relationship.
Optionally, the dependence according between each target word carries out semantic meaning representation to the target text, Include:
For each word pair, the corresponding term vector of each word of word centering and the word centering two are determined The corresponding relation vector of dependence between a word;
Corresponding two term vectors and relation vector are encoded using each word, obtain the target text Text code vector, wherein the text code vector expresses the syntactic information and sequence of terms information of the target text;
Using the text code vector, the semantic information of the target text is expressed.
Optionally, the dependence according between each target word carries out semantic meaning representation to the target text, Include:
According between each target word dependence and every interdependent path, the target text is carried out semantic Expression, wherein every interdependent path is every single sub path in interdependent syntax tree, and the interdependent syntax tree describes each Dependence between target word, the terminal of the subpath are the leaf node of the interdependent syntax tree.
Optionally, the dependence and every interdependent path according between each target word, to the target Text carries out semantic meaning representation, comprising:
Determine the application scenarios of the semantic meaning representation result of the target text;
Different degree of the every interdependent path in the application scenarios is determined respectively;
According to the different degree of dependence and every interdependent path between each target word, to the target text Carry out semantic meaning representation.
Optionally, the different degree for determining every interdependent path respectively in the application scenarios, comprising:
For each word pair, the corresponding term vector of each word of word centering and the word centering two are determined The corresponding relation vector of dependence between a word;
Corresponding two term vectors and relation vector are encoded using each word, obtain the target text Text code vector, wherein the text code vector expresses the syntactic information and sequence of terms information of the target text;
Every interdependent path is encoded, the path code vector corresponding to every interdependent path, the path are obtained Coding vector expresses the routing information that each target word is formed in the interdependent path;
Using the text code vector and the path code vector, the path weight value in the interdependent path is determined, In, the path weight value characterizes different degree of the interdependent path under the application scenarios.
Optionally, the different degree of the dependence and every interdependent path according between each target word is right The target text carries out semantic meaning representation, comprising:
According to the path code vector sum path weight value for corresponding to every interdependent path, determines and correspond to all interdependent paths Path code vector;
Using the text code vector and corresponding to the path code vector in all interdependent paths, the target is expressed The semantic information of text.
The embodiment of the present application also provides a kind of text semantic expression devices, comprising:
Target text acquiring unit, for obtaining target text to be expressed;
Target word obtaining unit obtains each target word for the target text to be carried out word segmentation processing;
Dependence determination unit determines each target word for carrying out interdependent syntactic analysis to the target text Between dependence;
Text semantic expression unit, for according to the dependence between each target word, to the target text into Row semantic meaning representation.
Optionally, the dependence determination unit includes:
Word is to subelement is obtained, for the determining domination word with the target word with dependence, obtain by The target word and the word pair for dominating word composition, wherein the domination word is root node mark or is different from Another target word of the target word, the root node mark is the mark of the root node of interdependent syntax tree, described interdependent Syntax tree describes the dependence between each target word;
Dependence determines subelement, is used for word pair corresponding for each target word, determines the word Dependence between two words of centering.
Optionally, the text semantic expression unit includes:
First relation vector determines subelement, for determining each word pair of word centering for each word pair The corresponding relation vector of dependence between two words of term vector and the word centering answered;
First coding vector obtains subelement, for utilizing each word to corresponding two term vectors and relation vector It is encoded, obtains the text code vector of the target text, wherein the text code vector expresses the target text This syntactic information and sequence of terms information;
First semantic information expresses subelement, for using the text code vector, expresses the language of the target text Adopted information.
Optionally, the text semantic expression unit, specifically for according to the dependence between each target word with And every interdependent path, semantic meaning representation is carried out to the target text, wherein every interdependent path is in interdependent syntax tree Every single sub path, the interdependent syntax tree describes the dependence between each target word, the terminal of the subpath For the leaf node of the interdependent syntax tree.
Optionally, the text semantic expression unit includes:
Application scenarios determine subelement, the application scenarios of the semantic meaning representation result for determining the target text;
Different degree determines subelement, for determining different degree of the every interdependent path in the application scenarios respectively;
Text semantic express subelement, for according between each target word dependence and every interdependent path Different degree, to the target text carry out semantic meaning representation.
Optionally, the different degree determines that subelement includes:
Second relation vector determines subelement, for determining each word pair of word centering for each word pair The corresponding relation vector of dependence between two words of term vector and the word centering answered;
Second coding vector obtains subelement, for utilizing each word to corresponding two term vectors and relation vector It is encoded, obtains the text code vector of the target text, wherein the text code vector expresses the target text This syntactic information and sequence of terms information;
Path code vector obtains subelement, for encoding to every interdependent path, obtains interdependent corresponding to every The path code vector in path, the path code vector express the path that each target word is formed in the interdependent path Information;
Path weight value determines subelement, for determining institute using the text code vector and the path code vector State the path weight value in interdependent path, wherein the path weight value characterizes weight of the interdependent path under the application scenarios It spends.
Optionally, the text semantic expression subelement includes:
Path code vector determines subelement, for according to the path code vector sum path for corresponding to every interdependent path Weight determines the path code vector for corresponding to all interdependent paths;
Second semantic information expresses subelement, for using the text code vector and corresponding to all interdependent paths Path code vector, express the semantic information of the target text.
The embodiment of the present application also provides a kind of text semantic expression devices, comprising: processor, memory, system bus;
The processor and the memory are connected by the system bus;
The memory includes instruction, described instruction for storing one or more programs, one or more of programs The processor is set to execute any one implementation in above-mentioned text semantic expression when being executed by the processor.
The embodiment of the present application also provides a kind of computer readable storage medium, deposited in the computer readable storage medium Instruction is contained, when described instruction is run on the terminal device, so that the terminal device executes above-mentioned text semantic expression side Any one implementation in method.
The embodiment of the present application also provides a kind of computer program product, the computer program product is on the terminal device When operation, so that the terminal device executes any one implementation in above-mentioned text semantic expression.
A kind of text semantic expression provided by the embodiments of the present application and device, are getting target text to be expressed Afterwards, word segmentation processing will be carried out to target text, to obtain each target word, interdependent syntax then is carried out to each target text Analysis, determine the dependence between each target word, then, can according to the dependence between each target word, Semantic meaning representation is carried out to target text.As it can be seen that the embodiment of the present application after getting target text to be expressed, no longer uses normal The one-hot mode seen carries out semantic meaning representation to target text, but according between target word each in target text according to Relationship is deposited, semantic meaning representation is carried out to the target text, that is, consider word in text when carrying out semantic meaning representation to target text Between semantic relation, to improve the accuracy of semantic meaning representation result.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is the application Some embodiments for those of ordinary skill in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.
Fig. 1 is a kind of flow diagram of text semantic expression provided by the embodiments of the present application;
The flow diagram of dependence of the Fig. 2 between each target word of determination provided by the embodiments of the present application;
Fig. 3 is the result schematic diagram provided by the embodiments of the present application that interdependent syntactic analysis is carried out to target text;
Fig. 4 is the structural schematic diagram in interdependent syntax tree provided by the embodiments of the present application and interdependent path;
Fig. 5 is one of the flow diagram provided by the embodiments of the present application that semantic meaning representation is carried out to target text;
Fig. 6 is the structural schematic diagram of the text code vector provided by the embodiments of the present application for generating target text;
Fig. 7 is the two of the flow diagram provided by the embodiments of the present application that semantic meaning representation is carried out to target text;
Fig. 8 is the structural schematic diagram of the path code vector provided by the embodiments of the present application for generating target text;
Fig. 9 is a kind of composition schematic diagram of text semantic expression device provided by the embodiments of the present application.
Specific embodiment
In some text semantic expressions, semantic meaning representation is usually carried out to text using one-hot mode.But It is that the dimension of vocabulary is general excessively high (Chinese common words are more than 100,000) in this one-hot expression way, causes to calculate multiple It is miscellaneous to spend greatly, meanwhile, this expression way has ignored the semantic relation between text word, for example, word " apple " and " pears Son " although all indicate fruit, the two words be in one-hot expression way it is completely irrelevant, be expressed as 0 or 1, That is, not accounting in incidence relation semantically between word, so as to cause text semantic expression of results inaccuracy.
To solve drawbacks described above, the embodiment of the present application provides a kind of text semantic expression, to be expressed getting Text after, first to the text carry out word segmentation processing, obtain each word in the text, then, to the text carry out it is interdependent Syntactic analysis, to determine the dependence in the text between each word, then, according to the interdependent pass between each word System carries out semantic meaning representation to the text.As it can be seen that the embodiment of the present application no longer carries out text using traditional one-hot mode Semantic meaning representation, but according to the dependence between word each in text, semantic meaning representation is carried out to text, that is, consider text Influence of the semantic relation to text semantic expression of results in this between each word, to improve text semantic expression of results Accuracy.
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art Every other embodiment obtained without making creative work, shall fall in the protection scope of this application.
First embodiment
Be a kind of flow diagram of text semantic expression provided in this embodiment referring to Fig. 1, this method include with Lower step:
S101: target text to be expressed is obtained.
In the present embodiment, the present embodiment will be used to realize any text definition of text semantic expression for target text. Also, the present embodiment does not limit the languages type of target text, for example, target text can be Chinese text or English text Deng;The present embodiment does not limit the length of target text yet, for example, target text can be sentence text, be also possible to chapter text This;The present embodiment does not limit the source of target text yet, for example, target text can be it is from speech recognition as a result, It can be the daily record data being collected into from each operation system of platform;The present embodiment does not limit the type of target text yet, For example, target text can be certain words in people's every-day language, it is also possible to speech draft, magazine article, literary works etc. In part text.
It is understood that sentence text refers to a sentence, it is the set of each word, chapter text refers to one The set of consecutive sentence can be according to subsequent step after obtaining sentence text or chapter text as target text to be expressed Semantic meaning representation is carried out to it.
S102: carrying out word segmentation processing for target text, obtains each target word.
In the present embodiment, after target text to be expressed being got by step S101, in order to target text It realizes more accurately semantic meaning representation, word segmentation processing can be carried out to target text, to obtain each word for including in target text Each word that participle obtains is defined as target word here by language.
Wherein, when target text is sentence text, it can use the segmenting method of existing or future appearance to target text This progress word segmentation processing obtains each word in target text, as each target word, such as, it is assumed that target text is " he cry Tom go take coat ", then after carrying out word segmentation processing to it, available six target words be " he, cry, Tom, go, It takes, coat ".
Alternatively, needing first to carry out subordinate sentence processing to target text, obtaining target text if target text is chapter text Each subordinate sentence text, recycle segmenting method to each subordinate sentence text carry out word segmentation processing, obtain each in target text Word, as each target word.
S103: interdependent syntactic analysis is carried out to target text, determines the dependence between each target word.
In the present embodiment, by step S102, after obtaining the corresponding each target word of target text, further, It can use interdependent syntactic analysis method and interdependent syntactic analysis carried out to target text, to determine between each target word Dependence, wherein the dependence between target word refers to the semantic association relationship between target word, for example, In six target words " he, cry, Tom, go, take, coat " of target text " he makes Tom go to take coat ", " he " and " crying " Between semantic association relationship be subject-predicate relationship.
It should be noted that specifically carrying out interdependent syntactic analysis between each target word of determination to target text The realization process of dependence can be found in the related introduction of subsequent second embodiment.
S104: according to the dependence between each target word, semantic meaning representation is carried out to target text.
In the present embodiment, by step S103, the dependence in target text between each target word is determined It afterwards, may further be according to dependence between each target word, such as subject-predicate relationship, dynamic guest's relationship etc., to target text This progress semantic meaning representation.
Specifically, it during carrying out semantic meaning representation to target text, can be carried out first according to target text Interdependent syntactic analysis as a result, the corresponding interdependent syntax tree of building target text, then according to mesh each in interdependent syntax tree The dependence between the semantic information and each target word and other target words of word is marked, determines that target text is corresponding Text code vector semantic table is carried out to target text it is then possible to using the corresponding text code vector of target text It reaches.It should be noted that specifically carrying out semantic meaning representation to target text according to the dependence between each target word Realization process can be found in the related introduction of subsequent second embodiment.
It further, can also be according to the interdependent syntax tree while building target text corresponding interdependent syntax tree Tree construction, obtain the corresponding a plurality of interdependent path of target text, wherein every interdependent path is to the semantic table of target text Up to playing a significant role, after the corresponding text code vector of target text has been determined, mesh can be further determined that on its basis The corresponding path code vector in every interdependent path in text is marked, which, which is able to reflect out, corresponds on interdependent path Set membership of each target word on the interdependent path of correspondence.And then it can be according to the dependence between each target word And every interdependent path in target text, it realizes and more accurate semantic meaning representation is carried out to target text, that is, can be according to mesh The corresponding path code vector in every interdependent path is marked in the corresponding text code vector of text and target text, is realized to mesh It marks text and carries out more accurate semantic meaning representation.
Therefore, it in order to improve the accuracy rate of target text semantic meaning representation, can also determine between each target word Dependence after, further combined with every in target text interdependent path, realize the semantic meaning representation to target text, specifically Dependence between each target word is combined with every in target text interdependent path, semanteme is carried out to target text The realization process of expression can be found in the related introduction of subsequent third embodiment.
To sum up, a kind of text semantic expression provided in this embodiment will after getting target text to be expressed Target text carries out word segmentation processing, to obtain each target word, then interdependent syntactic analysis is carried out to target text, with determination Dependence between each target word then can be according to the dependence between each target word, to target text Carry out semantic meaning representation.As it can be seen that the embodiment of the present application after getting target text to be expressed, no longer uses common one- Hot mode carries out semantic meaning representation to target text, but according to the dependence between target word each in target text, it is right The target text carries out semantic meaning representation, that is, the language in text between word is considered when carrying out semantic meaning representation to target text Adopted relationship, to improve the accuracy of semantic meaning representation result.
Second embodiment
The present embodiment first will be to step S103 in first embodiment " determining the dependence between each target word " Specific embodiment be introduced.
Referring to fig. 2, it illustrates the processes of the dependence between each target word of determination provided in this embodiment to show Be intended to, the process the following steps are included:
S201: the determining domination word with target word with dependence obtains by target word and dominates word group At word pair, wherein dominate word be root node mark or different from target word another target word, root node mark It is the mark of the root node of interdependent syntax tree, interdependent syntax tree describes the dependence between each target word.
In the present embodiment, target text is being segmented using segmenting method, it is corresponding each obtains target text After target word, interdependent syntactic analysis may further be carried out to target text using interdependent syntactic analysis method, for example, can be with Using Harbin Institute of Technology's language technology platform (Language Technology Platform, abbreviation LTP) to target text carry out according to Syntactic analysis is deposited, is analyzed as a result, can determine have in target text with each target word according to the analysis result The domination word of dependence, and then each target word and its corresponding domination word can be formed into word pair, together When, it can also be according to the analysis as a result, the corresponding interdependent syntax tree of building target text, the interdependent syntax tree can describe target Dependence in text between each target word.
Wherein, for each target word in target text, there is the domination word of dependence with the target word, It is different from another target of the target word in the mark of root node in interdependent syntax tree or interdependent syntax tree Word.
For example: it is based on the example above, after carrying out word segmentation processing to target text " he makes Tom go to take coat ", is obtained Six target words are " he, cry, Tom, go, take, coat ", using LTP to the target text " he makes Tom go by coat " into After the interdependent syntactic analysis of row, as shown in figure 3, the box of Fig. 3 bottommost is illustrated, to target text, " he cries obtained analysis result Tom goes to take coat " carry out the result of interdependent syntactic analysis, wherein and there are " inputs " that one is directed toward it for each target word Arrow, and the word of the other end of arrow connection is the domination word for having dependence with each target word, each mesh The corresponding domination word of mark word may be constructed word pair, also, target word is right by its in semantic association relationship What the domination word answered was dominated.
As shown in figure 3, the domination word of target word " he " is " crying ", the dependence that the two has is " subject-predicate relationship (subject-verb, abbreviation SBV) ", the two can partner word pair;The domination word of target word " crying " is interdependent sentence The root node of method tree identifies " ROOT ", and the dependence that the two has is " Key Relationships (head, abbreviation HED) ", and the two can be with Partner word pair, indicates the core of entire target text sentence;The domination word of target word " Tom " is " crying ", the two The dependence having is " and language (double, abbreviation DBL) ", and the two can partner word pair;Target word " going " Dominating word is " taking ", and the dependence that the two has is " verbal endocentric phrase (adverbial, abbreviation ADV) ", and the two can form A pair of of word pair;The domination word of target word " taking " is " crying ", and the dependence that the two has is " dynamic guest's relationship (verb- Object, abbreviation VOB) ", the two can partner word pair;The domination word of target word " coat " is " taking ", the two tool Some dependences be also " VOB), the two can partner word pair.
Meanwhile based on it is shown in Fig. 3 to target text " he makes Tom go to take coat " interdependent syntactic analysis of progress as a result, Wherein every a pair of of word centering can be dominated into word as the father node of corresponding target word, further by the interdependent syntax Analysis result is launched into the form of tree,, should be according to as shown in the left hand view of Fig. 4 to construct the corresponding interdependent syntax tree of target text Dependence in target text between each target word can be described by depositing syntax tree, from each leaf of the interdependent syntax tree Target word in node starts successively to find corresponding father node upwards, the available target word it is corresponding one it is interdependent Path starts successively to search out corresponding father node upwards to be " taking " -> " crying " -> " ROOT " by taking leaf node " going " as an example, this The corresponding interdependent path of the available target word " going " of sample is " ROOT- is cried, and-taking-goes ", as the 3rd article of Fig. 4 right part of flg according to It deposits shown in path, similarly, corresponding three interdependent paths such as Fig. 4 of target word in other available three leaf nodes is right Shown in the figure of side, that is to say, that produce four interdependent paths according to the interdependent syntax tree in left side is corresponding.
S202: word pair corresponding for each target word, determine between two words of word centering according to Deposit relationship.
In the present embodiment, S201 gets in target text each target word and corresponding through the above steps The word of word composition is dominated to rear, word pair corresponding for each target word can determine word centering mesh It marks word and it dominates the dependence between word, for example, being based on the example above, " he makes Tom go to take for target text Target word " he " and its domination word " crying " in coat ", can determine the interdependent pass of the word centering " he " and " crying " System is " SBV ", similarly, can determine that the dependence of word centering " Tom " and " crying " is " DBL " etc..
After determining the dependence between each target word by step S201-S202, next, the present embodiment It will be by following step S501-S503, to step S104 in first embodiment " according to the interdependent pass between each target word The specific embodiment of system, to target text progress semantic meaning representation " is introduced.
In the present embodiment, according to the semantic information of target word each in interdependent syntax tree and each target word with Dependence between other target words determines the corresponding text code vector of target text, it is then possible to compile using text Code vector realizes the semantic meaning representation to target text.
Referring to Fig. 5, it illustrates a kind of flow diagram provided in this embodiment for carrying out semantic meaning representation to target text, The process the following steps are included:
S501: for each word pair, the corresponding term vector of each word of word centering and word centering two are determined The corresponding relation vector of dependence between word.
In the present embodiment, each word that target text includes is obtained to rear by step S201, it may further root According to each word pair, the corresponding text code vector of target text is generated, to realize the semantic meaning representation to target text, is had Body can use the semantic meaning representation model constructed in advance and generate text code vector.
Wherein, the semantic meaning representation model constructed in advance can be encoding model neural network based, such as based on convolution mind Through network (Convolutional Neural Network, abbreviation CNN) or Recognition with Recurrent Neural Network (Recurrent Neural Network, abbreviation RNN) etc. encoding model.
Specifically, it during carrying out semantic meaning representation to target text, for each word pair, can determine first Out the corresponding relationship of dependence between two words of the corresponding term vector of each word of word centering and word centering to Amount indicates that word centering dominates the term vector of word with y for example, can indicate the term vector of word centering target word with x, And target word and the dependence dominated between word can then be indicated with relation vector r.Wherein it is possible to using word vectors Change method or for generate the correlation model of term vector to target word and dominate word carry out word vectors, obtain word to X and term vector y is measured, it is, for example, possible to use Word2vec method or Glove (Global Vectors for Word The open source softwares such as Representation), target word to word centering and dominate word and carry out word vectors, obtain Term vector x and term vector y, and target word and the relation vector r dominated between word then can be directly used at random initially The mode of change obtains, it is to be understood that and the word of different dependences is different corresponding relation vector r, also, after It is continuous to be updated according to r value of the semantic meaning representation model to initialization.
S502: corresponding two term vectors and relation vector are encoded using each word, obtain target text Text code vector, wherein text code vector expresses the syntactic information and sequence of terms information of target text.
In the present embodiment, the term vector x of each word centering target word determined by step S501, dominate word Term vector y and after representing the relation vector r of dependence between the two, three can be spliced into a Vector Groups, used To indicate corresponding word pair, for example, three can be spliced into ternary Vector Groups p=[x, r, y], to indicate target text A word pair.
For example, as shown in fig. 6, for by target word " he " and dominating word " crying " word pair for constituting, the two Dependence is SBV, it is assumed that target word " he " and dominates word " crying " progress word respectively using Word2vec method After vectorization, obtaining " he " corresponding term vector is x1, " crying " corresponding term vector is y1, then by way of random initializtion The relation vector for obtaining indicating dependence between the two is r1, and then three vectors can be spliced, obtain ternary to Amount group p1=[x1,r1,y1], to indicate that the word pair being made of target word " he " and domination word " crying " similarly can To use ternary Vector Groups p2、p3、p4、p5、p6Respectively indicate corresponding word to " crying " and " ROOT ", " Tom " and " crying ", " going " and " taking ", " taking " and " crying ", " coat " and " taking ", as shown in Figure 6.
It is understood that I=[p can be denoted as target text I1,p2…pi…pN], wherein N is indicated The number of target word in target text I, and piI-th of word is then represented in target text to corresponding ternary Vector Groups, For example, as shown in fig. 6, I=[p can be denoted as target text " he makes Tom go to take coat "1,p2,p3,p4,p5, p6], wherein it is contained in the target text six target words " he, cry, Tom, go, take, coat ", and p1、p2、p3、p4、 p5、p6Six words belonging to this six target words have been respectively represented to corresponding ternary Vector Groups.
Further, it can use each word in target text to encode corresponding ternary Vector Groups, obtain mesh The corresponding text code vector of text is marked, text coding vector expresses the syntactic information and sequence of terms letter of target text Breath.Wherein, the syntactic information of target text refers to the grammatical relation between each target word of composition target text, and word Word order column information then refers to sequencing information of each target word in target text.
Specifically, word each in target text can be inputted into the semanteme constructed in advance to corresponding ternary Vector Groups It is encoded in expression model, for example, being input to a deep neural network (such as CNN or RNN) the coding mould constructed in advance Type is encoded, so that obtaining deep neural network end layer implies exports coding vector h, and then can be as target text Corresponding text code vector, to carry out semantic meaning representation to target text.For example, as shown in fig. 6, for target text " he Cry Tom go take coat ", can by it includes six words to corresponding ternary Vector Groups p1、p2、p3、p4、p5、p6It is input to pre- The deep neural network encoding model first constructed is encoded, and then corresponding text code vector h can be obtained.
S503: text code vector is used, the semantic information of target text is expressed.
In the present embodiment, it after the text code vector h that target text is obtained by step S502, may further use Text coding vector h, expresses the semantic information of target text, as shown in fig. 6, text code vector h can be used, expresses mesh The semantic information of mark text " he makes Tom go to take coat ".Wherein, during obtaining text code vector h, by target Semantic information, the syntactic information of target text (grammatical relation between each target word) of each target word in text And sequence of terms information (sequencing information of each target word in target text) is encoded in text coding vector, institute Semantic meaning representation can be carried out to target text using the text code vector obtained.
It is understood that if target text is chapter text, it can also be by the sequence of sentence each in chapter text Column information is encoded in the corresponding text code vector h of target text.Also, due in an encoding process, being by target text In each word corresponding ternary Vector Groups are input in semantic meaning representation model and are encoded simultaneously, realize parallel work-flow, The problem of so as to avoid directly being operated on syntax tree and cannot achieve parallel work-flow, when can effectively save coding Between, improve code efficiency.
To sum up, the present embodiment according to the semantic information of each target word in the corresponding interdependent syntax tree of target text and Dependence between each target word and other target words determines the corresponding text code vector of target text, then, Using text coding vector, semantic meaning representation is carried out to target text, thus each target in having fully considered target text On the basis of semantic relation between word, the semantic meaning representation to target text is realized, and then improves target text semanteme The accuracy of expression of results.
3rd embodiment
The present embodiment will be to step S104 in first embodiment " according to the dependence between each target word, to mesh Another specific embodiment of mark text progress semantic meaning representation " is introduced.
It in the present embodiment, not only can be using the dependence between each target word of target text to target text This progress semantic meaning representation can also further utilize every interdependent path of target text, in conjunction with the two jointly to target text Carry out semantic meaning representation, wherein every interdependent path is every single sub path in the corresponding interdependent syntax tree of target text, such as Fig. 4 Right part of flg shown in, the terminal of every single sub path is each leaf node of interdependent syntax tree.
Referring to Fig. 7, it illustrates a kind of flow diagram provided in this embodiment for carrying out semantic meaning representation to target text, The process the following steps are included:
S701: the application scenarios of the semantic meaning representation result of target text are determined.
In the present embodiment, semantic meaning representation is carried out to target text in order to realize, firstly, it is necessary to determine the language of target text The application scenarios of adopted expression of results, wherein the application scenarios of the semantic meaning representation result of target text can be the emotion point of sentence The various different application scenes of the natural language processing fields such as class, sentence similarity retrieval, category classification.
S702: different degree of the every interdependent path in the application scenarios is determined respectively.
It in the present embodiment, can be with after the application scenarios of semantic meaning representation result that target text is determined by step S701 According to the application scenarios, every interdependent path of interdependent syntax tree corresponding to target text is analyzed and processed, with true respectively Make different degree of the every interdependent path in the application scenarios.Wherein, the different degree about every interdependent path can use Every interdependent path weight size shared in the application scenarios characterizes, for example, different degree is higher, corresponding weighted value more Greatly, vice versa, it is possible to further using the normalization of the weighted value in every interdependent path as a result, come correspond to characterization every Different degree of the interdependent path in the semantic meaning representation result of target text.
In a kind of implementation of the present embodiment, step S702 can specifically include step A-D:
Step A: for each word pair, the corresponding term vector of each word of word centering and word centering two are determined The corresponding relation vector of dependence between word.
Step B: corresponding two term vectors and relation vector are encoded using each word, obtain target text Text code vector, wherein text coding vector expresses the syntactic information and sequence of terms information of target text.
It should be noted that a kind of realization side of semantic meaning representation is carried out in step A-B and second embodiment to target text Step S501-S502 in formula is consistent, and related place refers to the introduction of above-mentioned steps S501-S502, and details are not described herein.
Step C: encoding every interdependent path, obtains the path code vector corresponding to every interdependent path, In, which expresses the routing information that each target word is formed in corresponding interdependent path.
In the present embodiment, it can be compiled according to every interdependent path in the corresponding interdependent syntax tree of target text Code, to obtain every interdependent respective path code vector in path, wherein every interdependent respective path code vector table in path The routing information that each target word is formed in every interdependent path has been reached, namely has expressed each mesh on every interdependent path Mark set membership information of the word on the interdependent path of correspondence.
Specifically, first by the corresponding term vector of target word each in the available every interdependent path step A, And then can use the corresponding characterized corresponding interdependent path of term vector of all target words in every interdependent path, such as scheme Shown in 4 right part of flg, for target text " he makes Tom go to take coat ", it can correspond to and generate four interdependent paths, be the 1st respectively Article " ROOT- cry-he ", the 2nd article " ROOT- is cried-Tom ", the 3rd article " ROOT- is cried-take-go " and the 4th article " ROOT- is cried-take- Coat ", and then it is jointly corresponding interdependent to characterize to can use the corresponding term vector of all target words in every interdependent path Path, such as path interdependent for the 1st article " ROOT- cry-he ", i.e., available " crying " and " he " corresponding term vector are characterized The interdependent path of this, similarly, the available term vector set for characterizing other three interdependent paths.
Further, the corresponding term vector set in every interdependent path can be inputted into the encoding model constructed in advance respectively It is encoded, is encoded for example, being input to a deep neural network (such as CNN or RNN) encoding model constructed in advance, To obtain the coding vector of the implicit output of deep neural network end layer, and then can be as corresponding to every interdependent path Path code vector, to target text carry out semantic meaning representation, as shown in figure 8, " he makes Tom go to take for target text Coat ", can by it includes the corresponding term vector set in four interdependent paths be separately input to construct in advance depth mind It is encoded through network code model, and then the path code vector S corresponding to four interdependent paths can be obtained1、S2、S3And S4
Step D: using the text code vector of target text and the path code vector in every interdependent path, every is determined The path weight value in interdependent path, wherein path weight value characterizes different degree of the corresponding interdependent path under corresponding application scenarios.
It should be noted that, although every interdependent path in target text plays a significant role semantic meaning representation, but When the semantic meaning representation result of target text is in different application scene, may only have the interdependent path in part to be in semantic meaning representation With main function, for example, when the semantic meaning representation result of target text is in the application scenarios of " sentence emotional semantic classification ", mesh Relative to interdependent path " hey you ", the importance in semantic meaning representation is higher in interdependent path " I is very glad " in mark text.
Based on this, in embodiment, text code vector using target text with correspond to every interdependent path Path code vector calculates the path weight value in every interdependent path, wherein path weight value characterizes interdependent path in its application Different degree under scene.
The formula for specifically calculating the path weight value in every interdependent path is as follows:
Wherein, i indicates i-th interdependent path in the corresponding interdependent syntax tree of target text;viIndicate i-th interdependent road The path weight value of diameter;The text code vector of h expression target text;SiIndicate the path code vector in i-th interdependent path.
Further, to viAfter being normalized, the calculation formula of the path weight value after being normalized is as follows:
Wherein, i indicates i-th interdependent path in the corresponding interdependent syntax tree of target text;viIndicate i-th interdependent road The path weight value of diameter;vkIndicate the path weight value in the interdependent path of kth item;M indicates the number in the interdependent path in interdependent syntax tree; aiIt indicates to viPath weight value after being normalized, after obtained normalization, it should be noted that aiIt represents current interdependent Weight of the path relative to all interdependent path codes, aiValue is bigger, then represents current interdependent path relative to other interdependent roads It is more important in current application scene for diameter.
S703: according to the different degree of dependence and every interdependent path between each target word, to target text This progress semantic meaning representation.
In the present embodiment, after determining different degree of the every interdependent path in application scenarios by step S702, into One step can according between target word each in target text dependence and every interdependent path in application scenarios Different degree, to target text carry out semantic meaning representation.
In a kind of implementation of the present embodiment, S703 can specifically include step E-F:
Step E: according to correspond to every interdependent path path code vector sum path weight value, determine correspond to it is all according to Deposit the path code vector in path.
In this implementation, determine that the path for corresponding to every interdependent path in target text is compiled by step S702 After code vector and the path weight value in every interdependent path, the path code vector in every interdependent path can be weighted and be asked With to determine the path code vector S corresponding to all interdependent paths of target text, specifically calculate all interdependent paths The formula of path code vector S is as follows:
Wherein, M indicates the number in interdependent path in the corresponding interdependent syntax tree of target text;I is indicated in interdependent syntax tree I-th interdependent path;SiIndicate the path code vector in i-th interdependent path;aiPath weight value after indicating normalization.
For example: as shown in figure 8, for target text " he cry Tom go take coat ", determine correspond to its four The path code vector S in the interdependent path of item1、S2、S3And S4Afterwards, four interdependent paths are calculated separately using above-mentioned steps D Path weight value after normalization, then summation is weighted to the path code vector in its four interdependent paths, it obtains corresponding to institute There is the path code vector S in interdependent path.
Step F: using text code vector and corresponding to the path code vector in all interdependent paths, target text is expressed This semantic information.
In this implementation, by step E determine the path code corresponding to all interdependent paths of target text to After measuring S, in conjunction with the text code vector h of target text, the semantic meaning representation to target text can be realized.
It specifically, can be by the text code vector h of target text and corresponding to all interdependent paths of target text Path code vector S is stitched together, to realize the semantic meaning representation to target text.Further, the semantic meaning representation result It can be used in the application scenarios of natural language processings such as sentence emotional semantic classification, category classification, sentence similarity retrieval.With sentence For sub- emotional semantic classification scene, which can be used as support vector machines (Support Vector Machine, letter Claim SVM) input data of classifier or multilayer perceptron (Multi-Layer Perceptron, abbreviation MLP), to training Sentence sentiment classification model.
To sum up, the present embodiment use by between target word each in target text dependence and every interdependent road The mode that diameter combines, that is, not only according to the semantic information of each target word in the corresponding interdependent syntax tree of target text with And the dependence between each target word and other target words, determine the corresponding text code vector of target text, Also every interdependent path is encoded to obtain the path code vector corresponding to every interdependent path, it is jointly right in conjunction with the two Target text carries out semantic meaning representation, further improves the accuracy of semantic meaning representation result.
Fourth embodiment
A kind of text semantic expression device will be introduced in the present embodiment, and related content refers to above method implementation Example.
It is a kind of composition schematic diagram of text semantic expression device provided in this embodiment, the device 900 packet referring to Fig. 9 It includes:
Target text acquiring unit 901, for obtaining target text to be expressed;
Target word obtaining unit 902 obtains each target word for the target text to be carried out word segmentation processing;
Dependence determination unit 903 determines each target word for carrying out interdependent syntactic analysis to the target text Dependence between language;
Text semantic expression unit 904, for according to the dependence between each target word, to the target text Carry out semantic meaning representation.
In a kind of implementation of the present embodiment, the dependence determination unit 903 includes:
Word is to subelement is obtained, for the determining domination word with the target word with dependence, obtain by The target word and the word pair for dominating word composition, wherein the domination word is root node mark or is different from Another target word of the target word, the root node mark is the mark of the root node of interdependent syntax tree, described interdependent Syntax tree describes the dependence between each target word;
Dependence determines subelement, is used for word pair corresponding for each target word, determines the word Dependence between two words of centering.
In a kind of implementation of the present embodiment, the text semantic expression unit 904 includes:
First relation vector determines subelement, for determining each word pair of word centering for each word pair The corresponding relation vector of dependence between two words of term vector and the word centering answered;
First coding vector obtains subelement, for utilizing each word to corresponding two term vectors and relation vector It is encoded, obtains the text code vector of the target text, wherein the text code vector expresses the target text This syntactic information and sequence of terms information;
First semantic information expresses subelement, for using the text code vector, expresses the language of the target text Adopted information.
In a kind of implementation of the present embodiment, the text semantic expression unit 904 is specifically used for according to each mesh The dependence between word and every interdependent path are marked, semantic meaning representation is carried out to the target text, wherein every described Interdependent path is every single sub path in interdependent syntax tree, and the interdependent syntax tree describes interdependent between each target word Relationship, the terminal of the subpath are the leaf node of the interdependent syntax tree.
In a kind of implementation of the present embodiment, the text semantic expression unit 904 includes:
Application scenarios determine subelement, the application scenarios of the semantic meaning representation result for determining the target text;
Different degree determines subelement, for determining different degree of the every interdependent path in the application scenarios respectively;
Text semantic express subelement, for according between each target word dependence and every interdependent path Different degree, to the target text carry out semantic meaning representation.
In a kind of implementation of the present embodiment, the different degree determines that subelement includes:
Second relation vector determines subelement, for determining each word pair of word centering for each word pair The corresponding relation vector of dependence between two words of term vector and the word centering answered;
Second coding vector obtains subelement, for utilizing each word to corresponding two term vectors and relation vector It is encoded, obtains the text code vector of the target text, wherein the text code vector expresses the target text This syntactic information and sequence of terms information;
Path code vector obtains subelement, for encoding to every interdependent path, obtains interdependent corresponding to every The path code vector in path, the path code vector express the path that each target word is formed in the interdependent path Information;
Path weight value determines subelement, for determining institute using the text code vector and the path code vector State the path weight value in interdependent path, wherein the path weight value characterizes weight of the interdependent path under the application scenarios It spends.
In a kind of implementation of the present embodiment, the text semantic expression subelement includes:
Path code vector determines subelement, for according to the path code vector sum path for corresponding to every interdependent path Weight determines the path code vector for corresponding to all interdependent paths;
Second semantic information expresses subelement, for using the text code vector and corresponding to all interdependent paths Path code vector, express the semantic information of the target text.
Further, the embodiment of the present application also provides a kind of text semantic expression devices, comprising: processor, memory, System bus;
The processor and the memory are connected by the system bus;
The memory includes instruction, described instruction for storing one or more programs, one or more of programs The processor is set to execute any implementation method of above-mentioned text semantic expression when being executed by the processor.
Further, described computer-readable to deposit the embodiment of the present application also provides a kind of computer readable storage medium Instruction is stored in storage media, when described instruction is run on the terminal device, so that the terminal device executes above-mentioned text Any implementation method of semantic meaning representation method.
Further, the embodiment of the present application also provides a kind of computer program product, the computer program product exists When being run on terminal device, so that the terminal device executes any implementation method of above-mentioned text semantic expression.
As seen through the above description of the embodiments, those skilled in the art can be understood that above-mentioned implementation All or part of the steps in example method can be realized by means of software and necessary general hardware platform.Based on such Understand, substantially the part that contributes to existing technology can be in the form of software products in other words for the technical solution of the application It embodies, which can store in storage medium, such as ROM/RAM, magnetic disk, CD, including several Instruction is used so that a computer equipment (can be the network communications such as personal computer, server, or Media Gateway Equipment, etc.) execute method described in certain parts of each embodiment of the application or embodiment.
It should be noted that each embodiment in this specification is described in a progressive manner, each embodiment emphasis is said Bright is the difference from other embodiments, and the same or similar parts in each embodiment may refer to each other.For reality For applying device disclosed in example, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place Referring to method part illustration.
It should also be noted that, herein, relational terms such as first and second and the like are used merely to one Entity or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to contain Lid non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or equipment including the element.
The foregoing description of the disclosed embodiments makes professional and technical personnel in the field can be realized or use the application. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the application.Therefore, the application It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims (17)

1. a kind of text semantic expression characterized by comprising
Obtain target text to be expressed;
The target text is subjected to word segmentation processing, obtains each target word;
Interdependent syntactic analysis is carried out to the target text, determines the dependence between each target word;
According to the dependence between each target word, semantic meaning representation is carried out to the target text.
2. the method according to claim 1, wherein the dependence between each target word of the determination, Include:
The determining domination word with the target word with dependence, obtains by the target word and the domination word The word pair of composition, wherein it is described dominate word be root node mark or different from the target word another target word, The root node mark is the mark of the root node of interdependent syntax tree, and the interdependent syntax tree describes between each target word Dependence;
Word pair corresponding for each target word, determines the interdependent pass between two words of the word centering System.
3. according to the method described in claim 2, it is characterized in that, the dependence according between each target word, Semantic meaning representation is carried out to the target text, comprising:
For each word pair, the corresponding term vector of each word of word centering and two words of the word centering are determined The corresponding relation vector of dependence between language;
Corresponding two term vectors and relation vector are encoded using each word, obtain the text of the target text Coding vector, wherein the text code vector expresses the syntactic information and sequence of terms information of the target text;
Using the text code vector, the semantic information of the target text is expressed.
4. the method according to claim 1, wherein the dependence according between each target word, Semantic meaning representation is carried out to the target text, comprising:
According between each target word dependence and every interdependent path, semantic table is carried out to the target text It reaches, wherein every interdependent path is every single sub path in interdependent syntax tree, and the interdependent syntax tree describes each mesh The dependence between word is marked, the terminal of the subpath is the leaf node of the interdependent syntax tree.
5. according to the method described in claim 4, it is characterized in that, the dependence according between each target word with And every interdependent path, semantic meaning representation is carried out to the target text, comprising:
Determine the application scenarios of the semantic meaning representation result of the target text;
Different degree of the every interdependent path in the application scenarios is determined respectively;
According to the different degree of dependence and every interdependent path between each target word, the target text is carried out Semantic meaning representation.
6. according to the method described in claim 5, it is characterized in that, described determine every interdependent path in the applied field respectively Different degree in scape, comprising:
For each word pair, the corresponding term vector of each word of word centering and two words of the word centering are determined The corresponding relation vector of dependence between language;
Corresponding two term vectors and relation vector are encoded using each word, obtain the text of the target text Coding vector, wherein the text code vector expresses the syntactic information and sequence of terms information of the target text;
Every interdependent path is encoded, the path code vector corresponding to every interdependent path, the path code are obtained Vector expresses the routing information that each target word is formed in the interdependent path;
Using the text code vector and the path code vector, the path weight value in the interdependent path is determined, wherein institute It states path weight value and characterizes different degree of the interdependent path under the application scenarios.
7. according to the method described in claim 6, it is characterized in that, the dependence according between each target word with And the different degree in every interdependent path, semantic meaning representation is carried out to the target text, comprising:
According to the path code vector sum path weight value for corresponding to every interdependent path, the road for corresponding to all interdependent paths is determined Diameter coding vector;
Using the text code vector and corresponding to the path code vector in all interdependent paths, the target text is expressed Semantic information.
8. a kind of text semantic expression device characterized by comprising
Target text acquiring unit, for obtaining target text to be expressed;
Target word obtaining unit obtains each target word for the target text to be carried out word segmentation processing;
Dependence determination unit determines between each target word for carrying out interdependent syntactic analysis to the target text Dependence;
Text semantic expression unit, for carrying out language to the target text according to the dependence between each target word Justice expression.
9. device according to claim 8, which is characterized in that the dependence determination unit includes:
Word, for the determining domination word with the target word with dependence, is obtained by described to subelement is obtained Target word and the word pair for dominating word composition, wherein the domination word is root node mark or is different from described Another target word of target word, the root node mark are the mark of the root node of interdependent syntax tree, the interdependent syntax Tree describes the dependence between each target word;
Dependence determines subelement, is used for word pair corresponding for each target word, determines the word centering Two words between dependence.
10. device according to claim 9, which is characterized in that the text semantic expression unit includes:
First relation vector determines subelement, for determining that each word of word centering is corresponding for each word pair The corresponding relation vector of dependence between two words of term vector and the word centering;
First coding vector obtains subelement, for being carried out using each word to corresponding two term vectors and relation vector Coding, obtains the text code vector of the target text, wherein the text code vector expresses the target text Syntactic information and sequence of terms information;
First semantic information expresses subelement, for using the text code vector, expresses the semantic letter of the target text Breath.
11. device according to claim 8, which is characterized in that the text semantic expression unit is specifically used for according to each Dependence and every interdependent path between a target word carry out semantic meaning representation to the target text, wherein described Every interdependent path is every single sub path in interdependent syntax tree, and the interdependent syntax tree describes between each target word Dependence, the terminal of the subpath are the leaf node of the interdependent syntax tree.
12. device according to claim 11, which is characterized in that the text semantic expression unit includes:
Application scenarios determine subelement, the application scenarios of the semantic meaning representation result for determining the target text;
Different degree determines subelement, for determining different degree of the every interdependent path in the application scenarios respectively;
Text semantic expresses subelement, for the weight according to dependence and every interdependent path between each target word It spends, semantic meaning representation is carried out to the target text.
13. device according to claim 12, which is characterized in that the different degree determines that subelement includes:
Second relation vector determines subelement, for determining that each word of word centering is corresponding for each word pair The corresponding relation vector of dependence between two words of term vector and the word centering;
Second coding vector obtains subelement, for being carried out using each word to corresponding two term vectors and relation vector Coding, obtains the text code vector of the target text, wherein the text code vector expresses the target text Syntactic information and sequence of terms information;
Path code vector obtains subelement, for encoding every interdependent path, obtains corresponding to every interdependent path Path code vector, the path code vector expresses the path letter that each target word is formed in the interdependent path Breath;
Path weight value determines subelement, for using the text code vector and the path code vector, determine it is described according to Deposit the path weight value in path, wherein the path weight value characterizes different degree of the interdependent path under the application scenarios.
14. device according to claim 13, which is characterized in that the text semantic expresses subelement and includes:
Path code vector determines subelement, for according to the path code vector sum path power for corresponding to every interdependent path Weight determines the path code vector for corresponding to all interdependent paths;
Second semantic information expresses subelement, for using the text code vector and corresponding to the road in all interdependent paths Diameter coding vector expresses the semantic information of the target text.
15. a kind of text semantic expression device characterized by comprising processor, memory, system bus;
The processor and the memory are connected by the system bus;
The memory includes instruction for storing one or more programs, one or more of programs, and described instruction works as quilt The processor makes the processor perform claim require 1-7 described in any item methods when executing.
16. a kind of computer readable storage medium, which is characterized in that instruction is stored in the computer readable storage medium, When described instruction is run on the terminal device, so that the terminal device perform claim requires the described in any item methods of 1-7.
17. a kind of computer program product, which is characterized in that when the computer program product is run on the terminal device, make It obtains the terminal device perform claim and requires the described in any item methods of 1-7.
CN201810942947.2A 2018-08-17 2018-08-17 Text semantic expression method and device Active CN109062902B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810942947.2A CN109062902B (en) 2018-08-17 2018-08-17 Text semantic expression method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810942947.2A CN109062902B (en) 2018-08-17 2018-08-17 Text semantic expression method and device

Publications (2)

Publication Number Publication Date
CN109062902A true CN109062902A (en) 2018-12-21
CN109062902B CN109062902B (en) 2022-12-06

Family

ID=64687412

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810942947.2A Active CN109062902B (en) 2018-08-17 2018-08-17 Text semantic expression method and device

Country Status (1)

Country Link
CN (1) CN109062902B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109992777A (en) * 2019-03-26 2019-07-09 浙江大学 A kind of crucial semantic information extracting method of Chinese medicine state of an illness text based on keyword
CN111062200A (en) * 2019-12-12 2020-04-24 北京声智科技有限公司 Phonetics generalization method, phonetics identification method, device and electronic equipment
CN111666738A (en) * 2020-06-09 2020-09-15 南京师范大学 Formalized coding method for motion description natural text
CN112115700A (en) * 2020-08-19 2020-12-22 北京交通大学 Dependency syntax tree and deep learning based aspect level emotion analysis method
CN112883741A (en) * 2021-04-29 2021-06-01 华南师范大学 Specific target emotion classification method based on dual-channel graph neural network
WO2021213155A1 (en) * 2020-11-25 2021-10-28 平安科技(深圳)有限公司 Method, apparatus, medium, and electronic device for adding punctuation to text
CN113593557A (en) * 2021-07-27 2021-11-02 中国平安人寿保险股份有限公司 Distributed session method, device, computer equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080005094A1 (en) * 2006-07-01 2008-01-03 Kevin Cunnane Method and system for finding the focus of a document
CN106156041A (en) * 2015-03-26 2016-11-23 科大讯飞股份有限公司 Hot information finds method and system
CN106155999A (en) * 2015-04-09 2016-11-23 科大讯飞股份有限公司 Semantics comprehension on natural language method and system
CN106844327A (en) * 2015-12-07 2017-06-13 科大讯飞股份有限公司 Text code method and system
CN107145512A (en) * 2017-03-31 2017-09-08 北京大学 The method and apparatus of data query

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080005094A1 (en) * 2006-07-01 2008-01-03 Kevin Cunnane Method and system for finding the focus of a document
CN106156041A (en) * 2015-03-26 2016-11-23 科大讯飞股份有限公司 Hot information finds method and system
CN106155999A (en) * 2015-04-09 2016-11-23 科大讯飞股份有限公司 Semantics comprehension on natural language method and system
CN106844327A (en) * 2015-12-07 2017-06-13 科大讯飞股份有限公司 Text code method and system
CN107145512A (en) * 2017-03-31 2017-09-08 北京大学 The method and apparatus of data query

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ROTH M等: "Neural semantic role labeling with dependency path embeddings", 《ARXIV》 *
吕愿愿等: "利用实体与依存句法结构特征的病历短文本分类方法", 《中国医疗器械杂志》 *
曹莉丽等: "融合词向量的多特征问句相似度计算方法研究", 《现代计算机(专业版)》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109992777A (en) * 2019-03-26 2019-07-09 浙江大学 A kind of crucial semantic information extracting method of Chinese medicine state of an illness text based on keyword
CN111062200A (en) * 2019-12-12 2020-04-24 北京声智科技有限公司 Phonetics generalization method, phonetics identification method, device and electronic equipment
CN111062200B (en) * 2019-12-12 2024-03-05 北京声智科技有限公司 Speaking generalization method, speaking recognition device and electronic equipment
CN111666738A (en) * 2020-06-09 2020-09-15 南京师范大学 Formalized coding method for motion description natural text
CN112115700A (en) * 2020-08-19 2020-12-22 北京交通大学 Dependency syntax tree and deep learning based aspect level emotion analysis method
CN112115700B (en) * 2020-08-19 2024-03-12 北京交通大学 Aspect-level emotion analysis method based on dependency syntax tree and deep learning
WO2021213155A1 (en) * 2020-11-25 2021-10-28 平安科技(深圳)有限公司 Method, apparatus, medium, and electronic device for adding punctuation to text
CN112883741A (en) * 2021-04-29 2021-06-01 华南师范大学 Specific target emotion classification method based on dual-channel graph neural network
CN113593557A (en) * 2021-07-27 2021-11-02 中国平安人寿保险股份有限公司 Distributed session method, device, computer equipment and storage medium
CN113593557B (en) * 2021-07-27 2023-09-12 中国平安人寿保险股份有限公司 Distributed session method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN109062902B (en) 2022-12-06

Similar Documents

Publication Publication Date Title
CN111488734B (en) Emotional feature representation learning system and method based on global interaction and syntactic dependency
CN109062902A (en) A kind of text semantic expression and device
CN110826336B (en) Emotion classification method, system, storage medium and equipment
CN109933664B (en) Fine-grained emotion analysis improvement method based on emotion word embedding
CN107133224B (en) Language generation method based on subject word
CN107229610B (en) A kind of analysis method and device of affection data
CN111274398B (en) Method and system for analyzing comment emotion of aspect-level user product
CN108519890A (en) A kind of robustness code abstraction generating method based on from attention mechanism
CN111931506B (en) Entity relationship extraction method based on graph information enhancement
CN108780464A (en) Method and system for handling input inquiry
CN111797898B (en) Online comment automatic reply method based on deep semantic matching
CN110196928B (en) Fully parallelized end-to-end multi-turn dialogue system with domain expansibility and method
CN115329127A (en) Multi-mode short video tag recommendation method integrating emotional information
CN111144097B (en) Modeling method and device for emotion tendency classification model of dialogue text
Li et al. Learning document embeddings by predicting n-grams for sentiment classification of long movie reviews
CN114756681B (en) Evaluation and education text fine granularity suggestion mining method based on multi-attention fusion
CN115495568B (en) Training method and device for dialogue model, dialogue response method and device
CN111400584A (en) Association word recommendation method and device, computer equipment and storage medium
JP2006190229A (en) Opinion extraction learning device and opinion extraction classifying device
Lee et al. Off-Topic Spoken Response Detection Using Siamese Convolutional Neural Networks.
CN114528398A (en) Emotion prediction method and system based on interactive double-graph convolutional network
Celikyilmaz et al. A New Pre-Training Method for Training Deep Learning Models with Application to Spoken Language Understanding.
CN111382333B (en) Case element extraction method in news text sentence based on case correlation joint learning and graph convolution
CN116932938A (en) Link prediction method and system based on topological structure and attribute information
JP2007241881A (en) Method, device and program for creating opinion property determination database, and method, device and program for determining opinion property, and computer readable recording medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant