CN106156110A - text semantic understanding method and system - Google Patents

text semantic understanding method and system Download PDF

Info

Publication number
CN106156110A
CN106156110A CN201510159102.2A CN201510159102A CN106156110A CN 106156110 A CN106156110 A CN 106156110A CN 201510159102 A CN201510159102 A CN 201510159102A CN 106156110 A CN106156110 A CN 106156110A
Authority
CN
China
Prior art keywords
network
sub
text
word string
subnet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510159102.2A
Other languages
Chinese (zh)
Other versions
CN106156110B (en
Inventor
吴维昊
杨溥
潘青华
王影
胡国平
胡郁
刘庆峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
iFlytek Co Ltd
Original Assignee
iFlytek Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by iFlytek Co Ltd filed Critical iFlytek Co Ltd
Priority to CN201510159102.2A priority Critical patent/CN106156110B/en
Publication of CN106156110A publication Critical patent/CN106156110A/en
Application granted granted Critical
Publication of CN106156110B publication Critical patent/CN106156110B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of text semantic understanding method and system, the method includes: build directed graph grammer networks based on major network-subnet pattern in advance, described directed graph grammer networks includes a master network and one or more sub-network, the corresponding text character in every section of path of described directed graph grammer networks or a subnet identifier;Obtain text to be resolved;Based on described directed graph grammer networks, described text is decoded, obtains decoding paths;Obtain the relevant semanteme of described decoding paths as semantic understanding result.The present invention can be effectively reduced the complexity of directed graph grammer networks, improves decoding efficiency, reduces memory consumption.

Description

Text semantic understanding method and system
Technical field
The present invention relates to natural language processing technique field, be specifically related to a kind of text semantic understanding method and be System.
Background technology
As the natural language understanding technology in one of direction most important in artificial intelligence field, the most relevant neck The focus of territory research worker research.The most in recent years, along with developing rapidly of development of Mobile Internet technology, letter Breathization degree improves day by day, and the information on network the most exponentially increases severely, when the mankind enter big data Generation.People thirst for all the more allowing machine understand natural language, thus analyze efficiently from the data of magnanimity and Obtain valuable information.
Traditional semantic understanding system goes out several sentence inputting set mainly by grammar definition, works as input Text these set among, then understand successfully.In recent years for the semantic need excavated that text is profound Asking, research worker proposes the scheme that text semantic based on grammar rule understands.In the program first, it is settled that Application sentence grammar rule under each concrete applied environment, in order to describe the natural language syntax under each concrete application Input;Subsequently this grammar rule is efficiently compiled and obtain computer intelligible directed graph syntax net Network;Finally the natural language input received and directed graph grammer networks are carried out coupling to resolve, according to optimum The relevant semanteme of coupling path extraction, it is achieved the Deep Semantics of the sentence phrase of input is understood.
But, for mass data, use traditional semantic understanding system based on grammar rule to need definition The thousands of kind syntax, its directed graph grammer networks structure built according to grammar rule is the hugest, complicated. Additionally, the decoding of directed graph grammer networks is the process of a breadth first search in legacy system, thus user When text carries out mating parsing with grammer networks, computationally intensive, the most, cause the effect of whole semantic understanding Rate is greatly reduced, and during its decoding, hardware resource consumption is big.
Summary of the invention
The embodiment of the present invention provides a kind of text semantic understanding method and system, to solve prior art decoding effect Rate is low, the problem that during decoding, hardware resource consumption is big.
To this end, the embodiment of the present invention following technical scheme of offer:
A kind of text semantic understanding method, including:
Build directed graph grammer networks based on major network-subnet pattern, described directed graph grammer networks bag in advance Include a master network and one or more sub-network, every section of path correspondence of described directed graph grammer networks One text character or a subnet identifier;
Obtain text to be resolved;
Based on described directed graph grammer networks, described text is decoded, obtains decoding paths;
Obtain the relevant semanteme of described decoding paths as semantic understanding result.
Preferably, described structure directed graph grammer networks based on major network-subnet pattern includes:
Sentence grammar rule is set up according to the syntactic property of natural language input under each application;
Determine master network and each self-corresponding text type of sub-network;
According to each self-corresponding text type of master network and sub-network, described sentence grammar rule is compiled raw Become major network directed graph grammer networks and the subnet directed graph grammer networks of band subnet identifier.
Preferably, described based on described directed graph grammer networks, described text is decoded, obtains decoding road Footpath includes:
To text to be resolved, carry out word string coupling from the first node of master network;
If there is subnet identifier in the coupling path of master network, then record master network match information, and Call sub-network corresponding to described subnet identifier and carry out word string coupling, obtain and record sub-network coupling letter Breath;
After text to be resolved has all mated, mate according to the master network match information obtained and sub-network Information, obtains decoding paths.
Preferably, described based on described directed graph grammer networks, described text is decoded, obtains decoding road Footpath also includes:
When calling sub-network corresponding to described subnet identifier and carrying out word string coupling, it is judged that described sub-network Whether it is to call first;
If it is, utilize described sub-network to carry out word string coupling, and the sub-network match information obtained is protected It is stored in subnet matching result manager;
Otherwise, from described subnet match management device, history match result is obtained as sub-network match information.
Preferably, described sub-network match information includes: sub-network coupling path, sub-network search sign, Mate the number of words of word string;Described master network match information includes: master network coupling path, the subnet called The subnet identifier of network, mate the number of words of word string;
Described judge whether described sub-network is to call first to include:
If described sub-network search sign represents do not search for, it is determined that described sub-network is for call first;
If described sub-network search sign represents search for, and described master network match information and sub-network The number of words mating word string in match information is identical, it is determined that described sub-network is non-to call first.
Preferably, described utilize described sub-network carry out word string coupling include:
When utilizing described sub-network to carry out word string coupling, fault tolerant mechanism is used to carry out word string coupling, described fault-tolerant Mechanism includes one or more of word string matching way: oneself jumps, company jumps, wrongly written character is fault-tolerant.
Preferably, described sub-network has one or more layers.
A kind of text semantic understands system, including:
Network struction module, builds directed graph grammer networks based on major network-subnet pattern, institute in advance State directed graph grammer networks and include a master network and one or more sub-network, the described directed graph syntax The corresponding text character in every section of path of network or a subnet identifier;
Receiver module, is used for obtaining text to be resolved;
Decoder module, for being decoded described text based on described directed graph grammer networks, is decoded Path;
Result acquisition module, for obtaining the relevant semanteme of described decoding paths as semantic understanding result.
Preferably, described network struction module includes:
Rule arranges unit, for setting up sentence syntax rule according to the syntactic property of natural language input under each application Then;
Text division unit, is used for determining master network and each self-corresponding text type of sub-network;
Compilation unit, for according to each self-corresponding text type of master network and sub-network, to the described sentence syntax Rule is compiled generating major network directed graph grammer networks and the oriented picture and text of subnet of band subnet identifier Method network.
Preferably, described decoder module includes:
Matching unit, for text to be resolved, carries out word string coupling from the first node of master network;And When subnet identifier occurs in the coupling path of master network, record master network match information, and call described Sub-network corresponding to subnet identifier carries out word string coupling, obtains and records sub-network match information;
Decoding paths acquiring unit, is used for after text to be resolved has all been mated by described matching unit, The master network match information obtained according to described matching unit and sub-network match information, obtain decoding paths.
Preferably, described decoder module also includes:
Judging unit, is carried out for calling sub-network corresponding to described subnet identifier at described matching unit During word string coupling, it is judged that whether described sub-network is to call first, and will determine that result feeds back to described coupling Unit;
Described matching unit, when described judging unit judges that described sub-network is to call first, utilizes described son Network carries out word string coupling, and the sub-network match information of acquisition is saved in subnet matching result manager In, when described judging unit judges that described sub-network right and wrong are called first, from described subnet match management device Middle acquisition history match result is as sub-network match information.
Preferably, described sub-network match information includes: sub-network coupling path, sub-network search sign, Mate the number of words of word string;Described master network match information includes: master network coupling path, the subnet called The subnet identifier of network, mate the number of words of word string;
Described judging unit, specifically for when described sub-network search sign represents and do not searches for, determines described Sub-network, for call first, represents at described sub-network search sign and searches for, and described master network coupling When information is identical with the number of words mating word string in sub-network match information, determine that described sub-network is non-head Secondary call.
Preferably, when described matching unit utilizes described sub-network to carry out word string coupling, fault tolerant mechanism is used to enter Row word string is mated, and described fault tolerant mechanism includes one or more of word string matching way: from jumping, even jump, Wrongly written character is fault-tolerant.
Described sub-network has one or more layers.
It is different from the directed graph grammer networks of traditional bulky complex built based on grammar rule, this Directed graph grammer networks is divided into master network and sub-network by bright embodiment text semantic understanding method, effectively drops The low complexity of directed graph grammer networks, improves decoding efficiency.And, waiting to solve to user's input When analysis text is decoded, uses Depth Priority Searching that text to be resolved carries out grammer networks coupling and solve Analysis, reduces memory consumption.
Further, antithetical phrase network settings preservation mechanism, the decoding with a user input text is preserved Call the match information of sub-network first, when subsequent decoding repeats to call this sub-network, directly use preservation The matching result preserved in administrative mechanism, decreases the matching times of sub-network, further increases decoding effect Rate.
Further, by fault tolerant mechanism, improve system survivability.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present application or technical scheme of the prior art, below will be to enforcement In example, the required accompanying drawing used is briefly described, it should be apparent that, the accompanying drawing in describing below is only Some embodiments described in the present invention, for those of ordinary skill in the art, it is also possible to according to these Accompanying drawing obtains other accompanying drawing.
Fig. 1 is the flow chart of embodiment of the present invention text semantic understanding method;
Fig. 2 is directed graph grammer networks example one based on major network-subnet pattern in the embodiment of the present invention;
Fig. 3 is that in the embodiment of the present invention, text is carried out by directed graph grammer networks based on major network-subnet pattern The flow chart of decoding;
Fig. 4 is directed graph grammer networks example two based on major network-subnet pattern in the embodiment of the present invention;
Fig. 5 is the structural representation that embodiment of the present invention text semantic understands system.
Detailed description of the invention
In order to make those skilled in the art be more fully understood that the scheme of the embodiment of the present invention, below in conjunction with the accompanying drawings With embodiment, the embodiment of the present invention is described in further detail.
As it is shown in figure 1, be the flow chart of embodiment of the present invention text semantic understanding method, comprise the following steps:
Step 101, builds directed graph grammer networks based on major network-subnet pattern in advance.
It is different from the directed graph grammer networks of traditional bulky complex built based on grammar rule, this In bright embodiment, directed graph grammer networks is divided into master network and sub-network, the most described directed graph grammer networks Including a master network and one or more sub-network, the corresponding literary composition in every section of path of described master network This character or a subnet identifier.And, according to reality application needs, sub-network can also nesting set Put, one or more layers i.e. can be set.If only one straton network, then every section of path pair of this sub-network Answer a text character;If there being multilamellar sub-network, then corresponding one of every section of path of the sub-network of the bottom Text character, and the corresponding text character in every section of path of other each straton network in addition to the bottom or One subnet identifier.
The process building directed graph grammer networks based on major network-subnet pattern is as follows:
First, sentence grammar rule is set up according to the syntactic property of natural language input under each application.Described sentence literary composition Method rule can be by user according to practical application request, it is also possible to by system previously according to common application demand, Determining according to system grammar rule set in advance, so as to describing under each application, natural language input syntax can Energy.
Then, it is determined that master network and each self-corresponding text type of sub-network, to realize master network and subnet The division of network.Specifically, first distich grammar rule is analyzed, then determines structure master network and sub-network The most corresponding text type, and then realize the division of master network and sub-network.The literary composition that described sub-network is corresponding This type, mainly user input easily makes mistakes or confusing text word string, has been defined generally to comparison bright The really noun of context, such as singer's name, song title, TV play name etc..The literary composition that described master network is corresponding This type, generally model comparision are fixed, and user's input is not easy the text word string made mistakes.
After determining major network and each self-corresponding text of subnet, band subnet identifier can be generated by compiling Major network directed graph grammer networks, and subnet directed graph grammer networks.
As following sentence grammar rule is compiled, the directed graph grammer networks obtained as shown in Figure 2:
$ sub=Wang Fei;
$ main=I want to listen the song of $ sub;
Wherein, the text that master network is corresponding is " I wants to listen the song of xxx ", and the pattern of text type is more fixing, Text corresponding to sub-network is " Wang Fei ", and text type is to have the noun of the clear and definite context of comparison, and sub is Subnet identifier, the corresponding text character of every paths of described directed graph grammer networks or a subnet Network identifier.
Step 102, obtains text to be resolved.
Step 103, is decoded described text based on described directed graph grammer networks, obtains decoding paths.
First, to text to be resolved, word string coupling is carried out from the first node of master network;If master network Join and path occurs subnet identifier, then record master network match information, and call described sub-network mark The sub-network that symbol is corresponding carries out word string coupling, obtains and records sub-network match information;Complete at text to be resolved After portion has mated, according to the master network match information obtained and sub-network match information, obtain decoding paths.
Concrete decoding process will be described in detail later.
Step 104, obtains the relevant semanteme of described decoding paths as semantic understanding result.
As it is shown on figure 3, be directed graph grammer networks pair based on major network-subnet pattern in the embodiment of the present invention The flow chart that text is decoded, comprises the following steps:
Step 301, master network word string is mated.
For the text to be resolved of user's input, carry out word string coupling from the first node of master network.
Step 302, it may be judged whether call sub-network, the most then perform step 303, otherwise perform step 304。
When master network word string is mated, there is subnet identifier in path, then judge to call sub-network, no Then judge without calling sub-network.
Step 303, calls sub-network and carries out word string coupling.
It is previously noted that when calling sub-network, need to record master network match information, entering according to sub-network During row coupling, obtain and record sub-network match information, and then after text to be resolved has all mated, Decoding paths can be obtained according to the master network match information obtained and sub-network match information.
In actual applications, for the ease of processing, " calling state supervisor " and " son can be respectively provided with Net matching result manager ", store above-mentioned master network match information and sub-network match information respectively.Need Illustrate, each sub-network each corresponding one " subnet matching result manager ".Described " call shape State manager " can create when grammer networks builds, " subnet matching result manager " can be accordingly Sub-network create when building, it is also possible to build when calling sub-network during decoding, this present invention implemented Example does not limits.In addition, it is necessary to explanation, described " calling state supervisor " and " subnet coupling knot Really manager " information that stored in the matching process, in the text to be resolved decoding that user is once inputted After completing, need all to reset, to avoid the impact decoded next time;Or start in upper once decoding Before, by initializing, it is all reset, this embodiment of the present invention is not limited.
Master network match information includes: master network coupling path, the subnet identifier of the sub-network called. Sub-network match information includes: sub-network coupling path.
In order to improve decoding efficiency further, above-mentioned master network match information may also include that mates word string Number of words, sub-network match information may also include that sub-network search sign, mates the number of words of word string.So, When subsequent decoding repeats to call this sub-network, can directly use the matching result that sub-network has preserved.Need It is noted that described sub-network search sign can create when sub-network builds, and can independently deposit Storage, it is also possible to creating should being moved and be stored to this after " the subnet matching result manager " of sub-network In " subnet matching result manager ".
Below the process calling sub-network is described in detail.
When calling sub-network, first in " calling state supervisor ", storage master network mates path, calls Sub-network subnet identifier, currently mated the number of words of word string.Secondly, it is judged that whether this sub-network Being to call first, if then carrying out sub-network word string coupling, and preserving matching result, otherwise use preservation History match result.
Whether described subnet is eventually the judgement called first, can by above-mentioned sub-network search sign and The number of words joining word string determines.Such as, sub-network search sign value if 0, then is judged as calling first; If sub-network search sign value is 1, then determine whether that store in " calling state supervisor " calls Mated before current sub network the number of words of word string with in " subnet matching result manager " storage call this subnet Before to have mated the number of words of word string the most identical, if identical, be judged as non-calling first.
When sub-network is called first, after completing word string coupling, protect in " subnet matching result manager " Deposit sub-network coupling path, sub-network search sign, call this subnet before mated the number of words of word string.Described Sub-network search sign is used for identifying this sub-network and had searched for, and its value can be 0 or 1,0 table Showing and do not search for, 1 expression is searched for, or vice versa.
Subnet is non-when calling first, directly uses subnet word string of storage in " subnet matching result manager " Coupling routing information.
Step 304, word string coupling, until terminating, obtains mating path.
As can be seen here, the directed graph syntax of traditional bulky complex built based on grammar rule it are different from Network, directed graph grammer networks is divided into master network and subnet by embodiment of the present invention text semantic understanding method Network, significantly reduces the complexity of directed graph grammer networks, improves decoding efficiency.And, to When the text to be resolved of family input is decoded, use Depth Priority Searching that text to be resolved is carried out literary composition Method net mate resolves, and reduces memory consumption.
Further, antithetical phrase network settings preservation mechanism, the decoding with a user input text is preserved Call the match information of sub-network first, when subsequent decoding repeats to call this sub-network, directly use preservation The matching result preserved in administrative mechanism, further increases decoding efficiency.
Citing describes embodiment of the present invention directed graph based on the major network-subnet pattern syntax in detail further below Network carries out the process of text decoding.
As shown in Figure 4, directed graph grammer networks based on major network-subnet pattern is illustrated.
This directed graph grammer networks is mainly applied and is searched for into film, the wherein agent structure of major network network main1 For " I wants to see the xxx of xxx ".This directed graph grammer networks totally three subnet net, respectively sub1, Sub2, sub3, wherein sub1 is film performer's name subnet net, and sub2 is TV play performer's name subnet net Network, sub3 is movie name subnet net, and the eps in network represents sky arc, is automatically to add in compilation process Adding, described empty arc is intended merely to formally distinguish each logical gate in the sentence syntax, is using When natural statement is resolved by network, sky arc can be ignored, two nodes that empty arc connects are considered as same joint Point.
As user inputs " I wants to see the Infernal Affairs of Liu Dehua ", described grammer networks has two word strings Coupling path, path A and path B, concrete matching process is as described below:
1. coupling path A (calling sub-network sub3 first):
A) proceed by the accurate coupling of " I thinks " word string from major network network, path occurs sub-network Identifier sub1;
B) calling subnet net sub1, it is 3 that active user inputs word string coupling number of words, to should sub-network Sub-network search sign for not search for, create to should the matching result manager of sub-network, and by described son Web search mark is saved in should then start to mate word string in the matching result manager of sub-network " Liu Dehua ", to preserving coupling path in the matching result manager of sub-network, to should subnet The sub-network search sign of network is set to search for;Return master network;
C) carry out word string " " coupling, path occurs subnet identifier sub3;
D) calling subnet net sub3, it is 7 that active user inputs word string coupling number of words, to should sub-network Sub-network search sign for not search for, create to should the matching result manager of sub-network, and by described Sub-network search sign is saved in should then proceed by word string in the matching result manager of sub-network The coupling of " Infernal Affairs ", to preserving coupling path, correspondence in the matching result manager of sub-network The sub-network search sign of this sub-network is set to search for, and returns master network, returns semantic understanding result.
2. coupling path B (non-call sub-network sub3 first)
A) proceed by the accurate coupling of " I thinks " word string from master network, path occurs sub-network mark Know symbol sub2;
B) calling sub-network sub2, it is 3 that active user inputs word string coupling number of words, to should sub-network Sub-network search sign for not search for, create to should the matching result manager of sub-network, start to mate word String " Liu Dehua ", to preserving coupling path in the matching result manager of sub-network, to should son The sub-network search sign of network is set to search for;Return master network;
C) carry out word string " " coupling, path occurs subnet identifier sub3;
D) call sub-network sub3, the sub-network search sign of corresponding sub-network sub3 for search for, and It is 7 that active user inputs word string coupling number of words, with matching result manager when calling sub-network sub3 first The number of words mating word string of middle storage is identical, and therefore this calls without carrying out word string coupling, directly uses The coupling path preserved in the matching result manager of corresponding sub-network sub3, returns semantic understanding knot Really.
Furthermore, it is necessary to explanation, in actual applications, described sub-network is when mating, it is also possible to There is fault tolerant mechanism, use BFS method to carry out net mate decoding.User can be according to reality Demand determines whether to open fault tolerant mechanism.
Fault tolerant mechanism mainly includes one or more of word string matching way: oneself jumps, company jumps, wrongly written character is fault-tolerant. Continue with reference to the grammer networks shown in Fig. 4, illustrate application fault tolerance mechanism and carry out sub-network coupling Process.
When text to be resolved is " I wants to see the Infernal Affairs of Liu Liu De China " or " I wants to see Liu Zhang Dehua continuously Road " time, when described subnet does not has " Liu Liu De China " or " Liu Zhang Dehua ", can be by from the side jumped Formula, sponges " Liu " or " opening " word string of multi input, and subnet is being called in both word strings coupling path During network sub3, it is only necessary to carry out a substring coupling, another kind of word string coupling path directly uses first Join result.
When text to be resolved is " I wants to see the Infernal Affairs of Liu Hua ", subnet sub1 or sub2 do not has " Liu China ", and when having " Liu Dehua ", " Liu Qinghua ", " Liu Yuhua ", can be by the way of even jumping, by " Liu China " fault-tolerant one-tenth " Liu Dehua ", " Liu Qinghua ", " Liu Yuhua " three kinds of word strings coupling paths, when these three Join path when calling sub-network sub3, it is only necessary to carry out a substring coupling, other two word string coupling road Footpath directly uses first fit result.
When text to be resolved is " I wants to see the magnificent Infernal Affairs of Liu ", sub-network sub1 or sub-network sub2 In do not have " China of Liu ", and when having " Liu Dehua ", " Liu get Hua ", " bang China ", wrongly written character can be passed through Fault tolerant mechanism, it is fault-tolerant, as " China of Liu " held that the penalty value in calculating different wrongly written character coupling path carries out wrongly written character Wrong one-tenth " Liu Dehua ", " Liu get Hua " two kinds of word strings coupling paths.Due to " extra large " word string with " " word string The most close with in font in pronunciation, thus will not fault-tolerant become " bang China ", when the two coupling path When calling sub-network sub3, it is only necessary to carrying out a substring coupling, another kind of word string coupling path is direct Use first fit result.
Visible, the text semantic understanding method of the embodiment of the present invention, by fault tolerant mechanism, improve system Fault-tolerant ability.
Correspondingly, the embodiment of the present invention also provides for a kind of text semantic understanding system, as it is shown in figure 5, be this A kind of structural representation of system.
In this embodiment, described system includes:
Network struction module 501, builds directed graph grammer networks based on major network-subnet pattern in advance 500, described directed graph grammer networks 500 includes a master network and one or more sub-network, institute State the corresponding text character in every section of path of directed graph grammer networks or a subnet identifier;
Receiver module 502, is used for obtaining text to be resolved;
Decoder module 503, for being decoded described text based on described directed graph grammer networks, obtains Decoding paths;
Result acquisition module 504, for obtaining the relevant semanteme of described decoding paths as semantic understanding result.
Above-mentioned network struction module 501 specifically can build described directed graph according to the sentence grammar rule arranged Grammer networks.A kind of concrete structure of this module includes following unit:
Rule arranges unit, for setting up sentence syntax rule according to the syntactic property of natural language input under each application Then;
Text division unit, is used for determining master network and each self-corresponding text type of sub-network;
Compilation unit, for according to each self-corresponding text type of master network and sub-network, to the described sentence syntax Rule is compiled generating major network directed graph grammer networks and the oriented picture and text of subnet of band subnet identifier Method network.
Above-mentioned decoder module 503 specifically to text to be resolved, carries out word string coupling from the first node of master network; If there is subnet identifier in the coupling path of master network, then record master network match information, and call Sub-network corresponding to described subnet identifier carries out word string coupling, obtains and records sub-network match information; After text to be resolved has all mated, mate letter according to the master network match information obtained and sub-network Breath, obtains decoding paths.A kind of concrete structure of this module includes matching unit and decoding paths acquiring unit, Wherein:
Described matching unit, for text to be resolved, carries out word string coupling from the first node of master network;And And when subnet identifier occurring in the coupling path of master network, record master network match information, and call Sub-network corresponding to described subnet identifier carries out word string coupling, obtains and records sub-network match information;
Described decoding paths acquiring unit, for all having mated text to be resolved at described matching unit After, the master network match information obtained according to described matching unit and sub-network match information, obtain decoding road Footpath.
Embodiment of the present invention text semantic understands that directed graph grammer networks is divided into master network and sub-network by system, Significantly reduce the complexity of directed graph grammer networks, improve decoding efficiency.And, defeated to user When the text to be resolved entered is decoded, use Depth Priority Searching that text to be resolved carries out syntax net Network coupling resolves, and reduces memory consumption.
Further, above-mentioned decoder module 503 may also include that judging unit, at described matching unit Call sub-network corresponding to described subnet identifier when carrying out word string coupling, it is judged that whether described sub-network is Call first, and will determine that result feeds back to described matching unit.Such as, described sub-network match information bag Include: sub-network coupling path, sub-network search sign, mate the number of words of word string;Described master network mates Information includes: master network coupling path, the subnet identifier of the sub-network called, mate the word of word string Number.So, by more above-mentioned information, described judging unit can judge that sub-network is to call first still Non-call first, specifically, when described sub-network search sign represents and do not searches for, determine described sub-network For calling first, represent at described sub-network search sign and search for, and described master network match information and When the number of words mating word string in sub-network match information is identical, determine that described sub-network is non-to adjust first With.
Correspondingly, described matching unit is when described judging unit judges that described sub-network is to call first, sharp Carry out word string coupling by described sub-network, and the sub-network match information of acquisition is saved in subnet matching result In manager, when described judging unit judges that described sub-network right and wrong are called first, mate from described subnet Manager obtains history match result as sub-network match information.
Visible, the text semantic of the embodiment of the present invention understands system, by antithetical phrase network settings preservation mechanism, The match information of sub-network is called first, at subsequent decoding for decoding preservation with a user input text When repeating to call this sub-network, directly use and preserve the matching result preserved in administrative mechanism, improve further Decoding efficiency.
It should be noted that in actual applications, described matching unit when utilizing sub-network to mate, Can also have fault tolerant mechanism, use BFS method to carry out net mate decoding.Described fault-tolerant machine System includes one or more of word string matching way: from jumping, even jump, wrongly written character fault-tolerant.Each fault tolerant mechanism Decoding process can refer to the description in above the inventive method embodiment, does not repeats them here.
It addition, in a system of the invention, fault tolerant mechanism be may further be provided module is set, for user Offer arranges function, user determine whether according to the actual requirements to open fault tolerant mechanism.If it is to say, User opens fault tolerant mechanism, then, when utilizing sub-network to carry out word string coupling, use fault tolerant mechanism to carry out Join, otherwise, use accurate matching mechanisms to mate.Certainly, in actual applications, it is also possible to according to reality Border applied environment needs, system preset whether use fault tolerant mechanism.
Visible, the text semantic of the embodiment of the present invention understands system, by fault tolerant mechanism, further increases System survivability.
Each embodiment in this specification all uses the mode gone forward one by one to describe, phase homophase between each embodiment As part see mutually, what each embodiment stressed is different from other embodiments it Place.For system embodiment, owing to it is substantially similar to embodiment of the method, so describing Fairly simple, relevant part sees the part of embodiment of the method and illustrates.System described above is implemented Example is only that schematically the wherein said unit illustrated as separating component can be or may not be Physically separate, the parts shown as unit can be or may not be physical location, the most permissible It is positioned at a place, or can also be distributed on multiple NE.Can select according to the actual needs Some or all of module therein realizes the purpose of the present embodiment scheme.Those of ordinary skill in the art exist In the case of not paying creative work, i.e. it is appreciated that and implements.
Being described in detail the embodiment of the present invention above, detailed description of the invention used herein is to this Bright being set forth, the explanation of above example is only intended to help to understand the method and system of the present invention;With Time, for one of ordinary skill in the art, according to the thought of the present invention, in detailed description of the invention and application All will change in scope, in sum, this specification content should not be construed as limitation of the present invention.

Claims (14)

1. a text semantic understanding method, it is characterised in that including:
Build directed graph grammer networks based on major network-subnet pattern, described directed graph grammer networks bag in advance Include a master network and one or more sub-network, every section of path correspondence of described directed graph grammer networks One text character or a subnet identifier;
Obtain text to be resolved;
Based on described directed graph grammer networks, described text is decoded, obtains decoding paths;
Obtain the relevant semanteme of described decoding paths as semantic understanding result.
Method the most according to claim 1, it is characterised in that described structure is based on major network-subnet mould The directed graph grammer networks of formula includes:
Sentence grammar rule is set up according to the syntactic property of natural language input under each application;
Determine master network and each self-corresponding text type of sub-network;
According to each self-corresponding text type of master network and sub-network, described sentence grammar rule is compiled raw Become major network directed graph grammer networks and the subnet directed graph grammer networks of band subnet identifier.
Method the most according to claim 1, it is characterised in that described based on the described directed graph syntax Described text is decoded by network, obtains decoding paths and includes:
To text to be resolved, carry out word string coupling from the first node of master network;
If there is subnet identifier in the coupling path of master network, then record master network match information, and Call sub-network corresponding to described subnet identifier and carry out word string coupling, obtain and record sub-network coupling letter Breath;
After text to be resolved has all mated, mate according to the master network match information obtained and sub-network Information, obtains decoding paths.
Method the most according to claim 3, it is characterised in that described based on the described directed graph syntax Described text is decoded by network, obtains decoding paths and also includes:
When calling sub-network corresponding to described subnet identifier and carrying out word string coupling, it is judged that described sub-network Whether it is to call first;
If it is, utilize described sub-network to carry out word string coupling, and the sub-network match information obtained is protected It is stored in subnet matching result manager;
Otherwise, from described subnet match management device, history match result is obtained as sub-network match information.
Method the most according to claim 4, it is characterised in that described sub-network match information includes: Sub-network coupling path, sub-network search sign, mate the number of words of word string;Described master network match information Including: master network coupling path, call sub-network subnet identifier, mate the number of words of word string;
Described judge whether described sub-network is to call first to include:
If described sub-network search sign represents do not search for, it is determined that described sub-network is for call first;
If described sub-network search sign represents search for, and described master network match information and sub-network The number of words mating word string in match information is identical, it is determined that described sub-network is non-to call first.
Method the most according to claim 3, it is characterised in that described utilize described sub-network to carry out Word string coupling includes:
When utilizing described sub-network to carry out word string coupling, fault tolerant mechanism is used to carry out word string coupling, described fault-tolerant Mechanism includes one or more of word string matching way: oneself jumps, company jumps, wrongly written character is fault-tolerant.
7. according to the method described in any one of claim 1 to 6, it is characterised in that described sub-network has One or more layers.
8. a text semantic understands system, it is characterised in that including:
Network struction module, builds directed graph grammer networks based on major network-subnet pattern, institute in advance State directed graph grammer networks and include a master network and one or more sub-network, the described directed graph syntax The corresponding text character in every section of path of network or a subnet identifier;
Receiver module, is used for obtaining text to be resolved;
Decoder module, for being decoded described text based on described directed graph grammer networks, is decoded Path;
Result acquisition module, for obtaining the relevant semanteme of described decoding paths as semantic understanding result.
System the most according to claim 8, it is characterised in that described network struction module includes:
Rule arranges unit, for setting up sentence syntax rule according to the syntactic property of natural language input under each application Then;
Text division unit, is used for determining master network and each self-corresponding text type of sub-network;
Compilation unit, for according to each self-corresponding text type of master network and sub-network, to the described sentence syntax Rule is compiled generating major network directed graph grammer networks and the oriented picture and text of subnet of band subnet identifier Method network.
System the most according to claim 8, it is characterised in that described decoder module includes:
Matching unit, for text to be resolved, carries out word string coupling from the first node of master network;And When subnet identifier occurs in the coupling path of master network, record master network match information, and call described Sub-network corresponding to subnet identifier carries out word string coupling, obtains and records sub-network match information;
Decoding paths acquiring unit, is used for after text to be resolved has all been mated by described matching unit, The master network match information obtained according to described matching unit and sub-network match information, obtain decoding paths.
11. systems according to claim 10, it is characterised in that described decoder module also includes:
Judging unit, is carried out for calling sub-network corresponding to described subnet identifier at described matching unit During word string coupling, it is judged that whether described sub-network is to call first, and will determine that result feeds back to described coupling Unit;
Described matching unit, when described judging unit judges that described sub-network is to call first, utilizes described son Network carries out word string coupling, and the sub-network match information of acquisition is saved in subnet matching result manager In, when described judging unit judges that described sub-network right and wrong are called first, from described subnet match management device Middle acquisition history match result is as sub-network match information.
12. systems according to claim 11, it is characterised in that described sub-network match information bag Include: sub-network coupling path, sub-network search sign, mate the number of words of word string;Described master network mates Information includes: master network coupling path, the subnet identifier of the sub-network called, mate the word of word string Number;
Described judging unit, specifically for when described sub-network search sign represents and do not searches for, determines described Sub-network, for call first, represents at described sub-network search sign and searches for, and described master network coupling When information is identical with the number of words mating word string in sub-network match information, determine that described sub-network is non-head Secondary call.
13. systems according to claim 10, it is characterised in that described matching unit utilizes described When sub-network carries out word string coupling, using fault tolerant mechanism to carry out word string coupling, described fault tolerant mechanism includes following One or more word string matching ways: oneself jumps, company jumps, wrongly written character is fault-tolerant.
14. according to the system described in any one of claim 1 to 6, it is characterised in that described sub-network has One or more layers.
CN201510159102.2A 2015-04-03 2015-04-03 Text semantic understanding method and system Active CN106156110B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510159102.2A CN106156110B (en) 2015-04-03 2015-04-03 Text semantic understanding method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510159102.2A CN106156110B (en) 2015-04-03 2015-04-03 Text semantic understanding method and system

Publications (2)

Publication Number Publication Date
CN106156110A true CN106156110A (en) 2016-11-23
CN106156110B CN106156110B (en) 2019-07-30

Family

ID=57338433

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510159102.2A Active CN106156110B (en) 2015-04-03 2015-04-03 Text semantic understanding method and system

Country Status (1)

Country Link
CN (1) CN106156110B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897268A (en) * 2017-02-28 2017-06-27 科大讯飞股份有限公司 Text semantic understanding method, device and system
CN114219876A (en) * 2022-02-18 2022-03-22 阿里巴巴达摩院(杭州)科技有限公司 Text merging method, device, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1338721A (en) * 2000-08-16 2002-03-06 财团法人工业技术研究院 Probability-guide fault-tolerant method for understanding natural languages
US20110071819A1 (en) * 2009-09-22 2011-03-24 Tanya Miller Apparatus, system, and method for natural language processing
CN102789464A (en) * 2011-05-20 2012-11-21 陈伯妤 Natural language processing method, device and system based on semanteme recognition
CN103440234A (en) * 2013-07-25 2013-12-11 清华大学 Natural language understanding system and method
CN103500160A (en) * 2013-10-18 2014-01-08 大连理工大学 Syntactic analysis method based on sliding semantic string matching
US20140188835A1 (en) * 2012-12-31 2014-07-03 Via Technologies, Inc. Search method, search system, and natural language comprehension system
CN104252533A (en) * 2014-09-12 2014-12-31 百度在线网络技术(北京)有限公司 Search method and search device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1338721A (en) * 2000-08-16 2002-03-06 财团法人工业技术研究院 Probability-guide fault-tolerant method for understanding natural languages
US20110071819A1 (en) * 2009-09-22 2011-03-24 Tanya Miller Apparatus, system, and method for natural language processing
CN102789464A (en) * 2011-05-20 2012-11-21 陈伯妤 Natural language processing method, device and system based on semanteme recognition
US20140188835A1 (en) * 2012-12-31 2014-07-03 Via Technologies, Inc. Search method, search system, and natural language comprehension system
CN103440234A (en) * 2013-07-25 2013-12-11 清华大学 Natural language understanding system and method
CN103500160A (en) * 2013-10-18 2014-01-08 大连理工大学 Syntactic analysis method based on sliding semantic string matching
CN104252533A (en) * 2014-09-12 2014-12-31 百度在线网络技术(北京)有限公司 Search method and search device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897268A (en) * 2017-02-28 2017-06-27 科大讯飞股份有限公司 Text semantic understanding method, device and system
CN106897268B (en) * 2017-02-28 2020-06-02 科大讯飞股份有限公司 Text semantic understanding method, device and system
CN114219876A (en) * 2022-02-18 2022-03-22 阿里巴巴达摩院(杭州)科技有限公司 Text merging method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN106156110B (en) 2019-07-30

Similar Documents

Publication Publication Date Title
US7027975B1 (en) Guided natural language interface system and method
Le et al. Smartsynth: Synthesizing smartphone automation scripts from natural language
CN102737104B (en) Task driven user intents
KR20220027198A (en) Pinning of Artifacts for Expansion of Search Keys and Search Spaces in a Natural Language Understanding (NLU) Framework
US20160171050A1 (en) Distributed Analytical Search Utilizing Semantic Analysis of Natural Language
CN103440234B (en) Natural language understanding system and method
CN107704453A (en) A kind of word semantic analysis, word semantic analysis terminal and storage medium
CN105095178B (en) Method and system for realizing text semantic fault-tolerant understanding
CN109240670A (en) Modular software development methodology, system, equipment and medium
KR20100091209A (en) Device and method for automatically building applications from specifications and from off-the-shelf components selected by semantic analysis
CN109614106A (en) A kind of C++ program compiling method and device
RU2711104C2 (en) Method and computer device for determining intention associated with request to create intent-depending response
CN101251838A (en) Method and system for grammatical analysis of demixing marking document
CN110147544A (en) A kind of instruction generation method, device and relevant device based on natural language
CN106502987B (en) A kind of method and apparatus that the sentence template based on seed sentence is recalled
CN104050157A (en) Ambiguity elimination method and system
CN106156110A (en) text semantic understanding method and system
CN104021177B (en) With reference to semantic net and the information integration method of geography information feature
Chadwick Programming Razor: Tools for Templates in ASP. NET MVC or WebMatrix
CN107783765A (en) file compiling method and device
Kwak et al. Interactive Story Maker: Tagged Video Retrieval System for Video Re-creation Service
Dimanidis et al. A natural language driven approach for automated web api development: Gherkin2oas
KR102080931B1 (en) Voice dialogue controlling method and apparatus for the same
Peters et al. Taming concurrency for verification using multiparty session types
Carmagnola Handling semantic heterogeneity in interoperable distributed user models

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant