CN107247613A - Sentence analytic method and sentence resolver - Google Patents

Sentence analytic method and sentence resolver Download PDF

Info

Publication number
CN107247613A
CN107247613A CN201710276537.4A CN201710276537A CN107247613A CN 107247613 A CN107247613 A CN 107247613A CN 201710276537 A CN201710276537 A CN 201710276537A CN 107247613 A CN107247613 A CN 107247613A
Authority
CN
China
Prior art keywords
sentence
resolved
morpheme
syntax tree
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710276537.4A
Other languages
Chinese (zh)
Inventor
邢锦江
李剑
朱华
邹雪梅
陈险峰
朱峰登
史可华
董扬威
李亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Aerospace Control Center
Original Assignee
Beijing Aerospace Control Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Aerospace Control Center filed Critical Beijing Aerospace Control Center
Priority to CN201710276537.4A priority Critical patent/CN107247613A/en
Publication of CN107247613A publication Critical patent/CN107247613A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • G06F8/427Parsing

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a kind of sentence analytic method and sentence resolver.Wherein, this method includes:Obtain sentence to be resolved;According to the grammer of Chinese field language-specific, sentence to be resolved is parsed, wherein, sentence to be resolved and Chinese field language-specific are all based on what Chinese was described.The present invention is solved in the related art, based on the field language-specific of English, and processing is complicated, does not meet the technical problem of the speech habits of Chinese, and improve sentence to be resolved and Chinese field language-specific can be readability, and then improves Consumer's Experience.

Description

Sentence analytic method and sentence resolver
Technical field
The present invention relates to field language-specific field, filled in particular to a kind of sentence analytic method and sentence parsing Put.
Background technology
Field language-specific (Domain-Specific Language, referred to as DSL) is for specific application area The computer language of design, it is expressed the intention of professional, is aided in it efficiently to solve in this field using the syntax of agreement Certainly problem.
In the related art, a kind of computer language is described, typically using extended BNF (Extended Backus-Naur Forms, referred to as EBNF).Traditional field language-specific description instrument, such as (Another Tool of Language Recognition, referred to as Antlr), the design of field language-specific can be simplified to a certain extent.So And, traditional computer language based on extended BNF describes method, and the description of existing language and analytical tool It is (such as Antlr) or of problems:For example, in general field language-specific describes method, it is desirable to be used as base using English This lexical element and keyword.But, the Chinese punctuate logic complicated due to being difficult to correct processing, even if allowing using Chinese work For keyword, it is also necessary to as English, increase space, therefore, this method between word and word and do not meet the language of Chinese Speech custom.
Therefore, in the related art, the field language-specific based on English, processing is complicated, and the language for not meeting Chinese is practised It is used.
The content of the invention
The embodiments of the invention provide a kind of sentence analytic method and sentence resolver, at least to solve in correlation technique In, based on the field language-specific of English, processing is complicated, does not meet the technical problem of the speech habits of Chinese.
One side according to embodiments of the present invention there is provided a kind of sentence analytic method, including:Obtain language to be resolved Sentence;According to the grammer of Chinese field language-specific, sentence to be resolved is parsed, wherein, sentence to be resolved and Chinese field Language-specific is all based on what Chinese was described.
Alternatively, grammer is described using dynamically changeable data;Grammer includes:For describing Chinese field language-specific Morpheme type symbol, and, outside divided-by symbol for the dictionary that is supplemented symbol.
Alternatively, according to the grammer of Chinese field language-specific, carrying out parsing to sentence to be resolved includes:By language to be resolved Sentence is decomposed into basic morpheme;Part of speech is marked to the basic morpheme of decomposition;According to the grammer of Chinese field language-specific, it will be labelled with The basic morpheme of part of speech resolves to syntax tree.
Alternatively, before sentence to be resolved is decomposed into basic morpheme, in addition to:Sentenced using predetermined ambiguity evaluation algorithm The sentence to be resolved that breaks whether there is ambiguity;In the case where the judgment result is yes, using predetermined workaround to sentence to be resolved The ambiguity of presence is evaded.
Alternatively, sentence to be resolved is decomposed into basic morpheme includes:Using longest match principle, by sentence to be resolved point Solve as basic morpheme, wherein, longest match principle is matching long sentence as far as possible.
Alternatively, according to the grammer of Chinese field language-specific, the basic morpheme for being labelled with part of speech is resolved into syntax tree Including one below:Using descending manner syntax tree analytical algorithm, the basic morpheme for being labelled with part of speech is resolved into syntax tree, its In, descending manner syntax tree analytical algorithm is:In predetermined morpheme position, search matching forward successively, when the morpheme of matching is quoted During other symbols in addition to the symbol cited in morpheme, other symbols are matched;, will using ascending manner syntax tree analytical algorithm The basic morpheme for being labelled with part of speech resolves to syntax tree, wherein, ascending manner syntax tree analytical algorithm is:Build from sentence to be resolved The father node of the basic morpheme produced is decomposed, the father node for building father node in a like fashion is adopted afterwards, until producing unique Root node;By the way of descending manner syntax tree analytical algorithm and ascending manner syntax tree analytical algorithm are combined, word will be labelled with The basic morpheme of property resolves to syntax tree.
Alternatively, before sentence to be resolved is decomposed into basic morpheme, in addition to:Inferred using predetermined ellipsis and calculated Method, infers to sentence to be resolved, and sentence to be resolved is reduced to the sentence of Complete Information, wherein, predetermined ellipsis is pushed away Disconnected algorithm includes at least one of:According to basic morpheme above, the deduction algorithm above supplemented ellipsis;According to The time that the basic morpheme of reference time is calculated to the time infers algorithm;The basic morpheme of not specified complete information is carried out The business object of positioning infers algorithm.
Alternatively, in the grammer according to Chinese field language-specific, the basic morpheme for being labelled with part of speech is resolved into grammer After tree, in addition to:Leaf node on syntax tree passes to the content of leaf node the father node of leaf node;Father node Content to included all leaf nodes transmission is handled, and obtains the content of father node;Perform successively:The above is passed Operation is passed and handles, until root node, using the content of root node as the end value of syntax tree, wherein, the end value is used In execution application programming interfaces.
Other side according to embodiments of the present invention, additionally provides a kind of sentence resolver, it is characterised in that bag Include:Acquisition module, for obtaining sentence to be resolved;Parsing module, for the grammer according to Chinese field language-specific, treats solution Analysis sentence is parsed, wherein, sentence to be resolved and Chinese field language-specific are all based on what Chinese was described.
Alternatively, parsing module includes:Participle unit, for sentence to be resolved to be decomposed into basic morpheme;Mark unit, Part of speech is marked for the basic morpheme to decomposition;Resolution unit, for the grammer according to Chinese field language-specific, will be labelled with The basic morpheme of part of speech resolves to syntax tree.
Alternatively, parsing module also includes:Judging unit, for judging sentence to be resolved using predetermined ambiguity evaluation algorithm With the presence or absence of ambiguity;Evade unit, in the case where the judgment result is yes, using predetermined workaround to sentence to be resolved The ambiguity of presence is evaded.
Alternatively, participle unit includes:Subelement is decomposed, for using longest match principle, sentence to be resolved is decomposed For basic morpheme, wherein, longest match principle is matching long sentence as far as possible.
Alternatively, resolution unit includes one below:First parsing subelement, is calculated for being parsed using descending manner syntax tree Method, syntax tree is resolved to by the basic morpheme for being labelled with part of speech, wherein, descending manner syntax tree analytical algorithm is:In predetermined word Plain position, search matching forward successively, when the morpheme of matching refer to other symbols in addition to the symbol cited in morpheme, Match other symbols;Second parsing subelement, for using ascending manner syntax tree analytical algorithm, will be labelled with the basic word of part of speech Element resolves to syntax tree, wherein, ascending manner syntax tree analytical algorithm is:Build the basic morpheme for decomposing and producing from sentence to be resolved Father node, the father node for building father node in a like fashion is adopted afterwards, until producing unique root node;3rd parsing Unit, by the way of being combined using descending manner syntax tree analytical algorithm and ascending manner syntax tree analytical algorithm, will be labelled with word The basic morpheme of property resolves to syntax tree.
Alternatively, parsing module also includes:Unit is inferred, for inferring algorithm using predetermined ellipsis, to be resolved Sentence is inferred, sentence to be resolved is reduced to the sentence of Complete Information, wherein, predetermined ellipsis infer algorithm include with It is at least one lower:According to basic morpheme above, the deduction algorithm above supplemented ellipsis;According to the base of the time of reference The time that this morpheme is calculated to the time infers algorithm;The business pair positioned to the basic morpheme of not specified complete information As inferring algorithm.
Alternatively, parsing module also includes:Transfer unit, leaf node on syntax tree is by the content of leaf node Pass to the father node of leaf node;Processing unit, enters for father node to the content of included all leaf nodes transmission Row processing, obtains the content of father node;Performing module, for performing successively:The above is transmitted and processing operation, until root Node, the end value of syntax tree is used as using the content of root node.
Other side according to embodiments of the present invention, additionally provides a kind of storage medium, it is characterised in that storage medium Program including storage, wherein, equipment where control storage medium performs following operate when program is run:Obtain language to be resolved Sentence;According to the grammer of Chinese field language-specific, sentence to be resolved is parsed, wherein, sentence to be resolved and Chinese field Language-specific is all based on what Chinese was described.
Other side according to embodiments of the present invention, additionally provides a kind of processor, it is characterised in that processor is used for Operation program, wherein, following operate is performed when program is run:Obtain sentence to be resolved;According to the language of Chinese field language-specific Method, is parsed to sentence to be resolved, wherein, sentence to be resolved and Chinese field language-specific are all based on Chinese and are described 's.
In embodiments of the present invention, by using sentence to be resolved is obtained, then according to the language of Chinese field language-specific Method, is parsed to sentence to be resolved, wherein, sentence to be resolved and Chinese field language-specific are all based on Chinese and are described , because above-mentioned sentence to be resolved and Chinese field language-specific are all based on what Chinese was described, improve language to be resolved Sentence can be readability with Chinese field language-specific, and then solves in the related art, based on the field language-specific of English, Processing is complicated, does not meet the technical problem of the speech habits of Chinese, and then improves Consumer's Experience.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, this hair Bright schematic description and description is used to explain the present invention, does not constitute inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the flow chart of sentence analytic method according to embodiments of the present invention;
Fig. 2 is the logical model figure of symbol according to embodiments of the present invention;
Fig. 3 is the logical model figure of dictionary according to embodiments of the present invention;
Fig. 4 is the logical model figure of grammer tree node according to embodiments of the present invention;
Fig. 5 is word segmentation result exemplary plot according to embodiments of the present invention;
Fig. 6 be syntax tree parsing according to embodiments of the present invention descending manner before to matching algorithm flow chart;
Fig. 7 is the frame diagram of definition and the parsing of Chinese field language-specific according to embodiments of the present invention;
Fig. 8 is the flow chart of the resolving of Chinese field language-specific according to embodiments of the present invention;And
Fig. 9 is the schematic diagram of sentence resolver according to embodiments of the present invention.
Embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention Accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is only The embodiment of a part of the invention, rather than whole embodiments.Based on the embodiment in the present invention, ordinary skill people The every other embodiment that member is obtained under the premise of creative work is not made, should all belong to the model that the present invention is protected Enclose.
It should be noted that term " first " in description and claims of this specification and above-mentioned accompanying drawing, " Two " etc. be for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that so using Data can exchange in the appropriate case, so as to embodiments of the invention described herein can with except illustrating herein or Order beyond those of description is implemented.In addition, term " comprising " and " having " and their any deformation, it is intended that cover Lid is non-exclusive to be included, for example, the process, method, system, product or the equipment that contain series of steps or unit are not necessarily limited to Those steps or unit clearly listed, but may include not list clearly or for these processes, method, product Or the intrinsic other steps of equipment or unit.
According to embodiments of the present invention there is provided a kind of embodiment of the method for sentence analytic method, it is necessary to illustrate, attached The step of flow of figure is illustrated can perform in the computer system of such as one group computer executable instructions, though also, So logical order is shown in flow charts, but in some cases, can be shown to be performed different from order herein Or the step of description.
In the present embodiment there is provided a kind of sentence analytic method, Fig. 1 is sentence parsing side according to embodiments of the present invention The flow chart of method, as shown in figure 1, this method comprises the following steps:
Step S102, obtains sentence to be resolved.
Step S104, according to the grammer of Chinese field language-specific, is parsed to sentence to be resolved, wherein, it is to be resolved Sentence and Chinese field language-specific are all based on what Chinese was described.
By above-mentioned steps, using sentence to be resolved is obtained, then according to the grammer of Chinese field language-specific, solution is treated Analysis sentence is parsed, wherein, sentence to be resolved and Chinese field language-specific are all based on what Chinese was described, due to upper State sentence to be resolved and Chinese field language-specific is all based on what Chinese was described, solve in the related art, be based on The field language-specific of English, processing is complicated, the problem of not meeting the speech habits of Chinese, based on Chinese description, meets Chinese Speech habits, improve the convenience to field language-specific, and then improve Consumer's Experience.
It should be noted that in the related art, extended BNF can only also describe the morphology of static state, it is impossible to be directed to The word element object persistently changed carries out morphology expansion.Therefore, general field language-specific (the cmd orders under such as windows Syntactic definition OK) departs from the custom of natural language, and the personnel in the field need that by necessary training a kind of neck could be grasped Domain language-specific.Such as, application software platform is according to one new event of business demand dynamic creation, be named as " event X37 ", The morpheme of " event " class is should be, but because the morphology can not be pre-defined in language description, therefore can not correctly recognize.And In the Chinese field language-specific of the embodiment of the present invention, grammer can be pre-defined using Chinese.
In addition, in order to increase flexibility, different from using instruments such as the Antlr for extending Ba Kesi expression formulas, the present invention is real Apply the description that example carries out grammer using dynamically changeable data.When the data of symbol table and dictionary change, it will right immediately Resolution logic produces influence.For the field such as Antlr language-specific description instrument, it is necessary to which described grammer is solved in advance Release, generate related Java interpretive codes, then after compiling, can just be deployed in application business system, carry out final language Speech is performed.Therefore after operation system issue, maintenance, extension and the renewal of field language-specific are more inconvenient.And use this hair The use dynamically changeable data of bright embodiment carry out the description of grammer, and this process does not need the generation of any resolver code, compiled Translate and issue again.
The grammer of the embodiment of the present invention is described using dynamically changeable data below and illustrated.
Grammer can include:For the symbol of the type of the morpheme that describes Chinese field language-specific, wherein, symbol refers to one The fundamental type of lemma element in individual grammer system.The symbol in grammer system is described, code name, title, pattern, terminal symbol is used And priority attribute, Fig. 2 is the logical model figure of symbol according to embodiments of the present invention, and its logical model is as shown in Figure 2.Example Such as, table 1 shows code name in symbol definition, and title, pattern, the corresponding relation between terminal symbol and priority is patrolled using this One group of grammatical symbol example for collecting model description is as shown in table 1 below:
Table 1
" date " and " time " is two symbols independently defined in upper table, and its pattern is fixed using regular expression Justice.And the pattern of symbol " date-time " uses the reference of " date " and " time " two kinds of symbols (to be marked with square brackets cited Symbol code name) be defined.For some symbols, it is impossible to defined with simple mode combinations, such as " earth station " symbol is used Built-in function IsStation () is defined.The function is an inquiry to applied business data, is meant that judgement is No is a space flight measurement and control earth station.Given target text, if inquiring the record of the earth station in business datum, Return very, illustrate that target text represents an earth station.The method that this use function is judged, is suitable for dynamic calculation During constantly change, and can not with simple regular expression describe lexical element.Function not only supports data to look into Operation is ask, the logical operation of complexity is also supported, whether comprehensive descision target text is specific morphological type.It is basic as one Agreement, finishing sign can only be defined by pure regular expression or built-in function.And nonterminal symbol (can be made by other symbols The symbol code name or designation included with square brackets) combination be defined." priority " is the attribute of numeric type, is used to refer to The fixed order for attempting matching, the less preferential trial matching of numeral.For each symbol, analytical framework will define an acquiescence Inter-process function, entitled " sCODE ".Wherein s is general prefix, and CODE is the code name of corresponding symbol.Such as letter Number sDAY () is handles the acquiescence intrinsic function of " day reference " symbol, for determining what is referred in analyzed target text Which day specific day is.
In addition, also including dictionary for what is supplemented symbol outside divided-by symbol.Dictionary is outside symbol table Additional morphology describing mode, can provide more Symbol recognitions and matching foundation by way of marking actual text, into For the supplement of symbol table.Usual symbol table is that, towards language development personnel and system developer, and dictionary is towards common User's.Fig. 3 is the logical model figure of dictionary according to embodiments of the present invention, is included as shown in Figure 3:Word, symbol and parameter. Table 2 is the example of a dictionary, wherein, table 2 shows word, the relation between symbol code name and parameter:
Table 2
Word Symbol code name Parameter
Tomorrow DAY Now, 1
The day after tomorrow DAY Now, 2
In upper table, two neologisms that have been " DAY " (day reference) symbol definition:" tomorrow ", " day after tomorrow ", and given ginseng Number.These parameters are transferred to the default processing function sDAY () of DAY symbols, calculate the day that two symbols are specifically represented.It is right In " tomorrow ", " now, during 1 " two parameter, current date will be calculated as+1 day when given;And " day after tomorrow " will be calculated as working as Preceding+2 days dates.In this way, DAY symbols are expanded by language users in the way of lightweight, without to Go out regular expression.
After sentence to be resolved is got, in addition it is also necessary to which sentence to be resolved is parsed, wherein, sentence to be resolved is entered Row parsing includes:Sentence to be resolved is decomposed into basic morpheme;Part of speech is marked to the basic morpheme of decomposition;It is special according to Chinese field The grammer of attribute speech, syntax tree is resolved to by the basic morpheme for being labelled with part of speech.
As syntactic definition and parsing the most important data structure of system, the embodiment of the present invention use using grammer tree node as The logical model of main (rather than grammer subtree).The data structure of tree node can describe a variety of nodes, accommodate various Data, Fig. 4 is the logical model figure of grammer tree node according to embodiments of the present invention, and its underlying attribute is illustrated in fig. 4 shown below:In language In the logical model of method tree node, attribute " urtext " is the input word for generating the node;" symbol code name " refers to and works as prosthomere The sign pattern of point, corresponding symbol object and relevant treatment function can be accessed by the attribute;" content " attribute refers to Both be probably simple numerical value to the reference of the real data of the node, it is also possible to a class example, and it is any be used for be Node provides the data structure for calculating information;" father node " points to the even higher level of node of present node;And " child node " is indefinite Long list, points to each next stage node of present node;" display word " is usually a subset of content, in the present invention There is provided the text information of the display on tree node during " field language-specific manager " drafting syntax tree of embodiment, generally For testing and verifying field language-specific text and related resolution algorithm that user is inputted.
Different nodes is connected with each other, you can constitute a syntax tree.Unique root node does not have a father node in tree, and its One and only one father node of his each node.Undermost node is leaf node, and each leaf node is represented in sentence Element.And each node represents a kind of computing.
Carried out in addition, part-of-speech tagging algorithm is synchronous with participle.After each morpheme is separated, it is required for carrying out part of speech mark Note.Not only need to be lemma element mark the sign pattern corresponding to it, also to record its original contents, and therefrom extract crucial Parameter value.Part-of-speech tagging is completed by the default processing function of symbol.
For example, the sign pattern in " on November 13rd, 2016 " is the date (DATE), the text is passed into correspondence symbol Default processing function sDATE, it is as follows that the function simplifies pseudo-code:
In above-mentioned part-of-speech tagging algorithm, the numerical value of specific year, month, day is taken out, and establish according to these values The variable of date type.After default function processing through symbol, a syntax tree node object is established, the symbol of the node is Identified symbol during participle, urtext and display text are " on November 13rd, 2016 ";Symbol code name is DATE;Content For the object of a date type, its value is on November 13rd, 2016;Its father node, child node are sky, and representative does not have also currently There is the structure for carrying out syntax tree.
Before sentence to be resolved is decomposed into basic morpheme, in addition it is also necessary to judge to be resolved using predetermined ambiguity evaluation algorithm Sentence whether there is ambiguity;In the case where the judgment result is yes, the discrimination existed using predetermined workaround to sentence to be resolved Justice is evaded.
Traditionally the identifying processing of Chinese natural language is the problem inside Computational Linguistics.The embodiment of the present invention is directed to The description and parsing of Chinese field language-specific, have devised and embodied the frame structure and function of resolver, specific using as follows Key algorithm:Ambiguity evaluation algorithm and workaround.
It should be noted that for Chinese natural language, due to being only existed in grammer less part of speech (noun, verb, Adjective, number, measure word, pronoun, adverbial word, preposition, conjunction, auxiliary word, interjection, onomatopoeia totally 12), but to cover substantial amounts of reality Border vocabulary (according to《Lexicon of Common Words in Contemporary Chinese(draft)》, Chinese common words are 56008), therefore carrying out syntax tree Parsing can produce substantial amounts of ambiguity.
Ambiguity should do the best in Chinese field language-specific and avoid.Judged in an embodiment of the present invention using predetermined ambiguity Algorithm judges that sentence to be resolved whether there is ambiguity, specifically, and a read statement whether there is ambiguity, can use and sentence as follows Determine method:
(1) during participle, since object statement left end, do not limit maximum matching length, sweep forward it is all can with The pattern of symbol matched somebody with somebody, a kind of initial participle scheme is constituted for each matching.For each initial participle scheme, to the right Side is gradually scanned, and often matches a kind of new symbol, all corresponding former participle forecast scheme configuration permutation and combination relation.Class according to this Push away, until all texts are matched and finished, form x kind participle schemes.If x>1, then it can determine that read statement has morphology discrimination Justice.
(2) the syntax tree analytical algorithm of descending manner is used a kind of participle scheme respectively, and does not limit the position of matching, Even higher level of node can be built with matching symbols at an arbitrary position, until constructing unique root node.For all participles Scheme, forms y kinds tree construction (identical structure is calculated as a kind), if y altogether>1, then it can determine that read statement has syntax discrimination Justice.
In the case where the judgment result is yes, in order to evade ambiguity, the present invention is except using maximum in segmentation methods Ensure with principle and using outside descending manner syntax tree analytical algorithm, should also try one's best from the definition of grammer system, thus, this Inventive embodiments are evaded using predetermined workaround to the ambiguity that sentence to be resolved is present, and propose following principle:
For language institute towards professional domain, segment morpheme as far as possible, define the sign pattern compared with horn of plenty, lifted The ratio of " symbol quantity/word quantity ".
If (2) symbol A pattern includes symbol B pattern, A priority should be adjusted to it is smaller than B, i.e., it is preferential to be entered using A Row matching, this principle is maximum matching length principle.
, should be preposition by longer pattern if (3) there is replaceable part in symbol A pattern, with realize priority match compared with Long pattern.Such as represent or relation OR symbols, the use that should try one's best " or | or " defines its pattern, and avoid using " or | Or ", prevent from being identified as OR symbols and being separately separated out by " person " "or".The principle is also with maximum matching length principle kiss Close.
In order to preferably be parsed to sentence to be resolved, selection of the embodiment of the present invention, which is used, is decomposed into sentence to be resolved Basic morpheme, is specifically included:Using longest match principle, sentence to be resolved is decomposed into basic morpheme, wherein, most long matching is former The sentence then grown as far as possible for matching.
One section of Chinese sentence is decomposed into basic morpheme by segmentation methods, and determines the finishing sign corresponding to morpheme.The mistake Journey is also referred to as morphological analysis.In the process, only the pattern of terminal symbol can participate in sweep forward with matching, because only that termination Symbol can be occurred directly in sentence.Pseudo-code by simplified segmentation methods is as follows.
Wherein, " longest match principle " is the basic participle criterion that a kind of present invention is used.In each searching position of sentence On, above-mentioned algorithmic match long sentence as far as possible.Lexer () function is searched for forward in current goal text phrase Tsymbol (finishing sign) pattern.By the trial matching to all finishing signs, the symbol of most long matching length is obtained Number it will be used as final symbol.The text all in match statement if the principle fails, it tries the morphology of randomness Search, searching has highest possible participle mode.If both of which there is no feasible word segmentation result, it can report " read statement is wrong ", shows that it does not meet the grammer currently defined.Fig. 5 is word segmentation result example according to embodiments of the present invention Figure, its word segmentation result is as shown in Figure 5.
The characteristics of embodiment of the present invention is directed to Chinese domain language, it is also contemplated that calculated using a kind of syntax tree parsing of mediation formula Method, specifically includes two subalgorithms of descending manner and ascending manner.According to the grammer of Chinese field language-specific, part of speech will be labelled with Basic morpheme, which resolves to syntax tree, includes herein below:Using descending manner syntax tree analytical algorithm, the basic of part of speech will be labelled with Morpheme resolves to syntax tree, wherein, descending manner syntax tree analytical algorithm is:In predetermined morpheme position, successively search forward Match somebody with somebody, when the morpheme of matching refer to other symbols in addition to the symbol cited in morpheme, match other symbols.Specifically, In theory for a grammer system, if its morphology, syntax and its symbolism are complete, rational (i.e. symbols In the absence of unlimited reference directly or indirectly to itself), a descending manner resolving can be used, it is total to correct sentence The structure for the syntax tree that can be realized.Fig. 6 be syntax tree parsing according to embodiments of the present invention descending manner before to matching algorithm " forward direction matches () " function in flow chart, Fig. 6, to the position being scheduled in current word element, is searched for forward successively.This is one Recursive procedure, when refer to other symbols in the pattern for being try to matching, the function call itself is gone cited in matching Symbol.
To a certain extent, although descending manner algorithm is effectively and complete, optimum performance can not be obtained.Work as symbol There is big quantity symbol in system, and their pattern, when mutually quoting, the complexity of search tends to O (nd), wherein n is symbol Number quantity, d is maximum reference depth.Therefore, for the numerous large-scale field language-specific of symbol quantity, it is considered as making With the analytical algorithm of ascending manner, to obtain more preferably performance.
Using ascending manner syntax tree analytical algorithm, the basic morpheme for being labelled with part of speech is resolved into syntax tree, wherein, rise Formula syntax tree analytical algorithm is:The father node that the basic morpheme produced is decomposed from sentence to be resolved is built, afterwards using identical Mode builds the father node of father node, until producing unique root node;Using descending manner syntax tree analytical algorithm and ascending manner The mode that syntax tree analytical algorithm is combined, syntax tree is resolved to by the basic morpheme for being labelled with part of speech.
For the field language-specific of determination, the morpheme that the syntax tree analytical algorithm of ascending manner is produced from participle is opened Begin, attempt to build their father node.This algorithm asserts rule by application, is reduced significantly search space.According to symbol Priority (or frequency statistics of historical data), most common terminal symbol combination will preferentially be separated, their father's section Point will be established.
As the embodiment of an option, the parsing of ascending manner syntax tree can be not limited on current location, target language As long as the pattern that any position matching is currently attempted in sentence, can be separated progress superior node structure immediately.Target language The morpheme that do not match of sentence continues to attempt matching, until all leaf nodes are matched, form the complete second level and (compares leaf segment The high one-level of point) node.The ascending manner matching of same procedure is carried out to second level node, until ultimately forming a unique root section Point.If can not finally form unique root node, report read statement is wrong.
Above-mentioned descending manner and ascending manner algorithm are combined, mediation formula analytical algorithm is just constituted, its process is as follows:
(1) in read statement, preferentially carry out ascending manner for the pattern of symbol that part priority is high, the frequency of occurrences is high and search Rope.Local syntax tree is carried out after the match is successful every time immediately to build, subtree is formed.
(2), whenever there is new subtree to successfully construct, its root node is proceeded into ascending manner language together with the other parts of sentence Method is parsed, until the high frequency mode in sentence is processed.
(3), for the remainder of sentence, the syntax tree solution of descending manner is carried out together with the root node for the subtree having been built up Analysis, until constructing the syntax tree of whole sentence.
The characteristics of mediation formula syntax tree analytical algorithm combines ascending manner algorithm and descending manner algorithm, is taking into account integrality The high efficiency of parsing is ensure that simultaneously.The improved properties of mediation formula algorithm, premised on itself syntactic property of language, it is commonly used High-frequency symbols species less, the ratio that accounts for word it is bigger, be more readily available performance boost.
Before sentence to be resolved is decomposed into basic morpheme, it is also contemplated that utilize:Inferred using predetermined ellipsis Algorithm, infers to sentence to be resolved, and sentence to be resolved is reduced to the sentence of Complete Information, wherein, predetermined ellipsis Infer that algorithm includes at least one of:According to basic morpheme above, the deduction algorithm above supplemented ellipsis;Root The time calculated according to the basic morpheme of the time of reference to the time infers algorithm;The basic morpheme of not specified complete information is entered The business object of row positioning infers algorithm.
In Chinese field language-specific, also allow the use of ellipsis.The embodiment of the present invention infers algorithm using simple Sentence is omitted in processing, is reduced to the sentence with Complete Information.Ellipsis infers that being divided into deduction above, time infers and industry Business object infers three kinds.
Infer above:According to previously mentioned morpheme, ellipsis is supplemented.Such as " in March, 2017 performs work to sentence Make one.Perform work two April." wherein April be ellipsis, lack year information, it is impossible to for service computation.With ellipsis to It is preceding to find the sentence with the time, obtain 2017, therefore be " in April, 2017 " by " April " supplement.
Time infers:The morpheme of time is referred to for " tomorrow " " next month " " 1 day " " March " etc., passes through current time Calculated.As " tomorrow " represents current date+1 day." 1 day ", in the case of no contextual information, refers generally to from current Nearest 1 day.By that analogy.
Business object is inferred:By searching the business datum of application platform, the morpheme of not specified complete information is determined Position.Estimating method is that ergodic data table and each bar are recorded, and finds path of the target morpheme in business datum.Such as in teaching operations In host system, " submit《Exercise industry on-site investigation》To Zhang San " sentence, by retrieving each business datum table, find《Body-building Industry is investigated》There is record in paper tables of data, be the paper title of student Li Si, and Zhang San is in teacher's information's table Teacher, therefore sentence can be supplemented " to submit the paper of Li Si《Exercise industry on-site investigation》Give teacher Zhang San ".So, to lacking The morpheme of few information supplements necessary modifier, realizes the deduction of ellipsis.
In the grammer according to Chinese field language-specific, after the basic morpheme for being labelled with part of speech resolved into syntax tree, Also include:Leaf node on syntax tree passes to the content of leaf node the father node of leaf node;Father node is to being wrapped The content of all leaf nodes transmission included is handled (calculating), obtains the content of father node;Perform successively:The above is passed Pass and handle operation, until root node, using the content of root node as syntax tree end value, wherein, the end value is used In performing application programming interfaces (Application Processor Interface, referred to as API), user view is realized. Said process that is to say the operation that performs of syntax tree, specific as follows shown:
1) content (calculate and obtain in part-of-speech tagging) of each leaf node is passed into its father node as parameter.
2) father node receives the parameter of content that all child nodes are transmitted function by default, performs the function and is counted Calculate, the content of father node is updated with the function return value.
3) by that analogy, until root node calculating is finished, the content of root node is the end value of the syntax tree.
In general, there is interactive relation between the calculating of each leaf node and applied business data, calculate all may be used each time Business datum can be influenceed.Therefore appropriate execution sequence should be chosen.Can be preferential (i.e. by a left side extremely using a left side according to the property of father node The right calculating for carrying out node) or right priority principle.It is left preferential because the modifier (attribute and the adverbial modifier) of modern Chinese is general preposition Principle meets the custom of most of Chinese field language-specifics, therefore is used as the preferred option of inventive algorithm.
For the ease of display, wherein syntax tree is represented in tables of data with the textual form of preposition expression formula.When selected tool Body test statement, such as " 3 minutes after 27 circle Xiamen station TIN:Experiment arrangement 3-008 " (navigate the circle that day device flies around ground by its centre circle It is secondary;TIN is a space flight technical term, at the time of implication is that survey station starts tracking to a spacecraft), software parses the sentence Generative grammar tree, then depicts the Chinese statement syntax tree parsing knot of " Chinese field language-specific manager " Software on Drawing afterwards Really.
Programming language compiler (interpreter) technology is combined by the embodiment of the present invention with natural language processing technique, structure The hybrid technological frame of a kind of description of universal Chinese field language-specific and parsing is built, it is allowed to by regular expression, interior Put the flexi modes such as function and carry out syntactic definition, and the automatic identification of object statement can be carried out according to the grammer, parses and holds OK, so as to coordinate the build-in function of business application system, the customization operation of finishing service data is realized flexible on demand to business Extension.Specifically, Fig. 7 is the frame diagram of definition and the parsing of Chinese field language-specific according to embodiments of the present invention, such as Fig. 7 It is shown including:The definition of Chinese field language-specific and parsing technological frame and application business system, wherein, Chinese field is special By obtaining object statement in the definition of attribute speech and parsing technological frame, to Chinese domain language on the basis of syntactic description Parsing, then enters line statement and performs operation;The definition of Chinese field language-specific and parsing technological frame by sentence perform with Application business system is connected, specifically, and business processing operation is carried out with business datum and service logic.
Wherein, Fig. 8 is the flow chart of the resolving of Chinese field language-specific according to embodiments of the present invention, specifically , Chinese domain language parsing is carried out by process as shown in Figure 8, to object statement by pretreatment, participle, part-of-speech tagging, language The step of method tree parses, generative grammar tree.Specifically, in addition it is also necessary to carry out participle behaviour using the participle training result in grammer system Make, syntax parsing training result is parsed to syntax tree.In addition, during participle and part-of-speech tagging, in addition it is also necessary to word Allusion quotation and symbol table.
The above embodiment of the present invention perfect can support the definition of Chinese (and any other languages) field language-specific With parsing.It is aided with rational grammar design, or even being capable of natural language word of the automatic identification processing with certain rule, such as wealth Through news, sports news etc..The hybrid description method of morphology and the syntax based on regular expression and discriminant function, relative to Extended BNF has higher flexibility.Text can dynamically be changed and expanded to the field language-specific designed using the present invention Method and come into force, code building, compiling and issue without carrying out language interpreter.
Other side according to embodiments of the present invention, additionally provides a kind of sentence resolver, and Fig. 9 is according to the present invention The schematic diagram of the sentence resolver of embodiment, includes as shown in Figure 9:Acquisition module 91 and parsing module 93.Below to this Device is illustrated.
Acquisition module 91, for obtaining sentence to be resolved.
Parsing module 93, for the grammer according to Chinese field language-specific, is parsed to sentence to be resolved, wherein, Sentence to be resolved and Chinese field language-specific are all based on what Chinese was described.
Alternatively, parsing module includes:Participle unit, for sentence to be resolved to be decomposed into basic morpheme;Mark unit, Part of speech is marked for the basic morpheme to decomposition;Resolution unit, for the grammer according to Chinese field language-specific, will be labelled with The basic morpheme of part of speech resolves to syntax tree.
Alternatively, parsing module also includes:Judging unit, for judging sentence to be resolved using predetermined ambiguity evaluation algorithm With the presence or absence of ambiguity;Evade unit, in the case where the judgment result is yes, using predetermined workaround to sentence to be resolved The ambiguity of presence is evaded.
Alternatively, participle unit includes:Subelement is decomposed, for using longest match principle, sentence to be resolved is decomposed For basic morpheme, wherein, longest match principle is matching long sentence as far as possible.
Alternatively, resolution unit includes one below:First parsing subelement, is calculated for being parsed using descending manner syntax tree Method, syntax tree is resolved to by the basic morpheme for being labelled with part of speech, wherein, descending manner syntax tree analytical algorithm is:In predetermined word Plain position, search matching forward successively, when the morpheme of matching refer to other symbols in addition to the symbol cited in morpheme, Match other symbols;Second parsing subelement, for using ascending manner syntax tree analytical algorithm, will be labelled with the basic word of part of speech Element resolves to syntax tree, wherein, ascending manner syntax tree analytical algorithm is:Build the basic morpheme for decomposing and producing from sentence to be resolved Father node, the father node for building father node in a like fashion is adopted afterwards, until producing unique root node;3rd parsing Unit, by the way of being combined using descending manner syntax tree analytical algorithm and ascending manner syntax tree analytical algorithm, will be labelled with word The basic morpheme of property resolves to syntax tree.
Alternatively, parsing module also includes:Unit is inferred, for inferring algorithm using predetermined ellipsis, to be resolved Sentence is inferred, sentence to be resolved is reduced to the sentence of Complete Information, wherein, predetermined ellipsis infer algorithm include with It is at least one lower:According to basic morpheme above, the deduction algorithm above supplemented ellipsis;According to the base of the time of reference The time that this morpheme is calculated to the time infers algorithm;The business pair positioned to the basic morpheme of not specified complete information As inferring algorithm.
Alternatively, parsing module also includes:Transfer unit, leaf node on syntax tree is by the content of leaf node Pass to the father node of leaf node;Processing unit, enters for father node to the content of included all leaf nodes transmission Row processing, obtains the content of father node;Performing module, for performing successively:The above is transmitted and processing operation, until root Node, the end value of syntax tree is used as using the content of root node.
Other side according to embodiments of the present invention, additionally provides a kind of storage medium, it is characterised in that storage medium Program including storage, wherein, equipment where control storage medium performs following operate when program is run:Obtain language to be resolved Sentence;According to the grammer of Chinese field language-specific, sentence to be resolved is parsed, wherein, sentence to be resolved and Chinese field Language-specific is all based on what Chinese was described.
Other side according to embodiments of the present invention, additionally provides a kind of processor, it is characterised in that processor is used for Operation program, wherein, following operate is performed when program is run:Obtain sentence to be resolved;According to the language of Chinese field language-specific Method, is parsed to sentence to be resolved, wherein, sentence to be resolved and Chinese field language-specific are all based on Chinese and are described 's.
The embodiments of the present invention are for illustration only, and the quality of embodiment is not represented.
In the above embodiment of the present invention, the description to each embodiment all emphasizes particularly on different fields, and does not have in some embodiment The part of detailed description, may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed technology contents, others can be passed through Mode is realized.Wherein, device embodiment described above is only schematical, such as division of described unit, Ke Yiwei A kind of division of logic function, can there is other dividing mode when actually realizing, such as multiple units or component can combine or Person is desirably integrated into another system, or some features can be ignored, or does not perform.Another, shown or discussed is mutual Between coupling or direct-coupling or communication connection can be the INDIRECT COUPLING or communication link of unit or module by some interfaces Connect, can be electrical or other forms.
The unit illustrated as separating component can be or may not be it is physically separate, it is aobvious as unit The part shown can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple On unit.Some or all of unit therein can be selected to realize the purpose of this embodiment scheme according to the actual needs.
In addition, each functional unit in each embodiment of the invention can be integrated in a processing unit, can also That unit is individually physically present, can also two or more units it is integrated in a unit.Above-mentioned integrated list Member can both be realized in the form of hardware, it would however also be possible to employ the form of SFU software functional unit is realized.
If the integrated unit is realized using in the form of SFU software functional unit and as independent production marketing or used When, it can be stored in a computer read/write memory medium.Understood based on such, technical scheme is substantially The part contributed in other words to prior art or all or part of the technical scheme can be in the form of software products Embody, the computer software product is stored in a storage medium, including some instructions are to cause a computer Equipment (can for personal computer, server or network equipment etc.) perform each embodiment methods described of the invention whole or Part steps.And foregoing storage medium includes:USB flash disk, read-only storage (ROM, Read-Only Memory), arbitrary access are deposited Reservoir (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. are various can be with store program codes Medium.
Described above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications also should It is considered as protection scope of the present invention.

Claims (17)

1. a kind of sentence analytic method, it is characterised in that including:
Obtain sentence to be resolved;
According to the grammer of Chinese field language-specific, the sentence to be resolved is parsed, wherein, the sentence to be resolved and The Chinese field language-specific is all based on what Chinese was described.
2. according to the method described in claim 1, it is characterised in that
The grammer is described using dynamically changeable data;The grammer includes:For describing the specific language in Chinese field The symbol of the type of the morpheme of speech, and, in addition to the symbol for the dictionary that is supplemented the symbol.
3. according to the method described in claim 1, it is characterised in that according to the grammer of the Chinese field language-specific, Carrying out parsing to the sentence to be resolved includes:
The sentence to be resolved is decomposed into basic morpheme;
To the basic morpheme mark part of speech of decomposition;
According to the grammer of the Chinese field language-specific, the basic morpheme for being labelled with part of speech is resolved into syntax tree.
4. method according to claim 3, it is characterised in that the sentence to be resolved is being decomposed into the basic morpheme Before, in addition to:
Judge that the sentence to be resolved whether there is ambiguity using predetermined ambiguity evaluation algorithm;
In the case where the judgment result is yes, professional etiquette is entered to the ambiguity that the sentence to be resolved is present using predetermined workaround Keep away.
5. method according to claim 3, it is characterised in that the sentence to be resolved is decomposed into the basic morpheme bag Include:
Using longest match principle, the sentence to be resolved is decomposed into the basic morpheme, wherein, the longest match principle The sentence grown as far as possible for matching.
6. method according to claim 3, it is characterised in that according to the grammer of the Chinese field language-specific, will mark The basic morpheme for having noted part of speech resolves to the syntax tree including one below:
Using descending manner syntax tree analytical algorithm, the basic morpheme for being labelled with part of speech is resolved into the syntax tree, wherein, it is described Descending manner syntax tree analytical algorithm is:In predetermined morpheme position, search matching forward successively, when the morpheme of matching refer to remove During other symbols outside the symbol cited in the morpheme, other described symbols of matching;
Using ascending manner syntax tree analytical algorithm, the basic morpheme for being labelled with part of speech is resolved into the syntax tree, wherein, it is described Ascending manner syntax tree analytical algorithm is:The father node that the basic morpheme produced is decomposed from the sentence to be resolved is built, is adopted afterwards The father node of father node is built in a like fashion, until producing unique root node;
By the way of the descending manner syntax tree analytical algorithm and the ascending manner syntax tree analytical algorithm are combined, it will be labelled with The basic morpheme of part of speech resolves to the syntax tree.
7. method according to claim 3, it is characterised in that the sentence to be resolved is being decomposed into the basic morpheme Before, in addition to:
Algorithm is inferred using predetermined ellipsis, the sentence to be resolved is inferred, the sentence to be resolved is reduced to The sentence of Complete Information, wherein, the predetermined ellipsis infers that algorithm includes at least one of:According to basic word above Element, the deduction algorithm above supplemented ellipsis;The time calculated according to the basic morpheme of the time of reference to the time Infer algorithm;The business object positioned to the basic morpheme of not specified complete information infers algorithm.
8. the method according to any one of claim 3 to 7, it is characterised in that according to the Chinese specific language in field The grammer of speech, after the basic morpheme for being labelled with part of speech resolved into the syntax tree, in addition to:
Leaf node on the syntax tree passes to the content of the leaf node father node of the leaf node;
The father node is handled the content of included all leaf nodes transmission, obtains the content of father node;
Perform successively:The above is transmitted and processing operation, until root node, regard the content of the root node as institute's predicate The end value of method tree, wherein, the end value is used to perform application programming interfaces.
9. a kind of sentence resolver, it is characterised in that including:
Acquisition module, for obtaining sentence to be resolved;
Parsing module, for the grammer according to Chinese field language-specific, is parsed to the sentence to be resolved, wherein, institute State sentence to be resolved and the Chinese field language-specific is all based on what Chinese was described.
10. device according to claim 9, it is characterised in that the parsing module includes:
Participle unit, for the sentence to be resolved to be decomposed into basic morpheme;
Unit is marked, part of speech is marked for the basic morpheme to decomposition;
Resolution unit, for the grammer according to the Chinese field language-specific, the basic morpheme for being labelled with part of speech is resolved to Syntax tree.
11. device according to claim 10, it is characterised in that the parsing module also includes:
Judging unit, for judging that the sentence to be resolved whether there is ambiguity using predetermined ambiguity evaluation algorithm;
Evade unit, in the case where the judgment result is yes, existing using predetermined workaround to the sentence to be resolved Ambiguity evaded.
12. device according to claim 10, it is characterised in that the participle unit includes:
Subelement is decomposed, for using longest match principle, the sentence to be resolved is decomposed into the basic morpheme, wherein, The sentence that the longest match principle is grown as far as possible for matching.
13. device according to claim 10, it is characterised in that the resolution unit includes one below:
First parsing subelement, for using descending manner syntax tree analytical algorithm, the basic morpheme for being labelled with part of speech is resolved to The syntax tree, wherein, the descending manner syntax tree analytical algorithm is:In predetermined morpheme position, search is matched forward successively, When the morpheme of matching refer to other symbols in addition to the symbol cited in the morpheme, other described symbols of matching;
Second parsing subelement, for using ascending manner syntax tree analytical algorithm, the basic morpheme for being labelled with part of speech is resolved to The syntax tree, wherein, the ascending manner syntax tree analytical algorithm is:Build from the sentence to be resolved and decompose the basic of generation The father node of morpheme, adopts the father node for building father node in a like fashion afterwards, until producing unique root node;
3rd parsing subelement, for using the descending manner syntax tree analytical algorithm and the ascending manner syntax tree analytical algorithm With reference to mode, the basic morpheme for being labelled with part of speech is resolved into the syntax tree.
14. device according to claim 10, it is characterised in that the parsing module also includes:
Infer unit, for inferring algorithm using predetermined ellipsis, the sentence to be resolved is inferred, waits to solve by described Analysis sentence is reduced to the sentence of Complete Information, wherein, the predetermined ellipsis infers that algorithm includes at least one of:According to Basic morpheme above, the deduction algorithm above supplemented ellipsis;The time is entered according to the basic morpheme of the time of reference The time that row is calculated infers algorithm;The business object positioned to the basic morpheme of not specified complete information infers algorithm.
15. the device according to any one of claim 10 to 14, it is characterised in that the parsing module also includes:
The content of the leaf node is passed to the leaf node by transfer unit, the leaf node on the syntax tree Father node;
Processing unit, handles the content of included all leaf nodes transmission for the father node, obtains father's section The content of point;
Performing module, for performing successively:The above is transmitted and processing operation, until root node, when the root node Content as the syntax tree end value, wherein, the end value be used for perform application programming interfaces.
16. a kind of storage medium, it is characterised in that the storage medium includes the program of storage, wherein, when described program is run Equipment where controlling the storage medium performs following operate:
Obtain sentence to be resolved;
According to the grammer of Chinese field language-specific, the sentence to be resolved is parsed, wherein, the sentence to be resolved and The Chinese field language-specific is all based on what Chinese was described.
17. a kind of processor, it is characterised in that the processor is used for operation program, wherein, performed when described program is run with Lower operation:
Obtain sentence to be resolved;
According to the grammer of Chinese field language-specific, the sentence to be resolved is parsed, wherein, the sentence to be resolved and The Chinese field language-specific is all based on what Chinese was described.
CN201710276537.4A 2017-04-25 2017-04-25 Sentence analytic method and sentence resolver Pending CN107247613A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710276537.4A CN107247613A (en) 2017-04-25 2017-04-25 Sentence analytic method and sentence resolver

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710276537.4A CN107247613A (en) 2017-04-25 2017-04-25 Sentence analytic method and sentence resolver

Publications (1)

Publication Number Publication Date
CN107247613A true CN107247613A (en) 2017-10-13

Family

ID=60016573

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710276537.4A Pending CN107247613A (en) 2017-04-25 2017-04-25 Sentence analytic method and sentence resolver

Country Status (1)

Country Link
CN (1) CN107247613A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108874917A (en) * 2018-05-30 2018-11-23 北京五八信息技术有限公司 Intension recognizing method, device, equipment and storage medium
CN109298857A (en) * 2018-10-09 2019-02-01 杭州朗和科技有限公司 Method for building up, medium, device and the calculating equipment of DSL statement model
CN109558590A (en) * 2018-11-23 2019-04-02 中国人民解放军63789部队 A kind of critical failure device localization method based on spacecraft telemetry parameter participle
CN109841210A (en) * 2017-11-27 2019-06-04 西安中兴新软件有限责任公司 A kind of Intelligent control implementation method and device, computer readable storage medium
CN111178052A (en) * 2019-12-20 2020-05-19 中国建设银行股份有限公司 Method and device for constructing robot process automation application
CN112380848A (en) * 2020-11-19 2021-02-19 平安科技(深圳)有限公司 Text generation method, device, equipment and storage medium
CN112579093A (en) * 2020-12-11 2021-03-30 杭州安恒信息技术股份有限公司 Information pushing method and device and related equipment
CN118132375A (en) * 2024-03-01 2024-06-04 北京开运联合信息技术集团股份有限公司 Novel intelligent space survey operation and control language monitoring system

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110119047A1 (en) * 2009-11-19 2011-05-19 Tatu Ylonen Oy Ltd Joint disambiguation of the meaning of a natural language expression
CN103365834A (en) * 2012-03-29 2013-10-23 富泰华工业(深圳)有限公司 System and method for eliminating language ambiguity
US20140059417A1 (en) * 2012-08-23 2014-02-27 International Business Machines Corporation Logical contingency analysis for domain-specific languages
CN103902521A (en) * 2012-12-24 2014-07-02 高德软件有限公司 Chinese statement identification method and device
CN104050151A (en) * 2014-06-05 2014-09-17 北京江南天安科技有限公司 Security incident feature analysis method and system based on predicate deduction
US20150142443A1 (en) * 2012-10-31 2015-05-21 SK PLANET CO., LTD. a corporation Syntax parsing apparatus based on syntax preprocessing and method thereof
US20160132304A1 (en) * 2014-11-12 2016-05-12 International Business Machines Corporation Contraction aware parsing system for domain-specific languages
CN105701253A (en) * 2016-03-04 2016-06-22 南京大学 Chinese natural language interrogative sentence semantization knowledge base automatic question-answering method
CN106095398A (en) * 2016-05-10 2016-11-09 深圳前海信息技术有限公司 Big data mining application process based on DSL and device
CN106202010A (en) * 2016-07-12 2016-12-07 重庆兆光科技股份有限公司 The method and apparatus building Law Text syntax tree based on deep neural network
CN106227719A (en) * 2016-07-26 2016-12-14 北京智能管家科技有限公司 Chinese word segmentation disambiguation method and system
CN106250104A (en) * 2015-06-09 2016-12-21 阿里巴巴集团控股有限公司 A kind of remote operating system for server, method and device
CN106411626A (en) * 2015-08-03 2017-02-15 中兴通讯股份有限公司 Test method and device based on DSL network element simulator
CN106446163A (en) * 2016-09-26 2017-02-22 福建省知识产权信息公共服务中心 Retrieval method based on advanced assertion decision algorithm and LL recursive descent method

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110119047A1 (en) * 2009-11-19 2011-05-19 Tatu Ylonen Oy Ltd Joint disambiguation of the meaning of a natural language expression
CN103365834A (en) * 2012-03-29 2013-10-23 富泰华工业(深圳)有限公司 System and method for eliminating language ambiguity
US20140059417A1 (en) * 2012-08-23 2014-02-27 International Business Machines Corporation Logical contingency analysis for domain-specific languages
US20150142443A1 (en) * 2012-10-31 2015-05-21 SK PLANET CO., LTD. a corporation Syntax parsing apparatus based on syntax preprocessing and method thereof
CN103902521A (en) * 2012-12-24 2014-07-02 高德软件有限公司 Chinese statement identification method and device
CN104050151A (en) * 2014-06-05 2014-09-17 北京江南天安科技有限公司 Security incident feature analysis method and system based on predicate deduction
US20160132304A1 (en) * 2014-11-12 2016-05-12 International Business Machines Corporation Contraction aware parsing system for domain-specific languages
CN106250104A (en) * 2015-06-09 2016-12-21 阿里巴巴集团控股有限公司 A kind of remote operating system for server, method and device
CN106411626A (en) * 2015-08-03 2017-02-15 中兴通讯股份有限公司 Test method and device based on DSL network element simulator
CN105701253A (en) * 2016-03-04 2016-06-22 南京大学 Chinese natural language interrogative sentence semantization knowledge base automatic question-answering method
CN106095398A (en) * 2016-05-10 2016-11-09 深圳前海信息技术有限公司 Big data mining application process based on DSL and device
CN106202010A (en) * 2016-07-12 2016-12-07 重庆兆光科技股份有限公司 The method and apparatus building Law Text syntax tree based on deep neural network
CN106227719A (en) * 2016-07-26 2016-12-14 北京智能管家科技有限公司 Chinese word segmentation disambiguation method and system
CN106446163A (en) * 2016-09-26 2017-02-22 福建省知识产权信息公共服务中心 Retrieval method based on advanced assertion decision algorithm and LL recursive descent method

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109841210A (en) * 2017-11-27 2019-06-04 西安中兴新软件有限责任公司 A kind of Intelligent control implementation method and device, computer readable storage medium
CN109841210B (en) * 2017-11-27 2024-02-20 西安中兴新软件有限责任公司 Intelligent control implementation method and device and computer readable storage medium
CN108874917A (en) * 2018-05-30 2018-11-23 北京五八信息技术有限公司 Intension recognizing method, device, equipment and storage medium
CN109298857A (en) * 2018-10-09 2019-02-01 杭州朗和科技有限公司 Method for building up, medium, device and the calculating equipment of DSL statement model
CN109558590A (en) * 2018-11-23 2019-04-02 中国人民解放军63789部队 A kind of critical failure device localization method based on spacecraft telemetry parameter participle
CN109558590B (en) * 2018-11-23 2022-11-15 中国人民解放军63789部队 Method for positioning key fault device based on spacecraft remote measurement parameter word segmentation
CN111178052A (en) * 2019-12-20 2020-05-19 中国建设银行股份有限公司 Method and device for constructing robot process automation application
CN112380848A (en) * 2020-11-19 2021-02-19 平安科技(深圳)有限公司 Text generation method, device, equipment and storage medium
CN112380848B (en) * 2020-11-19 2022-04-26 平安科技(深圳)有限公司 Text generation method, device, equipment and storage medium
CN112579093A (en) * 2020-12-11 2021-03-30 杭州安恒信息技术股份有限公司 Information pushing method and device and related equipment
CN118132375A (en) * 2024-03-01 2024-06-04 北京开运联合信息技术集团股份有限公司 Novel intelligent space survey operation and control language monitoring system

Similar Documents

Publication Publication Date Title
CN107247613A (en) Sentence analytic method and sentence resolver
CN108363790B (en) Method, device, equipment and storage medium for evaluating comments
CN108304468B (en) Text classification method and text classification device
US10120861B2 (en) Hybrid classifier for assigning natural language processing (NLP) inputs to domains in real-time
CN111611810B (en) Multi-tone word pronunciation disambiguation device and method
CN107818164A (en) A kind of intelligent answer method and its system
CN101261623A (en) Word splitting method and device for word border-free mark language based on search
US20030046078A1 (en) Supervised automatic text generation based on word classes for language modeling
CN103324621B (en) A kind of Thai text spelling correcting method and device
WO2020233386A1 (en) Intelligent question-answering method and device employing aiml, computer apparatus, and storage medium
CN103314369B (en) Machine translation apparatus and method
CN113704416B (en) Word sense disambiguation method and device, electronic equipment and computer-readable storage medium
CN109614620B (en) HowNet-based graph model word sense disambiguation method and system
US20200311345A1 (en) System and method for language-independent contextual embedding
US20180341646A1 (en) Translated-clause generating method, translated-clause generating apparatus, and recording medium
KR100481580B1 (en) Apparatus for extracting event sentences in documents and method thereof
CN111444704A (en) Network security keyword extraction method based on deep neural network
Grif et al. Development of computer sign language translation technology for deaf people
Araujo Part-of-speech tagging with evolutionary algorithms
CN116561275A (en) Object understanding method, device, equipment and storage medium
CN110929518A (en) Text sequence labeling algorithm using overlapping splitting rule
CN110750967B (en) Pronunciation labeling method and device, computer equipment and storage medium
CN112434513A (en) Word pair up-down relation training method based on dependency semantic attention mechanism
CN109960782A (en) A kind of Tibetan language segmenting method and device based on deep neural network
CN111859910B (en) Word feature representation method for semantic role recognition and fusing position information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination