CN108170679B - Semantic matching method and system based on computer recognizable natural language description - Google Patents

Semantic matching method and system based on computer recognizable natural language description Download PDF

Info

Publication number
CN108170679B
CN108170679B CN201711460123.3A CN201711460123A CN108170679B CN 108170679 B CN108170679 B CN 108170679B CN 201711460123 A CN201711460123 A CN 201711460123A CN 108170679 B CN108170679 B CN 108170679B
Authority
CN
China
Prior art keywords
word set
natural language
word
words
description
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711460123.3A
Other languages
Chinese (zh)
Other versions
CN108170679A (en
Inventor
杨学红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN201711460123.3A priority Critical patent/CN108170679B/en
Publication of CN108170679A publication Critical patent/CN108170679A/en
Application granted granted Critical
Publication of CN108170679B publication Critical patent/CN108170679B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms

Abstract

The invention belongs to the technical field of programming, and particularly relates to a semantic matching method based on computer recognizable natural language description and a corresponding semantic matching system. The semantic matching method based on the computer recognizable natural language description comprises the following steps: step S1): taking the logic and steps defined by the grammatical rules of the target language as reference, and restricting the natural language requirement description into a structure with the logical steps; step S2): obtaining a candidate word set comprising a root word in the natural language requirement description for a fixed sentence pattern in the constrained natural language requirement description; step S3): segmenting the message name/operation name in the target language to obtain a standby word set comprising the root word in the message name/operation name; step S4): and calculating the matching degree of the candidate word set and the spare word set. The semantic matching method and the semantic matching system can coordinate the divergence of users and developers in the application of natural language, and realize the automatic programming of machine language.

Description

Semantic matching method and system based on computer recognizable natural language description
Technical Field
The invention belongs to the technical field of programming, and particularly relates to a semantic matching method based on computer recognizable natural language description and a corresponding semantic matching system based on computer recognizable natural language description.
Background
Natural language is still the description language of the current software requirements document. The automatic generation of the flow from the functional requirements described by the natural language can not only help users and developers to quickly achieve consensus on the requirements, but also accelerate the development of the flow.
However, as users and developers have different concerns, their descriptions of requirements are often different. In the process that a user and a developer describe functional requirements by using a natural language, the user cares about functions provided by software, performance levels achieved by the software and the like, and the developer can describe the requirements of the software from the technical point of view; moreover, they do not know the naming convention of the specific messages and actions used in the development language, and the real words they use for their description of the requirements are not necessarily the same as the words used in the message names and action names in the development language. In addition, in most cases, users are not familiar with those specialized terms and technical issues.
However, most of the current software requirement documents are still written in natural language, which has two reasons: firstly, most users and developers do not have the capability of formalizing description requirements; secondly, because the natural language vocabulary is rich, the expression ability is strong. However, natural language also has inevitable disadvantages including ambiguity, and inconsistency.
In order to make up for the deficiency of natural language, a method is needed that can constrain and formalize the flow requirement description expressed by natural language so that the computer can understand the requirement. How to coordinate the divergence of the natural language application between users and developers becomes a technical problem to be solved urgently at present.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a semantic matching method based on computer recognizable natural language description and a corresponding semantic matching system based on computer recognizable natural language description, which can effectively eliminate the divergence of users and developers in natural language application and realize the automatic programming of machine language.
The technical scheme adopted for solving the technical problem of the invention is that the semantic matching method based on the computer recognizable natural language description comprises the following steps:
step S1): taking the logic and steps defined by the grammatical rules of the target language as reference, and restricting the natural language requirement description into a structure with the logical steps;
step S2): obtaining a candidate word set comprising a root word in the natural language requirement description for a fixed sentence pattern in the constrained natural language requirement description;
step S3): segmenting the message name/operation name in the target language to obtain a standby word set comprising the root word in the message name/operation name;
step S4): and calculating the matching degree of the candidate word set and the spare word set.
Preferably, the step S2) includes:
step S21): acquiring a demand statement described by a natural language according to a set limiting word, and dividing the demand statement into words to form a primary word set;
step S22): removing stop words in the primary word set to form a suitable word set;
step S23): carrying out synonym expansion on each word in the applicable word set;
step S24): and carrying out root reduction on the expansion word set to obtain a candidate word set comprising the root in the natural language requirement description.
Preferably, in step S21), the qualifier for the requirement sentence converted into the target language setting has the prefix as the mark;
step S22), auxiliary words, prepositions and conjunctions are pre-stored as stop words and are used as a stop word bank;
in step S23), synonym expansion is carried out on each term in the applicable term set according to the synonym thesaurus;
step S24), the root reduction algorithm is Porter algorithm or Lucene algorithm.
Preferably, the step S4) includes the steps of:
step S41): traversing the words of the standby word set, and screening the words which have intersection with the candidate word set;
step S42): and calculating the matching degree of the words meeting the intersection.
Preferably, in step S4), the matching degree between the candidate word set and the spare word set is represented by the following formula:
Figure BDA0001530098500000031
wherein, count is the number of searched words with similar semantics, | word setA| is the number of participles in the demand description sentence, | word setBAnd | is the number of participles in the message name/operation name.
A semantic matching system based on computer recognizable natural language description comprises a constraint module, a candidate word set forming module, a standby word set forming module and a matching module, wherein:
the constraint module is used for describing and constraining natural language requirements into a structure with logical steps by taking the logic and the steps defined by the grammatical rules of the target language as reference;
the candidate word set composition module is used for obtaining a candidate word set comprising a root word in the natural language requirement description for a sentence pattern fixed in the constrained natural language requirement description;
the standby word set forming module is used for segmenting the message name/the operation name in the target language to obtain a standby word set comprising the root word in the message name/the operation name;
and the matching module is used for calculating the matching degree of the candidate word set and the standby word set.
Preferably, the candidate word set composing module includes a primary word set unit, an applicable word set unit, a synonym expansion unit, and a root restoring unit, where:
the primary word set unit is used for acquiring a demand sentence described by a natural language according to a set limiting word, and dividing the demand sentence into words to form a primary word set;
the applicable word set unit is used for removing stop words in the primary word set to form an applicable word set;
the synonym expansion unit is used for carrying out synonym expansion on each term in the applicable term set;
and the root reduction unit is used for carrying out root reduction on the expansion word set to obtain a candidate word set comprising the root in the natural language requirement description.
Preferably, in the primary term set unit, the qualifier set for converting the requirement statement into the target language takes the prefix as the identifier;
in the applicable word set unit, auxiliary words, prepositions and conjunctions are pre-stored as stop words and are used as a stop word bank;
in the synonym expansion unit, synonym expansion is carried out on each term in the applicable term set according to the synonym thesaurus;
in the root reduction unit, the root reduction algorithm is Porter algorithm or Lucene algorithm.
Preferably, the matching module includes an intersection unit and a matching unit, wherein:
the intersection unit is used for traversing the words of the standby word set and screening the words which have intersection with the candidate word set;
and the matching unit is used for calculating the matching degree of the words meeting the intersection.
Preferably, in the matching unit, a formula of a matching degree between the candidate word set and the spare word set is as follows:
Figure BDA0001530098500000051
wherein, count is the number of searched words with similar semantics, | word setA| is the number of participles in the demand description sentence, | word setBAnd | is the number of participles in the message name/operation name.
The invention has the beneficial effects that: the semantic matching method based on the computer recognizable natural language description and the corresponding semantic matching system thereof increase synonym expansion and modify similar calculation on the basis of word segmentation, stop word removal, root reduction and similar calculation so as to be suitable for matching with message names/operation names in demand description, can coordinate divergence of users and developers in natural language application, and realize automatic programming of machine language.
Drawings
FIG. 1 is a flow chart of a semantic matching method based on computer recognizable natural language description according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating the steps of obtaining a set of candidate words including a root word in a requirement description according to an embodiment of the present invention;
FIG. 3 is a block diagram of a semantic matching system based on computer recognizable natural language description according to an embodiment of the present invention;
in the figure:
1-a constraint module; 2-candidate word set composition module; 3-a standby word set forming module; 4-matching module.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the following describes the computer recognizable natural language description based semantic matching method and the corresponding computer recognizable natural language description based semantic matching system in further detail with reference to the accompanying drawings and the detailed description.
In order to establish a bridge between requirement description and development language, the invention provides a semantic matching method based on computer recognizable natural language description, which is based on a word stock with hierarchy (which can be understood as English dictionary word stock) formed on the basis of root words and synonyms from the perspective of semantic matching, can coordinate divergence of users and developers on natural language application, realizes automatic programming of machine language, and greatly accelerates project progress.
As shown in fig. 1, the semantic matching method based on computer recognizable natural language description in the present invention includes the following steps:
step S1): the natural language requirement description is constrained to the structure of the logical steps with reference to the logic and steps defined by the grammatical rules of the target language.
The flow function requirement described by the natural language has certain step performance, and the step performance is embodied by prepositions in sentences, such as after, if, then, or else, at the same time and the like. However, the simple logical step relationships for human beings are not easily recognized and understood by computers. Therefore, a constraint rule is required to be specified, and preparation before conversion from the natural language to the target language is carried out, so that users and developers can carry out requirement description according to the constraint rule, and the requirement description can be directly embodied in step for a computer. The target language may be a computer language selected for programming.
In the step, the natural language requirement description is restricted to be presented as a structure with logic steps, and words forming the structure with logic steps can be integrated to form a word stock word setA
Step S2): and acquiring a candidate word set comprising the root word in the natural language requirement description for the fixed sentence pattern in the constrained natural language requirement description.
When describing the functional requirements by natural language, users and developers do not know the naming information of specific messages and operations in the program file, and the real words used by the users and developers for describing the requirements are not necessarily the same as the words used in the message names and operation names in the program file. In this step, the fixed sentence pattern in the constrained requirement description is automatically formalized with reference to the logic and steps defined by the grammatical rules of the target language by the computer. Formalization is typically done in a target development language (e.g., in the automated business process assembly language BPEL), i.e., converting the requirement description into a language that can be understood by a computer. The invention aims at the process combination language, and formally converts the requirement description with certain step after constraint processing into the corresponding statement of the process combination language. Thus, a bridge between the requirement description of the flow and the target language is achieved through formalization.
Therefore, in this step, from the perspective of semantic matching, a synonym thesaurus is used, a matching algorithm is performed on the natural language requirement description based on the root word and the synonym, and a candidate term set including the root word in the natural language requirement description is obtained for the sentence pattern fixed in the natural language requirement description after constraint.
In the following, referring to fig. 2, the requirement statement a described in natural language is used to obtain the final word setAThe process of (a) will be described in detail. The method specifically comprises the following steps:
step S21): and acquiring a demand statement described by a natural language according to the set qualifier, and segmenting the demand statement to form a primary term set.
Therein, fromHowever, after the requirement statement A described by the language is constrained and formalized, a constraint statement A' is obtained. If the constraint sentence A ' comprises the set qualifier of the person, extracting the natural language requirement description sentence of the constraint sentence A ', and performing word segmentation to obtain the primary word set of word set 'A. In general, the preliminary words are specified by the requirement description in the target language to be converted, and therefore, the qualifier may be set in advance to be retrieved from the thesaurus in the target language.
Preferably, the qualifier for converting the requirement statement a into the target language setting can be identified by a prefix, for example, the automated business process assembly language BPEL is taken as an example, and the prefix of the constraint statement a' is [ RECEIVE]Or [ INVOKE]Then the natural language requirement description sentence of the extraction constraint sentence A 'is A', and the segmented primary word set of A 'is word set'A. Here, [ RECEIVE]Indicates acceptance of a message, INVOKE]Indicating that a service is invoked.
Step S22): and removing stop words in the primary word set to form an applicable word set.
In general, in a requirement sentence, besides real words such as names, adjectives, moving parts of speech, etc., there may be also dummy words without actual meanings such as auxiliary words, prepositions, conjunctions, etc., and based on the purpose of finding the most semantically matched message and operation with the requirement sentence from all target documents, the words irrelevant to semantics will interfere with semantic matching, so it is necessary to eliminate them in the process of calculating the matching degree. Therefore, it is further preferable that, in order to ensure the purity of the thesaurus, the auxiliary words, prepositions, conjunctions, and the like are stored in advance as stop words as the stop word thesaurus D. According to the stop word lexicon D, the word set is wordset'ARemoving stop words, i.e. from wordset'ARemoving stop words to obtain a suitable word set
Figure BDA0001530098500000081
For wordset'AIf w ∈ D, then
Figure BDA0001530098500000082
Step S23): and carrying out synonym expansion on each word in the applicable word set.
In this step, the set of applicable words is assembled according to the thesaurus C (which can be understood as a general English dictionary)
Figure BDA0001530098500000083
And carrying out synonym expansion on each word in the Chinese sentence. For the
Figure BDA0001530098500000084
Any term w in the synonym thesaurus C is inquired about the synonym set synnyms (w) of the term w, and all synonyms of the term w are added into the synonym thesaurus C
Figure BDA0001530098500000091
In (1),
Figure BDA0001530098500000092
get the expanded word set word "A
Step S24): and carrying out root restoration on the expansion word set.
In the case of expanding word set word "AIn the step of root reduction, for word set "ACalculating the root of word w 'of any word w by using a root reduction algorithm, and replacing word by w'AW, obtaining a candidate word set wordset including a root word in a natural language requirement descriptionAI.e. wordsetA=wordset"A-w + w'. Here, w' is denoted porter (w). The specific root reduction algorithm may be a Porter algorithm or a Lucene algorithm, which is not limited herein.
Through the steps, the processing of dividing, removing stop words, synonym expansion and root reduction is carried out on the required sentences described by the natural language in sequence, and the roots in the sentences described by the natural language and the expansion of the roots in the same level with the roots can be obtained without being interfered by the stop words, so that semantic expansion and containment of users and developers in the communication process can be realized to the maximum extent, and a richer candidate matching basis is provided for the conversion of the computer language.
Step S3): and segmenting the message name/operation name in the target language to obtain a standby word set comprising the root word in the message name/operation name.
In this step, the message name/operation name is segmented, and the formed word set is a fixed sentence pattern in the constrained requirement description. The alternative word set of the message name/operation name B after word segmentation is wordsetB
It should be understood here that the definition of the message name/action name B requires a specific language specific analysis, since each computer language has specificity.
Step S4): and calculating the matching degree of the candidate word set and the spare word set.
The invention automatically converts the flow function requirement of natural language description into the application of development language description, and adds the matching algorithm of root word and synonym in the aspect of semantic processing for improving the accuracy. Therefore, the candidate word set obtained in step S2) and the alternative word set obtained in step S3) are subjected to matching degree calculation to ensure that the words in the candidate word set are matched with the maximum similarity in the alternative word set.
Currently, the matching degree calculation method includes a Dice-Euclidean similarity calculation method. In this embodiment, in order to more accurately search the flow corresponding to the natural language, the similarity calculation algorithm Dice algorithm is improved in consideration of the root of a word and the synonym, and the DicePlus algorithm is used for calculating the word setAAnd wordsetBThe degree of matching of (2).
The improved extended similarity calculation algorithm DicePlus comprises the steps of:
step S41): and traversing the words of the standby word set, and screening the words which have intersection with the candidate word set.
In this step, the spare word set wordset is traversedBIf each word in wordsetBWord w in wordsetAIn (1), or synonyms of the word w in wordsetAExists in the word block so as to judge the spare word set wordsetBWord in and candidate word set wordsetAWhether there is an intersection between the words in (1).
Step S42): and calculating the matching degree of the words meeting the intersection.
The matching degree is calculated in order to find a program statement meeting the matching degree to replace a corresponding requirement description statement, and if the program statement can not be found, developers need to write the corresponding statement by themselves. In this step, the candidate word set wordset is calculated using the following formulaAAnd the spare word set wordBDegree of matching of
Figure BDA0001530098500000101
Figure BDA0001530098500000102
Wherein, count is the number of searched words with similar semantics, | word setA| is the number of participles in the demand description sentence, | word setBAnd | is the number of participles in the message name/operation name.
Based on the matching degree algorithm and the similarity algorithm, the requirement of natural language description can be converted into a description language which can be identified by a computer, and the automatic computer programming of statements described according to the natural language can be realized. At this time, even though the real words used by the user requirement description are not necessarily identical to the words used by the developer (e.g. both receive and get indicate receiving the message), accurate matching can still be performed.
The natural language and the computer language are in continuous development and updating, the semantic matching method of the invention cannot have exhaustiveness, can be added by self-learning in the future use, slowly accumulates word stock, and continuously enriches and perfects matching.
The semantic matching method based on the computer recognizable natural language description adds synonym expansion and modifies similar calculation on the basis of word segmentation, stop word removal, root reduction and similar calculation so as to be suitable for matching with message names/operation names in demand description, and can coordinate divergence of users and developers in natural language application and realize automatic programming of machine language.
Correspondingly, the embodiment also provides a semantic matching system based on computer recognizable natural language description, which can coordinate divergence of users and developers on natural language application and realize automatic programming of machine voice.
As shown in fig. 3, the semantic matching system based on computer recognizable natural language description comprises a constraint module 1, a candidate word set composition module 2, an alternative word set composition module 3 and a matching module 4, wherein:
a constraint module 1, which is used for describing and constraining natural language requirements into a structure with logical steps by taking the logic and steps defined by the grammatical rules of the target language as reference;
a candidate word set composition module 2, configured to obtain a candidate word set including a root in the natural language requirement description for a sentence pattern fixed in the constrained natural language requirement description;
a standby word set composition module 3, configured to perform word segmentation on the message name/operation name in the target language, and obtain a standby word set including a root word in the message name/operation name;
and the matching module 4 is used for calculating the matching degree of the candidate word set and the standby word set.
The candidate word set composition module 2 includes a primary word set unit, a suitable word set unit, a synonym expansion unit, and a root reduction unit, wherein:
and the primary word set unit is used for acquiring the demand sentences described by the natural language according to the set limiting words and dividing the demand sentences into words to form primary word sets. In the primary word set unit, converting the requirement sentence into a qualifier set by a target language, and using prefix as an identifier;
and the applicable word set unit is used for removing stop words in the primary word set to form an applicable word set. In the applicable word set unit, auxiliary words, prepositions and conjunctions are pre-stored as stop words and are used as a stop word bank;
and the synonym expansion unit is used for carrying out synonym expansion on each term in the applicable term set. In the synonym expansion unit, performing synonym expansion on each term in the applicable term set according to the synonym thesaurus;
and the root reduction unit is used for carrying out root reduction on the expansion word set to obtain a candidate word set comprising the root in the natural language requirement description. In the root reduction unit, the root reduction algorithm is Porter algorithm or Lucene algorithm.
The matching module 4 comprises an intersection unit and a matching unit, wherein:
the intersection unit is used for traversing the words of the standby word set and screening the words which have intersection with the candidate word set;
and the matching unit is used for calculating the matching degree of the words meeting the intersection.
In the matching unit, the formula of the matching degree of the candidate word set and the spare word set is as follows:
Figure BDA0001530098500000121
wherein, count is the number of searched words with similar semantics, | word setA| is the number of participles in the demand description sentence, | word setBAnd | is the number of participles in the message name/operation name.
The semantic matching system based on the computer recognizable natural language description adds synonym expansion and modifies similar calculation on the basis of word segmentation, stop word removal, root reduction and similar calculation so as to be suitable for matching with message names/operation names in demand description, and can coordinate divergence of users and developers in natural language application and realize automatic programming of machine language.
It will be understood that the above embodiments are merely exemplary embodiments taken to illustrate the principles of the present invention, which is not limited thereto. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.

Claims (8)

1. A semantic matching method based on computer recognizable natural language description is characterized by comprising the following steps:
step S1): taking the logic and steps defined by the grammatical rules of the target language as reference, and restricting the natural language requirement description into a structure with the logical steps;
step S2): obtaining a candidate word set comprising a root word in the natural language requirement description for a fixed sentence pattern in the constrained natural language requirement description;
step S3): segmenting the message name/operation name in the target language to obtain a standby word set comprising the root word in the message name/operation name;
step S4): calculating the matching degree of the candidate word set and the standby word set so as to replace corresponding sentences in the natural language requirement description with target languages meeting the matching degree, wherein the target languages are computer languages selected for programming;
step S4) includes the steps of:
step S41): traversing the words of the standby word set, and screening the words which have intersection with the candidate word set;
step S42): and calculating the matching degree of the words meeting the intersection.
2. The semantic matching method based on computer recognizable natural language description according to claim 1, wherein the step S2) includes:
step S21): acquiring a demand statement described by a natural language according to a set limiting word, and dividing the demand statement into words to form a primary word set;
step S22): removing stop words in the primary word set to form a suitable word set;
step S23): carrying out synonym expansion on each word in the applicable word set;
step S24): and carrying out root reduction on the expansion word set to obtain a candidate word set comprising the root in the natural language requirement description.
3. The computer recognizable natural language description based semantic matching method according to claim 2,
step S21), transferring the requirement sentence into a qualifier set by the target language, wherein the prefix is used as a mark;
step S22), auxiliary words, prepositions and conjunctions are pre-stored as stop words and are used as a stop word bank;
in step S23), synonym expansion is carried out on each term in the applicable term set according to the synonym thesaurus;
step S24), the root reduction algorithm is Porter algorithm or Lucene algorithm.
4. The semantic matching method based on computer recognizable natural language description as claimed in claim 1, wherein in step S4), the formula of the matching degree of the candidate word set and the alternative word set is:
Figure FDA0003169491630000021
wherein, count is the number of searched words with similar semantics, | word setA| is the number of participles in the demand description sentence, | word setBAnd | is the number of participles in the message name/operation name.
5. A semantic matching system based on computer recognizable natural language description is characterized by comprising a constraint module, a candidate word set forming module, a standby word set forming module and a matching module, wherein:
the constraint module is used for describing and constraining natural language requirements into a structure with logical steps by taking the logic and the steps defined by the grammatical rules of the target language as reference;
the candidate word set composition module is used for obtaining a candidate word set comprising a root word in the natural language requirement description for a sentence pattern fixed in the constrained natural language requirement description;
the standby word set forming module is used for segmenting the message name/the operation name in the target language to obtain a standby word set comprising the root word in the message name/the operation name;
the matching module is used for calculating the matching degree of the candidate word set and the standby word set so as to replace corresponding sentences in the natural language requirement description with target languages meeting the matching degree, wherein the target languages are computer languages selected for programming;
the matching module comprises an intersection unit and a matching unit, wherein:
the intersection unit is used for traversing the words of the standby word set and screening the words which have intersection with the candidate word set;
and the matching unit is used for calculating the matching degree of the words meeting the intersection.
6. The computer recognizable natural language description based semantic matching system of claim 5, wherein the candidate word set composition module comprises a primary word set unit, an applicable word set unit, a synonym expansion unit, and a root restoration unit, wherein:
the primary word set unit is used for acquiring a demand sentence described by a natural language according to a set limiting word, and dividing the demand sentence into words to form a primary word set;
the applicable word set unit is used for removing stop words in the primary word set to form an applicable word set;
the synonym expansion unit is used for carrying out synonym expansion on each term in the applicable term set;
and the root reduction unit is used for carrying out root reduction on the expansion word set to obtain a candidate word set comprising the root in the natural language requirement description.
7. The computer recognizable natural language description based semantic matching system of claim 6,
in the primary word set unit, prefixes of qualifiers set by converting the requirement sentences into the target language are used as identifiers;
in the applicable word set unit, auxiliary words, prepositions and conjunctions are pre-stored as stop words and are used as a stop word bank;
in the synonym expansion unit, synonym expansion is carried out on each term in the applicable term set according to the synonym thesaurus;
in the root reduction unit, the root reduction algorithm is Porter algorithm or Lucene algorithm.
8. The semantic matching system based on computer recognizable natural language description as claimed in claim 5, wherein in the matching unit, the formula of the matching degree of the candidate word set and the alternative word set is:
Figure FDA0003169491630000041
wherein, count is the number of searched words with similar semantics, | word setA| is the number of participles in the demand description sentence, | word setBAnd | is the number of participles in the message name/operation name.
CN201711460123.3A 2017-12-28 2017-12-28 Semantic matching method and system based on computer recognizable natural language description Active CN108170679B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711460123.3A CN108170679B (en) 2017-12-28 2017-12-28 Semantic matching method and system based on computer recognizable natural language description

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711460123.3A CN108170679B (en) 2017-12-28 2017-12-28 Semantic matching method and system based on computer recognizable natural language description

Publications (2)

Publication Number Publication Date
CN108170679A CN108170679A (en) 2018-06-15
CN108170679B true CN108170679B (en) 2021-09-03

Family

ID=62519156

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711460123.3A Active CN108170679B (en) 2017-12-28 2017-12-28 Semantic matching method and system based on computer recognizable natural language description

Country Status (1)

Country Link
CN (1) CN108170679B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110413267B (en) * 2019-08-08 2023-05-26 四川爱创科技有限公司 Self-adaptive business process modeling method based on business rules
CN114238619B (en) * 2022-02-23 2022-04-29 成都数联云算科技有限公司 Method, system, device and medium for screening Chinese nouns based on edit distance

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101595474A (en) * 2007-01-04 2009-12-02 思解私人有限公司 Language analysis
CN103186574A (en) * 2011-12-29 2013-07-03 北京百度网讯科技有限公司 Method and device for generating searching result
CN103699667A (en) * 2013-12-24 2014-04-02 天津大学 Web service multi-dimensional semantic model building method
CN103902652A (en) * 2014-02-27 2014-07-02 深圳市智搜信息技术有限公司 Automatic question-answering system
CN106339366A (en) * 2016-08-08 2017-01-18 北京百度网讯科技有限公司 Method and device for requirement identification based on artificial intelligence (AI)
CN106407196A (en) * 2015-07-29 2017-02-15 成都诺铱科技有限公司 Semantic analysis intelligent instruction robot applied to logistics management software
CN107391111A (en) * 2017-06-22 2017-11-24 刘武丰 Artificial intelligence co-development framework and implementation method
CN107463683A (en) * 2017-08-09 2017-12-12 上海壹账通金融科技有限公司 The naming method and terminal device of code element

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1417707A (en) * 2002-12-02 2003-05-14 刘莎 Natural language semantic information united-coding method
US8326775B2 (en) * 2005-10-26 2012-12-04 Cortica Ltd. Signature generation for multimedia deep-content-classification by a large-scale matching system and method thereof
US7735068B2 (en) * 2005-12-01 2010-06-08 Infosys Technologies Ltd. Automated relationship traceability between software design artifacts
CN103309852A (en) * 2013-06-14 2013-09-18 瑞达信息安全产业股份有限公司 Method for discovering compound words in specific field based on statistics and rules
CN104133812B (en) * 2014-07-17 2017-03-08 北京信息科技大学 A kind of Chinese sentence similarity layered calculation method of user oriented query intention and device
CN106776532B (en) * 2015-11-25 2020-07-07 中国移动通信集团公司 Knowledge question-answering method and device
CN105930452A (en) * 2016-04-21 2016-09-07 北京紫平方信息技术股份有限公司 Smart answering method capable of identifying natural language
CN106055537B (en) * 2016-05-23 2019-03-12 王立山 A kind of natural language machine identification method and system
CN106372117B (en) * 2016-08-23 2019-06-14 电子科技大学 A kind of file classification method and its device based on Term co-occurrence

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101595474A (en) * 2007-01-04 2009-12-02 思解私人有限公司 Language analysis
CN103186574A (en) * 2011-12-29 2013-07-03 北京百度网讯科技有限公司 Method and device for generating searching result
CN103699667A (en) * 2013-12-24 2014-04-02 天津大学 Web service multi-dimensional semantic model building method
CN103902652A (en) * 2014-02-27 2014-07-02 深圳市智搜信息技术有限公司 Automatic question-answering system
CN106407196A (en) * 2015-07-29 2017-02-15 成都诺铱科技有限公司 Semantic analysis intelligent instruction robot applied to logistics management software
CN106339366A (en) * 2016-08-08 2017-01-18 北京百度网讯科技有限公司 Method and device for requirement identification based on artificial intelligence (AI)
CN107391111A (en) * 2017-06-22 2017-11-24 刘武丰 Artificial intelligence co-development framework and implementation method
CN107463683A (en) * 2017-08-09 2017-12-12 上海壹账通金融科技有限公司 The naming method and terminal device of code element

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于领域需求结构化描述的自动分析建模方法;欧阳柳波等;《计算机工程与应用》;20150605;第52卷(第20期);52-57 *

Also Published As

Publication number Publication date
CN108170679A (en) 2018-06-15

Similar Documents

Publication Publication Date Title
KR101306667B1 (en) Apparatus and method for knowledge graph stabilization
Danenas et al. Natural language processing-enhanced extraction of SBVR business vocabularies and business rules from UML use case diagrams
US7493251B2 (en) Using source-channel models for word segmentation
Orosz et al. PurePos 2.0: a hybrid tool for morphological disambiguation
US20040098247A1 (en) Statistical method and apparatus for learning translation relationships among phrases
Chanlekha et al. Thai named entity extraction by incorporating maximum entropy model with simple heuristic information
de Araújo et al. Re-bert: automatic extraction of software requirements from app reviews using bert language model
CN109614620B (en) HowNet-based graph model word sense disambiguation method and system
US10528664B2 (en) Preserving and processing ambiguity in natural language
US11031009B2 (en) Method for creating a knowledge base of components and their problems from short text utterances
US20080208566A1 (en) Automated word-form transformation and part of speech tag assignment
Chang et al. Illinois-Coref: The UI system in the CoNLL-2012 shared task
CN111159330A (en) Database query statement generation method and device
CN110096599B (en) Knowledge graph generation method and device
Jayan et al. A hybrid statistical approach for named entity recognition for malayalam language
CN108170679B (en) Semantic matching method and system based on computer recognizable natural language description
JP6867963B2 (en) Summary Evaluation device, method, program, and storage medium
Altabba et al. An Arabic morphological analyzer and part-of-speech tagger
US10354646B2 (en) Bilingual corpus update method, bilingual corpus update apparatus, and recording medium storing bilingual corpus update program
Kadim et al. Parallel HMM-based approach for arabic part of speech tagging.
CN113779062A (en) SQL statement generation method and device, storage medium and electronic equipment
Remus et al. EmpiriST: AIPHES-robust tokenization and POS-tagging for different genres
JP2006252323A (en) Data conversion aptitude evaluation method, and data converter
Sandillon-Rezer et al. Using tree transducers for grammatical inference
CN114625889A (en) Semantic disambiguation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant