CN115186671A

CN115186671A - Method for mapping noun phrases to descriptive logic concepts based on extension

Info

Publication number: CN115186671A
Application number: CN202210530158.4A
Authority: CN
Inventors: 瞿裕忠; 宋鼎; 丁文韬
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2022-05-16
Filing date: 2022-05-16
Publication date: 2022-10-14

Abstract

A method for mapping noun phrases to description logic concepts based on extension, which first exhausts all text segments of noun phrases and generates a mapping table of the text segments to resources in a knowledge base; then generating an analysis sequence according to the word segmentation, part-of-speech tagging and a syntax tree of the noun phrase; finally, from the concept of EL + + in the order of resolution

And initially, continuously refining the basic concept generated by the indexed resources until all words are analyzed to obtain the description logic concept mapped by the noun phrases. The invention can automatically process the high-quality description logic concept which can be generated by the complex noun phrases with implicit relations through the analysis of the syntax tree.

Description

Method for mapping noun phrases to descriptive logic concepts based on extension

Technical Field

The invention belongs to the technical field of computers, relates to natural language processing and knowledge graph technology, and discloses a method for mapping noun phrases to descriptive logic concepts based on extension.

Background

Solving natural language by a computing mechanism is always a goal which is constantly pursued by scientific researchers in the field of natural language processing. The semantic parsing task aims at converting natural language text into a meaning expression language which can be understood by a computer, and is one of the most difficult problems in the field of natural language processing. This task has been of interest to many researchers since its introduction due to the complexity and ambiguity of natural language. The rise of the knowledge graph makes the work of connecting the natural language with the knowledge graph have more key significance.

In natural language, noun phrases (Noun phrase) refer to a class of phrases whose grammatical functions are equivalent to nouns, and Noun phrases widely appear in various linguistic data, so that understanding Noun phrases has important meaning, and a good Noun phrase parser can also become a component of other natural language processing work. However, at present, semantic parsing work and KBQA work implemented by the semantic parsing method usually take sentences or chapters as research units of natural language, and the research on noun phrases is performed with little pertinence. Relationship information in noun phrases often appears implicitly, such as the semantics of "American songwriters" being "songwriters write born in United States" or "songwriters write citizenship is United States", which is easily understood by humans, but for computers, information of nationality or place of birth cannot be obtained directly from phrase text. In part of the work, in order to save the labor for labeling the training data, the extension is selected and used as the training data of the weakly supervised learning. An extension is a concept relative to connotation, consisting of what the phrase applies to. For the question-answering task, the extension is the set of answer entities of the question sentence. In part of the work, the extension is used as a supplement of training data, and a statistical index based on extension information can be used as a training feature, so that reference is provided for determination of an implicit relation in the training process. However, such semantic parsing based on supervised learning or weakly supervised learning requires a training data set of a certain size to generate a model through training. Currently, training data sets for authoritative, public supervised learning directed specifically to noun phrases have not emerged. How to use a more lightweight approach to achieve phrase-specific understanding deserves discussion and research.

On the other hand, some related work has emerged on the task of mapping noun phrases to knowledgegraphs using epitaxy. These works, which have been the subject of research on the wikipedia category, give some characteristics, by statistical measures, that are consistent with the entities described by the wikipedia category, since the set of entities described is simply available. The Cat2Ax extracts a matching mode from the hierarchical structure in the Wikipedia category, selects the Axiom (Axiom) with the highest score according to the statistical index and the lexical score comprehensive score, and further generates a new triple to complement the knowledge base; pasca et al, which uses complex noun phrases as a combination of head type and modifiers, first identifies the head in the phrase, then divides the rest of the phrase into a number of modifiers, and selects, by statistical measures, the interpretation of the other modifiers with the known interpretation of the head. Generally, the existing methods regard noun phrases as combinations of modifiers, and simple concatenation is performed after respective explanations, so that the complex noun phrases containing nested relationships cannot be processed.

Since there may be more complex noun phrases, a semantic representation with a stronger expressive power is needed to describe them. The description logic mainly describes the concept and the attribute of the ontology, provides a convenient expression form for the construction of the knowledge graph, and is widely applied to ontology reasoning work. The description logic language EL + + has the calculation complexity of the inference of polynomial time, and is light in weight while better expression capacity is reserved. The EL + + logical form can be defined recursively as:

wherein,

for the top-level concept name set, A represents an atomic concept, i.e. a concept name, such as Film; r represents an atomic role, i.e., a role name, such as basedOn; o is the name of the individual, such as Alice Munro; c ₁ And C ₂ Is a general concept. That is, in EL + +, concept C is formed by extracting atomic concept A and atomic character r

There are constraints

Generated as a constructor. For ease of understanding, the concept in describing logic EL + + will be referred to as describing the logic concept.

In summary, an efficient and effective method for mapping phrases to logical forms based on a specific knowledge graph using epitaxy is of great significance.

Disclosure of Invention

The invention aims to solve the problems that: the existing semantic parsing work needs a large amount of training data, because of the lack of extension and the lack of data sets, the parsing effect on noun phrases in the prediction stage is poor, and the existing method for mapping noun phrases to knowledge graph by using extension cannot process complex noun phrases with nested relation. The invention aims to provide a method for quickly and comprehensively understanding noun phrases through extension, in particular to a method for mapping noun phrases to EL + + description logic concepts.

The technical scheme of the invention is as follows: a method for mapping noun phrases to descriptive logical concepts based on extensions, the noun phrases being mapped by the extensions of the noun phrases to logical language concepts expressed in a descriptive logical language EL + +, generating an understanding of the noun phrases on a given knowledge base, comprising the steps of:

step 1, carrying out word segmentation and morphology reduction on noun phrases, enumerating all text segments T on word sequences after word segmentation, namely, composing of all N-gram models in noun phrasesAnd text segments T with reduced word forms corresponding to the text segments _lemma Indexing the text segment T to the resources of the knowledge base, and generating a mapping table of the text segment T to the resources in the knowledge base;

step 2, performing part-of-speech tagging according to the participles of the noun phrases to generate a syntactic tree, recursively traversing the whole tree from the top of the tree, and taking leaf nodes, namely the traversal sequence of each word, as an analysis sequence;

step 3, from the concept of EL + + according to the analysis sequence

Starting, refining the basic concept generated by indexed resources continuously, analyzing each analyzable word in sequence, and continuing the process until all words are analyzed to obtain a description logic concept mapped by noun phrases:

step 3.1, aiming at the current analyzable word, listing all candidate text segments containing the analyzable word;

step 3.2, according to the mapping table obtained in the step 1, indexing the candidate text segments to corresponding resources, and generating candidate thinning operation according to the corresponding resources;

step 3.3, carrying out consistency screening on the newly generated candidate thinning operation, and screening out the thinning operation inconsistent with the syntax;

3.4, generating a detailed description logic concept for the current analyzable word by using the detailed operation obtained by 3.3, grading the obtained description logic concept, selecting a high k reserve before the score, then checking whether the analysis is finished, namely whether the current analyzable word is the last analyzable word in the analysis sequence, and if not, entering the step 3.1 to analyze the next analyzable word; if yes, entering step 3.5;

the scoring function that describes the logical concept is:

S _score (NP,C)＝w _sup *S _sup (NP,C)+w _match *S _match (NP,C)+w _sim *S _sim (NP,C)

wherein S _sup For the support score, S _match Is a piece of paperGrading of the degree of distribution, S _sim For simplicity scoring, w _sup 、w _match 、w _sim Is a function of the corresponding weight or weights,

support score S describing logical concepts _sup Defining a smooth mean value of the support degree of a support set of a plurality of thinning operations in the process of generating the logic concept, and carrying out the known noun phrase NP and the thinning operation

NP ^I Set of entities described for noun phrases, i.e. extension of phrases, for concepts C, C ^I For the set of entities described by concept C, for the basic concepts B, B ^I Refining operation for entity set described by basic concept B

For concept C, a part A in C is modified by basic concept B, and Set is supported _sup The calculation formula is as follows:

wherein,

part A, referred to as B modification, is descriptive of epitaxial NP ^I As such, the first and second electrodes are,

part A modified by B is an entity set describing the relationship with epitaxy;

S _sup calculated from the following formula, where d refines the concept C,

is a support set

The support degree of (c):

S _match defined as the proportion of words in the noun phrase NP that can be matched by the concept C, the calculation formula is as follows:

S _sim defined as the number of refinement operations in the concept, the calculation formula is as follows:

S _sim (C)＝-|{d|d∈C}|

step 3.5, the description logic concepts of all the words obtained according to the analysis sequence are kept with the highest score as the output C _best I.e., the descriptive logical concepts to which the noun phrases are mapped, are used for the semantic understanding of the noun phrases by the knowledge base.

Compared with the prior art, the invention has the beneficial effects that:

(1) The method for semantic analysis by utilizing a syntax tree in the existing semantic analysis work is less, some methods adopt a method for jointly training the syntax tree and a semantic analysis result or generating a decoding process of a model through syntax information constraint, are mainly supervised learning and semi-supervised learning methods, depend on a training data set, and have unsatisfactory analysis effect; in the invention, under the condition of lacking a training data set aiming at noun phrases, complex noun phrases containing implicit relations are automatically processed by a lightweight method by utilizing the extension of the noun phrases, thereby realizing an unsupervised lightweight algorithm;

(2) The existing bundling method using extension does not consider the relation analysis of complex noun phrases, the core reason of the method is the purpose of the method, and the existing unsupervised method using extension basically aims at extracting a triple supplemental knowledge base, so the existing bundling method using extension does not consider complex phrases. The invention aims to utilize the resources of a given knowledge base to understand noun phrases, in particular to complex noun phrases with nested relation, the invention utilizes the grammatical information of the noun phrases, improves the quality and the efficiency of generated description logic concepts by restricting the grammatical consistency and analyzing sequence, and has the capability of processing the noun phrases with nested relation;

(3) The invention utilizes the epitaxial-based statistical indexes and the index matching indexes to select high-quality description logic concepts in a multidimensional grading mode, thereby improving the accuracy of mapping noun phrases into logic language concepts. The high-quality Wikipedia categories which are randomly extracted and manually labeled and explained are used as a data set, and a verification set and a test set are divided by 5 to obtain index results as follows: the generated EL + + describes the logic concept with a complete matching rate of 0.53 and a partial matching rate of 0.71.

Drawings

FIG. 1 is a schematic flow diagram of the process of the present invention.

Detailed Description

The invention relates to a method for mapping noun phrases to EL + + description logic concepts through extension, wherein the noun phrases are mapped to the logic language concepts expressed by the description logic language EL + + through extension of the noun phrases, so that the understanding of the noun phrases on a given knowledge base DBpedia is generated, and a computer can better understand the noun phrases.

Step 1, using a natural language processing tool to perform word segmentation and morphology reduction on noun phrases, enumerating all text segments T on word sequences after word segmentation, wherein the text segments refer to segments of continuous words in the noun phrases, namely segments formed by all N-gram models, and the text segments T _lemma The key used in the establishment of the index dictionary, namely the text alias, may be the original form of the word, for example, the "French" index is less than the entity "dbr: france", but "France" may be the original form fragment, namely the text fragment after the reduction of the word form. Text fragment T and text fragment T with restored corresponding word form _lemma Indexing the resources of the given knowledge base, and generating a mapping table of the text segments to the resources in the knowledge base; resources include entities, literal amounts, attributes, types.

And 2, performing part-of-speech tagging according to the participles of the noun phrases to generate a syntactic tree, recursively traversing the whole tree from the top of the tree, and taking leaf nodes, namely the traversal sequence of each word as an analysis sequence, which is specifically as follows.

Step 2.1, generating a syntax tree of noun phrases by using a natural language processing tool;

and 2.2, traversing the whole tree recursively from the top of the tree, taking the traversal sequence of the leaf nodes as an analysis sequence, wherein in the syntax tree, the leaf nodes are all words, the generated analysis word sequence is a certain arrangement sequence of the words of the phrase, the current analyzable word is a word which is currently to be analyzed, and the words are sequentially analyzed according to the analysis sequence.

Furthermore, when the parsing sequence is generated, the head of the noun phrase is defined as the last word of the first noun group, the noun group refers to a long-name phrase formed by modifying nouns of other nouns, and all noun groups corresponding to the noun phrase are obtained through part-of-speech analysis. For noun phrase nodes in a syntax tree, firstly, a sub-node where the head of a current noun phrase is located is used as a new noun phrase node for analysis, then, the sub-node on the left side of the head is analyzed from right to left, and finally, the sub-node on the right side of the head is analyzed from left to right, namely, the analysis sequence is from the head to the near to the far; for the nodes where the verbs or the adverbs start, the verbs or the adverbs are firstly analyzed, and then the rest parts are analyzed according to the original sequence, namely the sequence from left to right or from right to left on the father nodes; for the nodes where adjectives begin, as the adjectives are bound to be used as the modification of the head, the adjectives are firstly analyzed according to the original sequence of phrases except the adjectives, and finally the adjectives are analyzed.

And 3, for one analyzable word, indexing resources according to all text segments of the analyzable word to generate a thinning operation, and obtaining a corresponding description logic concept through the thinning operation. From the concept of EL + + in analytical order

Starting, refining the basic concept generated by the indexed resources, and performing refining operation analysis on each analyzable word in sequence, wherein the process is continued until all words are analyzed, and the description logic concept mapped by the noun phrase is obtained, which is specifically as follows.

And 3.1, listing all candidate text segments containing the word aiming at the current resolvable word, namely all the text segments containing the resolvable word.

And 3.2, indexing the text segments to corresponding resources according to the resource mapping table obtained in the step 1, and generating all candidate thinning operations according to the corresponding resources.

The basic concept in the descriptive logic language EL + + includes 5 modalities: EL + + describes basic concept forms corresponding to individual { O }, atomic concept A and role in logic concept

And hiding the role name

And

the resources comprise entities, literal volumes, attributes and types, and for the indexed entities and literal volumes, corresponding forms are generated, including { O } and

for the indexed type, corresponding morphology is generated, including A and

for the indexed attributes, corresponding forms are generated

Defining a refinement operation

Comprises the following steps: modifying a part A of C with a basic concept B for concept C, generating all possible refinement operations by enumerating the A part of refined C for known basic concept B and known concept C, wherein for the indexed entity and literal amount o, generating corresponding basic concept { o } and all basic concepts containing hidden roles

And generating all thinning operations with the support degree not being 0; for the indexed type A, generating a corresponding basic concept A and all basic concepts containing hidden roles

And generating all thinning operations with the support degree not being 0; for indexed attributes p, corresponding to roles r, corresponding base concepts are generated

And generating all thinning operations with the support degree not being 0. The support degree here refers to the support degree of the support set of the refinement operation, for the noun phrase NP and the refinement operation d, and the extension NP ^I ，

Is the support Set _sup (NP, d) support degree.

Step 3.3, carrying out consistency screening on the newly generated candidate thinning operation, and screening out the thinning operation inconsistent with the syntax:

if the current word to be analyzed is the head, the preference is as follows

In which B is _atomic Is an atomic concept, whereas if the current word to be parsed is not a phrase header, then non-phrase is preferably selected

And (5) performing formal refinement operation.

3.4, generating a detailed description logic concept for the current analyzable word by using the detailed operation obtained by the 3.3, grading the obtained description logic concept, selecting a reserve with a high k before the score, checking whether the analysis is completed or not, namely the currently analyzed analyzable word is the last word in the analysis sequence, and if not, entering the step 3.1 to analyze the next word; if yes, go to step 3.5.

The scoring function that describes the logical concept is:

wherein S _sup For the support rating, S _match Scoring the degree of match, S _sim For simplicity scoring, w _sup 、w _match 、w _sim Is the corresponding weight.

Support score S describing logical concepts _sup Defining the average value of the smooth values of the support degree of the support set of a plurality of times of thinning operation in the process of generating the logic concept, and carrying out the thinning operation on the known noun phrase NP

NP ^I Set of entities described for noun phrases, i.e. extension of phrases, for concepts C, C ^I For the entity set described by the concept, the basic concept B, B ^I Refining operations for entity sets of basic concept descriptions

Referring to concept C, a part A in C is modified by basic concept B, and the support set calculation formula is as follows:

wherein,

part A, referred to as B modification, is descriptive of epitaxial NP ^I Per se, e.g.

In the step (1), the first step,

the moiety A, which means modification of B, is a moiety described in relation to epitaxySets of entities, e.g.

In (1), the thinned portion of Work is not descriptive of epitaxial NP ^I But rather a collection of entities with which there is a "basedOn" relationship.

S _sup Calculated from the following formula, where d is the refinement operation to generate C,

is a support set

The support degree of (c), epsilon is an empirical parameter, and is generally set to 1:

S _match defined as the proportion of words in the phrase that can be conceptually matched to, the calculation formula is as follows:

S _sim (C)＝-|{d|d∈C}|

step 3.5, the description logic concepts of all the words obtained according to the analysis sequence are retained with the score S _score Highest as output C _best I.e., the descriptive logical concepts to which the noun phrases are mapped, are used for the semantic understanding of the noun phrases by the knowledge base.

The invention is described in further detail below with reference to the figures and the specific embodiments. In particular, the weighting parameter is set to w _sup ＝0.3,w _match ＝0.5,w _sim K is 5, experiment select 2016-10 version of DBpedia as the knowledge base.

Examples

The input noun phrase is 'Films based on works by Alice Munro', the entity set described by the phrase is dbr: away _ from _ Her, dbr: edge _ of _ Madness

The present invention is further described in detail with reference to examples, so that those skilled in the art can implement the present invention with reference to the description.

With reference to fig. 1, the present invention specifically comprises the following steps:

step 1, exhausting all text segments of noun phrases, and generating a mapping table from the text segments to resources in a knowledge base, wherein the mapping table specifically comprises the following steps:

using a natural language processing tool to carry out word segmentation and word shape reduction on noun phrases to obtain a word sequence of [ files, base, on, works, by, alice, munro ]]", prototype sequence" [ film, base, on, work, by, alice, munro]". And enumerating all text segments and the text segments after the corresponding word forms are restored for the word sequences after word segmentation. Text fragment T and text fragment T with corresponding morphology reduced _lemma And indexing to corresponding knowledge base resources, wherein the resources comprise entities, literal amounts, attributes and types. An index dictionary used for indexing is constructed offline through anchor text, tag attribute values and redirection in DBpedia and stored as<Natural language text, resources>For fast lookup in the indexing process.

The indexing results in partial text segments and corresponding attributes as shown in table 1.

TABLE 1

Step 2, generating an analysis sequence according to the word segmentation, part-of-speech tagging and the syntax tree of the noun phrase, which comprises the following specific steps:

and 2.1, generating a syntax tree of the noun phrase by using a natural language processing tool. The syntax tree of "Films based on work by Alice Munro" is "(TOP (NP (_ Films)) (VP (_ based) (PP (_ on) (NP (_ works)) (PP (_ by) (NP (_ Alice) (_ Munro))))))))) ("

And 2.2, recursively traversing the whole tree from the top of the tree, and taking the traversal sequence of the leaf nodes (namely each word) as a resolution sequence.

The head of a noun phrase is defined as the last word of the first noun group. For noun phrase nodes, the sub-node where the head of the current noun phrase is located is firstly used as a new noun phrase node for analysis, then the sub-node on the left side of the head is analyzed from right to left, and finally the sub-node on the right side of the head is analyzed from left to right. That is, the order of analysis is from the head to the back. For the nodes where the verb or adverb begins, the verb or adverb is parsed first, and then the remaining parts are parsed in the original order. For the starting node of the adjective, as the adjective is inevitably used as the modification of the head, the other parts are firstly analyzed according to the original sequence phrase, and finally the adjective is analyzed.

For 'Films based on works by Alice Munro', the head 'files' is processed first, then the part on the right of the head is processed from left to right, and since the first word 'based' of the node is a verb, the verb 'based' is processed first, and then 'on' is processed in the original order. The new noun phrase "works by Alice Munro" now appears. For this part, the part's header "works" is processed first, then "by" is processed in order, and finally the new noun phrase "Alice Munro" is processed. Therefore, the analysis order is "files", "base", "on", "works", "by", "Munro" or "Alice".

Step 3, from the concept of EL + + according to the analysis sequence

Initially, the basic concept of resource generation with indexing continuesRefining until all words are analyzed, and obtaining a description logic concept mapped by the noun phrases, wherein the description logic concept is as follows:

and 3.1, aiming at the current analyzable word, generating all candidate text segments containing the analyzable word. For example, for the parsable word file, all candidate text segments are generated, including "files", "file based on", and so on.

And 3.2, indexing the corresponding resources from the text segments according to the resource indexes obtained in the step 1. And generating all candidate thinning operations according to the corresponding resources. For the text segment "files", the indexed resources include the type "dbo: film", the entity "dbr: film", the attribute "dbo: openingFilm", and the like, and generate the basic concept

Film, { Film }, etc., for refining known concepts, resulting in refining operations such as

And so on.

And 3.3, carrying out consistency screening on the newly generated candidate thinning operation, and screening out the thinning operation inconsistent with the syntax. For the current header "files", screen out

Etc. reserve

And 3.4, sequencing the refined concepts, and selecting the concept with the highest k before the score is high. For each concept, checking whether the analysis is completed or not, and if not, entering a step 3.1; if yes, go to step 3.5.

When only one concept Film exists, because the concept Film is not resolved, 3.1 is directly entered to search possible refinement operation again. After multiple cycles, the concept of candidate after refinement is

And the like. Score it, e.g. by

The matching degree score of (2) is 6/7=0.857, the support degree score is 0.83, the cleanliness is-4, and the calculated score is-0.1225. The concept of the top 2 scores is retained as

And

when the resolution is found to be completed, the process proceeds to step 3.5.

Step 3.5, for all the concepts so far, the highest score is retained and is C _best As an output. For this embodiment, the output score is highest

Compared with the work of Cat2Ax and Pasca et al (i.e., H-M decomplex), the invention has more excellent accuracy and can give more complete and better mapping results of descriptive logic concepts, as shown in Table 2.

TABLE 2

	Partial match rate	Complete matching rate
			The invention	0.71	0.53
Cat2Ax	0.42	0.21
			H-M decompose*	0.36	0.29

Claims

1. A method for mapping noun phrases to descriptive logical concepts based on extensions, wherein mapping noun phrases to logical language concepts expressed in a descriptive logical language EL + + through the extensions of noun phrases generates an understanding of noun phrases on a given knowledge base, comprising the steps of:

step 1, carrying out word segmentation and morphology reduction on noun phrases, enumerating all text segments T on word sequences after word segmentation, namely segments formed by all N-gram models in noun phrases, and text segments T after morphology reduction corresponding to the text segments _lemma Indexing the text segments to resources of a knowledge base, and generating a mapping table of the text segments to the resources in the knowledge base;

step 3, continuously refining the basic concept generated by the indexed resources from the concept T of EL + + according to the analysis sequence, and analyzing each analyzable word in sequence, wherein the process is continued until all words are analyzed to obtain the description logic concept mapped by the noun phrases:

3.4, generating a detailed description logic concept for the current analyzable word by using the detailed operation obtained by the 3.3, grading the obtained description logic concept, selecting a high k before the score to be reserved, then checking whether the analysis is finished, namely whether the currently analyzed analyzable word is the last analyzable word in the analysis sequence, and if not, entering the step 3.1 to analyze the next analyzable word; if yes, entering step 3.5;

the scoring function that describes the logical concept is:

wherein S _sup For the support score, S _match Scoring the degree of match, S _sim For clarity scoring, w _sup 、w _match 、w _sim Is a function of the corresponding weight or weights,

NP ^I Set of entities described for noun phrases, i.e. extension of phrases, to concepts C, C ^I For the set of entities described by concept C, for the basic concepts B, B ^I Refining operation for entity set described by basic concept B

The method is characterized in that for a concept C, a part A in the concept C is modified by a basic concept B, and a support Set is Set _sup The calculation formula is as follows:

wherein,

part A, referred to as B modification, is descriptive of epitaxial NP ^I In and of itself, the first and second,

part A modified by B is an entity set describing the relationship with epitaxy;

S _sup is calculated by the following formula, wherein d represents the refinement operation on the concept C,

is a support set

The support degree of (c):

S _sim (C)＝-|{d|d∈C}|

2. The method for mapping noun phrases to descriptive logical concepts based on extensions according to claim 1, wherein the resources include entities, literal amounts, attributes, and types.

3. The method for mapping noun phrases to descriptive logic concepts based on epitaxy as claimed in claim 1, wherein when the parsing order is generated, all noun groups corresponding to noun phrases are obtained through part-of-speech analysis, the head of a noun phrase is defined as the last word of the first noun group, for noun phrase nodes in a syntax tree, a sub-node where the head of the current noun phrase is located is first parsed as a new noun phrase node, then a sub-node on the left side of the head is parsed from right to left, and finally a sub-node on the right side of the head is parsed from left to right, that is, the parsing order is from near to far from the head; for the nodes where the verbs or the adverbs start, the verbs or the adverbs are firstly analyzed, and then the rest parts are analyzed according to the sequence from left to right or from right to left on the father nodes; for the nodes where the adjectives begin, the adjectives are analyzed according to the phrases of the father nodes except the adjectives, and finally the adjectives are analyzed.

4. The method for mapping noun phrases to descriptive logical concepts based on extensions according to claim 1, characterized in that step 3.2 generates all candidate refinement operations based on the corresponding resources, as follows:

the basic concept in definition description logic includes 5 modalities: EL + + describes the basic concept forms corresponding to the individual { O }, the atom concept A and the role in the logic concept

And is hiddenHiding the name of the character

And

for indexed entities and literal volumes, corresponding morphologies are generated, including { O } and

for the indexed type, corresponding forms are generated, including A and

for the indexed attributes, corresponding forms are generated

Defining refinement operations

And generating all thinning operations with the support degree not being 0; for indexed attributes p, corresponding roles r, corresponding base concepts are generated

And generating all thinning operations with the support degree not being 0.

5. The method for mapping noun phrases to descriptive logical concepts based on extensions according to claim 1, characterized in that in step 3.3, if the current word to be resolved is the head, preference is given to the word as

A formal refinement operation.