CN101404033A - Automatic generation method and system for noumenon hierarchical structure - Google Patents

Automatic generation method and system for noumenon hierarchical structure Download PDF

Info

Publication number
CN101404033A
CN101404033A CNA2008102263909A CN200810226390A CN101404033A CN 101404033 A CN101404033 A CN 101404033A CN A2008102263909 A CNA2008102263909 A CN A2008102263909A CN 200810226390 A CN200810226390 A CN 200810226390A CN 101404033 A CN101404033 A CN 101404033A
Authority
CN
China
Prior art keywords
property value
sentence
hierarchical structure
noumenon
automatic generation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2008102263909A
Other languages
Chinese (zh)
Inventor
穗志方
赵庆亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CNA2008102263909A priority Critical patent/CN101404033A/en
Publication of CN101404033A publication Critical patent/CN101404033A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention relates to an automatic generation method of an ontology hierarchical structure. The method comprises the following steps: S1. an attribute value list of each concept is extracted based on the Internet; S2. similar attribute values in the attribute value list are merged; S3. the attribute values in the attribute value list are filtered according to the domain feature of the concept; and S4. the conceptual hierarchical structure in the ontology is automatically generated by the merged and filtered attribute values. The invention also relates to a corresponding system. The method takes weighted attribute values as the characteristic vectors of the concepts and utilizes the clustering algorithm to cluster the concepts, which can greatly improve the accuracy rate of the results, thus causing the automatic generation of a large-scale and practical Ontology to be possible.

Description

The automatic generation method and system of noumenon hierarchical structure
Technical field
The present invention relates to Internet technical field, relate in particular to the automatic generation method and system of a kind of Ontology (body) hierarchical structure.
Background technology
Ontology is a kind of semantic basis that exchanges (dialogue, interoperability, share etc.) in certain field between (can be in the application-specific, also can be wider scope) different subjects (people, machine, software systems etc.).The structure work of early stage Ontology expends great amount of manpower and material resources and financial resources by manually finishing, and the time cycle is also very long, has influenced the application of Ontology to a great extent.Over nearly 30 years, the researchist concentrates on energy on automatic, the semi-automatic structure of Ontology, has obtained a lot of achievements.The important component part that Ontology makes up automatically is the automatic generation of concept hierarchy structure, and the hierarchical structure of notion is the basic framework that Ontology carries out knowledge organization, is that Ontology makes up most crucial content automatically.The concept hierarchy structure automatic generating calculation of efficiently and accurately has basic meaning for automatic structure extensive, practicality Ontology, makes next generation internet based on Ontology simultaneously--the realization of the Semantic Web possibility that becomes.
Current Ontology hierarchical structure generates automatically and mainly contains based on Pattern, based on the method for FCA, cluster, and these methods exist the not high drawback of accuracy, become a bottleneck problem in the automatic building process of Ontology.
Summary of the invention
The objective of the invention is to overcome that noumenon hierarchical structure generates the not high problem of method accuracy automatically in the prior art.
In order to achieve the above object, technical scheme of the present invention proposes a kind of automatic generation method of noumenon hierarchical structure, and this method may further comprise the steps:
S1. extract the list of attribute values of each notion based on the internet;
S2. property value similar in the described list of attribute values is merged;
S3. according to the field characteristic of notion the property value in the described list of attribute values is filtered;
S4. utilize described merging, the property value after filtering carries out the automatic generation of concept hierarchy structure in the body.
In the automatic generation method of above-mentioned noumenon hierarchical structure, described step S1 specifically comprises:
S111. use " class name+attribute+subset " internet to be retrieved the saving result webpage as key word;
S112. described results web page is carried out denoising;
S113. carrying out sentence according to the webpage of preset condition after to described denoising selects;
S114. extraction and described subset are in the phrase in the parallel construction, and calculate weight;
S115. whether the weights that obtain of determining step S114 are higher than preset threshold value, if then add list of attribute values and change step S113 as new property value, otherwise stop.
In the automatic generation method of above-mentioned noumenon hierarchical structure, the pre-conditioned of described step S113 comprises:
Comprise parallel construction in the sentence;
The seed property value appears in the described parallel construction.
In the automatic generation method of above-mentioned noumenon hierarchical structure, described step S1 specifically comprises:
S121. read the sentence that described step S113 produces, described parallel construction in pre-conditioned is replaced into sky;
S122. according to the feature of sentence among the default feature templates extraction step S121;
Adopting comparatively simple feature templates herein, is in order to guarantee that feature space does not too disperse.
Figure A20081022639000061
Wherein, i is from the 0 length l ength-1 to sentence, word iEach speech in the expression sentence, word I-1+ word iRepresent two tuples that previous speech and current speech constitute; Pos iThe part of speech of each speech in the expression sentence, the implication of part of speech combination expression is identical with contamination.
S123. the sentence feature that generates according to step S122 uses the training of maximum entropy instrument to generate sorter;
S124. use " class name+attribute " internet to be retrieved the saving result webpage as key word;
S125. after the results web page that step S124 is obtained is carried out denoising, use the sorter that generates among the step S123 to classify to each sentence in the webpage;
S126. in the described sorted relevant sentence of step S125, carry out word frequency statistics;
S127. word frequency is the highest several are as subset, repeating said steps S111~S115.
In the automatic generation method of above-mentioned noumenon hierarchical structure, described step S2 specifically comprises:
S21. generate the global property value list;
S22. extract the context that each attribute occurs in the list of attribute values;
S23. extract the characteristic set of property value according to default feature templates;
Characteristic set is as follows:
Figure A20081022639000071
Adopt speech and part of speech as feature herein, the feature space dimension is reduced.
Wherein, i is from the 0 length l ength-1 to sentence, word iEach speech in the expression sentence, pos iThe part of speech of each speech in the expression sentence.
S24. structural attitude vector, and with vector distance, similarity of character string as the standard of weighing two concept similarities;
S25. use nearest neighbor algorithm to carry out cluster.
In the automatic generation method of above-mentioned noumenon hierarchical structure, described step S3 specifically comprises:
Use word frequency/reverse file frequency algorithm to filter ordinary speech.
In the automatic generation method of above-mentioned noumenon hierarchical structure, the execution of described step S2 and step S3 in no particular order.
Technical scheme of the present invention also proposes a kind of automatic creation system of noumenon hierarchical structure, comprising:
The automatic extraction module of property value extracts the list of attribute values of each notion based on the internet;
Property value cluster module merges property value similar in the described list of attribute values;
The property value filtering module filters the property value in the described list of attribute values according to the field characteristic of notion;
Concept hierarchy structure generation module utilizes described merging, the property value after filtering carries out the automatic generation of concept hierarchy structure in the body.
Technical scheme utilization of the present invention has the proper vector of the property value of weight as notion, use clustering algorithm that notion is carried out cluster, can improve result's accuracy rate significantly, thereby make that structure Ontology (body) extensive, practicality becomes possibility automatically.
Description of drawings
Fig. 1 implements illustration for the automatic creation system of noumenon hierarchical structure of the present invention;
Fig. 2 is the general frame figure of the automatic extraction module of property value among Fig. 1 embodiment;
Fig. 3 is the weak guidance method synoptic diagram of the automatic extraction module of property value among Fig. 2;
Fig. 4 is the no guidance method synoptic diagram of the automatic extraction module of property value among Fig. 2;
The tree-like construction that Fig. 5 obtains in medical domain for Fig. 1 embodiment is figure as a result.
Embodiment
Following examples are used to illustrate the present invention, but are not used for limiting the scope of the invention.
Fig. 1 implements illustration for the automatic creation system of noumenon hierarchical structure of the present invention, as shown in the figure, the system of present embodiment comprises the automatic extraction module 101 of property value, property value cluster module 102, property value filtering module 103 and concept hierarchy structure generation module 104, below will be described respectively.
1) the automatic extraction module 101 of property value
Module effect: the list of attribute values of extracting each term based on WWW automatically
Based on the automatic extending method of the extensive Ontology of WWW, its input is notion title, Property Name and the property value subset among the Ontology, extracts and the property value extraction through relevant sentence, obtains the candidate attribute value set, and its general frame as shown in Figure 2.As can be seen from Figure 2, the notion title and the seed that are input as among the Ontology of system are gathered, and the module by the structure inquiry obtains the inquiry that is made of notion title+Property Name and seed; For the retrieval internet module, obtain the set of related web page afterwards; Be the sentence sort module afterwards, obtain the set of relevant sentence; Last use attribute extraction module extracts property value set from relevant sentence, thereby finishes the task that property value extracts automatically from the internet.
Use Google API as the instrument that obtains original web page.Choose that preceding 100 webpages of correlativity rank carry out information extraction in the result for retrieval, before using webpage, also need original web page is carried out denoising, at first construct the Dom tree of webpage, the proportion of the link under statistics leaf Table label and the leaf Div label, if the proportion of link surpasses 50%, think that then this piece is that noise removes.
The property value that divides two steps to carry out specified concept in the webpage extracts: a. seeks relevant sentence from webpage; B. in relevant sentence, seek property value.
For step a, the relevant sentence of definable is:
Comprise parallel construction (condition 1) in the sentence;
Seed property value (condition 2) appears in the parallel construction.
Under the situation that is implemented in given few property value of trying one's best, can obtain enough relevant sentences, and the final accurately complete property value set that obtains, present embodiment also proposes sentence and selects to extract interactive method with property value.Specific algorithm is as follows: the new property value that finds in sentence is added to during subset is fated, so just can find more sentence, and can extract more property value from more sentence.Carry out the evaluation of candidate attribute value simultaneously, the candidate attribute value of having only degree of confidence to be higher than certain threshold value just can be adopted.Whole process ends at no longer to produce new candidate attribute value, as shown in Figure 3, may further comprise the steps:
S301. use " class name+attribute+subset " internet to be retrieved, preserve related web page as key word;
S302. for the results web page denoising;
S303. carrying out sentence according to above-mentioned condition 1,2 selects;
S304. extraction and subset are in the phrase in the parallel construction, and calculate its weight according to above-mentioned formula;
If S305. have new property value to occur then change step S303, otherwise stop.
For the evaluation of candidate attribute value, can use x 2Calculate the weight of each phrase in the parallel construction, it is added in the subset if weight is higher than preset threshold.Yet the phrase arranged side by side with the high seed of degree of confidence should have high confidence level; The phrase arranged side by side with the low seed of degree of confidence then should have low confidence.So when a phrase and seed attribute appear in the parallel construction simultaneously, also should add the weight of seed property value here, computing formula is as follows:
Use x 2As initial weight:
Figure A20081022639000101
Wherein, m i , j = Σ i freq i , j Σ j freq i , j Σ ij freq i , j
Iterative formula:
weight phrase = weight phrase + Σ 0 m weight phrase m
Wherein, phrase is an object phrase, weight PhraseBe its weight; Phrase mFor appearing at phrase in the parallel construction with phrase, Be its weight.Phrase mIt is kind subphrase with the object phrase co-occurrence.The weight that is object phrase be the initial weight genitive phrase that is in coordination with it weight add and.
The method that proposes above (calling weak guidance method in the following text) has solved certain attribute when target class and has had under the situation of seed property set the filling to this attribute.But the difficulty that this method faces is, when need be to the automatic structure of extensive Ontology, all specify a property value subset can not for each attribute of each class.In order to address this problem, guideless property value extracting method has below further been proposed, thereby can be implemented in the process of filling whole Ontology, need specify a property value seed set for certain attribute of a class, just can finish the automatic filling of the property value of other all these attributes of class for artificial.
In the process that weak guidance method is filled a class, obtain the set of a sentence, if can find out certain pattern at these sentences, just can utilize these patterns to judge whether a sentence is the sentence of describing a certain generic attribute, thereby substitute two Rule of judgment 1,2 in the weak guidance method.For improving the correctness of property value set, only select those high confidence level candidate attribute values here, they as subset, just can be finished property value and extract automatically according to weak guidance method.The synoptic diagram of the property value extraction method under nothing instructs may further comprise the steps as shown in Figure 4:
S401. read the relevant sentence that produces in the weak guidance method attribute filling process, wherein parallel construction is replaced into sky;
S402. extract the feature of sentence among the S401 according to feature templates;
S403. the sentence feature that generates according to S402 uses the training of maximum entropy instrument to generate sorter;
S404. use " class name+attribute " internet to be retrieved, preserve related web page as key word;
S405. for after the results web page denoising, use the sorter that generates among the S403 to classify to each sentence in the webpage;
S406. in the classification of S405, carry out word frequency statistics for relevant sentence;
S407. word frequency is the highest 3 as seed set, the process a little less than the repetition in the guidance method.
Process in the weak guidance method of foundation can obtain a set of relevant sentence, and note is made TrainingSentenceSet.Parallel construction in each sentence among the SentenceSet is replaced with variable, for example: " ... symptom has: cough, fever etc.. " replace with: " ... symptom has: $ etc.. ", the result who obtains is this sentence and the irrelevant part of concrete condition, can embody the sentence pattern of current relation on attributes, training obtains a sorter as training examples with them then.The feature templates of selecting in the present embodiment is as shown in table 1.
Table 1 sentence pattern sorter feature templates
Figure A20081022639000111
Wherein, i represents from the 0 length l ength-1 to sentence, word iEach speech in the expression sentence, word I-1+ word iRepresent two tuples that previous speech and current speech constitute, word I-1+ word i+ word I+1Represent the tlv triple that previous speech, current speech and a back speech constitute; Pos iThe part of speech of each speech in the expression sentence, the implication of part of speech combination expression is identical with contamination.
Because it is limited to describe the sentence formula of attribute,, the processing of problem is had a negative impact if select for use too many template can cause data sparse.Through a series of experiments, the final feature templates of selecting of present embodiment is the template 1,2,3,4 in the table 1.Use maximum entropy algorithm to train a sorter.Still the algorithm of continuing to use in the weak guidance method is estimated the candidate attribute value, has so just obtained the seed set, carries out the attribute filling process with weak guidance method again.
In the foregoing description, no guidance method is to the further modification of weak guidance method and must be based upon on the basis of weak guidance method, can regard as: in use, a property value subset of the given notion of elder generation, extract its property value with weak guidance method, extract the sentence set that comprises these property values simultaneously; In the time will extracting the property value of other notions, because can not give property value subset of each notion, so, the sentence that obtains with weak guidance method can be gathered as training set, adopt no guidance method further to extract the property value set of other notions.
2) property value cluster module 102
The effect of this module is to merge similar property value, thereby plays the effect of dimensionality reduction, and improves accuracy.Specific algorithm can be expressed as follows with false code:
I. the property value of each notion is merged into a table, generate the global property value list, the global property value list is " property value, a weight " table of comparisons, concrete construction process is as follows: for the property value of each notion, if a certain property value does not occur in global listings, then add one, and be the weight of this property value weight setting; Otherwise, revise weight, recruitment is the weight of this property value;
Ii. extract the context that each attribute occurs in the list of attribute values, the contextual sentence that this property value occurs that is defined as;
Iii. extract the characteristic set of property value according to certain feature templates, feature templates is as follows:;
Figure A20081022639000131
Wherein, i is from the 0 length l ength-1 to sentence, word iEach speech in the expression sentence, pos iThe part of speech of each speech in the expression sentence.
Iv. structural attitude vector, and with vector distance, similarity of character string as the standard of weighing two concept similarities;
V. use nearest neighbor algorithm to carry out cluster, the clustering algorithm comparative maturity repeats no more herein.
3) the property value filtering module 103
Since pending Ontology to as if the field relevant, so the property value in the category of term field should can reflect the feature of a notion more, the territoriality of judging property value is mainly based on following means:
Use TF/IDF (Term Frequency/Inverse Document Frequency, word frequency/reverse file frequency) in order to filter ordinary speech.
Increase the weight of those property values relevant with the field according to above means.
4) concept hierarchy structure generation module 104
At first need to define the similarity of two notions, two notions of definition are similar in the present embodiment, based on following 2 hypothesis:
If a. two notions are similar more on its determinant attribute, these two notions are just similar more so
If b. the similar attribute of two notions is many more, these two notions are just similar more so
For hypothesis a, need what attribute of definition to be only the determinant attribute of notion, selection by automatic extraction of property value and property value, we can obtain the weight of certain each property value of notion, it has just embodied the tight ness rating of the contact between property value and the notion, and the higher property value of those weights is crucial more.
For hypothesis b, by the unified weight of amplifying, such two notions if identical property value even the property value weight is very little, can not be left in the basket yet.
Will weigh two hypothesis simultaneously, concrete formula is as follows:
Dis tan ce ( Concept 1 , Concept 2 ) = ΣΣ weight i weight j
Owing to did normalized before the weight of property value, so evolution promptly can reach the effect of amplification.
Next, promptly use the KNN algorithm that notion is carried out cluster.For certain notion A, seek N the notion the most similar (calculating of two concept similarities see on) to it, this N+1 notion is aggregated into one bunch, bunch core be bunch in the merging of each vector, with whole bunch is to regard a notion as, use notion in the core representative bunch to participate in the cluster of a new round, specific algorithm can be expressed as follows with false code:
I. take out a notion A in the list of concepts;
Ii. seek a N notion the most close with A;
Iii. this N+1 notion added a S set;
The proper vector of iv.S be among the S each elemental characteristic vector and;
V. the result after will merging repeats i~iv as input, gathers all elements that comprises in the list of concepts up to S.
Finally can obtain one comprise all notions, be node with the notion, by hyponymy with these concept structures tree structure together.With the medical domain is example, as shown in Figure 5.
More than be preferred forms of the present invention, according to content disclosed by the invention, those of ordinary skill in the art can expect some identical, replacement schemes apparently, all should fall into the scope of protection of the invention.

Claims (8)

1, a kind of automatic generation method of noumenon hierarchical structure is characterized in that this method may further comprise the steps:
S1. extract the list of attribute values of each notion based on the internet;
S2. property value similar in the described list of attribute values is merged;
S3. according to the field characteristic of notion the property value in the described list of attribute values is filtered;
S4. utilize described merging, the property value after filtering carries out the automatic generation of concept hierarchy structure in the body.
2, the automatic generation method of noumenon hierarchical structure according to claim 1 is characterized in that described step S1 specifically comprises:
S111. use " class name+attribute+subset " internet to be retrieved the saving result webpage as key word;
S112. described results web page is carried out denoising;
S113. carrying out sentence according to the webpage of preset condition after to described denoising selects;
S114. extraction and described subset are in the phrase in the parallel construction, and calculate weight;
S115. whether the weights that obtain of determining step S114 are higher than preset threshold value, if then add list of attribute values and change step S113 as new property value, otherwise stop.
3, as the automatic generation method of noumenon hierarchical structure as described in the claim 2, it is characterized in that the pre-conditioned of described step S 113 comprises:
Comprise parallel construction in the sentence;
The seed property value appears in the described parallel construction.
4, as the automatic generation method of noumenon hierarchical structure as described in the claim 3, it is characterized in that described step S1 specifically comprises:
S121. read the sentence that described step S113 produces, described parallel construction in pre-conditioned is replaced into sky;
S122. according to the feature of sentence among the default feature templates extraction step S121;
S123. the sentence feature that generates according to step S122 uses the training of maximum entropy instrument to generate sorter;
S124. use " class name+attribute " internet to be retrieved the saving result webpage as key word;
S125. after the results web page that step S124 is obtained is carried out denoising, use the sorter that generates among the step S123 to classify to each sentence in the webpage;
S126. in the described sorted relevant sentence of step S125, carry out word frequency statistics;
S127. word frequency is the highest several are as subset, repeating said steps S111~S115.
5, the automatic generation method of noumenon hierarchical structure according to claim 1 is characterized in that described step S2 specifically comprises:
S21. generate the global property value list;
S22. extract the context that each attribute occurs in the list of attribute values;
S23. extract the characteristic set of property value according to default feature templates;
S24. structural attitude vector, and with vector distance, similarity of character string as the standard of weighing two concept similarities;
S25. use nearest neighbor algorithm to carry out cluster.
6, the automatic generation method of noumenon hierarchical structure according to claim 1 is characterized in that described step S3 specifically comprises:
Use word frequency/reverse file frequency algorithm to filter ordinary speech.
7, as the automatic generation method of noumenon hierarchical structure as described in claim 5 or 6, it is characterized in that the execution of described step S2 and step S3 in no particular order.
8, a kind of automatic creation system of noumenon hierarchical structure is characterized in that, this system comprises:
The automatic extraction module of property value extracts the list of attribute values of each notion based on the internet;
Property value cluster module merges property value similar in the described list of attribute values;
The property value filtering module filters the property value in the described list of attribute values according to the field characteristic of notion;
Concept hierarchy structure generation module utilizes described merging, the property value after filtering carries out the automatic generation of concept hierarchy structure in the body.
CNA2008102263909A 2008-11-14 2008-11-14 Automatic generation method and system for noumenon hierarchical structure Pending CN101404033A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA2008102263909A CN101404033A (en) 2008-11-14 2008-11-14 Automatic generation method and system for noumenon hierarchical structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA2008102263909A CN101404033A (en) 2008-11-14 2008-11-14 Automatic generation method and system for noumenon hierarchical structure

Publications (1)

Publication Number Publication Date
CN101404033A true CN101404033A (en) 2009-04-08

Family

ID=40538045

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2008102263909A Pending CN101404033A (en) 2008-11-14 2008-11-14 Automatic generation method and system for noumenon hierarchical structure

Country Status (1)

Country Link
CN (1) CN101404033A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102236641A (en) * 2011-05-18 2011-11-09 安徽农业大学 Method for generating similarity matrix between concepts in agricultural field
CN102567314A (en) * 2010-12-07 2012-07-11 中国电信股份有限公司 Device and method for inquiring knowledge
CN102637202A (en) * 2012-03-15 2012-08-15 中国科学院计算技术研究所 Method for automatically acquiring iterative conception attribute name and system
CN102982095A (en) * 2012-10-31 2013-03-20 中国运载火箭技术研究院 Noumenon automatic generating system and method thereof based on thesaurus
CN103207856A (en) * 2013-04-03 2013-07-17 同济大学 Ontology concept and hierarchical relation generation method
CN103324628A (en) * 2012-03-21 2013-09-25 腾讯科技(深圳)有限公司 Industry classification method and system for text publishing
CN104077295A (en) * 2013-03-27 2014-10-01 百度在线网络技术(北京)有限公司 Data label mining method and data label mining system
CN102129422B (en) * 2010-01-14 2015-10-14 富士通株式会社 Template extraction method and apparatus
CN105824828A (en) * 2015-01-06 2016-08-03 深圳市腾讯计算机系统有限公司 Label excavation method and apparatus
CN108536664A (en) * 2017-03-01 2018-09-14 华东师范大学 The knowledge fusion method in commodity field
CN108694208A (en) * 2017-04-11 2018-10-23 富士通株式会社 Method and apparatus for constructs database
CN110874395A (en) * 2019-10-14 2020-03-10 中国船舶重工集团公司第七0九研究所 Abstract concept instantiation method based on context correlation
CN111813896A (en) * 2020-07-13 2020-10-23 重庆紫光华山智安科技有限公司 Text triple relation identification method and device, training method and electronic equipment

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129422B (en) * 2010-01-14 2015-10-14 富士通株式会社 Template extraction method and apparatus
CN102567314A (en) * 2010-12-07 2012-07-11 中国电信股份有限公司 Device and method for inquiring knowledge
CN102567314B (en) * 2010-12-07 2015-03-04 中国电信股份有限公司 Device and method for inquiring knowledge
CN102236641B (en) * 2011-05-18 2015-02-04 安徽农业大学 Method for generating similarity matrix between concepts in agricultural field
CN102236641A (en) * 2011-05-18 2011-11-09 安徽农业大学 Method for generating similarity matrix between concepts in agricultural field
CN102637202A (en) * 2012-03-15 2012-08-15 中国科学院计算技术研究所 Method for automatically acquiring iterative conception attribute name and system
CN103324628A (en) * 2012-03-21 2013-09-25 腾讯科技(深圳)有限公司 Industry classification method and system for text publishing
CN103324628B (en) * 2012-03-21 2016-06-08 腾讯科技(深圳)有限公司 A kind of trade classification method and system for issuing text
CN102982095B (en) * 2012-10-31 2015-08-19 中国运载火箭技术研究院 A kind of body automatic creation system based on thesaurus and method thereof
CN102982095A (en) * 2012-10-31 2013-03-20 中国运载火箭技术研究院 Noumenon automatic generating system and method thereof based on thesaurus
CN104077295A (en) * 2013-03-27 2014-10-01 百度在线网络技术(北京)有限公司 Data label mining method and data label mining system
CN103207856A (en) * 2013-04-03 2013-07-17 同济大学 Ontology concept and hierarchical relation generation method
CN103207856B (en) * 2013-04-03 2015-10-28 同济大学 A kind of Ontological concept and hierarchical relationship generation method
CN105824828A (en) * 2015-01-06 2016-08-03 深圳市腾讯计算机系统有限公司 Label excavation method and apparatus
CN105824828B (en) * 2015-01-06 2020-01-10 深圳市腾讯计算机系统有限公司 Label mining method and device
CN108536664A (en) * 2017-03-01 2018-09-14 华东师范大学 The knowledge fusion method in commodity field
CN108694208A (en) * 2017-04-11 2018-10-23 富士通株式会社 Method and apparatus for constructs database
CN110874395A (en) * 2019-10-14 2020-03-10 中国船舶重工集团公司第七0九研究所 Abstract concept instantiation method based on context correlation
CN110874395B (en) * 2019-10-14 2022-05-31 中国船舶重工集团公司第七0九研究所 Abstract concept instantiation method based on context correlation
CN111813896A (en) * 2020-07-13 2020-10-23 重庆紫光华山智安科技有限公司 Text triple relation identification method and device, training method and electronic equipment

Similar Documents

Publication Publication Date Title
CN101404033A (en) Automatic generation method and system for noumenon hierarchical structure
CN103544255B (en) Text semantic relativity based network public opinion information analysis method
CN102591988B (en) Short text classification method based on semantic graphs
CN105045875B (en) Personalized search and device
CN111190900B (en) JSON data visualization optimization method in cloud computing mode
Chen et al. Websrc: A dataset for web-based structural reading comprehension
CN105574090A (en) Sensitive word filtering method and system
CN105378731A (en) Correlating corpus/corpora value from answered questions
CN104391942A (en) Short text characteristic expanding method based on semantic atlas
CN104765769A (en) Short text query expansion and indexing method based on word vector
CN101593200A (en) Chinese Web page classification method based on the keyword frequency analysis
CN106844632A (en) Based on the product review sensibility classification method and device that improve SVMs
CN101609450A (en) Web page classification method based on training set
CN103049569A (en) Text similarity matching method on basis of vector space model
CN103823824A (en) Method and system for automatically constructing text classification corpus by aid of internet
CN104991891A (en) Short text feature extraction method
CN106257441A (en) A kind of training method of skip language model based on word frequency
CN106484797A (en) Accident summary abstracting method based on sparse study
CN103646112A (en) Dependency parsing field self-adaption method based on web search
CN109522396B (en) Knowledge processing method and system for national defense science and technology field
CN102999533A (en) Textspeak identification method and system
CN105975547A (en) Approximate web document detection method based on content and position features
CN102314448A (en) Equipment for acquiring one or more key elements from document and method
CN114997288A (en) Design resource association method
KR101379128B1 (en) Dictionary generation device, dictionary generation method, and computer readable recording medium storing the dictionary generation program

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20090408