CN102760140A - Incident body-based method for expanding searches - Google Patents

Incident body-based method for expanding searches Download PDF

Info

Publication number
CN102760140A
CN102760140A CN2011101108081A CN201110110808A CN102760140A CN 102760140 A CN102760140 A CN 102760140A CN 2011101108081 A CN2011101108081 A CN 2011101108081A CN 201110110808 A CN201110110808 A CN 201110110808A CN 102760140 A CN102760140 A CN 102760140A
Authority
CN
China
Prior art keywords
event class
event
importance degree
incident
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011101108081A
Other languages
Chinese (zh)
Inventor
仲兆满
李存华
陈宗华
陈永江
管燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
JIANGSU JINGE NETWORK TECHNOLOGY Co Ltd
Huaihai Institute of Techology
Original Assignee
JIANGSU JINGE NETWORK TECHNOLOGY Co Ltd
Huaihai Institute of Techology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by JIANGSU JINGE NETWORK TECHNOLOGY Co Ltd, Huaihai Institute of Techology filed Critical JIANGSU JINGE NETWORK TECHNOLOGY Co Ltd
Priority to CN2011101108081A priority Critical patent/CN102760140A/en
Publication of CN102760140A publication Critical patent/CN102760140A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an incident body-based method for expanding searches. The method comprises the following steps of: (1) inputting a trigger word, time, address and figure of a searched incident; (2) regulating the formula of the input time element; (3) expanding the input address element according to an address body; (4) judging the field of the incident body of the searched incident; (5) selecting the incident body of a specific field for search expansion; and (6) calculating the similarity of the searched item and text, and outputting a search text according to the similarity in a descending order. According to the idea of incident tetrad is used, the search is expanded based on an existing incident body semantic resource, and the accuracy of incident type information search results can be remarkably improved.

Description

A kind of enquiry expanding method based on the incident body
Technical field
The invention belongs to technical field of information retrieval, specifically relate to a kind of enquiry expanding method based on the incident body.
Background technology
In present information retrieval model and system; User's query requests occurs with the form of keyword usually; The conventional information retrieval utilizes simple speech coupling rule to calculate the similarity between file characteristics value and term, often has only query word to appear in the document and just possibly retrieve.Thereby, the situation that can't be retrieved out because word is different with the relevant document of user inquiring request often appears.Speech does not match becomes one of major reason that influences the information retrieval effect.Address this problem the normal at present query expansion technology that adopts.
Query expansion is meant on the basis of former query word and adds relevant speech, thus form new, inquire about word set more accurately.It utilizes multiple technologies such as Computational Linguistics, information science; Be the basis with the former inquiry of user; Add the speech relevant with former inquiry to former inquiry, so that more completely describe semanteme or theme that former inquiry is implied, the help information searching system provides more information that help judging document relevance more; Be to remedy user inquiring information deficiency, improve the effective means of the recall ratio and the precision ratio of information retrieval.Its key problem is how to design and utilize the source of expansion word.
Body as a kind of can be on semantic and knowledge hierarchy the conceptual model modeling tool of descriptor system, have good concept hierarchy and to the support of reasoning from logic.Its application in computer realm makes query expansion become possibility from the aspect of bringing up to based on knowledge (or notion) based on the aspect of keyword.Body is fused in the traditional information retrieval, not only can carries out the processing on the semantic hierarchies, can also carry out association's reasoning to user's query contents, and then obtain query specification more accurately based on body to the information in the document.
As far back as 1994; The proceeding of Britain issuing: Proceedings of the 17th annualinternational ACM SIGIR Conference on Research on and development ininformation retrieval; Exercise question is: (this article author is Query expansion using lexical-semantic relations: Voorhees E.); This article has proposed the method based on the query expansion of body; Used the notion in the body to carry out query expansion, and to draw extended mode the most effectively be the conclusion of utilizing synonym notion and set membership in the body to expand.
Periodical Chinese Publishing in 2000: computer engineering; Exercise question is: (this article author is: Liao Minghong) for ontology and information retrieval; This article compares generalities and ontology; Attempt ontology is done formal description, and discussed on this basis based on ontological information retrieval method.
Proceeding Canada's publication in 2003: Proceedings of the 1stInternational Workshop on Adaptive Text Extraction and Mining; Exercise question is: (this article author is An analysis of ontology-based query expansion strategies: Navigli R.; Velardi P.), this article has proposed the enquiry expanding method based on the body note.This method has supposed that notion similar in body or term also have similar definition, has used WordNet to expand the note of notion in the body.When the similarity of calculating between the expansion concept, according to the similarity of the word or expression statistical computation notion that occurs in the notion note.
Periodical U.S.'s publication in 2004: Behavior Research Methods; Instruments, & Computers, exercise question is: (this article author is Semantic distance norms computed from an electronicdictionary (wordnet): Maki W.; McKinley L.; Thompson A.), this article has proposed the extended method based on body construction, and basic thought is the expansion that utilizes the structural drawing in the body to inquire about.In main body structure figure, the path of connection is arranged between the node of notion, when the user inquiring content is expanded, the notion on the path that can select to be communicated with this concept node.
Periodical Chinese Publishing in 2005: Nanjing University's journal; Exercise question is: (this article author is: Song Junfeng, Zhang Weiming, Xiao Weidong based on the information retrieval model investigation of body; Tang Jiuyang); The information retrieval model based on body that this article proposes has adopted the descriptive language of description logic as body, uses the vocabulary that defines in the body to come marking document, generates document logical view and customer information requirement logical view based on body; Thereby realize the retrieval of semantic hierarchies, retrieval performance is made moderate progress.
In recent years, some scholar began the thought of the incident of in the query expansion based on body, having introduced.Proceeding Hong-Kong publication in 2005: Proceedings of the 2005 IEEEInternational Conference on e-Technology; E-Commerce and e-Service; Exercise question is: (author of this article is Event-based ontology design for retrieving digital archiveson human religious self-help consulting: Lin H.F.; LiangJ.M.), this article has proposed the retrieval technique of a kind of being called " incident body ".The top layer notion of this body is the key element (like place, time etc.) of incident, and the inscape of incident as the main classification in this body, can be expanded query word by the incident key element in retrieval.
Proceeding Chinese Publishing in 2007: Proceedings of the 2007 IEEEInternational conference on natural language processing and knowledgeengineering; Exercise question is: (author of this article is Reconstruction of people information based on anevent ontology: Han Y.); This article has proposed a kind of personage's ontology model based on incident; He thinks can be according to the pass series structure body between the personage; The more related specific incidents of personage's meeting simultaneously, incident can be used as personage's a generic attribute.
It is thus clear that, based on the information retrieval of body a lot of achievements in research have been arranged, the retrieval of incident has also been caused some scholars' attention.But the application of existing body in query expansion remains with the traditional concepts body and is the basis.Incident has been related a plurality of key elements such as time, place, personage; Than the bigger semantic resource of notion; The incident body is the clear and definite formalization normalized illustration of the event class system model of the outwardness shared, is the proposition that requires study of the query expansion technology of semantic resource with the incident body.
Summary of the invention
The technical matters that the present invention will solve is the deficiency to prior art, and a kind of enquiry expanding method based on the incident body is provided, and this method is carried out query expansion based on the semantic resource of existing incident body, can improve the accuracy rate of event class information inquiry.
In order to address the above problem, the present invention adopts following technical proposals:
A kind of enquiry expanding method based on the incident body, its concrete steps are following:
(1), triggering speech, time, place, personage's four elements of difference input inquiry incident in the query frame of appointment;
(2), the form of element of time of input is carried out regular, unifiedly regularly be the form of < year, month, day>tlv triple;
(3), the place key element of input is expanded according to the place body;
(4), according to the triggering speech of query event of input, judge the field of the incident body under the query event;
(5), the incident body of choosing specific area carries out query expansion;
(6), calculate the similarity of query term and text, the resulting text that obtains is exported according to the big or small descending sort of similarity.
The place key element to input described in the above-mentioned steps (3) is expanded according to the place body, and its concrete steps are following:
(3-1), in the body of place, find the place key element of input;
(3-2), all sub-notions of the place key element of importing are expanded by level successively.
According to the triggering speech of the query event of importing, judge the field of the incident body that query event is affiliated in the above-mentioned steps (4), its concrete steps are following:
(4-1), the event class of field incident body is sorted, suppose that field incident body has n, note is made EQ respectively 1, EQ 2, Λ, EQ n, EQ i(event class among 1≤i≤n) is EC according to the event class set that the big or small descending sort of importance degree obtains i={ EC I1, EC I2, Λ, EC Ij, Λ }.
(4-2), with the input the Event triggered speech successively with each field incident body EQ iIn event class set EC iComparison is write down the Event triggered speech at EC iThe middle sequence number that occurs is k i(1≤i≤n), if EC iMiddle nothing this Event triggered speech, then k iValue is set to the machine maximum number;
(4-3) last, get the minimum ki of sequence number and be affiliated incident body.
Event class to field incident body in the above-mentioned steps (4-1) sorts, and its concrete steps are following:
(4-1-1), the importance degree of each event class in the initialization event body, to event class set EC iIn the initialization formula of importance degree of each event class be:
R ( EC ij ) = 1 n
Wherein, R (EC Ij) be event class EC IjImportance degree, n is event class set EC iThe number of middle event class;
(4-1-2), the Authorities value of each event class of initialization and Hubs value are 0;
(4-1-3), calculate the Authorities value of each event class, the Authorities value formula that calculates each event class is:
S ij = &Sigma; g &Element; In ( EC ij ) R ( EC ig ) k - 1 &times; w gj
Wherein, S IjBe event class EC IjThe Authorities value, In (EC Ij) expression chain go into EC IjEvent class set, R (EC Ig) K-1Be event class EC IgThe k-1 time iteration the time importance degree, w GjBe event class EC IgTo event class EC IjFactor of influence;
(4-1-4), calculate the Hubs value of each event class, the Hubs value formula that calculates each event class is:
S oj = &Sigma; h &Element; Out ( EC ij ) R ( EC ih ) k - 1 &times; w jh
Wherein, S OjBe event class EC IjThe Hubs value, Out (EC Ij) expression EC IjThe event class set that chain goes out, R (EC Ih) K-1Be event class EC IhThe k-1 time iteration the time importance degree, w JhBe event class EC IjTo event class EC IhFactor of influence;
(4-1-5), calculate the importance degree of each event class, the formula that calculates the importance degree of each event class is:
R ( EC ij ) k = R ( EC ij ) k - 1 + d &times; ( &alpha; &times; S ij + ( 1 - &alpha; ) &times; S oj ) + 1 - d n
Wherein, R (EC Ij) kBe event class EC IjImportance degree during the k time iteration; D is a ratio of damping, and span is 0~1, gets d=0.85 usually; α is the parameter of regulating Authorities value and Hubs value; 0≤α≤1, if α=1, just with the foundation of Authorities value as iterative computation; If α=0; Just,, get α=0.5 usually for Authorities value and the Hubs value of taking all factors into consideration event class with the foundation of Hubs value as iterative computation;
(4-1-6), the importance degree of each event class of standardizing, the formula of the importance degree of each event class of standardizing is:
R ( EC ij ) k = 1 &Sigma; i = 1 n R ( EC ij ) k &times; R ( EC ij ) k
Wherein, R (EC Ij) kBe event class EC IjImportance degree during the k time iteration;
Figure BDA0000058541690000064
It is the importance degree sum of all event class.
(4-1-7), judge whether the importance degree of each event class satisfies the precision of iteration convergence, decision event class EC IjWhether satisfy the precision ε of iteration convergence, judge whether that the iteration convergence formula is:
|R(EC ij) k-R(EC ij) k-1|>ε
Wherein, R (EC Ij) kBe event class EC IjImportance degree during the k time iteration, R (EC Ij) K-1Be event class EC IjImportance degree during the k-1 time iteration, ε are the accuracy requirements of iteration convergence, if event class EC IjImportance degree satisfy the precision ε of iteration convergence, the importance degree that then calculates event class in the incident body finishes, if event class EC IjImportance degree can not satisfy the precision ε of iteration convergence, then return again execution in step (4-1-3), so circulation is up to the precision ε that satisfies iteration convergence, the importance degree that calculates event class finishes.
The incident body of choosing specific area described in the above-mentioned steps (5) carries out query expansion, and its concrete steps are following:
(5-1), suppose that the number of extension is defined as s, the number m of the extension of having chosen.If m>s then stops expansion.Key element according to the event instance of this field incident body comprises is expanded, if key element to be expanded has been included in the query term of input, does not then carry out the expansion of this key element;
(5-2), the key element that comprises according to the event class of this field incident body expands, if m>s then stops expansion; If key element to be expanded has been included in the query term of input, then do not carry out the expansion of this key element;
(5-3), expand, add all the subevent classes under the event class, if m>s then stops expansion according to the classification relation between the event class of this field incident body;
(5-4), expand according to the strength of association between the event class of this field incident body, if m>s then stops expansion.
Compared with prior art; Enquiry expanding method based on the incident body of the present invention has following technique effect: this method is because the thought of the incident of use four-tuple; Carry out query expansion based on the semantic resource of existing incident body; On same inquiry theme, carry out information inquiry, can improve event class information inquiry result's accuracy rate significantly.
Description of drawings
Fig. 1 is the process flow diagram of the inventive method;
Fig. 2 is the described triggering speech according to the query event of importing of step among Fig. 1 (4), judges the process flow diagram in the field of the incident body that query event is affiliated;
Fig. 3 is the described process flow diagram that the event class of field incident body is sorted of step among Fig. 2 (4-1);
Fig. 4 is the process flow diagram that the described incident body of choosing specific area of step among Fig. 1 (5) carries out query expansion.
Embodiment
Below in conjunction with accompanying drawing and embodiment implementation process of the present invention is described in further detail.
Embodiment 1, with reference to Fig. 1, and a kind of enquiry expanding method based on the incident body, this method comprises the steps:
(1), the triggering speech of input inquiry incident, time, place, personage's four elements, it is specific as follows:
(1-1), the triggering speech key element of incoming event in the triggering speech query frame of incident;
(1-2), the element of time of incoming event in the time of incident query frame;
(1-3), the place key element of incoming event in the query frame of the place of incident;
(1-4), personage's key element of incoming event in personage's query frame of incident.
(2), the form of element of time of input is carried out regular, unifiedly regularly be the form of < year, month, day>tlv triple.
(3), to the input the place key element expand according to the place body, it is specific as follows:
(3-1), the place key element is expanded according to the synonymy of place key element in the body of place;
(3-2), the place key element is expanded according to the hyponymy of place key element in the body of place.
(4), according to the triggering speech of query event of input, judge the field of the incident body under the query event, with reference to Fig. 2, its concrete steps are following:
(4-1), the event class of field incident body is sorted, suppose that field incident body has n, note is made EQ respectively 1, EQ 2, Λ, EQ n, EQ i(event class among 1≤i≤n) is EC according to the event class set that the big or small descending sort of importance degree obtains i={ EC I1, EC I2, Λ, EC Ij, Λ };
(4-2), with the input the Event triggered speech successively with each field incident body EQ iIn event class set EC iComparison;
If (4-3) event class set EC iThe Event triggered speech that comprises input is write down the Event triggered speech at EC iThe middle sequence number that occurs is k i(1≤i≤n);
If (4-4) event class set EC iThe Event triggered speech that does not comprise input, then the sequence number k of Event triggered speech appearance in the event class set iValue is set to the machine maximum number;
(4-5), at last, get the minimum k of sequence number iIncident body under being.
(5), the event class of field incident body is sorted, with reference to Fig. 3, its concrete steps are following:
(4-1-1), the importance degree of each event class in the initialization event body, to event class set EC iIn the initialization formula of importance degree of each event class be:
R ( EC ij ) = 1 n
Wherein, R (EC Ij) be event class EC IjImportance degree;
N is event class set EC iThe number of middle event class.
(4-1-2), the Authorities value of each event class of initialization and Hubs value are 0;
(4-1-3), calculate the Authorities value of each event class, the Authorities value formula that calculates each event class is:
S ij = &Sigma; g &Element; In ( EC ij ) R ( EC ig ) k - 1 &times; w gj
Wherein, S IjBe event class EC IjThe Authorities value;
In (EC Ij) expression chain go into EC IjEvent class set;
R (EC Ig) K-1Be event class EC IgThe k-1 time iteration the time importance degree;
w GjBe event class EC IgTo event class EC IjFactor of influence.
(4-1-4), calculate the Hubs value of each event class, the Hubs value formula that calculates each event class is:
S oj = &Sigma; h &Element; Out ( EC ij ) R ( EC ih ) k - 1 &times; w jh
Wherein, S OjBe event class EC IjThe Hubs value;
Out (EC Ij) expression EC IjThe event class set that chain goes out;
R (EC Ih) K-1Be event class EC IhThe k-1 time iteration the time importance degree;
w JhBe event class EC IjTo event class EC IhFactor of influence.
(4-1-5), calculate the importance degree of each event class, the formula that calculates the importance degree of each event class is:
R ( EC ij ) k = R ( EC ij ) k - 1 + d &times; ( &alpha; &times; S ij + ( 1 - &alpha; ) &times; S oj ) + 1 - d n
Wherein, R (EC Ij) kBe event class EC IjImportance degree during the k time iteration;
D is a ratio of damping, and span is 0~1, gets d=0.85 usually;
α is the parameter of regulating Authorities value and Hubs value; 0≤α≤1, if α=1, just with the foundation of Authorities value as iterative computation; If α=0; Just,, get α=0.5 usually for Authorities value and the Hubs value of taking all factors into consideration event class with the foundation of Hubs value as iterative computation.
(4-1-6), the importance degree of each event class of standardizing, the formula of the importance degree of each event class of standardizing is:
R ( EC ij ) k = 1 &Sigma; i = 1 n R ( EC ij ) k &times; R ( EC ij ) k
Wherein, R (EC Ij) kBe event class EC IjImportance degree during the k time iteration;
Figure BDA0000058541690000104
is the importance degree sum of all event class.
(4-1-7), judge whether the importance degree of each event class satisfies the precision of iteration convergence, decision event class EC IjWhether satisfy the precision ε of iteration convergence, judge whether that the iteration convergence formula is:
|R(EC ij) k-R(EC ij) k-1|>ε
Wherein, R (EC Ij) kBe event class EC IjImportance degree during the k time iteration;
R (EC Ij) K-1Be event class EC IjImportance degree during the k-1 time iteration;
ε is the accuracy requirement of iteration convergence, if event class EC IjImportance degree satisfy the precision ε of iteration convergence, the importance degree that then calculates event class in the incident body finishes, if event class EC IjImportance degree can not satisfy the precision ε of iteration convergence, then return again execution in step (4-1-3), so circulation is up to the precision ε that satisfies iteration convergence, the importance degree that calculates event class finishes.
(6), the incident body of choosing specific area carries out query expansion, with reference to Fig. 4, its concrete steps are following:
(5-1), the number of setting extension is defined as s, the number m of the extension that statistics has been chosen;
(5-2), judge m>s? If then stop expansion; If, then do not continue expansion;
(5-3), the key element that comprises according to the event instance of this field incident body expands, if key element to be expanded has been included in the query term of input, then do not carry out the expansion of this key element;
(5-4), judge m>s? If then stop expansion; If, then do not continue expansion;
(5-5), the key element that comprises according to the event class of this field incident body expands, if key element to be expanded has been included in the query term of input, then do not carry out the expansion of this key element;
(5-6), judge m>s? If then stop expansion; If, then do not continue expansion;
(5-7), expand all the subevent classes under the interpolation event class according to the classification relation between the event class of this field incident body;
(5-8), judge m>s? If then stop expansion; If, then do not continue expansion;
(5-9), expand according to the strength of association between the event class of this field incident body.
(7), calculate the similarity of query term and text, the resulting text that obtains is exported according to the big or small descending sort of similarity.
Embodiment 2, the inquiry contrast experiment who adopts embodiment 1 described method and prior art scheme to carry out.
The method that the contrast experiment adopted:
1, embodiment 1 described enquiry expanding method based on the incident body, EOnto made in brief note;
2, a kind of based on the event-oriented enquiry expanding method of partial analysis, LA-EO made in brief note;
3, a kind of local context is analyzed extended method, and LCA made in brief note.
On same inquiry theme, carry out information inquiry, respectively to the comparison that experimentizes of the inquiry accuracy rate of above-mentioned 3 kinds of methods.
The inventor has made up the incident body in 5 fields around accident: " earthquake ", " fire ", " food poisoning ", " traffic hazard " and " attack of terrorism ".Therefore, the experiment language material is mainly collected around these 5 fields.By the Google search engine, import some keys word of the inquiry, collect 1639 pieces of texts; Use the reptile instrument, download 2435 pieces of texts from some websites of appointment.Then all texts are arranged heavily according to title, remaining at last 4011 pieces of texts are as the language material of this paper experiment.
Being provided with of inquiry theme adopt use search engine with the user the Advanced Search function class seemingly: the triggering speech of input inquiry incident, time, place and personage's four elements in different text boxes.Manual work is provided with 10 inquiry themes; For each inquiry theme, use
Figure BDA0000058541690000121
and
Figure BDA0000058541690000122
as evaluation index.The result that
Figure BDA0000058541690000123
index simulation search engine commonly used returns; Be an index that personalizes, that uses in the present search evaluation and test is more.
Figure BDA0000058541690000124
index only is concerned about whether the result who retrieves is relevant with the inquiry theme; Do not consider text that returns and the order of inquiring about topic relativity, evaluate and test easily and realize.
Use the Pooling technology to confirm the model answer of each inquiry theme.Confirm that for the model answer of theme concrete steps are: (1) is got preceding n piece of writing text that 4 kinds of methods return and is merged and obtain a S set; (2) model answer of relevant document as a theme chosen in manual work from this text collection S.
Table 1 has been listed 10 inquiry themes that use.
Show 1:10 inquiry theme
Figure BDA0000058541690000131
For example, for inquiry theme " Wenchuan reconstruction ", preceding ten extension using 3 kinds of extended methods to obtain are as shown in table 2.
Preceding 10 extension that table 2:3 kind extended method obtains
Figure BDA0000058541690000132
Visible from table 2, the extension that different extended methods obtains has bigger difference.EOnto has 30% to be different with LA-EO, and EOnto has 50% to be different with LCA.And the ordering of the query term that 3 kinds of methods obtain is some difference all, even obtain identical query term, but different orderings has influenced the weights of query term, and the similarity of calculating query term and text is also had bigger influence.
Number to the expansion word of 3 kinds of enquiry expanding method EOnto, LA-EO and LCA experimentizes between the 0-40.3 kinds of methods are got 10 optimal values of inquiring about the average result of themes and are contrasted.Table 3 has been listed comparing result.
Table 3: the comparison of the retrieval performance of the optimum that different extended methods obtains
Figure BDA0000058541690000141
Visible from table 3,3 kinds of different enquiry expanding methods, the retrieval performance of EOnto is better than LA-EO and LCA, and wherein the retrieval performance of EOnto is best, and the retrieval performance of LCA is the poorest.Evaluation index
Figure BDA0000058541690000142
and
Figure BDA0000058541690000143
EOnto have been improved 0.20 and 0.19 respectively than LCA, and EOnto has improved 0.06 and 0.10 respectively than LA-EO.Main cause: LCA does not divide into query term the form of incident four-tuple on the one hand, does not adopt event-oriented association expanding policy; EOnto is based on the semantic resource of existing incident body on the other hand, and LA-EO is based on the local document collection.

Claims (5)

1. the enquiry expanding method based on the incident body is characterized in that, its concrete steps are following:
(1), triggering speech, time, place, personage's four elements of difference input inquiry incident in the query frame of appointment;
(2), the form of element of time of input is carried out regular, unifiedly regularly be the form of < year, month, day>tlv triple;
(3), the place key element of input is expanded according to the place body;
(4), according to the triggering speech of query event of input, judge the field of the incident body under the query event;
(5), the incident body of choosing specific area carries out query expansion;
(6), calculate the similarity of query term and text, the resulting text that obtains is exported according to the big or small descending sort of similarity.
2. a kind of enquiry expanding method based on the incident body according to claim 1 is characterized in that, the place key element to input described in the above-mentioned steps (3) is expanded according to the place body, and its concrete steps are following:
(3-1), in the body of place, find the place key element of input;
(3-2), all sub-notions of the place key element of importing are expanded by level successively.
3. a kind of enquiry expanding method based on the incident body according to claim 1 is characterized in that, according to the triggering speech of the query event of importing, judges the field of the incident body that query event is affiliated in the above-mentioned steps (4), and its concrete steps are following:
(4-1), the event class of field incident body is sorted, suppose that field incident body has n, note is made EQ respectively 1, EQ 2, Λ, EQ n, EQ i(event class among 1≤i≤n) is EC according to the event class set that the big or small descending sort of importance degree obtains i={ EC I1, EC I2, Λ, EC Ij, Λ };
(4-2), with the input the Event triggered speech successively with each field incident body EQ iIn event class set EC iComparison is write down the Event triggered speech at EC iThe middle sequence number that occurs is k i(1≤i≤n), if EC iMiddle nothing this Event triggered speech, then k iValue is set to the machine maximum number;
(4-3) last, get the minimum k of sequence number iIncident body under being.
4. the query event according to input according to claim 3 triggers speech, judges the field of the incident body that query event is affiliated, it is characterized in that the event class to field incident body in the above-mentioned steps (4-1) sorts, and its concrete steps are following:
(4-1-1), the importance degree of each event class in the initialization event body, to event class set EC iIn the initialization formula of importance degree of each event class be:
R ( EC ij ) = 1 n
Wherein, R (EC Ij) be event class EC IjImportance degree, n is event class set EC iThe number of middle event class;
(4-1-2), the Authorities value of each event class of initialization and Hubs value are 0;
(4-1-3), calculate the Authorities value of each event class, the Authorities value formula that calculates each event class is:
S ij = &Sigma; g &Element; In ( EC ij ) R ( EC ig ) k - 1 &times; w gj
Wherein, S IjBe event class EC IjThe Authorities value, In (EC Ij) expression chain go into EC IjEvent class set, R (EC Ig) K-1Be event class EC IgThe k-1 time iteration the time importance degree, w GjBe event class EC IgTo event class EC IjFactor of influence;
(4-1-4), calculate the Hubs value of each event class, the Hubs value formula that calculates each event class is:
S oj = &Sigma; h &Element; Out ( EC ij ) R ( EC ih ) k - 1 &times; w jh
Wherein, S OjBe event class EC IjThe Hubs value, Out (EC Ij) expression EC IjThe event class set that chain goes out, R (EC Ih) K-1Be event class EC IhThe k-1 time iteration the time importance degree, w JhBe event class EC IjTo event class EC IhFactor of influence;
(4-1-5), calculate the importance degree of each event class, the formula that calculates the importance degree of each event class is:
R ( EC ij ) k = R ( EC ij ) k - 1 + d &times; ( &alpha; &times; S ij + ( 1 - &alpha; ) &times; S oj ) + 1 - d n
Wherein, R (EC Ij) kBe event class EC IjImportance degree during the k time iteration; D is a ratio of damping, and span is 0~1, gets d=0.85 usually; α is the parameter of regulating Authorities value and Hubs value; 0≤α≤1, if α=1, just with the foundation of Authorities value as iterative computation; If α=0; Just,, get α=0.5 usually for Authorities value and the Hubs value of taking all factors into consideration event class with the foundation of Hubs value as iterative computation;
(4-1-6), the importance degree of each event class of standardizing, the formula of the importance degree of each event class of standardizing is:
R ( EC ij ) k = 1 &Sigma; i = 1 n R ( EC ij ) k &times; R ( EC ij ) k
Wherein, R (EC Ij) kBe event class EC IjImportance degree during the k time iteration;
Figure FDA0000058541680000033
It is the importance degree sum of all event class;
(4-1-7), judge whether the importance degree of each event class satisfies the precision of iteration convergence, and whether decision event class ECij satisfies the precision ε of iteration convergence, judge whether that the iteration convergence formula is:
|R(EC ij) k-R(EC ij) k-1|>ε
Wherein, R (EC Ij) kBe event class EC IjImportance degree during the k time iteration, R (EC Ij) K-1Be event class EC IjImportance degree during the k-1 time iteration, ε are the accuracy requirements of iteration convergence, if event class EC IjImportance degree satisfy the precision ε of iteration convergence, the importance degree that then calculates event class in the incident body finishes, if event class EC IjImportance degree can not satisfy the precision ε of iteration convergence, then return again execution in step (4-1-3), so circulation is up to the precision ε that satisfies iteration convergence, the importance degree that calculates event class finishes.
5. a kind of enquiry expanding method based on the incident body according to claim 1 is characterized in that, the incident body of choosing specific area described in the above-mentioned steps (5) carries out query expansion, and its concrete steps are following:
(5-1), suppose that the number of extension is defined as s, the number m of the extension of having chosen; Key element according to the event instance of this field incident body comprises is expanded, if key element to be expanded has been included in the query term of input, does not then carry out the expansion of this key element; If m>s then stops expansion;
(5-2), the key element that comprises according to the event class of this field incident body expands, if key element to be expanded has been included in the query term of input, then do not carry out the expansion of this key element; If m>s then stops expansion;
(5-3), expand, add all the subevent classes under the event class, if m>s then stops expansion according to the classification relation between the event class of this field incident body;
(5-4), expand according to the strength of association between the event class of this field incident body, if m>s then stops expansion.
CN2011101108081A 2011-04-29 2011-04-29 Incident body-based method for expanding searches Pending CN102760140A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011101108081A CN102760140A (en) 2011-04-29 2011-04-29 Incident body-based method for expanding searches

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011101108081A CN102760140A (en) 2011-04-29 2011-04-29 Incident body-based method for expanding searches

Publications (1)

Publication Number Publication Date
CN102760140A true CN102760140A (en) 2012-10-31

Family

ID=47054598

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011101108081A Pending CN102760140A (en) 2011-04-29 2011-04-29 Incident body-based method for expanding searches

Country Status (1)

Country Link
CN (1) CN102760140A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982163A (en) * 2012-11-29 2013-03-20 淮海工学院 Web news retrieval method based on event analysis
CN104424281A (en) * 2013-08-30 2015-03-18 宏碁股份有限公司 Integration method and system of event
CN105229635A (en) * 2013-03-14 2016-01-06 微软技术许可有限责任公司 Search comments and suggestion
CN105824938A (en) * 2016-03-18 2016-08-03 点击律(上海)网络科技有限公司 Search method and system based on bidirectional mapping
CN107918607A (en) * 2017-12-02 2018-04-17 北京工业大学 A kind of digital archives inquiry and sort method based on semantic information
CN110737821A (en) * 2018-07-03 2020-01-31 百度在线网络技术(北京)有限公司 Similar event query method, device, storage medium and terminal equipment
CN113139389A (en) * 2021-04-29 2021-07-20 南宁师范大学 Graph model semantic query expansion method and device based on dynamic optimization
CN117743390A (en) * 2024-02-20 2024-03-22 证通股份有限公司 Query method and system for financial information and storage medium

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982163A (en) * 2012-11-29 2013-03-20 淮海工学院 Web news retrieval method based on event analysis
CN102982163B (en) * 2012-11-29 2015-06-03 淮海工学院 Web news retrieval method based on event analysis
CN105229635A (en) * 2013-03-14 2016-01-06 微软技术许可有限责任公司 Search comments and suggestion
CN104424281A (en) * 2013-08-30 2015-03-18 宏碁股份有限公司 Integration method and system of event
CN105824938B (en) * 2016-03-18 2019-11-08 点击律(上海)网络科技有限公司 A kind of search method and system based on biaxial stress structure
CN105824938A (en) * 2016-03-18 2016-08-03 点击律(上海)网络科技有限公司 Search method and system based on bidirectional mapping
CN107918607A (en) * 2017-12-02 2018-04-17 北京工业大学 A kind of digital archives inquiry and sort method based on semantic information
CN107918607B (en) * 2017-12-02 2020-05-08 北京工业大学 Digital archive inquiry and sorting method based on semantic information
CN110737821A (en) * 2018-07-03 2020-01-31 百度在线网络技术(北京)有限公司 Similar event query method, device, storage medium and terminal equipment
CN110737821B (en) * 2018-07-03 2022-06-07 百度在线网络技术(北京)有限公司 Similar event query method, device, storage medium and terminal equipment
CN113139389A (en) * 2021-04-29 2021-07-20 南宁师范大学 Graph model semantic query expansion method and device based on dynamic optimization
CN113139389B (en) * 2021-04-29 2023-01-13 南宁师范大学 Graph model semantic query expansion method and device based on dynamic optimization
CN117743390A (en) * 2024-02-20 2024-03-22 证通股份有限公司 Query method and system for financial information and storage medium
CN117743390B (en) * 2024-02-20 2024-05-28 证通股份有限公司 Query method and system for financial information and storage medium

Similar Documents

Publication Publication Date Title
Wei et al. A survey of faceted search
Bhagavatula et al. Methods for exploring and mining tables on wikipedia
Thakkar et al. Graph-based algorithms for text summarization
CN100416570C (en) FAQ based Chinese natural language ask and answer method
Cheng et al. Relin: relatedness and informativeness-based centrality for entity summarization
CN101364239B (en) Method for auto constructing classified catalogue and relevant system
US6965900B2 (en) Method and apparatus for electronically extracting application specific multidimensional information from documents selected from a set of documents electronically extracted from a library of electronically searchable documents
CN102760140A (en) Incident body-based method for expanding searches
CN100433007C (en) Method for providing research result
Wang et al. Knowsim: A document similarity measure on structured heterogeneous information networks
CN106156272A (en) A kind of information retrieval method based on multi-source semantic analysis
CN100538695C (en) The method and system of structure, the personalized classification tree of maintenance
CN107122413A (en) A kind of keyword extracting method and device based on graph model
US20030115188A1 (en) Method and apparatus for electronically extracting application specific multidimensional information from a library of searchable documents and for providing the application specific information to a user application
CN101650729B (en) Dynamic construction method for Web service component library and service search method thereof
CN104484380A (en) Personalized search method and personalized search device
Biancalana et al. Social tagging in query expansion: A new way for personalized web search
Minkov et al. Improving graph-walk-based similarity with reranking: Case studies for personal information management
Carrasco et al. A new model for linguistic summarization of heterogeneous data: an application to tourism web data sources
Kanapala et al. Passage-based text summarization for legal information retrieval
Wang et al. A semantic query expansion-based patent retrieval approach
Chopra et al. A survey on improving the efficiency of different web structure mining algorithms
Zhang et al. A comparative study on key phrase extraction methods in automatic web site summarization
Zhang et al. GSPSummary: a graph-based sub-topic partition algorithm for summarization
Asa et al. A comprehensive survey on extractive text summarization techniques

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20121031