CN101882259A - Method and equipment for filtering entity relationship instance - Google Patents

Method and equipment for filtering entity relationship instance Download PDF

Info

Publication number
CN101882259A
CN101882259A CN2009101380558A CN200910138055A CN101882259A CN 101882259 A CN101882259 A CN 101882259A CN 2009101380558 A CN2009101380558 A CN 2009101380558A CN 200910138055 A CN200910138055 A CN 200910138055A CN 101882259 A CN101882259 A CN 101882259A
Authority
CN
China
Prior art keywords
entity relationship
confidence level
relationship example
entity
rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2009101380558A
Other languages
Chinese (zh)
Inventor
沈国阳
胡长建
许洪志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC China Co Ltd
Renesas Electronics China Co Ltd
Original Assignee
NEC China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC China Co Ltd filed Critical NEC China Co Ltd
Priority to CN2009101380558A priority Critical patent/CN101882259A/en
Publication of CN101882259A publication Critical patent/CN101882259A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a method and equipment for filtering an entity relationship instance. The method comprises the following steps of: marking the reliability of the entity relationship instance on the basis of reliability associated information of the entity relationship instance; and filtering the marked entity relationship instance to obtain the reliable entity relationship instance. The entity relationship instance with higher accuracy can be obtained by the method and through the equipment; and the method and the equipment provide more reliable basis for high-level analysis of the entity relationship instance so that the obtained entity relationship instance is more practical for high-level decisions.

Description

Be used for method and apparatus to filtering entity relationship instance
Technical field
The present invention relates to the technical field of information extraction, relate more specifically to be used for method and apparatus filtering entity relationship instance.
Background technology
Along with continuous development, the lasting expansion of market area and being on the increase of rival of economic globalization, for enterprise, have ability that information to external world catches and the handle ever more important that seems.Specifically, need a kind of like this technical finesse ability exactly, that is, and by the commercial relations analysis of associated enterprise being built a virtual enterprise operation environment, in order to help enterprise's decision maker's acquire knowledges at different levels and clairvoyance, and then make the decision-making more favourable to enterprise.
Information extraction is one of core technology of constructing above-mentioned technical finesse ability, and the entity relationship extraction is one of important subject in the information extraction field then.It is a kind of technology that is used for automatically finding from text the relation between the entity that entity relationship extracts.For example, for given text " AMD plans to compete with Intel atom chip ", can analyze between named entity " AMD " and " Intel " " competition (the compete) " relation that exists automatically according to this technology.Entity relationship extracts as one of the important technology in information extraction field, and its result will directly have influence on higher layer analysis, and for example enterprise's business information is handled.Therefore, the entity relationship abstracting method of efficiently and accurately is to be very important for entity relationship extracts.
From the angle of technology, it is the association that will discern automatically with between two entities of natural language expressing that entity relationship extracts.In the prior art, normally used method mainly comprises the abstracting method of rule-based abstracting method and machine learning.Rule-based abstracting method need be constructed corresponding knowledge base by the expert at different fields.Another abstracting method based on machine learning then is relation to be extracted be converted to classification problem, and it utilizes machine learning to obtain sorter by the tectonic relationship candidate, thus utilize this sorter with relationship marking for belonging to which predefined relation.Owing to concern that the extraction problem itself has great complicacy, so be that rule and method or machine learning method all can't reach gratifying extraction precision.In addition, use incredible data source information also can introduce extra interference, this relation that makes extracts the distance that also differs greatly apart from requirement of actual application.
For obtaining more accurate extraction result, a feasible way of the prior art is exactly that the result after extracting is analyzed and filters, so that reject wrong extraction result, improves the precision of entity relationship example, and then satisfies the needs of practical application.Therefore, how to construct one efficiently the entity relationship strobe utility just become a practicality and problem demanding prompt solution.
For the entity relationship filtration problem, there are some relevant solutions in the prior art.For example, " the RelEx-Relation extraction using dependency parse trees " that deliver in Bioinformatics in Dec, 2006 at Katrin Fundel, Robert K ü ffner and Ralf Zimmer (v.23n.3, p.365-371) in, disclose the rule-based filter method that concerns, also can be called post-processing step.In the document, introduce expertise, and make up four kinds of strobe utilities and come the entity relationship that extracts is further revised and filtered.
These four kinds of treatment mechanisms correspond respectively to four filtration steps:
1) negates to check, determine promptly whether a relation is uncertainty relation.If comprising in the node of the node of candidate relationship or each child node negates the speech of implication, such as speech such as " no (no) ", " not (non-) ", " nor (also not) ", " neither (all not) ", " without (not having) ", " lack (lacking) ", " fail (s; ed) (fail) ", " unable (s) (can not) ", " abrogate (s; d) (cancellation) ", " absen (ce; t)) (shortage) ", then this relation be considered to negate.According to the method in the document, will reject these relations of negating.
2) agent-word denoting the receiver of an action detects.Agent is meant phraseological action subject, and word denoting the receiver of an action is meant phraseological action object.In a pair of relation, the entity that occurs earlier is an agent usually, and what then occur is word denoting the receiver of an action.If detecting corresponding context-descriptive is passive voice, so then the role with agent in the entity relationship and word denoting the receiver of an action changes.In the document, judge by a series of predefined speech whether contextual voice is passive voice.
3) enumerate and clear up.By analyzing and the corresponding noun phrase piece of detected relation, judge whether to exist the entity of enumerating side by side, if there is the entity of enumerating, then generate a plurality of similar entity relationship examples.
4) interest domain filters.In this treatment mechanism, pre-defined a series of fields related term or phrase, and detect the text corresponding with the relation that is checked through and whether comprise this field related term or phrase, if do not comprise this field related term or phrase, then with this entity relationship example rejecting.
From the disclosure of the document as can be seen, these strobe utilities have solved the problem of filtering mistake relationship example to a certain extent.Yet in fact, the degree of accuracy of entity relationship example still has much room for improvement.
Summary of the invention
For this reason, one of the object of the invention has been to provide a kind of method and apparatus that is used for filtering entity relationship instance, so that improve the precision of the entity relationship example that obtains.
According to an aspect of the present invention, provide a kind of method that is used for filtering entity relationship instance.Described method can comprise: come the reliability of entity relationship example is carried out mark based on the reliability relevant information of entity relationship example; And to filtering entity relationship instance, to obtain reliable entity relationship example through mark.
In one embodiment according to the present invention, the reliability of entity relationship example being carried out mark can comprise: the confidence level of determining this entity relationship example based on the reliability relevant information of this entity relationship example; And more determined confidence level and predetermined confidence level threshold value, reliable or unreliable so that the entity relationship example is labeled as.
According to another embodiment of the present invention, described reliability relevant information can comprise: at least one in the confidence level of the decimation rule of the confidence level of the data source of entity relationship example and entity relationship example, and determine the confidence level of this entity relationship based in the confidence level of the decimation rule of the confidence level of the data source of entity relationship example and entity relationship example at least one.
According to an embodiment more of the present invention, the ratio that can pass through responsible entity relationship example in a plurality of entity relationship examples of mark relevant with this data source by calculating, in advance obtains the confidence level of this data source.
According to another embodiment of the invention, can by predetermined iterative algorithm, obtain the confidence level of these a plurality of data sources based on the incidence relation between a plurality of data sources that comprise this data source and the known initial trusted degree in partial data source wherein.
According to another embodiment of the present invention, the ratio that the confidence level of described decimation rule can be passed through responsible entity relationship example in a plurality of entity relationship examples of mark relevant with this decimation rule by calculating, in advance obtains.
According to an embodiment more of the present invention, described reliability relevant information can comprise wide area contextual information and predetermined wide area context decision rule, and the confidence level of wherein determining this entity relationship example based on wide area contextual information and predetermined wide area context decision rule.
According to another embodiment of the invention, described reliability relevant information may further include wide area contextual information and predetermined wide area context decision rule, and the confidence level of wherein further determining this entity relationship example based on wide area contextual information and predetermined wide area context decision rule.
According to another embodiment of the present invention, described wide area contextual information can be the business type information of the entity relevant with this entity relationship example, and described predetermined wide area context decision rule is and the relevant rule of entity business type information.
According to an embodiment more of the present invention, described reliability relevant information can comprise the historical decision rule of relation, and wherein right entity relationship example carries out mark to relating to identical entity based on the historical decision rule of relation.
According to another embodiment of the invention, the historical decision rule of described relation can comprise agent-word denoting the receiver of an action relation to and/or the relationship change pattern.
According to another embodiment of the present invention, described reliability relevant information may further include the historical decision rule of relation, and wherein further right entity relationship example carries out mark to relating to identical entity based on the historical decision rule of relation.
According to an embodiment more of the present invention, may further include with through mark, the entity relationship example of confidence level in predetermined threshold range be saved in the storehouse.
According to a further aspect in the invention, provide a kind of equipment that is used for filtering entity relationship instance.Described equipment comprises: labelling apparatus is used for coming the reliability of entity relationship example is carried out mark based on the reliability relevant information of entity relationship example; And filtration unit, be used for to filtering entity relationship instance, to obtain reliable entity relationship example through mark.
By the present invention, can obtain the higher entity relationship example of degree of accuracy, for the high layer analysis based on the entity relationship example provides basis more reliably, the entity relationship example that obtains has bigger practicality for decision of the senior level.
Description of drawings
By shown embodiment in conjunction with the accompanying drawings is elaborated, above-mentioned and other features of the present invention will be more obvious, and identical label is represented same or analogous parts in the accompanying drawing of the present invention.In the accompanying drawings,
Schematically illustrated the process flow diagram of being used for of Fig. 1 to the method for filtering entity relationship instance according to one embodiment of the present invention;
The schematically illustrated procedure chart that is used for the entity relationship example is carried out the method for mark of Fig. 2 according to one embodiment of the present invention;
Fig. 3 schematically shows the network chart that is used for computational data source confidence level according to of the present invention;
Fig. 4 schematically shows the procedure chart that is used for the entity relationship example is carried out the method for mark of another embodiment according to the present invention;
Fig. 5 schematically shows according to the present invention the procedure chart that is used for the entity relationship example is carried out the method for mark of an embodiment again;
The schematically illustrated diagram that concerns the unusual sudden change of direction of Fig. 6;
Fig. 7 schematically shows the process flow diagram to the method for filtering entity relationship instance of being used for of according to the present invention another embodiment;
Fig. 8 schematically shows according to the present invention the process flow diagram to the method for filtering entity relationship instance of being used for of an embodiment again; And
Fig. 9 schematically shows the block scheme to the equipment of filtering entity relationship instance of being used for according to one embodiment of the present invention.
Embodiment
Hereinafter, will by embodiment provided by the invention being used for be described in detail the method and apparatus of filtering entity relationship instance with reference to the accompanying drawings.
At first, will be with reference to the method for figure 1 description according to one embodiment of the present invention.Fig. 1 shows the process flow diagram to the method for filtering entity relationship instance of being used for according to one embodiment of the present invention.
As shown in Figure 1, in step 101, come the reliability of entity relationship example is carried out mark based on the reliability relevant information of entity relationship example.
According to one embodiment of the present invention, can at first determine the confidence level of this entity relationship example according to the information relevant with the reliability of entity relationship example, carry out the reliability mark based on this confidence level and a predetermined threshold value then.Below with reference to Fig. 2 to Fig. 4 this embodiment is described.
With reference to figure 2, Fig. 2 shows the procedure chart that the entity relationship example is carried out the method for mark according to one embodiment of the present invention.In this embodiment, the reliability relevant information is the confidence level of the data source of entity relationship example.
As shown in Figure 2, at piece 201 inputs entity relationship example to be marked, the entity relationship example can manually be imported also and can be imported by interface by other programs.The entity relationship example extracts the information that obtains by means of the entity relationship extraction technique typically from the text of data source.Each entity relationship example comprises the relationship type between two entities and two entities at least.These two entities can be the agent entities and be subjected to fact object for example, have two entities of purchase relation or supply of material relation respectively; Also can be two entities, for example have two entities of competitive relation with peer-to-peer.
According to the present invention, the entity relationship example may further include the data source (such as website, information bank or other information sources) in this entity relationship example source of indication, can further include employed rule of this entity relationship example or the method for extracting.One or more certification marks that can also comprise in addition, the reliability that is used for the presentation-entity relationship example.
The entity relationship example can be stored with data structure given below on the backstage:
Entity A
Entity B
Relationship type
Data source
Application rule
Certification mark
The data structure of table 1 entity relationship example
In addition, for the ease of understanding, in table 2, provided the example of several entity relationship examples of storing in the database:
Entity A Entity B Relationship type Data source Application rule Certification mark
Sohu Sina Competition Website A Decimation rule a
Google Baidu Competition Website C Decimation rule b
Microsoft Google Competition Website A Decimation rule a
The example of table 2 entity relationship example to be marked
Need to prove that entity relationship example to be marked can be the entity relationship example that obtains after extracting through entity relationship, also can be to have adopted the entity relationship example that obtains behind the filter method of the prior art.
Continuation is with reference to figure 2, at piece 202, according to the confidence level of determining this entity relationship example in the database 207 with the confidence level of this entity relationship example associated data source.Database 207 is storehouses that configuration is used to store the data source confidence level.This data source confidence level calculates and is stored in the database 207 at piece 206.In the embodiment shown in Fig. 2, can be based on related information between a plurality of data sources of the data source that comprises this entity relationship example (being stored in database 205) and the known initial trusted degree in partial data source (by obtaining of storage in the database 208) wherein through the entity relationship example of mark, by predetermined iterative algorithm, obtain the confidence level of a plurality of data sources.Hereinafter, come an embodiment of the confidence level in computational data source with being described with reference to Figure 3 based on incidence relation between the data source and known initial trusted degree.
In this embodiment, suppose to exist 6 data sources, i.e. website 1 to website 6.Based on the linking relationship between these websites, can form as shown in Figure 3 network chart G=(V, ε), wherein V is the summit of figure G, ε is the limit that connects each summit among the figure.In figure shown in Figure 3, website 1 to website 6 is represented on summit 1 to summit 6 respectively.
As shown in Figure 3, owing to comprise to the website 3 and the hyperlink of website 6 in the website 1, so summit 1 has two limits pointing to summit 3 and summit 6 respectively.Similarly, comprise to the website 1 hyperlink in the website 2, thereby summit 2 has the limit of pointing to summit 1; Do not comprise hyperlink in the website 3, so node 3 is without any the limit of pointing to other summits to any website; Comprise to the website 3 hyperlink in the website 4, thereby summit 4 has the limit of pointing to summit 3; Comprise to the website 2 and the hyperlink of website 4 in the website 5, thereby summit 5 has the limit of pointing to summit 2 and summit 4; And comprise to the website 3 and the hyperlink of website 5 in the website 6, thereby summit 6 has the limit of pointing to summit 3 and summit 5.
Then, can calculate trust value (TrustRank) matrix T according to following formula according to the figure shown in Fig. 3:
T ( p , q ) = 0 if ( q , p ) ∉ ϵ 1 / ω ( q ) if ( q , p ) ∈ ϵ Formula 1
Wherein, the numbering on summit in p and the q presentation graphs, ω (q) is meant the number on the limit of the summit directed outwards that is numbered q, i.e. out-degree.According to this formula 1, if there is the limit of being pointed to summit p by summit q between summit q and summit p, then (p is 1/ ω (q) q), otherwise is 0 the matrix element T of the element of the capable q row of the p of this trust value matrix T.For example, for the 2nd element T of the 1st row (1,2), owing between summit 1 and summit 2, have the limit of pointing to summit 1 by summit 2, so T (1,2) is 1/ ω (q), and ω (q), promptly the limit number of directed outwards is 1, so T (1,2) is 1.Therefore, according to the figure among above-mentioned formula 1 and Fig. 3, can obtain the trust value matrix T that goes out as follows.
T = 0 1 0 0 0 0 0 0 0 0 1 / 2 0 1 / 2 0 0 1 0 1 / 2 0 0 0 0 1 / 2 0 0 0 0 0 0 1 / 2 1 / 2 0 0 0 0 0
In addition, suppose and to know that through the entity relationship example of mark the confidence level initial value of website 1 and website 2 is respectively 0.9 and 0.8 according to storage in the database 208.So, can obtain the confidence level initial vector of all website 1 to websites 6 in view of the above:
d=[0.9,0.8,0,0,0,0]T
In initial vector d, the element value corresponding with website 1 and website 2 is set as the known confidence level of website 1 and website 2, and the element value corresponding with the website of all the other confidence level the unknowns is set as 0.
Then, can obtain fiduciary level numerical value according to following iterative algorithm.
for?i=1to?IterNum
do?R=a·T·R+(1-a)·d
Wherein, T is the trust value matrix T that aforementioned calculation obtains, and the initial value of R is confidence level initial matrix d, and a is a decay factor.
When after through the several times iteration, tending towards stability, can obtain vectorial R, be the confidence level of website.For this example, the confidence level that obtains is:
R=[0.8,0.7,0.8,0.4,0.2,0.3]
The confidence level of each website that obtains can be stored in the database 207.
In this embodiment, the initial trusted degree of website 1 and website 2 can obtain through the ratio of responsible entity relationship example in a plurality of entity relationship examples of mark in advance by relevant with website 1 and website 2 in the computational data storehouse 208 respectively.In addition, the known initial trusted degree of part website also can be the confidence value with credible source.
About the more detailed information of the calculating of website confidence level, can reference
Figure B2009101380558D0000092
People such as Zolt á n, Hector Garcia-Molina, Jan Pedersen are in 2004 " Combating Web Spam with Trust Rank " literary compositions of delivering in the 30 very-large database international conference (VLDB) collection of thesis.
In addition, except obtaining by iterative algorithm the confidence level of each data source according to the foregoing description, the confidence level of data source also can be the setting value with highly reliable source.In addition, also can be by storage in the computational data storehouse 208, the relevant confidence level that obtains each data source in advance through the ratio of responsible entity relationship example in a plurality of entity relationship examples of mark with this data source.In this case, need provide relevant with each data source, larger amt entity relationship example equally in advance, so that guarantee to obtain the accuracy of the confidence level of data source through mark.
Like this,, just can obtain the confidence level of data source, and the confidence level of entity relationship example to be marked can be defined as the confidence level of its data source by various embodiment as described above.
Then, can carry out mark according to the confidence level and the predetermined threshold of determined entity relationship example to the entity relationship example at piece 203.
For purposes of illustration, suppose the entity relationship example that below the entity relationship example to be marked of piece 201 inputs is, provides:
RI1={<British Telecom, MCI 〉, purchase, Rule 1, Source 1}
RI2={<MCI, British Telecom 〉, purchase, Rule 2, Source 4}
RI3={<British Telecom, MCI 〉, purchase, Rule 3, Source 3}
For the confidence level R=[0.8 that obtains by iterative algorithm, 0.7,0.8,0.4,0.2,0.3] and predetermined confidence level threshold value 0.7, then can respectively entity relationship example RI1, RI2 and RI3 be labeled as:
RI1={<British Telecom, MCI 〉, purchase, Rule 1, Source 1, data source-reliable }
RI2={<MCI, British Telecom 〉, purchase, Rule 2, Source 4, data source-unreliable }
RI3={<British Telecom, MCI 〉, purchase, Rule 3, Source 3, data source-reliable }
Like this, just be respectively each entity relationship example to be marked and added the relevant reliability mark of data source.
Described hereinbefore based on the confidence level of data source the entity relationship example is carried out the embodiment of mark, yet the present invention is not limited thereto.In yet another embodiment of the present invention, the reliability relevant information of entity relationship example comprises the reliability of the decimation rule of this entity relationship example.In entity relationship extracts, decimation rule also plays important effect to the reliability of entity relationship, therefore can be according to determining the confidence level of entity relationship example with the confidence level of the decimation rule of entity relationship example, and in view of the above the entity relationship example is carried out mark.
In this embodiment, the confidence level of decimation rule and the confidence level of above-mentioned data source are similar, can be the confidence value with highly reliable source.In addition, the confidence level of this decimation rule also can be determined by a large amount of entity relationship examples through the handmarking of storage in the database 208.For example, the confidence level of decimation rule can obtain by the ratio of calculating responsible entity relationship example in a plurality of entity relationship examples of mark of process in advance relevant with this decimation rule.
For example, if the confidence level of the decimation rule Rule 1 that obtains, Rule 2, Rule 3 is respectively 0.9,0.7,0.8, then the confidence level of each entity instance is defined as the confidence level of employed decimation rule according to decimation rule.Utilize predetermined threshold 0.8, then following RI1, RI2 and the RI3 of contradictory relation of existing can be labeled as:
RI1={<British Telecom, MCI 〉, purchase, Rule 1, Source 1, rule-reliable }
RI2={<MCI, British Telecom 〉, purchase, Rule 2, Source 4, rule-unreliable }
RI3={<British Telecom, MCI 〉, purchase, Rule 3, Source 3, rule-reliable }
Like this, just added the reliability mark relevant with decimation rule for entity relationship example RI 1-RI3.
In preferred implementation according to the present invention, the confidence level of data source and the confidence level of decimation rule can be combined to entity relationship example mark.
For example, for the data source confidence level R=[0.8 that obtains by iterative algorithm, 0.7,0.8,0.4,0.2,0.3], and the confidence level 0.9,0.7 and 0.8 of determined decimation rule Rule 1, Rule 2, Rule 3, the product of the confidence level of the confidence level of data source and decimation rule can be defined as the confidence level of entity relationship example RI 1-RI3, it provides below:
RI[1-3]=[0.8×0.9??0.7×0.4??0.8×0.8]=[0.72??0.28??0.64]
Therefore, for given threshold value 0.6, can be as follows at entity relationship example RI 1-RI3 mark:
RI1={<British Telecom, MCI 〉, purchase, Rule 1, and Source 1, Shuo Juyuan ﹠amp; Rule-reliable }
RI2={<MCI, British Telecom 〉, purchase, Rule 2, and Source 4, Shuo Juyuan ﹠amp; Rule-unreliable }
RI3={<British Telecom, MCI 〉, purchase, Rule 3, and Source 3, Shuo Juyuan ﹠amp; Rule-reliable }
After the entity relationship example is carried out mark, just can promptly have the entity relationship example of reliability mark at the entity relationship example of piece 204 outputs through mark.
Need to prove, when the confidence level of the confidence level in binding data source and decimation rule is come the confidence level of computational entity relation, except the confidence value of product that provides above, can also adopt other algorithms to obtain the confidence level of entity relationship example as the entity relationship example with two confidence levels.For example, can get the confidence value of two smaller values in the confidence value as the entity relationship example, can get the confidence level of the mean value of two confidence value, perhaps can set weights and the weighted mean value of these two confidence value is defined as the confidence level of entity relationship example for two confidence value as the entity relationship example.
The confidence level of data source and decimation rule can be calculated in advance and be stored in the database, but it will be understood by those skilled in the art that these confidence levels also can be carried out when needed calculates and for example be not stored in the database 207.
The method of the entity relationship example being carried out mark according to another embodiment of the present invention is described below with reference to Fig. 4.Fig. 4 schematically shows the diagram of coming the entity relationship example is carried out the method for mark based on the wide area contextual information.In this embodiment, described reliability relevant information can comprise wide area contextual information and predetermined wide area context decision rule, and determines the confidence level of this entity relationship example based on wide area contextual information and predetermined wide area context decision rule.
With reference to figure 4, at piece 401 inputs entity relationship example to be marked, it can manually be imported or be imported by interface by other programs equally.Determine the confidence level of entity relationship example then according to wide area literary composition contextual information and wide area context rule at piece 402.
Database 405 is used to store the wide area contextual information.The wide area contextual information is meant information relevant with entity that will extract and entity relationship thereof but that can't obtain from the current text that is used for extracting entity relationship.The wide area contextual information can be gathered by hand or automatically.For example can obtain, perhaps obtain by the reliable information source of other information companies from the reliable home page of company.
The business type information that an exemplary example of wide area contextual information is an entity.To describe as an example with business type information below, but the present invention is not limited thereto.
Business type information can be stored with data structure given below on the backstage:
Exabyte
Business type
The store data structure of table 3 company business type information
The exemplary data message of storing in the database is as shown in table 4:
Exabyte Business type
Suning The electrical appliances retail merchant
Beautiful Appliance manufacturers
Exabyte Business type
Guomei The electrical appliances retail merchant
The example of table 4 company business type information
Continuation is with reference to figure 4, and database 406 has been stored the decision rule based on the wide area contextual information.These rules can be by artificial formulation, and perhaps the method by machine learning produces.Each rule can comprise the confidence level that it is corresponding.
Wide area context decision rule can be stored with the data structure that table 5 provides on the backstage:
Decision rule
The fiduciary level of decision rule
The store data structure of table 5 wide area context decision rule
And the exemplary regular example of storing in the database can for:
Figure B2009101380558D0000141
The example of table 6 wide area context decision rule
For entity relationship example to be marked as follows:
RI4={<Suning, beautiful, competition, Rule 4, Source 4}
Can determine the confidence level of this entity relationship example at piece 402 according to the above-mentioned decision rule that is stored in the decision rule " if the business type of two companies does not intersect, then can there be competitive relation in these two companies: 0.98 " in the database.Because Uncrossed two Sunings of company of this entity relationship case representation business type and the beautiful competitive relation that exists are just in time opposite with the description in this rule, can judge that therefore the confidence level that Suning and beautiful existence are competed is (1-0.98), is 0.02.
Then at piece 403 according to such as being 0.8 the predetermined threshold and the confidence level 0.02 of determined entity relationship example, entity relationship example RI4 can be labeled as:
RI4={<Suning, beautiful, competition, Rule 4, Source 4, Wide-area Measurement Information-unreliable }
For another exemplary entity relationship example RI5:
RI5={<Suning, Guomei 〉, the supply of material, Rule 4, Source 4}
Similarly, can determine the confidence level of this entity relationship example at piece 402 according to the above-mentioned decision rule " if the business type of two companies is in full accord, then there is not supply of material relation in these two companies: 0.81 " that is stored in the database.Because there are supply of material relation on all four two Sunings of company of this entity relationship case representation business type and Guomei, just in time opposite with the description in this rule, therefore can judge that it is (1-0.81) that there are the confidence level of supply of material relation in Suning and Guomei, is 0.19.
Therefore, according to such as being 0.8 the predetermined threshold and the confidence level 0.19 of determined entity relationship, this entity relationship example RI5 can be labeled as:
RI5={<Suning, Guomei 〉, the supply of material, Rule 4, Source 4, Wide-area Measurement Information-unreliable }
Like this, just can add the relevant reliability mark of Wide-area Measurement Information for the entity relationship example according to the wide area contextual information.
Described based on the confidence value of the entity relationship example of determining referring to figs. 2 to Fig. 4 hereinbefore the entity relationship example has been carried out the method for mark, described below with reference to Fig. 5 and the entity relationship example is carried out the method for mark according to another embodiment of the present invention.
In the embodiment shown in Fig. 5, described reliability relevant information comprises the historical decision rule of relation, and right entity relationship example carries out mark to relating to identical entity based on the historical decision rule of relation.
As shown in Figure 5, at piece 501 inputs entity relationship example to be marked, similar with the embodiment that the front has been described, entity relationship example to be marked can manually be imported or be imported by interface by other programs.Then, treat the mark-up entity relationship example at piece 502 according to the historical decision rule of relation and carry out mark, and at the entity relationship example of piece 503 outputs through mark.Wherein, concern that historical decision rule generates according to entity relationship instance histories information at piece 505 places, it can manually produce based on expertise, perhaps generates by machine learning method.Entity relationship example to be marked is the entity relationship example with time mark in this embodiment.
Concern two of historical decision rule comparatively typical examples are agent-word denoting the receiver of an action relations to and the relationship change pattern.To describe in detail as example below.
Agent-word denoting the receiver of an action relation is to being meant a pair of like this relation, in case wherein relation takes place then can the state of another relation be exerted an influence, this relation is called as the agent relation, and this another relation is called word denoting the receiver of an action and concerns.Agent-word denoting the receiver of an action relation is to can be according to storing with the given data structure of table 7 on the backstage:
The agent relation
The word denoting the receiver of an action relation
Influence
The storage organization that table 7 agent-the word denoting the receiver of an action relation is right
Be stored in a right illustrative examples of agent in the database-word denoting the receiver of an action relation can for:
The agent relation The word denoting the receiver of an action relation Influence
Purchase Competition Eliminate
The example that table 8 agent-the word denoting the receiver of an action relation is right
This agent-word denoting the receiver of an action relation is to showing after the purchase relation takes place between two entities, with the competitive relation of eliminating between these two entities.
In addition, concern that the direction changing pattern is meant the pattern that entity relationship changes and followed.
Entity relationship mode can be stored with the following data structure that provides on the backstage:
The entity relationship type
Changing pattern
The storage organization of table 9 relationship change pattern
In following table 10, provided an example of the relationship change pattern that is stored in the database:
The entity relationship type Changing pattern
Supply Not sudden change unusually
The example of table 10 relationship change pattern
The relationship change pattern that provides above shows, the relation of the supply between two entities can not undergone mutation, and ANOMALOUS VARIATIONS as shown in Figure 6 promptly can not occur.
Based on concerning that historical rule carries out the purpose of the embodiment of mark to the entity relationship example, provided the entity relationship example of several exemplary for explanation below:
RI01=<A, B, competition, Rule1, Source3 〉, t1
RI02=<A, B, competition, Rule2, Source2 〉, t2
RI03=<A, B, purchase, Rule4, Source6 〉, t3
RI04=<A, B, competition, Rule3, Source2 〉, t4
RI05=<C, D, supply, Rule1, Source3 〉, t5
RI06=<C, D, supply, Rule3, Source2 〉, t6
RI07=<C, D, supply, Rule2, Source2 〉, t7
RI08=<D, C, supply, Rule7, Source5 〉, t8
RI09=<C, D, supply, Rule3, Source2 〉, t9
RI10=<C, D, supply, Rule2, Source1 〉, t10
RI11=<C, D, supply, Rule2, Source3 〉, t11
Wherein, t1>t2>t3>t4>t5>t6>t7>t8>t9>t10>t11.
Entity relationship example RI01 to RI04 relates to the entity relationship example of identical entity A and B, and the agent-word denoting the receiver of an action that has related to example concerns agent relation and the word denoting the receiver of an action relation of centering, i.e. " purchase " and " competition ".Be subjected to-, compete by the above-mentioned agent that provides, eliminate concerning being right<purchase, can judge that RI04 is unreliable, because after the purchase relation of RI3 took place, the competitive relation between A of company and the B of company should be eliminated.
Similarly, entity relationship example RI05 to RI11 relates to the entity relationship example of identical entity C and D, and relates to the relation of the supply that above-mentioned example provides.According to the historical decision rule of the above-mentioned exemplary relation that provides "<supply, sudden change unusually〉", can judge that RI08 is unreliable.
Therefore, can entity relationship example RI01-11 be labeled as at piece 502:
RI01=<A, B, competition, Rule1, Source3 concern historical information-reliably 〉
RI02=<A, B, competition, Rule2, Source2 concern historical information-reliably 〉
RI03=<A, B, purchase, Rule4, Source6 concern historical information-reliably 〉
RI04=<A, B, competition, Rule3, Source2 concerns historical information-unreliable 〉
RI05=<C, D, supply, Rule1, Source3 concern historical information-reliably 〉
RI06=<C, D, supply, Rule3, Source2 concern historical information-reliably 〉
RI07=<C, D, supply, Rule2, Source2 concern historical information-reliably 〉
RI08=<D, C, supply, Rule7, Source5 concerns historical information-unreliable 〉
RI09=<C, D, supply, Rule3, Source2 concern historical information-reliably 〉
RI10=<C, D, supply, Rule2, Source1 concern historical information-reliably 〉
RI11=<C, D, supply, Rule2, Source3 concern historical information-reliably 〉
Can be that each entity relationship example adds the reliability mark that concerns historical relation like this, just based on the historical decision rule of relation.
Return with reference to figure 1, then in step 102, to filtering entity relationship instance, to obtain reliable entity relationship example through mark.
After the process processing of step 101, each entity relationship example includes at least one reliability mark.Then can be according to this reliability mark to filtering entity relationship instance.For example, under the situation that carries a reliability mark, directly will be labeled as insecure entity relationship example and filter out.And under situation, can adopt different filter criterias to filter out insecure entity relationship example according to the requirement of specific accuracy rate and recall rate with a plurality of reliability marks.For example,, can set, then the correspondent entity relationship example be rejected as long as have an expression unreliable in the reliability mark of entity relationship example for very strict accuracy rate requirement.Otherwise, if less demanding to accuracy rate, but wish that recall rate is reasonable, just lower filter criteria can be set.For example, can surpass under the situation of half, at the unreliable mark of entity relationship example its filtering.In addition, can also satisfy under certain combination requires at the reliability mark, the entity relationship example of filtering correspondence just can be under the insecure situation of data source more than or equal to two and one of them at insecure mark for example, with this entity relationship example filtering.
Then, can further the reliable entity relationship example of thinking that finally obtains be labeled as " machinelabel-is reliable ", and store for using subsequently.
In the superincumbent embodiment, described confidence level, the wide area contextual information of confidence level based on data source, decimation rule respectively and concerned that historical information carries out mark to the entity relationship example.But this law those skilled in the art are appreciated that above-described embodiment and can make up by variety of way.
Below with reference to Fig. 7 and Fig. 8 the method to filtering entity relationship instance according to other embodiments of the present invention is described.
As shown in Figure 7, can import or import entity relationship example to be marked by hand in step 701 respectively, can carry out mark based on above-mentioned various reliability relevant informations to identical entity relationship example in step 702,703,704 respectively with 705 concurrently then.Then,,, and carry out above-mentioned filtration according to their entrained reliability marks at these identical entity relationship examples in step 706, when this entity relationship example need be by filtering, all examples of filtering then; And think when this entity relationship example reliably need keep, then can be with one of them interpolation " machinelabel-is reliable " mark of these same instance, and preserve all the other identical entity relationship examples of filtering.
In addition, also can be as shown in Figure 8 after entity relationship example to be marked is imported or imported to step 801 by hand, mode with serial is carried out mark to entity relationship example to be marked successively in step 802,803,804 and 805, and at the step 806 a plurality of reliability marks entrained, to filtering entity relationship instance according to each entity relationship example.
Need to prove, also above-mentioned embodiment based on various reliability relevant informations execution marks can be combined.For example can at first determine comprehensive confidence level, carry out mark according to this comprehensive confidence level and predetermined threshold value then based on various reliability relevant informations.For example can be based on the confidence level of data source, confidence level and wherein two or more next these comprehensive confidence levels of determining of the contextual confidence level of wide area of decimation rule.
Need to prove, though in the described embodiment of the historical decision rule of referring-to relation, and provide the confidence level that concerns historical decision rule.But, it will be understood by those skilled in the art that and can concern that at each historical decision rule provides confidence value with similar at the embodiment of wide area context-descriptive, calculate the confidence level of the entity relationship example that obtains based on this rule then.And according to the rule and predetermined threshold value the entity relationship example is carried out mark.
Also it should be noted that, although in the above-described embodiment, is to utilize the entity relationship extraction technique to extract the entity relationship example that obtains from text with the entity relationship case description of importing to be marked, and the present invention is not limited thereto.Entity relationship example to be marked also can be the entity relationship example that obtains after having filtered by analysis according to prior art.
In addition, though show a plurality of databases that are used for various data and information, these databases are not must be database independent of each other, but can be the individual data storehouses that is used for storing various information and data yet.
In preferred implementation according to the present invention, can with through mark, the entity relationship example of confidence level in predetermined threshold range be saved in database 208 and 506, so that use and use during for machine learning generation decision rule in the confidence level of determining the confidence level of data source for example, decimation rule.By way of example, can regulation confidence level being equal to or less than 0.1 unreliable relationship example more than or equal to 0.9 reliable example and confidence level turns back in the database for using subsequently.
By the method that is used for filtering entity relationship instance provided by the invention, further the entity relationship example is carried out mark and filtration according to the reliability relevant information, therefore can obtain the higher entity relationship example of degree of accuracy, thereby for the high layer analysis based on the entity relationship example provides basis more reliably, the entity relationship example that obtains has bigger practicality for decision of the senior level.
Hereinafter, the equipment that is used for filtering entity relationship instance according to of the present invention will be described with reference to Figure 9.
Fig. 9 shows according to an embodiment of the invention and is used for equipment 900 to filtering entity relationship instance.As shown in Figure 9, this equipment 900 comprises labelling apparatus 901, is used for coming the reliability of entity relationship example is carried out mark based on the reliability relevant information of entity relationship example; And filtration unit 902, be used for to filtering entity relationship instance, to obtain reliable entity relationship example through mark.
In one embodiment according to the present invention, described labelling apparatus 901 can comprise: determine device, be used for determining based on the reliability relevant information of this entity relationship example the confidence level of this entity relationship example; And comparison means, be used for more determined confidence level and predetermined confidence level threshold value, reliable or unreliable so that the entity relationship example is labeled as.
In according to another embodiment of the present invention, described reliability relevant information can comprise: at least one in the confidence level of the decimation rule of the confidence level of the data source of entity relationship example and entity relationship example, and wherein said definite device can be configured to determine the confidence level of this entity relationship based in the confidence level of the decimation rule of the confidence level of the data source of entity relationship example and entity relationship example at least one.
In an embodiment more according to the present invention, the ratio that can pass through responsible entity relationship example in a plurality of entity relationship examples of mark relevant with this data source by calculating, in advance obtains the confidence level of this data source.
In according to another embodiment of the invention, can by predetermined iterative algorithm, obtain the confidence level of these a plurality of data sources based on the incidence relation between a plurality of data sources that comprise this data source and the known initial trusted degree in partial data source wherein.
In according to another embodiment of the present invention, the ratio that the confidence level of described decimation rule can be passed through responsible entity relationship example in a plurality of entity relationship examples of mark relevant with this decimation rule by calculating, in advance obtains.
In an embodiment more according to the present invention, described reliability relevant information can comprise wide area contextual information and predetermined wide area context decision rule, and wherein said definite device can be configured to determine based on wide area contextual information and predetermined wide area context decision rule the confidence level of this entity relationship.
In according to another embodiment of the invention, described reliability relevant information may further include wide area contextual information and predetermined wide area context decision rule, and described definite device can be configured to further determine based on wide area contextual information and predetermined wide area context decision rule the confidence level of this entity relationship.
In according to another embodiment of the present invention, described wide area contextual information can be the business type information of the entity relevant with this entity relationship example, and described predetermined wide area context decision rule is and the relevant rule of entity business type information.
In an embodiment more according to the present invention, described reliability relevant information can comprise the historical decision rule of relation, and wherein said labelling apparatus 901 can be configured to, and right entity relationship example carries out mark to relating to identical entity based on the historical decision rule of relation.
In according to another embodiment of the invention, the historical decision rule of described relation can comprise agent-word denoting the receiver of an action relation to and/or the relationship change pattern.
In according to another embodiment of the present invention, described reliability relevant information may further include the historical decision rule of relation, and wherein said labelling apparatus can be configured to, and further right entity relationship example carries out mark to relating to identical entity based on the historical decision rule of relation.
In according to another embodiment of the invention, described equipment 900 further comprises save set 903, be used for through mark, the entity relationship example of confidence level in predetermined threshold range be saved in the storehouse.
About the concrete operations of the labelling apparatus in the above-mentioned embodiment 901 and filtration unit 902, save set 903 and definite device and comparison means etc., can with reference to top in conjunction with 1 to Fig. 8 for the description of being used for to the method for filtering entity relationship instance according to embodiment of the invention mode.
It will be appreciated by those skilled in the art that embodiments of the invention can realize with the combination of software, hardware or software and hardware.Hardware components can utilize special logic to realize; Software section can be stored in the storer, and by suitable instruction execution system, for example microprocessor or special designs hardware are carried out.
Though described the present invention, should be appreciated that to the invention is not restricted to disclosed embodiment with reference to the embodiment that considers at present.On the contrary, the present invention is intended to contain the interior included various modifications and the equivalent arrangements of spirit and scope of claims.The scope of claims meets broad interpretation, to comprise all such modifications and equivalent structure and function.

Claims (26)

1. method that is used for filtering entity relationship instance comprises:
Come the reliability of entity relationship example is carried out mark based on the reliability relevant information of entity relationship example; And
To filtering entity relationship instance, to obtain reliable entity relationship example through mark.
2. method according to claim 1, wherein, the reliability of entity relationship example is carried out mark comprise:
Determine the confidence level of this entity relationship example based on the reliability relevant information of this entity relationship example; And
More determined confidence level and predetermined confidence level threshold value, reliable or unreliable so that the entity relationship example is labeled as.
3. method according to claim 2, wherein, described reliability relevant information comprises: at least one in the confidence level of the decimation rule of the confidence level of the data source of entity relationship example and entity relationship example, and determine the confidence level of this entity relationship based in the confidence level of the decimation rule of the confidence level of the data source of entity relationship example and entity relationship example at least one.
4. method according to claim 3, wherein,, ratio that in advance pass through in a plurality of entity relationship examples of mark responsible entity relationship example relevant with this data source by calculating obtains the confidence level of this data source.
5. method according to claim 3 wherein, based on the incidence relation between a plurality of data sources that comprise this data source and the known initial trusted degree in partial data source wherein, by predetermined iterative algorithm, obtains the confidence level of these a plurality of data sources.
6. method according to claim 3, the ratio that wherein, the confidence level of described decimation rule to be relevant with this decimation rule by calculating, pass through responsible entity relationship example in a plurality of entity relationship examples of mark in advance obtains.
7. method according to claim 2, wherein, described reliability relevant information comprises wide area contextual information and predetermined wide area context decision rule, and the confidence level of wherein determining this entity relationship example based on wide area contextual information and predetermined wide area context decision rule.
8. method according to claim 3, wherein, described reliability relevant information further comprises wide area contextual information and predetermined wide area context decision rule, and the confidence level of wherein further determining this entity relationship example based on wide area contextual information and predetermined wide area context decision rule.
9. according to claim 7 or 8 described methods, wherein, described wide area contextual information is the business type information of the entity relevant with this entity relationship example, and described predetermined wide area context decision rule is and the relevant rule of entity business type information.
10. method according to claim 1, wherein, described reliability relevant information comprises the historical decision rule of relation, and wherein right entity relationship example carries out mark to relating to identical entity based on the historical decision rule of relation.
11. method according to claim 10, wherein, the historical decision rule of described relation comprise agent-word denoting the receiver of an action relation to and/or the relationship change pattern.
12. according to claim 3,7 and 8 each described methods, wherein, described reliability relevant information further comprises the historical decision rule of relation, and wherein further right entity relationship example carries out mark to relating to identical entity based on the historical decision rule of relation.
13. method according to claim 1, further comprise with through mark, the entity relationship example of confidence level in predetermined threshold range be saved in the storehouse.
14. an equipment that is used for filtering entity relationship instance comprises:
Labelling apparatus is used for coming the reliability of entity relationship example is carried out mark based on the reliability relevant information of entity relationship example; And
Filtration unit is used for to the filtering entity relationship instance through mark, to obtain reliable entity relationship example.
15. equipment according to claim 14, wherein, described labelling apparatus comprises:
Determine device, be used for determining the confidence level of this entity relationship example based on the reliability relevant information of this entity relationship example; And
Comparison means is used for more determined confidence level and predetermined confidence level threshold value, and is reliable or unreliable so that the entity relationship example is labeled as.
16. equipment according to claim 15, wherein, described reliability relevant information comprises: at least one in the confidence level of the decimation rule of the confidence level of the data source of entity relationship example and entity relationship example, and wherein said definite device is configured to determine the confidence level of this entity relationship based in the confidence level of the decimation rule of the confidence level of the data source of entity relationship example and entity relationship example at least one.
17. equipment according to claim 16, wherein,, ratio that in advance pass through in a plurality of entity relationship examples of mark responsible entity relationship example relevant with this data source by calculating obtains the confidence level of this data source.
18. equipment according to claim 16 wherein, based on the incidence relation between a plurality of data sources that comprise this data source and the known initial trusted degree in partial data source wherein, by predetermined iterative algorithm, obtains the confidence level of these a plurality of data sources.
19. equipment according to claim 16, the ratio that wherein, the confidence level of described decimation rule to be relevant with this decimation rule by calculating, pass through responsible entity relationship example in a plurality of entity relationship examples of mark in advance obtains.
20. equipment according to claim 15, wherein, described reliability relevant information comprises wide area contextual information and predetermined wide area context decision rule, and wherein said definite device is configured to determine based on wide area contextual information and predetermined wide area context decision rule the confidence level of this entity relationship.
21. equipment according to claim 16, wherein, described reliability relevant information further comprises wide area contextual information and predetermined wide area context decision rule, and described definite device is configured to further determine based on wide area contextual information and predetermined wide area context decision rule the confidence level of this entity relationship.
22. according to claim 20 or 21 described equipment, wherein, described wide area contextual information is the business type information of the entity relevant with this entity relationship example, and described predetermined wide area context decision rule is and the relevant rule of entity business type information.
23. equipment according to claim 14, wherein, described reliability relevant information comprises the historical decision rule of relation, and wherein said labelling apparatus is configured to, and right entity relationship example carries out mark to relating to identical entity based on the historical decision rule of relation.
24. equipment according to claim 23, wherein, the historical decision rule of described relation comprise agent-word denoting the receiver of an action relation to and/or the relationship change pattern.
25. according to claim 16,20 and 21 any one described equipment, wherein, described reliability relevant information further comprises the historical decision rule of relation, and wherein said labelling apparatus is configured to, and further right entity relationship example carries out mark to relating to identical entity based on the historical decision rule of relation.
26. equipment according to claim 14 further comprises save set, be used for through mark, the entity relationship example of confidence level in predetermined threshold range be saved in the storehouse.
CN2009101380558A 2009-05-06 2009-05-06 Method and equipment for filtering entity relationship instance Pending CN101882259A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009101380558A CN101882259A (en) 2009-05-06 2009-05-06 Method and equipment for filtering entity relationship instance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009101380558A CN101882259A (en) 2009-05-06 2009-05-06 Method and equipment for filtering entity relationship instance

Publications (1)

Publication Number Publication Date
CN101882259A true CN101882259A (en) 2010-11-10

Family

ID=43054271

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009101380558A Pending CN101882259A (en) 2009-05-06 2009-05-06 Method and equipment for filtering entity relationship instance

Country Status (1)

Country Link
CN (1) CN101882259A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103167030A (en) * 2013-03-07 2013-06-19 北京山海树科技有限公司 System and method for detecting and building relations in communication system
CN103561123A (en) * 2013-10-28 2014-02-05 北京国双科技有限公司 Method and device for determining IP segment affiliation
CN107977379A (en) * 2016-10-25 2018-05-01 百度国际科技(深圳)有限公司 Method and apparatus for mined information
CN109472032A (en) * 2018-11-14 2019-03-15 北京锐安科技有限公司 A kind of determination method, apparatus, server and the storage medium of entity relationship diagram
CN109885827A (en) * 2019-01-08 2019-06-14 北京捷通华声科技股份有限公司 A kind of recognition methods and system of the name entity based on deep learning

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103167030A (en) * 2013-03-07 2013-06-19 北京山海树科技有限公司 System and method for detecting and building relations in communication system
CN103167030B (en) * 2013-03-07 2016-08-03 北京山海树科技有限公司 A kind of relation in communication system detects and relation sets up system and method
CN103561123A (en) * 2013-10-28 2014-02-05 北京国双科技有限公司 Method and device for determining IP segment affiliation
CN103561123B (en) * 2013-10-28 2017-05-10 北京国双科技有限公司 Method and device for determining IP segment affiliation
CN107977379A (en) * 2016-10-25 2018-05-01 百度国际科技(深圳)有限公司 Method and apparatus for mined information
CN107977379B (en) * 2016-10-25 2022-06-28 百度国际科技(深圳)有限公司 Method and device for mining information
CN109472032A (en) * 2018-11-14 2019-03-15 北京锐安科技有限公司 A kind of determination method, apparatus, server and the storage medium of entity relationship diagram
CN109885827A (en) * 2019-01-08 2019-06-14 北京捷通华声科技股份有限公司 A kind of recognition methods and system of the name entity based on deep learning
CN109885827B (en) * 2019-01-08 2023-10-27 北京捷通华声科技股份有限公司 Deep learning-based named entity identification method and system

Similar Documents

Publication Publication Date Title
Bowes et al. Software defect prediction: do different classifiers find the same defects?
US11899800B2 (en) Open source vulnerability prediction with machine learning ensemble
Fleischhacker et al. Detecting errors in numerical linked data using cross-checked outlier detection
Leydesdorff Betweenness centrality as an indicator of the interdisciplinarity of scientific journals
US7870039B1 (en) Automatic product categorization
Chen et al. Graphical tools for linear structural equation modeling
CN103902545B (en) A kind of classification path identification method and system
Proserpio et al. A workflow for differentially-private graph synthesis
CN105915555A (en) Method and system for detecting network anomalous behavior
CN106156145A (en) The management method of a kind of address date and device
CN103116588A (en) Method and system for personalized recommendation
CN101706812B (en) Method and device for searching documents
CN103324666A (en) Topic tracing method and device based on micro-blog data
Shen et al. Pareto optimality for sensor placements in a water distribution system
CN105389341A (en) Text clustering and analysis method for repeating caller work orders of customer service calls
CN101882259A (en) Method and equipment for filtering entity relationship instance
CN105740388B (en) A kind of feature selection approach based on distribution shift data set
CN111353838A (en) Method and device for automatically checking commodity category
Wagner et al. Consistent monitoring of cointegrating relationships: The US housing market and the subprime crisis
CN108108477B (en) A kind of the KPI system and Rights Management System of linkage
US10826781B2 (en) Systems and methods for extracting structure from large, dense, and noisy networks
CN110321285A (en) Test case processing method and relevant device
Beierl et al. Is that measure really one-dimensional?
de Mast et al. Modeling and evaluating repeatability and reproducibility of ordinal classifications
Rizzo et al. Generalized likelihood ratio control charts for high‐purity (high‐quality) processes

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20101110