CN102651028B - Uncertain data provenance query processing method based on D-S evidence theory - Google Patents

Uncertain data provenance query processing method based on D-S evidence theory Download PDF

Info

Publication number
CN102651028B
CN102651028B CN201210099515.2A CN201210099515A CN102651028B CN 102651028 B CN102651028 B CN 102651028B CN 201210099515 A CN201210099515 A CN 201210099515A CN 102651028 B CN102651028 B CN 102651028B
Authority
CN
China
Prior art keywords
data item
result
evidence
pedigree
probability assignment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210099515.2A
Other languages
Chinese (zh)
Other versions
CN102651028A (en
Inventor
岳昆
刘惟一
杨彦超
王源
田凯琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan University YNU
Original Assignee
Yunnan University YNU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan University YNU filed Critical Yunnan University YNU
Priority to CN201210099515.2A priority Critical patent/CN102651028B/en
Publication of CN102651028A publication Critical patent/CN102651028A/en
Application granted granted Critical
Publication of CN102651028B publication Critical patent/CN102651028B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to an uncertain data provenance query processing method based on D-S evidence theory. The method comprises the following steps of: taking selection, projection and connection query operation related to an uncertain data table as a representative, acquiring elementary probability assignment of each input data item to a result data item from a provenance expression which describes SPJ query operation; based on an evidence combining rule in the D-S evidence theory, calculating the combined influence of the uncertainty of a plurality of input data items on the uncertainty of each result data item, and acquiring the probability assignment of each result data item; and performing standardization according to the probability assignment of each result data item, and calculating the belief value and the likelihood value of each result data item, so that the uncertainty of the result data item is determined, and if the uncertainty of the result data item accords with the result obtained on the basis of an input uncertain data-based probable world example, demonstration and evaluation can be performed on the basis of the pair of provenance query results.

Description

Uncertain data pedigree inquiry processing method based on the D-S evidence theory
One, technical field:
The invention discloses a kind of uncertain data pedigree inquiry processing method based on the D-S evidence theory, relate to based on the D-S evidence theory and carry out probabilistic source in probabilistic expression and inference in the data, the trace data processing procedure, answer the method for pedigree inquiry.Belong to database technology and technical field of information processing.
Two, background technology
Progress and people's deepening continuously to the understanding of data acquisition and processing technology along with technology, uncertain data (Uncertain Data) have obtained paying attention to widely, be prevalent in the fields such as economy, logistics, finance, telecommunications and science calculating, and playing the part of pivotal player.Data in the probability database itself are with uncertain (such as probability, interval etc.), and Query Result is also with uncertainty, and this also is the maximum difference of uncertain data and deterministic data.
Pedigree (Lineage or Provenance) refers to the whole process that data produce, also pass in time and develop; in the fields such as science data and sensing data management, secret protection and digital library; based on the traceable data of pedigree and probabilistic source thereof, response user historical and probabilistic inquiry, raising inquiring sensor data efficient and accuracy to data, the analysis result that returns the secret protection data, evaluating data q﹠r.The pedigree of uncertain data can be used as an important technology investigating data uncertainty source and evolution process.
The pedigree expression formula has recorded the process that data are processed and developed, the pedigree query processing comes computing result's uncertainty according to the uncertain data of pedigree expression formula and input, in order to follow the trail of probabilistic source in uncertain data processing and the evolutionary process, probabilistic reasoning is the key of pedigree query processing.Can avoid exhaustive all possible worlds based on pedigree information, thereby raising treatment effeciency, for this reason, need to set up that effective pedigree represents and corresponding uncertain inference mechanism, known pedigree inquiry processing method represents pedigree based on boolean's formula or pattern structure, reflect the correlativity between related data, and the Based on Probability opinion is calculated the uncertainty of Query Result.Brilliant etc. (<Chinese journal of computers 〉, 2010,33 (3): 373-389) analyzed the status quo and challenges that uncertain data pedigree is managed; Huang Dongmei etc. (<patent CN201110004234.X 〉, 2011) based on the uncertain data management system ULDB with pedigree, search the source of uncertain marine monitoring data according to the pedigree function; Brilliant (<Fudan University PhD dissertation 〉, 2011) provided approximate description and the probabilistic evaluation method of target data of uncertain data pedigree based on tree; Yue Kun etc. (<Chinese journal of computers 〉, 2011,34 (10): 1897-1906) for the pedigree query processing, proposed the uncertain data pedigree method for expressing of Based on Probability graph model.
In the known pedigree inquiry processing method, probabilistic calculating (needs complete probability space) take theory of probability as the basis, being based upon given uncertain data has on this hypothesis of complete prior probability, do not consider the characteristics that uncertain data itself are imperfect or partial data lacks in the reality, affected result's accuracy.For this reason, known method is used for the management of uncertain data with the D-S evidence theory, is not to adopt probability but introduce belief function and measure uncertainty in the incomplete data.Li Fang etc. (<computer utility 〉, 2009,29 (11): 3092-3094) the D-S evidence theory is combined with Decision Tree Technologies, proposed the sorting algorithm of uncertain data; Jiang Xiaohua (<University Of Chongqing PhD dissertation 〉, 2009) expanded the ULDB system based on evidence theory, the concept of tuple degree of confidence and the expression of subjective uncertainty data, processing and corresponding data query and the update method of null value have been proposed.But these methods do not relate to pedigree query processing and uncertain inference wherein.
The present invention is take probabilistic reasoning as core, take the uncertain data of pedigree expression formula and input as starting point, with data item and probable value thereof respectively as the evidence that obtains the result data item and degree of belief thereof, the probable value that has proposed to input data item is converted to the method for elementary probability assignment in the D-S evidence theory, the computing method based on result data item probability assignment, conviction value and the likelihood value of D-S evidence theory have been set up, provided and utilized conviction value and likelihood value to describe the probabilistic mechanism of pedigree Query Result, and the authentication policy of validity as a result.The method provides a kind of new theoretical foundation and technical foundation for the query optimization of uncertain data, result's deduction, quality assessment etc. based on the related application of pedigree.
Three, summary of the invention
The object of the present invention is to provide a kind of uncertain data pedigree inquiry processing method based on the D-S evidence theory.The characteristics imperfect for uncertain data itself or partial information lacks, based on the D-S evidence theory, need not suppose " priori is complete ", data item and probable value thereof are distinguished as a result of corresponding evidence and the degree of belief thereof of data item, the method of uncertain inference in the data is carried out in foundation based on evidence, utilize the probabilistic mechanism of conviction value and likelihood value tolerance pedigree Query Result, provide the as a result authentication policy of validity, obtain accurately pedigree query processing result in more realistic mode, be the query optimization of uncertain data, result's deduction and quality assessment etc. lay the foundation.
2, the present invention finishes according to the following steps
Technological process of the present invention is: at first, be connected take the selection, projection and the connection that relate to two uncertain tables of data and be designated as SPJ) query manipulation is as representative, from describing the pedigree expression formula of SPJ query manipulation, obtain each input data item to the elementary probability assignment of result data item; Then, based on the Dempster evidence in the D-S evidence theory, calculate the uncertainty of a plurality of input data item to the probabilistic combined effect of each result data item, obtain the probability assignment of each result data item; Then, carry out standardization processing by the probability assignment to each result data item, calculate conviction value and the likelihood value of each result data item, thereby the uncertainty of tolerance result data item, with directly consistent based on the resulting result of possible world example of the uncertain data of input, can verify assessment to the pedigree Query Result based on this.
(1) obtains the elementary probability assignment of inputting data item
If A and B comprise the probabilistic input data table of tuple level, shape as
Figure BSA00000696742400021
The SPJ query manipulation represent that: A is connected connection with B On the result attribute c is carried out projection (π c), obtain comprising the result data table R of attribute c.{ r 1, r 2..., r lIt is the data item (representing with the tuple sign) among the R.λ: A * B → R is the pedigree function, λ (r j) be r j(the pedigree expression formula of 1≤j≤l) is expressed as boolean's formula of data item among A and the B.As the evidence tables of data, a data item among the A comprises a plurality of possibility values (separating with " || " between each value) of same entity or event with A, and namely A is to the evidence of R.{ a 1, a 2..., a nBe the data item among the A, a iIn k may value a IkProbable value (also claiming confidence level) be designated as p Ik(1≤i≤n, k 〉=1),
Figure BSA00000696742400023
Wherein n is data item number among the A (being number of tuples).
Similarly, { b 1, b 2..., b N 'Be the data item among the B, b xIn y may value b XyProbable value be designated as p Xy(1≤x≤n ', y 〉=1), Wherein n ' is data item number (being number of tuples) among the B.
Use m i(r j) expression evidence evidence a iTo r as a result jThe elementary probability assignment.
If 1. a Ik∧ b Xy∈ λ (r j), m then i(r j)=p IkP Xy
If 2. &Sigma; k p ik < 1 , Then m i ( &Theta; ) = 1 - &Sigma; j = 1 l m i ( r j ) , Wherein Θ represents among the R other data item subset (being unknown message).
(2) the probability assignment of calculation result data item
Based on the Demspter evidence in the D-S evidence theory, will be corresponding to the elementary probability assignment m of any two different evidences i(r j) and m ' i(r ' j) make up (1≤i, i '≤n, i ≠ i '; r j, r ' j∈ { r 1, r 2..., r l∪ Θ).For result data item r=r j∩ r ' j, represent elementary probability assignment m with m (r) i(r j) and m ' i(r ' j) result that makes up, and
m ( r ) = m i &CirclePlus; m i &prime; ( r ) = &Sigma; r i &cap; r i &prime; = r m i ( r i ) &CenterDot; m i &prime; ( r i &prime; ) (formula 1)
Wherein,
Figure BSA00000696742400034
Be the Evidence Combination Methods operator, satisfy law of commutation and law of association, therefore, according to m 1, m 2..., m nThe order of (n is the number of data item in the evidence Table A) is carried out combinatorial operation according to (formula 1), and the uncertainty that obtains all input data item among A and the B is to the combined effect of r as a result.
(3) uncertainty of tolerance result data item
If
Figure BSA00000696742400035
When expression comes data item among the inferred results R uncertain based on the pedigree expression formula, as there being conflict between the input data item of evidence.Therefore, in order to calculate r (r ∈ { r 1, r 2..., r l∪ Θ) the conviction value, with empty set
Figure BSA00000696742400036
The probability assignment combined result of losing is mended on the non-NULL result set in proportion, introduces standardizing factor
Figure BSA00000696742400037
M (r) is carried out standardization processing, so that still have for all m (r)
Figure BSA00000696742400038
Then, calculation result data item r j(the conviction value Bel (r of 1≤j≤l) j)=m (r j) K -1And likelihood value
Figure BSA00000696742400039
Figure BSA000006967424000310
, be illustrated respectively in r among the R jTrusting degree for genuine trusting degree and non-vacation.And then result data item r is measured in the interval that utilizes conviction value and likelihood value to consist of jUncertainty, be expressed as U (r j)=[Bel (r j), Pl (r j].
(4) as a result validation verification assessment
The possible world model of uncertain data, developing from the uncertain data source in exhaustive mode many established datas storehouse example (being called the possible world example).Uncertain data item by input provides and only comprises r j(the possible world example of 1≤j≤l) and comprise r jAnd the possible world example of other data item, its probable value is respectively L jAnd U jCan obtain conclusion Bel (r j) ∈ [L j, U j], illustrated that the uncertainty with U (r) tolerance result data item is rational.
In above step (1)~(4), from the pedigree expression formula, obtain inputting data item to the elementary probability assignment of result data item, considered unknown message, need not suppose " priori is complete "; Based on the probability assignment of Dempster evidence calculation result data item, reflected the internal relation between the related uncertain data item of pedigree, and obtained the input data uncertainty to the probabilistic combined effect of Query Result.
3, compare advantage and the good effect that the present invention has with known technology
(1) need not suppose uncertain data " priori is complete ", directly carry out uncertain inference for any given uncertain data, and then probabilistic source, inference process result's possibility in the trace data processing procedure.Compare with the method for known Based on Probability opinion, have more generality and versatility.
Reflected quantitatively as the internal relation between the input data item of evidence that (2) tolerance input data uncertainty provides a kind of meticulousr, more realistic pedigree Query Processing Mechanism to the probabilistic combined effect of result data.Compare with the known pedigree management method of data item on independently supposing that be based upon, can obtain more accurately uncertain inference result.
(3) interval that consists of with conviction value and likelihood value represents the uncertainty of pedigree Query Result, can clearly embody in the input data whether complete impact on Query Result of priori.Compare with the uncertain method for expressing of the Query Result of known Based on Probability opinion, have better elasticity and interpretation.
In a word, set up whether complete all applicable uncertain data pedigree inquiry processing methods of a kind of no matter priori, embodied the feature of pedigree itself and the potential demand that pedigree is managed.Ripe D-S evidence theory provides a kind of effective support technology for probabilistic reasoning in the uncertain data, for the query processing of uncertain data pedigree provides a kind of new modeling means and computing method, also infer for query optimization, result and quality of data evaluation etc. provides strong technical support based on the key issue that related application and the uncertain data management field of pedigree needs to be resolved hurrily.
Four, description of drawings
Fig. 1 Technology Roadmap of the present invention.Comprise following three parts: obtain the elementary probability assignment as the input data item of evidence, the probability assignment of calculation result data item and the uncertainty of tolerance result data item.
Five, embodiment
Embodiment: for the pedigree inquiry processing method of traffic accident charge
(1) uncertain data and pedigree inquiry
The uncertain data of given " witnessing investigation (Witness) " notes and vehicle administration office " drive recorder (Driver) ", respectively as shown in Table 1 and Table 2." witnessing investigation (Witness) " notes data in the table 1 comprise two evidence data item, are respectively the investigative record of witnessing of " Zhang San " and " Li Si ", the probability of " confidence level " reflection evidence data possibility value." witness investigation " and be connected by " license plate number " attribute with " drive recorder " table.
Table 1 is witnessed investigation (Witness)
Figure BSA00000696742400041
For SPJ inquiry " π The driver
Figure BSA00000696742400042
", Query Result " charge (Suspect) " and pedigree expression formula are as shown in table 3.Wherein, λ (41)=(21,1) ∧ (31,1) expression result data item 41 by eyewitness 21 the 1st may value be connected with driver 31 and obtains, λ (42)=((21,2) ∨ (22,1)) ∧ (32,1) expression result data item 42 by eyewitness 21 the 2nd may value merges with 22 the 1st possible value, is connected with driver 32 and obtains.Θ witnesses information aggregate, m for other of eyewitness 22 2(Θ)=1-0.7 * 1.0=0.3."? " expression needs the confidence level of calculation result data item.
Table 3 is accused (Suspect)
Figure BSA00000696742400051
(2) obtain the elementary probability assignment of inputting data item
If m 1And m 2Be respectively the basic probability assignment function of evidence 21 and 22, then can be got by " witnessing investigation " in the table 1 and " drive recorder " data in the table 2:
1. according to the pedigree expression formula, by (21,1) ∧ (31,1) ∈ λ (41), then m 1(41)=0.8 * 1.0=0.8; By (21,2) ∧ (32,1), (22,1) ∧ (32,1) ∈ λ (42), then m 1(42)=and 0.2 * 1.0=0.2, m 2(42)=0.7 * 1.0=0.7.
2. the notes information of (namely 22) is incomplete because eyewitness " Li Si ", so m 2(Θ)=and 1-0.7 * 1.0=0.3, Θ witnesses information aggregate for other of eyewitness 22.
(3) the probability assignment of calculation result data item
1. by the Demspter evidence, the elementary probability assignment is made up.For all 41 ∩ 42 that occur simultaneously, 41 ∩ Θ, 42 ∩ 42 and the 42 ∩ Θ of two basic probability assignment function result data items, according to m 1() * m 2() calculates respectively its corresponding probability assignment, and (m is with m 1And m 2Probability assignment function after making up):
m(41∩Θ)=m 1(41)×m 2(Θ)=m(41)=0.8×0.3=0.24
m(42∩42)=m 1(42)×m 2(42)=m(42)=0.2×0.7=0.14
m(42∩Θ)=m 1(42)×m 2(Θ)=m(42)=0.2×0.3=0.06
As shown in table 4.
The combination of table 4 elementary probability assignment
Figure BSA00000696742400053
2. based on the combined method of (formula 1), calculate with pedigree query processing result in the corresponding probability assignment of each data item (namely accusing the driver):
Figure BSA00000696742400061
m(41)=0.24,m 2(42)=0.14+0.06=0.20
(4) uncertainty of tolerance result data item
Figure BSA00000696742400062
Illustrate that there is conflict in two evidences witnessing in the investigation records for obtaining accusing the driver, therefore introduce standardizing factor
Figure BSA00000696742400063
M () is carried out standardization processing:
The conviction value of 1. accusing driver " king five " (namely 41) and " Zhao six " (namely 42) is respectively:
Bel(41)=m(41)·K -1=0.24/0.44=0.545,Bel(42)=m(42)·K -1=0.20/0.44=0.455
The likelihood value of 2. accusing driver " king five " and " Zhao six " is respectively:
Pl ( 41 ) = 1 - Bel ( &Not; 41 ) = 1 - Bel ( 42 ) = 0.545 , Pl ( 42 ) = 1 - Bel ( &Not; 42 ) = 1 - Bel ( 41 ) = 0.455
Therefore, charge driver's " king five " and " Zhao six " uncertainty is respectively:
U(41)=[Bel(41),Pl(41)]=[0.545,0.545],U(42)=[Bel(42),Pl(42)]=[0.455,0.455]
(5) as a result validation verification assessment
" witness investigation " table possible world for
Figure BSA00000696742400065
{ (21,1) }, { (21,2), { (22,1) }, { (21,1), (22,1) }, { (21,2), (22,1) } }, the probability of each possible world example is respectively
Figure BSA00000696742400066
P ({ (21,1) })=0.8 * 0.3=0.24, P ({ (21,2) })=and 0.2 * 0.3=0.06, P ({ (22,1) })=0, P ({ (21,1), (22,1) })=0.8 * 0.7=0.56, P ({ (21,2), (22,1) })=0.2 * 0.7=0.14.Because the confidence level of each data item is 1.0 in " drive recorder " table, accuses that then driver " king five " and " Zhao sixs' " possibility is respectively:
P ({ (21,1) })≤P (41)≤P ({ (21,1) })+P ({ (21,1), (22,1) }), i.e. 0.24≤P (41)≤0.8,
P ({ (21,2) })+P ({ (21,2), (22,1) })≤P (42)≤P ({ (21,2) })+P ({ (21,2), (22,1) })+P ({ (21,1), (22,1) }), i.e. 0.20≤P (41)≤0.76
Because Bel (41)=0.545 ∈ [0.24,0.8] and Bel (42)=0.455 ∈ [0.20,0.76] have verified that it is rational measuring respectively the uncertainty of accusing driver " king five " and " Zhao six " with U (41) and U (42).

Claims (2)

1. uncertain data pedigree inquiry processing method based on the D-S evidence theory, it is characterized in that: at first, take the selection, projection and the connection query manipulation that relate to two uncertain tables of data as representative, from describing the pedigree expression formula of SPJ query manipulation, obtain each input data item to the elementary probability assignment of result data item; Then, based on the Dempster evidence in the D-S evidence theory, calculate the uncertainty of a plurality of input data item to the probabilistic combined effect of each result data item, obtain the probability assignment of each result data item; Then, carry out standardization processing by the probability assignment to each result data item, calculate conviction value and the likelihood value of each result data item, thereby the uncertainty of tolerance result data item, with directly consistent based on the resulting result of possible world example of the uncertain data of input, concrete steps are:
(1) obtains the elementary probability assignment of inputting data item
If A and B comprise the probabilistic input data table of tuple level, shape as
Figure FSB00001108404800011
The SPJ query manipulation represent that: A is connected connection with B
Figure FSB00001108404800012
On the result attribute c is carried out projection (π c), obtain comprising the result data table R of attribute c, { r 1, r 2..., r lBeing the data item among the R (representing with the tuple sign), λ: A * B → R is the pedigree function, λ (r j) be r j(the pedigree expression formula of 1≤j≤l) is expressed as boolean's formula of data item among A and the B, and as the evidence tables of data, a data item among the A comprises that a plurality of of same entity or event may values with A, and namely A is to the evidence of R, { a 1, a 2..., a nBe the data item among the A, a iIn k may value a IkProbable value, be designated as p Ik(1≤i≤n, k 〉=1), Wherein n is that the data item number is number of tuples among the A;
Similarly, { b 1, b 2..., b n' be the data item among the B, b xIn y may value b XyProbable value be designated as p Xy(1≤x≤n ', y 〉=1),
Figure FSB00001108404800014
Wherein n ' is data item number among the B,
Use m i(r j) expression evidence evidence a iTo r as a result jThe elementary probability assignment;
If 1. a Ik∧ b Xy∈ λ (r j), m then i(r j)=p IkP Xy
If 2. &Sigma; k P ik < 1 , Then m i ( &Theta; ) = 1 - &Sigma; j = 1 l m i ( r j ) , Wherein Θ represents among the R other data item subset,
(2) the probability assignment of calculation result data item
Based on the Demspter evidence in the D-S evidence theory, will be corresponding to the elementary probability assignment m of any two different evidences i(r j) and m i' (r j') make up (1≤i, i '≤n, i ≠ i '; r j, r j' ∈ { r 1, r 2..., r l∪ Θ), for result data item r=r j∩ r j', represent elementary probability assignment m with m (r) i(r j) and m i' (r j') result that makes up, and
m ( r ) = m i &CirclePlus; m i &prime; ( r ) = &Sigma; r i &cap; r i &prime; = r m i ( r i ) &CenterDot; m i &prime; ( r i &prime; ) (formula 1)
Wherein,
Figure FSB00001108404800022
Be the Evidence Combination Methods operator, satisfy law of commutation and law of association, therefore, according to m 1, m 2..., m n, order carry out combinatorial operation according to formula 1, the uncertainty that obtains among A and the B all input data item is to the combined effect of r as a result, n is the number of data item in the evidence Table A,
(3) uncertainty of tolerance result data item
If When expression comes data item among the inferred results R uncertain based on the pedigree expression formula, as there being conflict between the input data item of evidence, therefore, in order to calculate r (r ∈ { r 1, r 2..., r l∪ Θ) the conviction value, with empty set
Figure FSB00001108404800024
The probability assignment combined result of losing is mended on the non-NULL result set in proportion, introduces standardizing factor
Figure FSB00001108404800025
M (r) is carried out standardization processing, so that still have for all m (r)
Figure FSB00001108404800026
Then, calculation result data item r j(the conviction value Bel (r of 1≤j≤l) j)=m (r j) K -1With likelihood value Pl (r j)=1-Bel
Figure FSB00001108404800028
Be illustrated respectively in r among the R jBe the trusting degree of genuine trusting degree and non-vacation, and then result data item r is measured in the interval that utilizes conviction value and likelihood value to consist of jUncertainty, be expressed as U (r j)=[Bel (r j), Pl (r j)],
(4) as a result validation verification assessment
The possible world model of uncertain data, developing from the uncertain data source in exhaustive mode many established datas storehouse example, is called the possible world example, by the uncertain data item of input, provides and only comprises r j(the possible world example of 1≤j≤l) and comprise r jAnd the possible world example of other data item, its probable value is respectively L jAnd U jCan obtain conclusion Bel (r j) ∈ [L j, U j], illustrated that the uncertainty with U (r) tolerance result data item is reasonably,
In above step (1)~(4), from the pedigree expression formula, obtain inputting data item to the elementary probability assignment of result data item, considered unknown message, need not suppose " priori is complete "; Based on the probability assignment of Dempster evidence calculation result data item, reflected the internal relation between the related uncertain data item of pedigree, and obtained the input data uncertainty to the probabilistic combined effect of Query Result.
2. the uncertain data pedigree inquiry processing method based on the D-S evidence theory according to claim 1 is characterized in that: the pedigree inquiry processing method of accusing for traffic accident,
(1) uncertain data and pedigree inquiry
The uncertain data of given " witnessing investigation " notes and vehicle administration office " drive recorder ", respectively as shown in Table 1 and Table 2, " witnessing investigation " notes data in the table 1 comprise two evidence data item, be respectively the investigative record of witnessing of " Zhang San " and " Li Si ", the probability of " confidence level " reflection evidence data possibility value, " witnessing investigation " is connected by " license plate number " attribute with " drive recorder " table;
Table 1 is witnessed investigation
Figure FSB00001108404800027
Figure FSB00001108404800031
Inquire about for SPJ
Figure FSB00001108404800032
Query Result " charge (Suspect) " and pedigree expression formula are as shown in table 3, wherein, λ (41)=(21,1) ∧ (31,1) expression result data item 41 by eyewitness 21 the 1st may value be connected with driver 31 and obtains, λ (42)=((21,2) ∨ (22,1)) ∧ (32,1) expression result data item 42 by eyewitness 21 the 2nd may value merges with 22 the 1st possible value, be connected with driver 32 again and obtain, Θ witnesses information aggregate, m for other of eyewitness 22 2(Θ)=1-0.7 * 1.0=0.3, "? " expression needs the confidence level of calculation result data item;
Table 3 is accused
Figure FSB00001108404800033
(2) obtain the elementary probability assignment of inputting data item
If m 1And m 2Be respectively the basic probability assignment function of evidence 21 and 22, then can be got by " witnessing investigation " in the table 1 and " drive recorder " data in the table 2:
1. according to the pedigree expression formula, by (21,1) ∧ (31,1) ∈ λ (41), then m 1(41)=0.8 * 1.0=0.8; By (21,2) ∧ (32,1), (22,1) ∧ (32,1) ∈ λ (42), then m 1(42)=and 0.2 * 1.0=0.2, m 2(42)=0.7 * 1.0=0.7;
2. because eyewitness's " Li Si " notes information is incomplete, so m 2(Θ)=and 1-0.7 * 1.0=0.3, Θ witnesses information aggregate for other of eyewitness 22,
(3) the probability assignment of calculation result data item
1. by the Demspter evidence, the elementary probability assignment is made up, for all 41 ∩ 42 that occur simultaneously, 41 ∩ Θ, 42 ∩ 42 and the 42 ∩ Θ of two basic probability assignment function result data items, according to m 1() * m 2() calculates respectively its corresponding probability assignment, and (m is with m 1And m 2Probability assignment function after making up):
m ( 41 &cap; 42 ) = m 1 ( 41 ) &times; m 2 ( 42 ) = m ( &empty; ) = 0 . 8 &times; 0 . 7 = 0.56
m(41∩Θ)=m 1(41)×m 2(Θ)=m(41)=0.8×0.3=0.24
m(42∩42)=m 1(42)×m 2(42)=m(42)=0.2×0.7=0.14
m(42∩Θ)=m 1(42)×m 2(Θ)=m(42)=0.2×0.3=0.06
List table 4 in,
The combination of table 4 elementary probability assignment
Figure FSB00001108404800041
2. based on the combined method of formula 1, calculate with pedigree query processing result in the corresponding probability assignment of each data item:
m(41)=0.24,m 2(42)=0.14+0.06=0.20
(4) uncertainty of tolerance result data item
Figure FSB00001108404800043
Illustrate that there is conflict in two evidences witnessing in the investigation records for obtaining accusing the driver, therefore introduce standardizing factor
Figure FSB00001108404800044
M () is carried out standardization processing:
The conviction value of 1. accusing driver " king five " and " Zhao six " is respectively:
Bel(41)=m(41)·K -1=0.24/0.44=0.545,Bel(42)=m(42)·K -1=0.20/0.44=0.455
The likelihood value of 2. accusing driver " king five " and " Zhao six " is respectively:
Pl(41)=1-Bel(
Figure FSB00001108404800048
41)=1-Bel(42)=0.545,Pl(42)=1-Bel(
Figure FSB00001108404800049
42)=1-Bel(41)=0.455
Therefore, charge driver's " king five " and " Zhao six " uncertainty is respectively:
U(41)=[Bel(41),Pl(41)]=[0.545,0.545],U(42)=[Bel(42),Pl(42)]=[0.455,0.455]
(5) as a result validation verification assessment
" witness investigation " table possible world be
Figure FSB00001108404800045
Figure FSB00001108404800046
The probability of each possible world example is respectively
Figure FSB00001108404800047
P ({ (21,1) })=0.8 * 0.3=0.24, P ({ (21,2) })=and 0.2 * 0.3=0.06, P ({ (22,1) })=0, P ({ (21,1), (22,1) })=and 0.8 * 0.7=0.56, P ({ (21,2), (22,1) })=and 0.2 * 0.7=0.14, because the confidence level of each data item is 1.0 in " drive recorder " table, accuse that then driver " king five " and " Zhao sixs' " possibility is respectively:
P ({ (21,1) })≤P (41)≤P ({ (21,1) })+P ({ (21,1), (22,1) }), i.e. 0.24≤P (41)≤0.8,
P ({ (21,2) })+P ({ (21,2), (22,1) })≤P (42)≤P ({ (21,2) })+P ({ (21,2), (22,1) })+P ({ (21,1), (22,1) }), i.e. 0.20≤P (41)≤0.76
Because Bel (41)=0.545 ∈ [0.24,0.8] and Bel (42)=0.455 ∈ [0.20,0.76] have verified that it is rational measuring respectively the uncertainty of accusing driver " king five " and " Zhao six " with U (41) and U (42).
CN201210099515.2A 2012-04-09 2012-04-09 Uncertain data provenance query processing method based on D-S evidence theory Expired - Fee Related CN102651028B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210099515.2A CN102651028B (en) 2012-04-09 2012-04-09 Uncertain data provenance query processing method based on D-S evidence theory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210099515.2A CN102651028B (en) 2012-04-09 2012-04-09 Uncertain data provenance query processing method based on D-S evidence theory

Publications (2)

Publication Number Publication Date
CN102651028A CN102651028A (en) 2012-08-29
CN102651028B true CN102651028B (en) 2013-10-30

Family

ID=46693036

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210099515.2A Expired - Fee Related CN102651028B (en) 2012-04-09 2012-04-09 Uncertain data provenance query processing method based on D-S evidence theory

Country Status (1)

Country Link
CN (1) CN102651028B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103593435B (en) * 2013-11-12 2017-02-22 河海大学 Approximate treatment system and method for uncertain data PT-TopK query
CN108257365B (en) * 2018-01-29 2020-04-24 杭州电子科技大学 Industrial alarm design method based on global uncertainty evidence dynamic fusion
CN108921414B (en) * 2018-06-22 2021-05-04 郑州大学 Social network trust degree calculation method based on evidence theory
CN110136012A (en) * 2019-05-14 2019-08-16 福建工程学院 A kind of accident auxiliary fix duty method based on block chain technology
CN110119852B (en) * 2019-05-28 2021-01-05 成都理工大学 Unified characterization method and system for uncertain mineralization information
US11763270B2 (en) * 2020-05-14 2023-09-19 RecycleGO Inc. Systems and methods for facilitating generation of a carbon offset based on processing of a recyclable item

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1428696A (en) * 2001-12-29 2003-07-09 杨炳儒 KDD* system based on double-library synergistic mechanism
CN101833538A (en) * 2010-05-11 2010-09-15 天津大学 Multiple qualitative probabilistic network integrating method based on rough set
CN102426599A (en) * 2011-11-09 2012-04-25 中国人民解放军信息工程大学 Method for detecting sensitive information based on D-S evidence theory

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1428696A (en) * 2001-12-29 2003-07-09 杨炳儒 KDD* system based on double-library synergistic mechanism
CN101833538A (en) * 2010-05-11 2010-09-15 天津大学 Multiple qualitative probabilistic network integrating method based on rough set
CN102426599A (en) * 2011-11-09 2012-04-25 中国人民解放军信息工程大学 Method for detecting sensitive information based on D-S evidence theory

Also Published As

Publication number Publication date
CN102651028A (en) 2012-08-29

Similar Documents

Publication Publication Date Title
CN102651028B (en) Uncertain data provenance query processing method based on D-S evidence theory
Kuwajima et al. Engineering problems in machine learning systems
US7715961B1 (en) Onboard driver, vehicle and fleet data mining
CN112114579B (en) Industrial control system safety measurement method based on attack graph
CN111680153A (en) Big data authentication method and system based on knowledge graph
CN102222040A (en) Software creditability grade estimating method based on multiple-attribute entropy weight synthesis
CN110110529A (en) A kind of software network key node method for digging based on complex network
CN114511429A (en) Geological disaster danger level assessment method and device
Champneys et al. On the vulnerability of data-driven structural health monitoring models to adversarial attack
Malik et al. Building a secure platform for digital governance interoperability and data exchange using blockchain and deep learning-based frameworks
CN115587670A (en) Product quality diagnosis method and device based on index map
Abuabed et al. STRIDE threat model-based framework for assessing the vulnerabilities of modern vehicles
CN103970651A (en) Software architecture safety assessment method based on module safety attributes
CN114386046A (en) Unknown vulnerability detection method and device, electronic equipment and storage medium
KR20170130371A (en) How to Identify the User&#39;s Interaction Signature
Chattopadhyay et al. ROWBACK: RObust Watermarking for neural networks using BACKdoors
CN117151513A (en) Method, device, equipment and storage medium for evaluating traffic safety
Hussain et al. Governance in the internet of vehicles (IoV) context: Examination of information privacy, transport anxiety, intention, and usage
Long et al. Robust evaluation of binary collaborative recommendation under profile injection attack
Guibene et al. A Pattern Mining-Based False Data Injection Attack Detector for Industrial Cyber-Physical Systems
Liang et al. DTC-MDD: A spatiotemporal data acquisition technology for privacy-preserving in MCS
Liu et al. A novel method of ds evidence theory for multi-sensor conflicting information
Monteserin Potholes vs. Speed Bumps: A Multivariate Time Series Classification Approach.
CN115146785A (en) Object screening method, device, electronic equipment, storage medium and program product
Melnykov et al. Accounting for spot matching uncertainty in the analysis of proteomics data from two-dimensional gel electrophoresis

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20131030

Termination date: 20160409

CF01 Termination of patent right due to non-payment of annual fee