CN102004798B - Matching method of symmetrical issuing subscription system based on plural one-dimensional index - Google Patents

Matching method of symmetrical issuing subscription system based on plural one-dimensional index Download PDF

Info

Publication number
CN102004798B
CN102004798B CN201010606649XA CN201010606649A CN102004798B CN 102004798 B CN102004798 B CN 102004798B CN 201010606649X A CN201010606649X A CN 201010606649XA CN 201010606649 A CN201010606649 A CN 201010606649A CN 102004798 B CN102004798 B CN 102004798B
Authority
CN
China
Prior art keywords
index
subscription
predicate
attribute
incident
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201010606649XA
Other languages
Chinese (zh)
Other versions
CN102004798A (en
Inventor
王波涛
王斌
信俊昌
王超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN201010606649XA priority Critical patent/CN102004798B/en
Publication of CN102004798A publication Critical patent/CN102004798A/en
Application granted granted Critical
Publication of CN102004798B publication Critical patent/CN102004798B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a matching method of a symmetrical issuing subscription system based on a plural one-dimensional index, belonging to the field of databases and comprising the following steps of: receiving data submitted by a user through a system; inquiring the subscription matched with an event and inserting the event into an event index; and inquiring the event matched with the subscription and inserting the subscription into a subscription index. The method has the advantages that: (1) when the event is matched with the subscription, the inquiring mode of the invention is range query instead of point query; and (2) when a subscription predication index is established, equal predications are stored into a form of B and tree, are considered as the conjunction of unequal predications and are decomposed into greater and smaller predications; and when the index is established, the node is respectively inserted into a greater predication index tree and a smaller predication index tree and two predications are used as a counting condition. Therefore, the data structure of the system is simple and easier to realize. The matching performance and the dynamic maintenance performance of the invention have favorable stability and favorable expansibility.

Description

A kind of symmetrical distribution subscription system matching process based on plural one-dimensional index
Technical field
The invention belongs to database field, particularly a kind of symmetrical distribution subscription system matching process based on plural one-dimensional index.
Background technology
Distribution subscription system is an application service platform of issuing and be retrieved as the center with the information of personalization; As the issue of information such as stock, traffic, weather, news and the service platform of subscription be provided; The information that the publisher offers platform is known as incident; The filtercondition that the subscriber offers the message interested of platform is known as subscription, and in traditional distribution subscription system, the publisher of information is to the not constraint of subscriber of information; Have category information issue obtain service platform as: ask for help, hire out and ask rent, easy thing etc. on the net; The publisher of information need carry out screening and filtering to the recipient of information; Ask for help with job hunting and to be example; Also age of job hunter there is requirement in the time of company issue wage information, also the wage that company provided is had requirement when the job hunter issues the age information of oneself, different with traditional distribution subscription system; The user of such service platform be the fabricator of information be again the consumer of information, this category information issue is obtained service platform and is called as symmetrical distribution subscription system.
No matter be traditional distribution subscription system or symmetrical distribution subscription system; The matching process of incident and subscription is one of gordian technique of distribution subscription system, in traditional distribution subscription system, in order to improve matching efficiency; Need set up the filtercondition of representing by predicate and subscribe to index; According to index structure, matching process can be divided three classes: 1) based on the method for plural one-dimensional index: set up an one dimension index for being defined in the identical predicate of same alike result type, index structure can adopt RBTree, Hash table and B+Tree; Matching process is that the predicate that satisfies condition is counted, and it is represented as the Counting method; 2) method based on high dimensional indexing is to regard subscription as on the higher dimensional space object, utilizes the higher dimensional space index that index is built in subscription, and matching process is exactly the query script of higher dimensional space, and index structure commonly used is R*Tree, UBTree; 3) based on the method for decision tree, one deck of the corresponding tree of each attribute, the branch of each layer is by the predicate decision that is defined in this one deck, and matching process is the search procedure from the root node of tree to leaf node.
From the Dynamic Maintenance row of efficient, the index of coupling and the susceptibility that data are distributed, above-mentioned matching process has different qualities: 1) based on the method for plural one-dimensional index, like the Counting method; For having good extendability, Dynamic Maintenance property, it is insensitive that data are distributed, for the filtration of each incident; The corresponding attribute of each dimension, the preliminary election results set all is whole subscription collection on each attribute, rather than its subclass; Calculated amount is bigger, realizes complicated; 2) based on the method for high dimensional indexing,, realize simple like the RTree index; Along with going deep into of searching of each step, the candidate result collection all is ever-reduced, and its calculated amount is linear growth along with the growth of dimension unlike plural one-dimensional index that kind; Calculated amount is less, but the maintenance cost of higher dimensional space index is high, when dimension is higher; Query manipulation need scan whole index, and Dynamic Maintenance property is looked into, and its performance distributes responsive to data; 3) fast based on the method matching speed of decision tree, but its Dynamic Maintenance property is very high, does not have availability.
In sum; In traditional distribution subscription system, the publisher is the fabricator of information, and the subscriber is the consumer of message; In symmetrical distribution subscription system; The publisher of message also hopes the condition of imposing restriction, thereby the recipient who reaches for information screens or limits, though support the incident of symmetrical distribution subscription and the coupling of subscription based on the matching process of high dimensional indexing; But responsive because its performance of Dynamic Maintenance property difference while distributes to data, in practice, significant limitation is arranged in availability; Method based on plural one-dimensional index is not directly supported the incident of symmetrical distribution subscription and the coupling of subscription; Because in traditional method, incident representes that with some incident is done an inquiry to subscription; In symmetrical distribution subscription system; Incident has constraint condition, and when incident requirement was just made range query to subscription, traditional matching process based on plural one-dimensional index can not adapt to the application under this condition.
Summary of the invention
To the deficiency of said method, the present invention proposes a kind of symmetrical distribution subscription system matching process based on plural one-dimensional index.
The Matching Model that at first defines symmetrical distribution subscription system is following: in symmetrical distribution subscription system; The incident of publisher's issue not only comprises the descriptor of incident self but also comprise the constraint condition to the subscriber; The descriptor of the incident on each attribute self can with the corresponding space of this attribute on a spacer segment represent; Be starting position and end position identical distance wherein with a descriptor of representing; The publisher representes with one group of predicate subscriber's constraint condition; On each attribute the definition predicate, each predicate also can with the corresponding space of this attribute on a spacer segment represent, wherein equal predicate table and be shown starting position and end position identical distance; Both comprised the subscriber also comprises subscriber self to the matching condition of the incident of publisher's issue descriptor in the subscription that the subscriber submits to; Matching condition and the descriptor of self described with predicate in the subscription can represent with the interval that is defined on the attribute, to sum up, the incident in the symmetrical distribution subscription system can use one group of time interval with subscription;
According to above-mentioned model; The present invention gives outgoing event and the definition of subscribing to coupling: to an attribute in the symmetrical distribution subscription system; Being defined on this attribute incident and subscription is called as the subevent and subscribes to son; Intersect if subscribe to corresponding intervals with subevent and son, the subevent is subscribed to son and is matched each other so; For all properties in the symmetrical distribution subscription system,, then claim this incident and this subscription coupling if all sons of all subevents of an incident and a subscription are subscribed to coupling;
Technical scheme of the present invention be achieved in that because in the distribution subscription system of symmetry, incident and subscription be symmetrical; The user be the supplier be again the consumer; Subscribe to incident and have only difference semantically; It is identical on processing procedure, subscribing to incident, and system should build index count to subscription also will build index count to incident, the structure of its index, index set up method of application and method of counting is identical; A kind of symmetrical distribution subscription system matching process based on plural one-dimensional index may further comprise the steps:
Step 1: system receives the data that the user submits to; Said data comprise user's requirement and user's self information; Represent with the numerical value form: the user is divided into two types, and the information data that one type of user of system option submits to is as subscription, and the information data that then another kind of user submits to is as incident; Execution in step 2 when the information data of user's submission is incident, execution in step 3 when the information data of user's submission is subscription;
Step 2: inquiry is inserted case index with the subscription of event matches and with incident;
Step 2-1: the query subscription index, method is: decompose each attribute of incident, it is become the set with the corresponding subevent of attribute; To each subevent, the subscription index that query steps 3-3 sets up, the son that inspection matches is subscribed to; And to counting with the corresponding subscription match counter SCounter of son subscription that matches, so that whether comparison in the future subscribes to when being met use, the incident of supposing arrival is [EX; EY]; Wherein, EX, EY be the numerical value of tabular form Origin And Destination at interval respectively, and pairing index and counting operation method are following:
With EY inquiry the index less than predicate of corresponding attribute: find first smaller or equal to the EY leaf node after; Utilize the sequential chained list of B+Tree leaf node; Leaf chained list to index scans; The direction of scanning is from big to small, and counting is mated in the subscription that each predicate that scans is corresponding, and count value is 1;
With EX inquiry the index greater than predicate of corresponding attribute: find first more than or equal to leaf node after; Utilize the sequential chained list of B+Tree leaf node; Leaf chained list to index scans; The direction of scanning is from small to large, and counting is mated in the subscription that each predicate that scans is corresponding, and count value is 1;
Step 2-2: the subscription of output coupling: the situation of each subscription associated index table that inspection step 2-1 is found; If the count value PCounter of the preparatory statistics of each subscription in the concordance list equates with the subscription match counter Scounter of its coupling; Then this incident makes this subscription obtain satisfying, and system should subscribe to output and submit to the user;
Step 2-3: this incident is inserted case index: each attribute that decomposes incident; It is become the set with the corresponding subevent of attribute,, be inserted into corresponding index simultaneously to this subevent counting according to the type of subevent corresponding intervals; The counting of this incident be each subevent count value with; Be preparatory statistical counting PCounter, each attribute has two index, one be used for index with greater than the predicate corresponding intervals; Another index with less than the corresponding index of predicate, the indexed data structure is B+Tree; The thresholding of supposing this attribute is [DOMAIN_MIN; DOMAIN_MAX]; DOMAIN_MIN and DOMAIN_MAX are the minimum value and the maximal value of this attribute; SX that hereinafter is used and SY are respectively threshold value and end point values at interval, and be following with the concrete operation method of the counting that each son is subscribed to the insertion of index:
(1) if do not define predicate on this attribute, when promptly being spaced apart [DOMAIN_MIN, DOMAIN_MAX] pattern; Because any incident is all in interval range; So any subscription all can be met, do not set up index and counting so be not required to be it, the count value of this subevent is 0;
(2) when being spaced apart [SX, DOMAIN_MAX] pattern, its correspondence is greater than predicate, with SX insert with greater than the corresponding index of predicate because the right-hand member of any one occurrence is all within maximal value, the right endpoint of this subevent need not counted, its count value is 1;
(3) when being spaced apart [DOMAIN_MIN, SY] pattern, its correspondence is less than predicate, with SY insert with less than the corresponding index of predicate because the left end of any one occurrence is all within minimum value, the left end point of this subevent need not counted, its count value is 1;
(4) when being spaced apart [SX, SY] pattern, though SX whether equate with SY, can be broken down into [SX; DOMAIN_MAX] and the form of [DOMAIN_MIN, SY], can know by (2) and (3); Need SX be inserted greater than the predicate index, SY is inserted less than the predicate index, the numerical value of this subevent is 2; When these two predicates that and if only if resolves into all were met, [SX, SY] pattern just can be met;
Step 2-4: forward step 1 to;
Step 3: inquiry also will be subscribed to insert with the subscription event matching and subscribed to index;
Step 3-1: query event index: decompose each attribute of subscribing to, it is become the set of subscribing to the corresponding son of attribute; Each height is subscribed to; The case index that query steps 2-3 sets up is checked the subevent that matches, and the event matches counter SCounter corresponding with the subevent that matches counted; So that use when whether the incident of comparison in the future is met; The incident of supposing arrival is [EX, EY], and pairing index and counting operation method are following:
(1) with EY inquiry the index less than predicate of corresponding attribute; Find first smaller or equal to the EY leaf node after; Utilize the sequential chained list of B+Tree leaf node, the leaf chained list of index is scanned, the direction of scanning is from big to small; Incident to each predicate that scans is corresponding is mated counting, and count value is 1;
(2) with EX inquiry the index greater than predicate of corresponding attribute; Find first more than or equal to leaf node after; Utilize the sequential chained list of B+Tree leaf node, the leaf chained list of index is scanned, the direction of scanning is from small to large; Event count is carried out in subscription to each predicate that scans is corresponding, and count value is 1;
Step 3-2: output event matching; The situation of each subscription associated index table that inspection step 3-1 is found; If the count value PCounter of the preparatory statistics of each incident in the concordance list equates with its event matching match counter SCounter; Then this subscription makes this incident obtain satisfying, and system is with this incident output and submit to the user;
Step 3-3: will subscribe to insert and subscribe to index, and decompose each attribute of subscribing to, it is become the set of subscribing to the corresponding son of attribute; Subscribe to the type of corresponding intervals according to son, be inserted into corresponding index and simultaneously this son subscribed to counting, the counting of this subscription be each son count value of subscribing to; Be preparatory statistical counting PCounter; Each attribute has two index, one be used for index with greater than the predicate corresponding intervals, another index with less than the corresponding index of predicate; The indexed data structure is B+Tree; The thresholding of supposing this attribute is [DOMAIN_MIN, DOMAIN_MAX], and is following with the concrete operation method of the counting that each son is subscribed to the insertion of index:
(1) if do not define predicate on this attribute, when promptly being spaced apart [DOMAIN_MIN, DOMAIN_MAX] pattern; Because any incident is all in interval range; So any incident all can be met, do not set up index and counting so be not required to be it, the count value that this son is subscribed to is 0;
(2) when being spaced apart [SX, DOMAIN_MAX] pattern, its correspondence is greater than predicate, with SX insert with greater than the corresponding index of predicate because the right-hand member of any one occurrence is all within maximal value, the right endpoint of this son subscription need not counted, its count value is 1;
(3) when being spaced apart [DOMAIN_MIN, SY] pattern, its correspondence is less than predicate, with SY insert with less than the corresponding index of predicate because the left end of any one occurrence is all within minimum value, the left end point of this son subscription need not counted, its count value is 1;
(4) when being spaced apart [SX, SY] pattern, though SX whether equate with SY, can be broken down into [SX; DOMAIN_MAX] and the form of [DOMAIN_MIN, SY], can know by step 3-3 (2) and (3); Need SX be inserted greater than the predicate index, SY is inserted less than the predicate index, the numerical value that this son is subscribed to is 2; When these two predicates that and if only if resolves into all were met, [SX, SY] pattern just can be met;
Step 3-4: forward step 1 to.
Advantage of the present invention: the inventive method is different with traditional Counting method; 1) when incident and subscription are mated; Inquiry mode of the present invention is range query rather than some inquiry; The point inquiry can be regarded as special range query, so the present invention had both supported the coupling of traditional distribution subscription, the coupling of the distribution subscription system of support symmetry again; 2) when setting up subscription predicate index, the present invention also is stored as the form that B+ sets to the predicate that equates, regards the predicate that equates as two conjunction that do not wait predicate; Equal predicate be broken down into greater than with less than predicate; When building index, respectively to correspondence insert this node greater than the predicate index tree with less than the predicate index tree, and these two predicates as the counting condition; Make the data structure of system simple like this, be easier to realize; 3) the present invention supports the distribution subscription coupling of symmetry, compares with the matching process based on higher-dimension, to different data scales, predicate ratio, DATA DISTRIBUTION and selectance; Matching performance of the present invention and Dynamic Maintenance performance have good stability good extendability are arranged.
Description of drawings
Fig. 1 is the example block diagram of a kind of symmetrical distribution subscription system matching process symmetry distribution subscription based on plural one-dimensional index of the present invention;
Fig. 2 shines upon synoptic diagram for a kind of symmetrical distribution subscription system matching process predicate based on plural one-dimensional index of the present invention with the interval;
Fig. 3 is a kind of coupling synoptic diagram of subscribing to based on the symmetrical distribution subscription system match party method of plural one-dimensional index of the present invention;
Fig. 4 is a kind of symmetrical distribution subscription system matching process process flow diagram based on plural one-dimensional index of the present invention;
Fig. 5 is a kind of subscription index and corresponding statistical counting synoptic diagram of being set up based on the symmetrical distribution subscription system matching process of plural one-dimensional index in advance of the present invention;
Fig. 6 (1)~Fig. 6 (4) is a kind of subscription index tree synoptic diagram of setting up based on the symmetrical distribution subscription system matching process embodiment of plural one-dimensional index of the present invention;
Fig. 7 is a kind of based on the symmetrical distribution subscription system matching process predicate of plural one-dimensional index and the corresponding relation synoptic diagram of subscription for the present invention;
Fig. 8 (0)~Fig. 8 (4) subscribes to synoptic diagram for a kind of symmetrical distribution subscription system matching process event matches based on plural one-dimensional index of the present invention;
Fig. 9 is a kind of symmetrical distribution subscription system matching process index comparison Time Created synoptic diagram based on plural one-dimensional index of the present invention;
Figure 10 for the time ratio of a kind of different dimensions of symmetrical distribution subscription system matching process based on plural one-dimensional index of the present invention than synoptic diagram;
Figure 11 is a kind of time diagram that equals the predicate different proportion based on the symmetrical distribution subscription system matching process of plural one-dimensional index of the present invention.
Embodiment
Below in conjunction with accompanying drawing and embodiment the present invention is further elaborated.
One embodiment of the present of invention, one ask for help with job hunting system in, the information of asking for help described here is called incident; Job hunting information is called subscription; 4 symmetrical distribution subscription data below present embodiment adopts, wherein each data has 2 attributes, and first attribute of data is " wage "; Span is [0,10000]; Data second attribute is " age ", and span is [0,150], the theing contents are as follows of said 4 symmetrical distribution subscription data:
Subscribe to 1:{ wage=1000,20<=age<=32}
Subscribe to 2:{ wage>=200, age<=60}
Subscribe to 3:{200<=wage<=600, NULL}
Incident 1:{800<=wage<=1600, age=24}
Tentation data arrival is in proper order for subscription 1, subscription 2, subscription 3, incident 1, and is as shown in Figure 1, shown also among Fig. 1 that system provides matching result after incident 1 is submitted to, and its result is subscription 1 and subscription 2;
Fig. 2 and Fig. 3 are the Matching Model principle schematic of the present invention's symmetry distribution subscription system, and wherein, Fig. 2 is predicate and interval Mapping Examples figure; Wherein, S1:A<=X<=B, S2:X<=B; S3:X>=A; S4:X=A, S1, S2, S3 and the dissimilar predicate of S4 representative, Min, Max, A, B represent the Origin And Destination with the maximal value of the corresponding attribute of predicate, minimum value, interval respectively; Predicate be corresponding one by one each other conversion at interval, S1 representes interval [A, B], S2 representes [Min, B], S3 representes [A, Max], S4 representes [A, A];
Fig. 3 is the coupling synoptic diagram that son is subscribed to, lists incident and subscribed to all possibilities that coupling promptly covers at interval, and be E1, E2, E3 and E6 with subscribing to the S event matching, unmatched is E4 and E5;
A kind of symmetrical distribution subscription system matching process of present embodiment based on plural one-dimensional index; Process flow diagram is as shown in Figure 4; The major function of " system initialization " is index used during establishment is mated with initialization and the data structure of preserving Counter Value, and after the initialization, index content is empty; The value of counter is 0, and total implementation may further comprise the steps:
(1) import when subscribing to 1, following based on the symmetrical distribution subscription system matching process execution in step of plural one-dimensional index:
Step 1: subscribe to 1 input, execution in step 3;
Step 2: inquiry is inserted case index with the subscription of event matches and with incident;
Step 3: inquiry also will be subscribed to insert with the subscription event matching and subscribed to index;
Step 3-1: query event index: this moment, case index was empty, but did not have search index, execution in step 3-2;
Step 3-2: the output event matching, because the case index of step 3-1 is empty, thus the output of no match event, execution in step 3-3;
Step 3-3: insert and subscribe to index, method is:
Subscribing to 1 uses time interval to be: subscribe to 1:{ [1000,1000], [20,32] };
To the insertion of index and the counting employing method (4) that each son is subscribed to, then subscribe to 1 predicate and represent and can equivalence convert into:
Subscribe to 1:{P1: wage<=1000, P2: wage>=1000, P3: age>=20, P4: age<=32}, its preparatory statistical counting Pcounter1=4; Simultaneously with 1000 and P1 insert attribute ' wage '<=the B+Tree index of predicate; With 1000 and P2 insert attribute ' wage '>=the B+Tree index of predicate; With 20 and P3 insert attribute ' age '>=the B+Tree index of predicate; With 32 and P4 insert attribute ' age '<=the B+Tree index of predicate, execution in step 3-4; Step 3-4: change step 1 over to;
(2) when subscription 2 and subscription 3 inputs, the corresponding situation of institute is identical with subscription 1, mainly is that step 3-3 insertion subscription index process is different:
Subscription 2 with the time interval of subscribing to 3 does
Subscribe to 2:{ [200,10000], [0,60] }
Subscribe to 3:{ [200,600], [0,150] }
Further:
According to (2) (3) among the step 3-3, the predicate of subscription 2 is represented and can equivalence be converted into:
Subscribe to 2:{P5: wage>=200, P6: age<=60}, its preparatory statistical counting Pcounter2=2; Simultaneously with 200 and P5 insert attribute ' wage '>=the B+Tree index of predicate, with 60 and P6 insert the B+Tree index of attribute ' age '<=predicate;
According to (1) (4) among the step 3-3, the predicate of subscription 3 is represented and can equivalence be converted into:
Subscribe to 3:{P5: wage>=200, P7: wage<=600}, its preparatory statistical counting Pcounter=2; Simultaneously with 600 and P7 insert attribute ' wage '<=the B+Tree index of predicate; P5 had been inserted into index in a last step, inserted among the P5 in predicate and subscription mapping table and subscribed to 3;
After subscription 1, subscription 2 are imported with subscription 3, the subscription index that system set up, subscription reaches the subscription index tree after corresponding preparatory statistical counting Pcounter and the foundation; Like Fig. 5; Fig. 6, shown in Figure 7, Fig. 7 has also shown the corresponding relation of predicate and subscription;, incident searches the subscription of coupling when importing, wherein P1, P2, P3, P4, P5, P6 and the above-mentioned sub-predicate of P7 representative;
Last execution in step 3-4;
When (3) information data of user's submission was incident 1, based on the symmetrical distribution subscription system matching process of plural one-dimensional index, execution in step was following:
Step 1: when incident 1 input, step 1 obtains the data of incident 1, and the time interval of incident 1 is:
S4:{ [800,1600], [24,24] }, the input data type is an incident, so change step 2 over to;
Step 2: inquiry is inserted case index with the subscription of event matches and with incident:
Step 2-1: query subscription index: at this moment; Subscribe to 1, subscribe to 2 and subscribe to 3 and Already in subscribe in the index, the son that inspection matches is subscribed to, and counts subscribing to corresponding subscription match counter SCounter with the son that matches; Shown in Fig. 8 (0)~Fig. 8 (4); Wherein, Fig. 8 (0) is the original state of counter, supposes that EX and EY represent Origin And Destination at interval:
For attribute ' wage ', according to (1) among the step 2-1, EX=800; EY=1600; With EY=1600 querying attributes ' wage '<=the B+Tree index of predicate, Query Result is P2, P5, according to the predicate of Fig. 7 and the corresponding relation of subscription; Subscribe to 1, subscribe to 2 and subscribe to 3 match counter and respectively add 1, the result is shown in Fig. 8 (1);
According to (2) among the step 2-1, with EX=800 querying attributes ' wage '<=the B+Tree index of predicate, Query Result is P1, according to the predicate of Fig. 7 and the corresponding relation of subscription, subscribes to 1 match counter and adds 1, the result is shown in Fig. 8 (2);
For attribute ' age ', EX=24, EY=24, inquiry ascending order scanning ' age '<=during the B+Tree of predicate; The result is P4, P6, so subscribe to 1, the match counter of subscription 2 respectively adds 1; Result such as Fig. 8 (3), inquiry scan ' age '>=during the B+Tree index of predicate, the result is P3; The match counter of subscription 1 increases 1, and count status is gone into shown in Fig. 8 (4) behind the execution of step 2-2, execution in step 2-2;
Step 2-2: the subscription of output coupling; Equating with the value of match counter with the preparatory statistical number device of subscribing to 2 because subscribe to 1, is matching results of incident 1 so subscribe to 1 with subscribing to 2, and system's output subscribes to 1 and subscription 2; Shown in job hunting matching result among Fig. 1, execution in step 2-3;
Step 2-3: this incident is inserted case index, and its process and above-mentioned (1) (2) step 3-3 set up the similar process of subscribing to index, and this repeats no more again.
The time space complexity of present embodiment: all subscription all are classified into the son subscription and are kept in the index; The indexed data structure is B+Tree; So the space complexity of present embodiment is 0 (n), in matching process, need each attribute of scanning index greater than the B+Tree of predicate and leaf chained list less than the B+Tree of predicate; Therefore complexity match time of present embodiment is 0 (n), and the insertion of index is 0 (log (n)) with the deletion cost.
Hardware platform is HP DX2708MT/CPU Intel Core 263001.86GHz, internal memory 2GB, hard disk 80GB 7200rpm; In the system of Debian GNU linux 4.0, carried out simulated experiment, all programs realize with C++; Result of experiment is shown in Fig. 9-11:
Fig. 9, Figure 10 and Figure 11 show that respectively dimension changes the insertion time of joining an Effect on Performance and index performance influence, the DATA DISTRIBUTION of matching performance, and in control experiment, DATA DISTRIBUTION Zipf and Uniform have two kinds; Multi-dimensional indexing is UB-Tree and R*tree; Counting represents the present invention, and UB-Tree and R*tree represent multi-dimensional indexing, as can beappreciated from fig. 9 when the subscription index is set up; Compare with R*Tree; UB-tree and the present invention have good Dynamic Maintenance performance, and the data of distribution subscription system are dynamic the insertions and deletion, and must there be good Dynamic Maintenance performance in system; Can find out from Figure 10 and Figure 11, compare that matching performance of the present invention has stability along with the variation linear growth of dimension to different DATA DISTRIBUTION with UB-tree.

Claims (1)

1. symmetrical distribution subscription system matching process based on plural one-dimensional index is characterized in that: may further comprise the steps:
Step 1: system receives the data that the user submits to; Said data comprise user's requirement and user's self information; Represent with the numerical value form: the user is divided into two types, and the information data that one type of user of system option submits to is as subscription, and the information data that then another kind of user submits to is as incident; Execution in step 2 when the information data of user's submission is incident, execution in step 3 when the information data of user's submission is subscription;
Step 2: inquiry is inserted case index, execution in step 1 with the subscription of event matches and with incident;
The subscription of described inquiry and event matches is also inserted case index with incident, may further comprise the steps:
Step 2-1: the query subscription index, method is: decompose each attribute of incident, it is become the set with the corresponding subevent of attribute; To each subevent, the subscription index that query steps 3-3 sets up, the son that inspection matches is subscribed to; And to counting with the corresponding subscription match counter SCounter of son subscription that matches, so that whether comparison in the future subscribes to when being met use, the incident of supposing arrival is [EX; EY]; Wherein, EX, EY represent the numerical value of Origin And Destination at interval respectively, and pairing index and counting operation method are following:
(1) with EY inquiry the index less than predicate of corresponding attribute: find first smaller or equal to the EY leaf node after; Utilize the sequential chained list of B+Tree leaf node; Leaf chained list to index scans; The direction of scanning is from big to small, and counting is mated in the subscription that each predicate that scans is corresponding, and count value is 1;
(2) with EX inquiry the index greater than predicate of corresponding attribute: find first more than or equal to leaf node after; Utilize the sequential chained list of B+Tree leaf node; Leaf chained list to index scans; The direction of scanning is from small to large, and counting is mated in the subscription that each predicate that scans is corresponding, and count value is 1;
Step 2-2: the subscription of output coupling: the situation of each subscription associated index table that inspection step 2-1 is found; If the count value PCounter of the preparatory statistics of each subscription in the concordance list equates with the subscription match counter Scounter of its coupling; Then this incident makes this subscription obtain satisfying, and system should subscribe to output and submit to the user;
Step 2-3: this incident is inserted case index: each attribute that decomposes incident; It is become the set with the corresponding subevent of attribute,, be inserted into corresponding index simultaneously to this subevent counting according to the type of subevent corresponding intervals; The counting of this incident be each subevent count value with; Be preparatory statistical counting PCounter, each attribute has two index, one be used for index with greater than the predicate corresponding intervals; Another index with less than the corresponding index of predicate, the indexed data structure is B+Tree; The thresholding of supposing this attribute is [DOMAIN_MIN; DOMAIN_MAX]; DOMAIN_MIN and DOMAIN_MAX are the minimum value and the maximal value of this attribute; SX that hereinafter is used and SY are respectively threshold value and end point values at interval, and be following with the concrete operation method of the counting that each son is subscribed to the insertion of index:
(1) if do not define predicate on this attribute, when promptly being spaced apart [DOMAIN_MIN, DOMAIN_MAX] pattern; Because any incident is all in interval range; So any subscription all can be met, do not set up index and counting so be not required to be it, the count value of this subevent is 0;
(2) when being spaced apart [SX, DOMAIN_MAX] pattern, its correspondence is greater than predicate, with SX insert with greater than the corresponding index of predicate because the right-hand member of any one occurrence is all within maximal value, the right endpoint of this subevent need not counted, its count value is 1;
(3) when being spaced apart [DOMAIN_MIN, SY] pattern, its correspondence is less than predicate, with SY insert with less than the corresponding index of predicate because the left end of any one occurrence is all within minimum value, the left end point of this subevent need not counted, its count value is 1;
(4) when being spaced apart [SX, SY] pattern, though SX whether equate with SY, can be broken down into [SX; DOMAIN_MAX] and the form of [DOMAIN_MIN, SY], can know by (2) and (3); Need SX be inserted greater than the predicate index, SY is inserted less than the predicate index, the numerical value of this subevent is 2; When these two predicates that and if only if resolves into all were met, [SX, SY] pattern just can be met;
Step 3: inquiry also will be subscribed to insert with the subscription event matching and subscribed to index, execution in step 1;
Described inquiry is with the subscription event matching and will subscribe to insertion subscription index, may further comprise the steps:
Step 3-1: query event index: decompose each attribute of subscribing to, it is become the set of subscribing to the corresponding son of attribute; Each height is subscribed to; The case index that query steps 2-3 sets up is checked the subevent that matches, and the event matches counter SCounter corresponding with the subevent that matches counted; So that use when whether the incident of comparison in the future is met; The incident of supposing arrival is [EX, EY], and pairing index and counting operation method are following:
(1) with EY inquiry the index less than predicate of corresponding attribute; Find first smaller or equal to the EY leaf node after; Utilize the sequential chained list of B+Tree leaf node, the leaf chained list of index is scanned, the direction of scanning is from big to small; Incident to each predicate that scans is corresponding is mated counting, and count value is 1;
(2) with EX inquiry the index greater than predicate of corresponding attribute; Find first more than or equal to leaf node after; Utilize the sequential chained list of B+Tree leaf node, the leaf chained list of index is scanned, the direction of scanning is from small to large; Event count is carried out in subscription to each predicate that scans is corresponding, and count value is 1;
Step 3-2: output event matching; The situation of each subscription associated index table that inspection step 3-1 is found; If the count value PCounter of the preparatory statistics of each incident in the concordance list equates with its event matching match counter SCounter; Then this subscription makes this incident obtain satisfying, and system is with this incident output and submit to the user;
Step 3-3: will subscribe to insert and subscribe to index, and decompose each attribute of subscribing to, it is become the set of subscribing to the corresponding son of attribute; Subscribe to the type of corresponding intervals according to son, be inserted into corresponding index and simultaneously this son subscribed to counting, the counting of this subscription be each son count value of subscribing to; Be preparatory statistical counting PCounter; Each attribute has two index, one be used for index with greater than the predicate corresponding intervals, another index with less than the corresponding index of predicate; The indexed data structure is B+Tree; The thresholding of supposing this attribute is [DOMAIN_MIN, DOMAIN_MAX], and is following with the concrete operation method of the counting that each son is subscribed to the insertion of index:
(1) if do not define predicate on this attribute, when promptly being spaced apart [DOMAIN_MIN, DOMAIN_MAX] pattern, the count value that this son is subscribed to is 0;
(2) when being spaced apart [SX, DOMAIN_MAX] pattern, its correspondence is greater than predicate, with SX insert with greater than the corresponding index of predicate, its count value is 1;
(3) when being spaced apart [DOMAIN_MIN, SY] pattern, its correspondence is less than predicate, with SY insert with less than the corresponding index of predicate, its count value is 1;
(4) when being spaced apart [SX, SY] pattern, though SX whether equate with SY, can be broken down into [SX; DOMAIN_MAX] and the form of [DOMAIN_MIN, SY], can know by step 3-3 (2) and (3); Need SX be inserted greater than the predicate index, SY is inserted less than the predicate index, the numerical value that this son is subscribed to is 2; When these two predicates that and if only if resolves into all were met, [SX, SY] pattern just can be met.
CN201010606649XA 2010-12-27 2010-12-27 Matching method of symmetrical issuing subscription system based on plural one-dimensional index Expired - Fee Related CN102004798B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010606649XA CN102004798B (en) 2010-12-27 2010-12-27 Matching method of symmetrical issuing subscription system based on plural one-dimensional index

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010606649XA CN102004798B (en) 2010-12-27 2010-12-27 Matching method of symmetrical issuing subscription system based on plural one-dimensional index

Publications (2)

Publication Number Publication Date
CN102004798A CN102004798A (en) 2011-04-06
CN102004798B true CN102004798B (en) 2012-05-23

Family

ID=43812160

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010606649XA Expired - Fee Related CN102004798B (en) 2010-12-27 2010-12-27 Matching method of symmetrical issuing subscription system based on plural one-dimensional index

Country Status (1)

Country Link
CN (1) CN102004798B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102819569B (en) * 2012-07-18 2015-01-07 中国科学院软件研究所 Matching method for data in distributed interactive simulation system
WO2014153717A1 (en) * 2013-03-26 2014-10-02 Telefonaktiebolaget L M Ericsson (Publ) Method and communication node for managing communication event subscribers and computer program and medium for the same
CN103678577B (en) * 2013-12-10 2017-10-24 新浪网技术(中国)有限公司 A kind of data-updating method and device
CN105068879B (en) * 2015-08-31 2018-08-17 苏州大学张家港工业技术研究院 A kind of method and device searched target and subscribed to
CN105373633B (en) * 2015-12-23 2019-03-05 江苏省现代企业信息化应用支撑软件工程技术研发中心 The top-k query of subscription matching process of location aware subscription/publication system
CN105740337B (en) * 2016-01-22 2019-03-12 东南大学 A kind of event fast matching method in distribution subscription system based on content
CN108833466B (en) * 2018-04-27 2021-05-14 中南民族大学 System and method for publishing/subscribing traffic network space text
CN111416854B (en) * 2020-03-16 2022-04-19 海南大学 Cloud service publishing method, subscribing method, device and system
CN111949913B (en) * 2020-08-12 2024-04-09 上海交通大学 Efficient matching method and system for space-time perception publish/subscribe system
CN113722332B (en) * 2021-09-09 2024-03-26 上海交通大学 Method and system for improving efficiency and robustness of matching algorithm based on data structure

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101295311A (en) * 2008-06-17 2008-10-29 浙江大学 Semantic matching algorithm of large scale issuance subscription system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7437373B2 (en) * 2006-03-06 2008-10-14 The Real Time Matrix Corporation Method and system for correlating information

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101295311A (en) * 2008-06-17 2008-10-29 浙江大学 Semantic matching algorithm of large scale issuance subscription system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
胡宁静等.发布/订阅系统中多级索引匹配过滤器.《计算机工程与应用》.2008,(第16期),第80-82页. *

Also Published As

Publication number Publication date
CN102004798A (en) 2011-04-06

Similar Documents

Publication Publication Date Title
CN102004798B (en) Matching method of symmetrical issuing subscription system based on plural one-dimensional index
CN102323947B (en) Generation method of pre-join table on ring-shaped schema database
CN104737162B (en) Automatic denormalization for the analytic type query processing in large-scale cluster
US6609131B1 (en) Parallel partition-wise joins
CN107515878B (en) Data index management method and device
CN107451208B (en) Data searching method and device
US20020194157A1 (en) Partition pruning with composite partitioning
CN109726305A (en) A kind of complex_relation data storage and search method based on graph structure
CN105740337A (en) Rapid event matching method in content-based publishing subscription system
CN103164449A (en) Search result showing method and search result showing device
CN106294695A (en) A kind of implementation method towards the biggest data search engine
CN106599052B (en) Apache Kylin-based data query system and method
CN101710348B (en) Document data query method and server
Tran et al. Structure index for RDF data
CN102054007A (en) Searching method and searching device
CN102750328A (en) Construction and storage method for data structure
CN103310350B (en) A kind of based on predicate differentiation and the quick subscription associated and matching process
CN106708946A (en) Universal API table query method
US8239417B2 (en) System, method, and computer program product for accessing and manipulating remote datasets
CN104036052A (en) Predicate index matching method based on historical experience
CN106202364B (en) XML data Partitioning optimization method and its system towards MapReduce
CN107291938A (en) Order Query System and method
CN101667202A (en) Parallel matching method of publish/subscribe system based on semantics under multi-core framework
Liu et al. Parallelizing uncertain skyline computation against n‐of‐N data streaming model
CN108664573A (en) A kind of quick processing system of big data and method with double-channel data library

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120523

Termination date: 20151227

EXPY Termination of patent right or utility model