CN102004798A - Matching method of symmetrical issuing subscription system based on plural one-dimensional index - Google Patents

Matching method of symmetrical issuing subscription system based on plural one-dimensional index Download PDF

Info

Publication number
CN102004798A
CN102004798A CN 201010606649 CN201010606649A CN102004798A CN 102004798 A CN102004798 A CN 102004798A CN 201010606649 CN201010606649 CN 201010606649 CN 201010606649 A CN201010606649 A CN 201010606649A CN 102004798 A CN102004798 A CN 102004798A
Authority
CN
China
Prior art keywords
index
subscription
predicate
incident
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 201010606649
Other languages
Chinese (zh)
Other versions
CN102004798B (en
Inventor
王波涛
王斌
信俊昌
王超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN201010606649XA priority Critical patent/CN102004798B/en
Publication of CN102004798A publication Critical patent/CN102004798A/en
Application granted granted Critical
Publication of CN102004798B publication Critical patent/CN102004798B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a matching method of a symmetrical issuing subscription system based on a plural one-dimensional index, belonging to the field of databases and comprising the following steps of: receiving data submitted by a user through a system; inquiring the subscription matched with an event and inserting the event into an event index; and inquiring the event matched with the subscription and inserting the subscription into a subscription index. The method has the advantages that: (1) when the event is matched with the subscription, the inquiring mode of the invention is range query instead of point query; and (2) when a subscription predication index is established, equal predications are stored into a form of B and tree, are considered as the conjunction of unequal predications and are decomposed into greater and smaller predications; and when the index is established, the node is respectively inserted into a greater predication index tree and a smaller predication index tree and two predications are used as a counting condition. Therefore, the data structure of the system is simple and easier to realize. The matching performance and the dynamic maintenance performance of the invention have favorable stability and favorable expansibility.

Description

A kind of symmetrical distribution subscription system matching process based on plural one-dimensional index
Technical field
The invention belongs to database field, particularly a kind of symmetrical distribution subscription system matching process based on plural one-dimensional index.
Background technology
Distribution subscription system is an application service platform of issuing and be retrieved as the center with the information of personalization, as provide the issue of information such as stock, traffic, weather, news and the service platform of subscription, the information that the publisher offers platform is known as incident, the filtercondition that the subscriber offers the message interested of platform is known as subscription, in traditional distribution subscription system, the publisher of information is to the not constraint of subscriber of information; Have category information issue obtain service platform as: ask for help, rent is asked in taxi, online easily thing etc., the publisher of information need carry out screening and filtering to the recipient of information, ask for help with job hunting and to be example, also there was requirement at the age to the job hunter when company issued wage information, when issuing the age information of oneself, the job hunter also the wage that company provided is had requirement, different with traditional distribution subscription system, the user of such service platform be the fabricator of information be again the consumer of information, this category information issue is obtained service platform and is called as symmetrical distribution subscription system.
No matter be traditional distribution subscription system or symmetrical distribution subscription system, the matching process of incident and subscription is one of gordian technique of distribution subscription system, in traditional distribution subscription system, in order to improve matching efficiency, need set up the filtercondition of representing by predicate and subscribe to index, according to index structure, matching process can be divided three classes: 1) based on the method for plural one-dimensional index: set up an one dimension index for being defined in the identical predicate of same alike result type, index structure can adopt RBTree, Hash table and B+Tree, matching process is that the predicate that satisfies condition is counted, and it is represented as the Counting method; 2) be that subscription is regarded as object on the higher dimensional space based on the method for high dimensional indexing, utilize the higher dimensional space index that index is built in subscription, matching process is exactly the query script of higher dimensional space, and index structure commonly used is R*Tree, UBTree; 3) based on the method for decision tree, one deck of the corresponding tree of each attribute, the branch of each layer is by the predicate decision that is defined in this one deck, and matching process is the search procedure from the root node of tree to leaf node.
From the efficient of coupling, the Dynamic Maintenance row of index and the susceptibility that data are distributed, above-mentioned matching process has different qualities: 1) based on the method for plural one-dimensional index, as the Counting method,, distribute insensitive to data for having good extendability, Dynamic Maintenance, filtration for each incident, the corresponding attribute of each dimension, the preliminary election results set all is whole subscription collection on each attribute, rather than its subclass, calculated amount is bigger, realizes complicated; 2) based on the method for high dimensional indexing, as the RTree index, realize simple, along with going deep into of searching of each step, the candidate result collection all is ever-reduced, its calculated amount is linear growth along with the growth of dimension unlike plural one-dimensional index, and calculated amount is less, but the maintenance cost height of higher dimensional space index, when dimension is higher, query manipulation need scan whole index, and Dynamic Maintenance is looked into, and its performance distributes responsive to data; 3) fast based on the method matching speed of decision tree, but its Dynamic Maintenance is very high, does not have availability.
In sum, in traditional distribution subscription system, the publisher is the fabricator of information, the subscriber is the consumer of message, in symmetrical distribution subscription system, the publisher of message also wishes the condition of imposing restriction, thereby the recipient who reaches for information screens or limits, though support the incident of symmetrical distribution subscription and the coupling of subscription based on the matching process of high dimensional indexing, but responsive because its performance of Dynamic Maintenance difference while distributes to data, in practice, significant limitation is arranged in availability; Method based on plural one-dimensional index is not directly supported the incident of symmetrical distribution subscription and the coupling of subscription, because in traditional method, incident is represented with point, incident is done an inquiry to subscription, in symmetrical distribution subscription system, incident has constraint condition, and when incident requirement was just made range query to subscription, traditional matching process based on plural one-dimensional index can not adapt to the application under this condition.
Summary of the invention
At the deficiency of said method, the present invention proposes a kind of symmetrical distribution subscription system matching process based on plural one-dimensional index.
The Matching Model that at first defines symmetrical distribution subscription system is as follows: in symmetrical distribution subscription system, the incident of publisher issue had not only comprised the descriptor of incident self but also had comprised constraint condition to the subscriber, the descriptor of the incident on each attribute self can with the corresponding space of this attribute on a spacer segment represent, wherein the descriptor of representing with point is starting position and end position identical distance, the publisher represents with one group of predicate subscriber's constraint condition, predicate of definition on each attribute, each predicate also can with the corresponding space of this attribute on a spacer segment represent, wherein equal predicate table and be shown starting position and end position identical distance; Both comprised the subscriber also comprises subscriber self to the matching condition of the incident of publisher's issue descriptor in the subscription that the subscriber submits to, matching condition and the descriptor of self described with predicate in the subscription can represent with the interval that is defined on the attribute, to sum up, the incident in the symmetrical distribution subscription system and subscription can be with one group of time interval;
According to above-mentioned model, the present invention gives outgoing event and the definition of subscribing to coupling: at an attribute in the symmetrical distribution subscription system, being defined on this attribute incident and subscription is called as the subevent and subscribes to son, intersect if subscribe to corresponding intervals with subevent and son, the subevent is subscribed to son and is matched each other so; For all properties in the symmetrical distribution subscription system,, then claim this incident and this subscription coupling if all sons of all subevents of an incident and a subscription are subscribed to coupling;
Technical scheme of the present invention be achieved in that because in the distribution subscription system of symmetry, incident and subscription be symmetrical, the user be the supplier be again the consumer, subscription and incident have only difference semantically, it is identical subscribing to incident on processing procedure, system should build index count to subscription also will build index count to incident, the structure of its index, index set up using method and method of counting is identical, a kind of symmetrical distribution subscription system matching process based on plural one-dimensional index may further comprise the steps:
Step 1: system receives the data that the user submits to, described data comprise user's requirement and user's self information, represent with the numerical value form: the user is divided into two classes, and the information data that system option one class user submits to is as subscription, and the information data that then another kind of user submits to is as incident; Execution in step 2 when the information data of user's submission is incident, execution in step 3 when the information data of user's submission is subscription;
Step 2: inquiry is inserted case index with the subscription of event matches and with incident;
Step 2-1: query subscription index, method is: each attribute that decomposes incident, it is become set with the subevent of attribute correspondence, to each subevent, the subscription index that query steps 3-3 sets up, the son that inspection matches is subscribed to, and count subscribing to corresponding subscription match counter SCounter with the son that matches, so that whether comparison in the future subscribes to when being met use, the incident of supposing arrival is [EX, EY], wherein, EX, EY be the numerical value of tabular form Origin And Destination at interval respectively, and pairing index and counting operation method are as follows:
With EY inquiry the index less than predicate of corresponding attribute: find first smaller or equal to the EY leaf node after, utilize the sequential chained list of B+Tree leaf node, leaf chained list to index scans, the direction of scanning is from big to small, counting is mated in subscription to each predicate correspondence that scans, and count value is 1;
With EX inquiry the index greater than predicate of corresponding attribute: find first more than or equal to leaf node after, utilize the sequential chained list of B+Tree leaf node, leaf chained list to index scans, the direction of scanning is from small to large, counting is mated in subscription to each predicate correspondence that scans, and count value is 1;
Step 2-2: the subscription of output coupling: the situation of checking each subscription associated index table that step 2-1 is found, if the count value PCounter of the pre-statistics of each subscription in the concordance list equates with the subscription match counter Scounter of its coupling, then this incident is met this subscription, and system should subscribe to output and submit to the user;
Step 2-3: this incident is inserted case index: each attribute that decomposes incident, it is become set with the subevent of attribute correspondence, type according to the subevent corresponding intervals, be inserted into corresponding index simultaneously to this subevent counting, the counting of this incident be each subevent count value and, be pre-statistical counting PCounter, each attribute has two index, one be used for index with greater than the predicate corresponding intervals, another index with less than the corresponding index of predicate, the indexed data structure is B+Tree; The thresholding of supposing this attribute is [DOMAIN_MIN, DOMAIN_MAX], DOMAIN_MIN and DOMAIN_MAX are the minimum value and the maximal value of this attribute, hereinafter used SX and SY are respectively threshold value and end point values at interval, and be as follows with the concrete operation method of the counting that each son is subscribed to the insertion of index:
(1) if do not define predicate on this attribute, when promptly being spaced apart [DOMAIN_MIN, DOMAIN_MAX] pattern, because any incident is all in interval range, so any subscription all can be met, do not set up index and counting so be not required to be it, the count value of this subevent is 0;
(2) when being spaced apart [SX, DOMAIN_MAX] pattern, its correspondence is greater than predicate, with SX insert with greater than the corresponding index of predicate because the right-hand member of any one occurrence is all within maximal value, the right endpoint of this subevent need not counted, its count value is 1;
(3) when being spaced apart [DOMAIN_MIN, SY] pattern, its correspondence is less than predicate, with SY insert with less than the corresponding index of predicate because the left end of any one occurrence is all within minimum value, the left end point of this subevent need not counted, its count value is 1;
(4) when being spaced apart [SX, SY] pattern, no matter whether SX equates with SY, can be broken down into [SX, DOMAIN_MAX] and the form of [DOMAIN_MIN, SY], by (2) and (3) as can be known, SX need be inserted greater than the predicate index, SY is inserted less than the predicate index, and the numerical value of this subevent is 2, when these two predicates that and if only if resolves into all are met, [SX, SY] pattern just can be met;
Step 2-4: forward step 1 to;
Step 3: inquiry also will be subscribed to insert with the incident of subscribing to coupling and be subscribed to index;
Step 3-1: query event index: decompose each attribute of subscribing to, it is become the set of subscribing to the son of attribute correspondence; Each height is subscribed to, the case index that query steps 2-3 sets up, the subevent that inspection matches, and the event matches counter SCounter corresponding with the subevent that matches counted, so that use when whether the incident of comparison in the future is met, the incident of supposing arrival is [EX, EY], and pairing index and counting operation method are as follows:
(1) with EY inquiry the index less than predicate of corresponding attribute, find first smaller or equal to the EY leaf node after, utilize the sequential chained list of B+Tree leaf node, leaf chained list to index scans, the direction of scanning is from big to small, incident to each predicate correspondence that scans is mated counting, and count value is 1;
(2) with EX inquiry the index greater than predicate of corresponding attribute, find first more than or equal to leaf node after, utilize the sequential chained list of B+Tree leaf node, leaf chained list to index scans, the direction of scanning is from small to large, event count is carried out in subscription to each predicate correspondence that scans, and count value is 1;
Step 3-2: the incident of output coupling, check the situation of each subscription associated index table that step 3-1 is found, if the count value PCounter of the pre-statistics of each incident in the concordance list equates with the event matches counter SCounter of its coupling, then this subscription is met this incident, and system is with this incident output and submit to the user;
Step 3-3: will subscribe to insert and subscribe to index, decompose each attribute of subscribing to, it is become the set of subscribing to the son of attribute correspondence, subscribe to the type of corresponding intervals according to son, be inserted into corresponding index and simultaneously this son subscribed to counting, the counting of this subscription be each son count value of subscribing to and, be pre-statistical counting PCounter, each attribute has two index, one be used for index with greater than the predicate corresponding intervals, another index with less than the corresponding index of predicate, the indexed data structure is B+Tree, the thresholding of supposing this attribute is [DOMAIN_MIN, DOMAIN_MAX], and is as follows with the concrete operation method of the counting that each son is subscribed to the insertion of index:
(1) if do not define predicate on this attribute, when promptly being spaced apart [DOMAIN_MIN, DOMAIN_MAX] pattern, because any incident is all in interval range, so any incident all can be met, do not set up index and counting so be not required to be it, the count value that this son is subscribed to is 0;
(2) when being spaced apart [SX, DOMAIN_MAX] pattern, its correspondence is greater than predicate, with SX insert with greater than the corresponding index of predicate because the right-hand member of any one occurrence is all within maximal value, the right endpoint of this son subscription need not counted, its count value is 1;
(3) when being spaced apart [DOMAIN_MIN, SY] pattern, its correspondence is less than predicate, with SY insert with less than the corresponding index of predicate because the left end of any one occurrence is all within minimum value, the left end point of this son subscription need not counted, its count value is 1;
(4) when being spaced apart [SX, SY] pattern, no matter whether SX equates with SY, can be broken down into [SX, DOMAIN_MAX] and the form of [DOMAIN_MIN, SY], by step 3-3 (2) and (3) as can be known, SX need be inserted greater than the predicate index, SY is inserted less than the predicate index, and the numerical value that this son is subscribed to is 2, when these two predicates that and if only if resolves into all are met, [SX, SY] pattern just can be met;
Step 3-4: forward step 1 to.
Advantage of the present invention: the inventive method is different with traditional Counting method, 1) when incident and subscription are mated, inquiry mode of the present invention is range query rather than some inquiry, the point inquiry can be regarded as special range query, therefore the present invention had both supported the coupling of traditional distribution subscription, supported the coupling of the distribution subscription system of symmetry again; 2) when setting up subscription predicate index, the present invention also is stored as the predicate that equates the form of B+ tree, the predicate that equates is regarded as two conjunction that do not wait predicate, equal predicate be broken down into greater than with less than predicate, when building index, respectively to correspondence insert this node greater than the predicate index tree with less than the predicate index tree, and these two predicates as the counting condition, make the data structure of system simple like this, be easier to realize; 3) the present invention supports the distribution subscription coupling of symmetry, compares with the matching process based on higher-dimension, at different data scales, predicate ratio, DATA DISTRIBUTION and selectance; Matching performance of the present invention and Dynamic Maintenance performance have good stability good extendability are arranged.
Description of drawings
Fig. 1 is the example block diagram of a kind of symmetrical distribution subscription system matching process symmetry distribution subscription based on plural one-dimensional index of the present invention;
Fig. 2 shines upon synoptic diagram for a kind of symmetrical distribution subscription system matching process predicate based on plural one-dimensional index of the present invention with the interval;
Fig. 3 is a kind of coupling synoptic diagram of subscribing to based on the symmetrical distribution subscription system match party method of plural one-dimensional index of the present invention;
Fig. 4 is a kind of symmetrical distribution subscription system matching process process flow diagram based on plural one-dimensional index of the present invention;
Fig. 5 is a kind of subscription index and corresponding pre-statistical counting synoptic diagram of being set up based on the symmetrical distribution subscription system matching process of plural one-dimensional index of the present invention;
Fig. 6 (1)~Fig. 6 (4) is a kind of subscription index tree synoptic diagram of setting up based on the symmetrical distribution subscription system matching process embodiment of plural one-dimensional index of the present invention;
Fig. 7 is a kind of based on the symmetrical distribution subscription system matching process predicate of plural one-dimensional index and the corresponding relation synoptic diagram of subscription for the present invention;
Fig. 8 (0)~Fig. 8 (4) subscribes to synoptic diagram for a kind of symmetrical distribution subscription system matching process event matches based on plural one-dimensional index of the present invention;
Fig. 9 is a kind of symmetrical distribution subscription system matching process index comparison Time Created synoptic diagram based on plural one-dimensional index of the present invention;
Figure 10 for the time ratio of a kind of different dimensions of symmetrical distribution subscription system matching process based on plural one-dimensional index of the present invention than synoptic diagram;
Figure 11 is a kind of time diagram that equals the predicate different proportion based on the symmetrical distribution subscription system matching process of plural one-dimensional index of the present invention.
Embodiment
Below in conjunction with drawings and Examples the present invention is further elaborated.
One embodiment of the present of invention, one ask for help with job hunting system in, the information of asking for help described here is called incident, job hunting information is called subscription, 4 symmetrical distribution subscription data below present embodiment adopts, wherein each data has 2 attributes, and first attribute of data is " wage ", span is [0,10000]; Data second attribute is " age ", and span is [0,150], the theing contents are as follows of described 4 symmetrical distribution subscription data:
Subscribe to 1:{ wage=1000,20<=age<=32}
Subscribe to 2:{ wage>=200, age<=60}
Subscribe to 3:{200<=wage<=600, NULL}
Incident 1:{800<=wage<=1600, age=24}
Tentation data arrival is subscription 1, subscription 2, subscription 3, incident 1 in proper order, as shown in Figure 1, has shown also among Fig. 1 that system provides matching result after incident 1 is submitted to, and its result is subscription 1 and subscription 2;
Fig. 2 and Fig. 3 are the Matching Model principle schematic of the present invention's symmetry distribution subscription system, wherein, Fig. 2 is predicate and interval Mapping Examples figure, wherein, S1:A<=X<=B, S2:X<=B, S3:X>=A, S4:X=A, S1, S2, S3 and the dissimilar predicate of S4 representative, Min, Max, A, B represent the Origin And Destination with the maximal value of the corresponding attribute of predicate, minimum value, interval respectively; Predicate be corresponding one by one mutually conversion at interval, S1 represents interval [A, B], S2 represents [Min, B], S3 represents [A, Max], S4 represents [A, A];
Fig. 3 is the coupling synoptic diagram that son is subscribed to, and the incident of having listed is mated all possibilities that promptly cover at interval with subscription, and the incident of mating with subscription S is E1, E2, E3 and E6, and unmatched is E4 and E5;
A kind of symmetrical distribution subscription system matching process of present embodiment based on plural one-dimensional index, process flow diagram as shown in Figure 4, the major function of " system initialization " is index used during establishment is mated with initialization and the data structure of preserving Counter Value, after the initialization, index content is empty, the value of counter is 0, and total implementation may further comprise the steps:
(1) import when subscribing to 1, as follows based on the symmetrical distribution subscription system matching process execution in step of plural one-dimensional index:
Step 1: subscribe to 1 input, execution in step 3;
Step 2: inquiry is inserted case index with the subscription of event matches and with incident;
Step 3: inquiry also will be subscribed to insert with the incident of subscribing to coupling and be subscribed to index;
Step 3-1: query event index: this moment, case index was empty, but did not have search index, execution in step 3-2;
Step 3-2: the incident of output coupling, because the case index of step 3-1 is empty, thus the output of no match event, execution in step 3-3;
Step 3-3: insert and subscribe to index, method is:
Subscribing to 1 usefulness time interval is: subscribe to 1:{[1000,1000], [20,32] };
To the insertion of index and the counting employing method (4) that each son is subscribed to, then subscribe to 1 predicate and represent and can equivalence be converted to:
Subscribe to 1:{P1: wage<=1000, P2: wage>=1000, P3: age>=20, P4: age<=32}, its pre-statistical counting Pcounter1=4; Simultaneously with 1000 and P1 insert attribute ' wage '<=the B+Tree index of predicate, with 1000 and P2 insert attribute ' wage '>=the B+Tree index of predicate, with 20 and P3 insert attribute ' age '>=the B+Tree index of predicate, with 32 and P4 insert attribute ' age '<=the B+Tree index of predicate, execution in step 3-4; Step 3-4: change step 1 over to;
(2) when subscription 2 and subscription 3 inputs, the corresponding situation of institute is identical with subscription 1, mainly is that step 3-3 insertion subscription index process is different:
Subscription 2 with the time interval of subscribing to 3 is
Subscribe to 2:{[200,10000], [0,60] }
Subscribe to 3:{[200,600], [0,150] }
Further:
According to (2) (3) among the step 3-3, the predicate of subscription 2 is represented and can equivalence be converted to:
Subscribe to 2:{P5: wage>=200, P6: age<=60}, its pre-statistical counting Pcounter2=2; Simultaneously with 200 and P5 insert attribute ' wage '>=the B+Tree index of predicate, with 60 and P6 insert the B+Tree index of attribute ' age '<=predicate;
According to (1) (4) among the step 3-3, the predicate of subscription 3 is represented and can equivalence be converted to:
Subscribe to 3:{P5: wage>=200, P7: wage<=600}, its pre-statistical counting Pcounter=2; Simultaneously with 600 and P7 insert attribute ' wage '<=the B+Tree index of predicate; P5 has been inserted into index in previous step, inserts among the P5 in predicate and subscription mapping table and subscribes to 3;
After subscription 1, subscription 2 and subscribing to 3 inputs, the subscription index that system set up, subscription index tree after subscription and corresponding pre-statistical counting Pcounter and the foundation, as Fig. 5, Fig. 6, shown in Figure 7, Fig. 7 has also shown the corresponding relation of predicate and subscription, when importing, incident searches the subscription of coupling, wherein P1, P2, P3, P4, P5, P6 and the above-mentioned sub-predicate of P7 representative;
Last execution in step 3-4;
When (3) information data of user's submission was incident 1, based on the symmetrical distribution subscription system matching process of plural one-dimensional index, execution in step was as follows:
Step 1: when incident 1 input, step 1 obtains the data of incident 1, and the time interval of incident 1 is:
S4:{[800,1600], [24,24] } and, the input data type is an incident, so change step 2 over to;
Step 2: inquiry is inserted case index with the subscription of event matches and with incident:
Step 2-1: query subscription index: at this moment, subscribe to 1, subscribe to 2 and subscribe to 3 and Already in subscribe in the index, the son that inspection matches is subscribed to, and count subscribing to corresponding subscription match counter SCounter with the son that matches, shown in Fig. 8 (0)~Fig. 8 (4), wherein, Fig. 8 (0) is the original state of counter, supposes that EX and EY represent Origin And Destination at interval:
For attribute ' wage ', according to (1) among the step 2-1, EX=800, EY=1600, with EY=1600 querying attributes ' wage '<=the B+Tree index of predicate, Query Result is P2, P5, according to the predicate of Fig. 7 and the corresponding relation of subscription, subscribe to 1, subscribe to 2 and subscribe to 3 match counter and respectively add 1, the result is shown in Fig. 8 (1);
According to (2) among the step 2-1, with EX=800 querying attributes ' wage '<=the B+Tree index of predicate, Query Result is P1, according to the predicate of Fig. 7 and the corresponding relation of subscription, the match counter of subscription 1 adds 1, the result is shown in Fig. 8 (2);
For attribute ' age ', EX=24, EY=24, inquiry ascending order scanning ' age '<=during the B+Tree of predicate, the result is P4, P6, so subscribe to 1, subscribe to 2 match counter and respectively add 1, result such as Fig. 8 (3), inquiry scan ' age '>=during the B+Tree index of predicate, the result is P3, the match counter of subscription 1 increases 1, and count status is gone into shown in Fig. 8 (4) behind the execution of step 2-2, execution in step 2-2;
Step 2-2: the subscription of output coupling, equating with the value of match counter with the pre-statistical number device of subscribing to 2 because subscribe to 1, is matching results of incident 1 so subscribe to 1 with subscribing to 2, and system's output subscribes to 1 and subscription 2, shown in job hunting matching result among Fig. 1, execution in step 2-3;
Step 2-3: this incident is inserted case index, and its process and above-mentioned (1) (2) step 3-3 set up the similar process of subscribing to index, and this repeats no more again.
The time space complexity of present embodiment: all subscription all are classified into the son subscription and are kept in the index, the indexed data structure is B+Tree, so the space complexity of present embodiment is 0 (n), in matching process, need each attribute of scanning index greater than the B+Tree of predicate and leaf chained list less than the B+Tree of predicate, therefore complexity match time of present embodiment is 0 (n), and the insertion of index is 0 (log (n)) with the deletion cost.
Hardware platform is HP DX2708MT/CPU Intel Core 263001.86GHz, internal memory 2GB, hard disk 80GB 7200rpm; Carried out simulated experiment in the system of Debian GNU linux 4.0, all programs realize with C++; Result of experiment is shown in Fig. 9-11:
Fig. 9, Figure 10 and Figure 11 show the influence of dimension to matching performance respectively, DATA DISTRIBUTION changed the insertion time of joining an Effect on Performance and index performance, in control experiment, DATA DISTRIBUTION Zipf and Uniform have two kinds, multi-dimensional indexing is UB-Tree and R*tree, Counting represents the present invention, UB-Tree and R*tree represent multi-dimensional indexing, as can be seen from Figure 9 when the subscription index is set up, compare with R*Tree, UB-tree and the present invention have good Dynamic Maintenance performance, the data of distribution subscription system are dynamic the insertions and deletion, and must there be good Dynamic Maintenance performance in system; From Figure 10 and Figure 11 as can be seen, compare with UB-tree, matching performance of the present invention has stability along with the variation linear growth of dimension to different DATA DISTRIBUTION.

Claims (3)

1. symmetrical distribution subscription system matching process based on plural one-dimensional index is characterized in that: may further comprise the steps:
Step 1: system receives the data that the user submits to, described data comprise user's requirement and user's self information, represent with the numerical value form: the user is divided into two classes, and the information data that system option one class user submits to is as subscription, and the information data that then another kind of user submits to is as incident; Execution in step 2 when the information data of user's submission is incident, execution in step 3 when the information data of user's submission is subscription;
Step 2: inquiry is inserted case index, execution in step 1 with the subscription of event matches and with incident;
Step 3: inquiry also will be subscribed to insert with the incident of subscribing to coupling and be subscribed to index, execution in step 1.
2. a kind of symmetrical distribution subscription system matching process based on plural one-dimensional index according to claim 1 is characterized in that: the subscription of described inquiry of step 2 and event matches is also inserted case index with incident, may further comprise the steps:
Step 2-1: query subscription index, method is: each attribute that decomposes incident, it is become set with the subevent of attribute correspondence, to each subevent, the subscription index that query steps 3-3 sets up, the son that inspection matches is subscribed to, and count subscribing to corresponding subscription match counter SCounter with the son that matches, so that whether comparison in the future subscribes to when being met use, the incident of supposing arrival is [EX, EY], wherein, EX, EY represent the numerical value of Origin And Destination at interval respectively, and pairing index and counting operation method are as follows:
(1) with EY inquiry the index less than predicate of corresponding attribute: find first smaller or equal to the EY leaf node after, utilize the sequential chained list of B+Tree leaf node, leaf chained list to index scans, the direction of scanning is from big to small, counting is mated in subscription to each predicate correspondence that scans, and count value is 1;
(2) with EX inquiry the index greater than predicate of corresponding attribute: find first more than or equal to leaf node after, utilize the sequential chained list of B+Tree leaf node, leaf chained list to index scans, the direction of scanning is from small to large, counting is mated in subscription to each predicate correspondence that scans, and count value is 1;
Step 2-2: the subscription of output coupling: the situation of checking each subscription associated index table that step 2-1 is found, if the count value PCounter of the pre-statistics of each subscription in the concordance list equates with the subscription match counter Scounter of its coupling, then this incident is met this subscription, and system should subscribe to output and submit to the user;
Step 2-3: this incident is inserted case index: each attribute that decomposes incident, it is become set with the subevent of attribute correspondence, type according to the subevent corresponding intervals, be inserted into corresponding index simultaneously to this subevent counting, the counting of this incident be each subevent count value and, be pre-statistical counting PCounter, each attribute has two index, one be used for index with greater than the predicate corresponding intervals, another index with less than the corresponding index of predicate, the indexed data structure is B+Tree; The thresholding of supposing this attribute is [DOMAIN_MIN, DOMAIN_MAX], DOMAIN_MIN and DOMAIN_MAX are the minimum value and the maximal value of this attribute, hereinafter used SX and SY are respectively threshold value and end point values at interval, and be as follows with the concrete operation method of the counting that each son is subscribed to the insertion of index:
(1) if do not define predicate on this attribute, when promptly being spaced apart [DOMAIN_MIN, DOMAIN_MAX] pattern, because any incident is all in interval range, so any subscription all can be met, do not set up index and counting so be not required to be it, the count value of this subevent is 0;
(2) when being spaced apart [SX, DOMAIN_MAX] pattern, its correspondence is greater than predicate, with SX insert with greater than the corresponding index of predicate because the right-hand member of any one occurrence is all within maximal value, the right endpoint of this subevent need not counted, its count value is 1;
(3) when being spaced apart [DOMAIN_MIN, SY] pattern, its correspondence is less than predicate, with SY insert with less than the corresponding index of predicate because the left end of any one occurrence is all within minimum value, the left end point of this subevent need not counted, its count value is 1;
(4) when being spaced apart [SX, SY] pattern, no matter whether SX equates with SY, can be broken down into [SX, DOMAIN_MAX] and the form of [DOMAIN_MIN, SY], by (2) and (3) as can be known, SX need be inserted greater than the predicate index, SY is inserted less than the predicate index, and the numerical value of this subevent is 2, when these two predicates that and if only if resolves into all are met, [SX, SY] pattern just can be met.
3. a kind of symmetrical distribution subscription system matching process based on plural one-dimensional index according to claim 1 is characterized in that: the described inquiry of step 3 also will be subscribed to insert with the incident of subscribing to coupling and be subscribed to index, may further comprise the steps:
Step 3-1: query event index: decompose each attribute of subscribing to, it is become the set of subscribing to the son of attribute correspondence; Each height is subscribed to, the case index that query steps 2-3 sets up, the subevent that inspection matches, and the event matches counter SCounter corresponding with the subevent that matches counted, so that use when whether the incident of comparison in the future is met, the incident of supposing arrival is [EX, EY], and pairing index and counting operation method are as follows:
(1) with EY inquiry the index less than predicate of corresponding attribute, find first smaller or equal to the EY leaf node after, utilize the sequential chained list of B+Tree leaf node, leaf chained list to index scans, the direction of scanning is from big to small, incident to each predicate correspondence that scans is mated counting, and count value is 1;
(2) with EX inquiry the index greater than predicate of corresponding attribute, find first more than or equal to leaf node after, utilize the sequential chained list of B+Tree leaf node, leaf chained list to index scans, the direction of scanning is from small to large, event count is carried out in subscription to each predicate correspondence that scans, and count value is 1;
Step 3-2: the incident of output coupling, check the situation of each subscription associated index table that step 3-1 is found, if the count value PCounter of the pre-statistics of each incident in the concordance list equates with the event matches counter SCounter of its coupling, then this subscription is met this incident, and system is with this incident output and submit to the user;
Step 3-3: will subscribe to insert and subscribe to index, decompose each attribute of subscribing to, it is become the set of subscribing to the son of attribute correspondence, subscribe to the type of corresponding intervals according to son, be inserted into corresponding index and simultaneously this son subscribed to counting, the counting of this subscription be each son count value of subscribing to and, be pre-statistical counting PCounter, each attribute has two index, one be used for index with greater than the predicate corresponding intervals, another index with less than the corresponding index of predicate, the indexed data structure is B+Tree, the thresholding of supposing this attribute is [DOMAIN_MIN, DOMAIN_MAX], and is as follows with the concrete operation method of the counting that each son is subscribed to the insertion of index:
(1) if do not define predicate on this attribute, when promptly being spaced apart [DOMAIN_MIN, DOMAIN_MAX] pattern, the count value that this son is subscribed to is 0;
(2) when being spaced apart [SX, DOMAIN_MAX] pattern, its correspondence is greater than predicate, with SX insert with greater than the corresponding index of predicate, its count value is 1;
(3) when being spaced apart [DOMAIN_MIN, SY] pattern, its correspondence is less than predicate, with SY insert with less than the corresponding index of predicate, its count value is 1;
(4) when being spaced apart [SX, SY] pattern, no matter whether SX equates with SY, can be broken down into [SX, DOMAIN_MAX] and the form of [DOMAIN_MIN, SY], by step 3-3 (2) and (3) as can be known, SX need be inserted greater than the predicate index, SY is inserted less than the predicate index, and the numerical value that this son is subscribed to is 2, when these two predicates that and if only if resolves into all are met, [SX, SY] pattern just can be met.
CN201010606649XA 2010-12-27 2010-12-27 Matching method of symmetrical issuing subscription system based on plural one-dimensional index Expired - Fee Related CN102004798B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010606649XA CN102004798B (en) 2010-12-27 2010-12-27 Matching method of symmetrical issuing subscription system based on plural one-dimensional index

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010606649XA CN102004798B (en) 2010-12-27 2010-12-27 Matching method of symmetrical issuing subscription system based on plural one-dimensional index

Publications (2)

Publication Number Publication Date
CN102004798A true CN102004798A (en) 2011-04-06
CN102004798B CN102004798B (en) 2012-05-23

Family

ID=43812160

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010606649XA Expired - Fee Related CN102004798B (en) 2010-12-27 2010-12-27 Matching method of symmetrical issuing subscription system based on plural one-dimensional index

Country Status (1)

Country Link
CN (1) CN102004798B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102819569A (en) * 2012-07-18 2012-12-12 中国科学院软件研究所 Matching method for data in distributed interactive simulation system
CN103678577A (en) * 2013-12-10 2014-03-26 新浪网技术(中国)有限公司 Method and device for updating data
WO2014153717A1 (en) * 2013-03-26 2014-10-02 Telefonaktiebolaget L M Ericsson (Publ) Method and communication node for managing communication event subscribers and computer program and medium for the same
CN105068879A (en) * 2015-08-31 2015-11-18 苏州大学张家港工业技术研究院 Target subscription retrieval method and apparatus
CN105373633A (en) * 2015-12-23 2016-03-02 江苏省现代企业信息化应用支撑软件工程技术研发中心 Top-k subscription inquiring and matching method of position sensing subscription/publishing system
CN105740337A (en) * 2016-01-22 2016-07-06 东南大学 Rapid event matching method in content-based publishing subscription system
CN108833466A (en) * 2018-04-27 2018-11-16 中南民族大学 The system and method for transportation network space text publish/subscribe
CN111416854A (en) * 2020-03-16 2020-07-14 海南大学 Cloud service publishing method, subscribing method, device and system
CN111949913A (en) * 2020-08-12 2020-11-17 上海交通大学 Efficient matching method and system for space-time perception publishing/subscribing system
CN113722332A (en) * 2021-09-09 2021-11-30 上海交通大学 Method and system for improving efficiency and robustness of matching algorithm based on data structure

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101295311A (en) * 2008-06-17 2008-10-29 浙江大学 Semantic matching algorithm of large scale issuance subscription system
US20090012947A1 (en) * 2006-03-06 2009-01-08 Whitehead Jeffrey A Method and system for correlating information

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090012947A1 (en) * 2006-03-06 2009-01-08 Whitehead Jeffrey A Method and system for correlating information
CN101295311A (en) * 2008-06-17 2008-10-29 浙江大学 Semantic matching algorithm of large scale issuance subscription system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《计算机工程与应用》 20081231 胡宁静等 发布/订阅系统中多级索引匹配过滤器 第80-82页 1-3 , 第16期 2 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102819569B (en) * 2012-07-18 2015-01-07 中国科学院软件研究所 Matching method for data in distributed interactive simulation system
CN102819569A (en) * 2012-07-18 2012-12-12 中国科学院软件研究所 Matching method for data in distributed interactive simulation system
WO2014153717A1 (en) * 2013-03-26 2014-10-02 Telefonaktiebolaget L M Ericsson (Publ) Method and communication node for managing communication event subscribers and computer program and medium for the same
CN103678577A (en) * 2013-12-10 2014-03-26 新浪网技术(中国)有限公司 Method and device for updating data
CN105068879A (en) * 2015-08-31 2015-11-18 苏州大学张家港工业技术研究院 Target subscription retrieval method and apparatus
CN105068879B (en) * 2015-08-31 2018-08-17 苏州大学张家港工业技术研究院 A kind of method and device searched target and subscribed to
CN105373633A (en) * 2015-12-23 2016-03-02 江苏省现代企业信息化应用支撑软件工程技术研发中心 Top-k subscription inquiring and matching method of position sensing subscription/publishing system
CN105373633B (en) * 2015-12-23 2019-03-05 江苏省现代企业信息化应用支撑软件工程技术研发中心 The top-k query of subscription matching process of location aware subscription/publication system
CN105740337B (en) * 2016-01-22 2019-03-12 东南大学 A kind of event fast matching method in distribution subscription system based on content
CN105740337A (en) * 2016-01-22 2016-07-06 东南大学 Rapid event matching method in content-based publishing subscription system
CN108833466A (en) * 2018-04-27 2018-11-16 中南民族大学 The system and method for transportation network space text publish/subscribe
CN108833466B (en) * 2018-04-27 2021-05-14 中南民族大学 System and method for publishing/subscribing traffic network space text
CN111416854A (en) * 2020-03-16 2020-07-14 海南大学 Cloud service publishing method, subscribing method, device and system
CN111949913A (en) * 2020-08-12 2020-11-17 上海交通大学 Efficient matching method and system for space-time perception publishing/subscribing system
CN111949913B (en) * 2020-08-12 2024-04-09 上海交通大学 Efficient matching method and system for space-time perception publish/subscribe system
CN113722332A (en) * 2021-09-09 2021-11-30 上海交通大学 Method and system for improving efficiency and robustness of matching algorithm based on data structure
CN113722332B (en) * 2021-09-09 2024-03-26 上海交通大学 Method and system for improving efficiency and robustness of matching algorithm based on data structure

Also Published As

Publication number Publication date
CN102004798B (en) 2012-05-23

Similar Documents

Publication Publication Date Title
CN102004798B (en) Matching method of symmetrical issuing subscription system based on plural one-dimensional index
Zhao et al. Modeling MongoDB with relational model
US6609131B1 (en) Parallel partition-wise joins
CN104737162B (en) Automatic denormalization for the analytic type query processing in large-scale cluster
CN107451208B (en) Data searching method and device
CN106294695A (en) A kind of implementation method towards the biggest data search engine
CN105740337A (en) Rapid event matching method in content-based publishing subscription system
Tran et al. Structure index for RDF data
CN111008521A (en) Method and device for generating wide table and computer storage medium
CN109614402A (en) Multidimensional data query method and device
CN101710348B (en) Document data query method and server
CN106294374A (en) The method of small documents merging and data query system
Yang et al. Distributed similarity queries in metric spaces
CN108536824B (en) Data processing method and device
Li et al. Efficient subspace skyline query based on user preference using MapReduce
CN111126852A (en) BI application system based on big data modeling
US8239417B2 (en) System, method, and computer program product for accessing and manipulating remote datasets
CN104537091A (en) Networked relational data query method based on hierarchical identification routing
Yin et al. An industrial dynamic skyline based similarity joins for multidimensional big data applications
CN106202364B (en) XML data Partitioning optimization method and its system towards MapReduce
CN107656989A (en) The nearest Neighbor perceived in cloud storage system based on data distribution
CN107291938A (en) Order Query System and method
CN106708946A (en) Universal API table query method
CN101667202A (en) Parallel matching method of publish/subscribe system based on semantics under multi-core framework
Fu et al. ICA: an incremental clustering algorithm based on OPTICS

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120523

Termination date: 20151227

EXPY Termination of patent right or utility model