CN103412883A - Semantic intelligent information publishing and subscribing method based on P2P technology - Google Patents

Semantic intelligent information publishing and subscribing method based on P2P technology Download PDF

Info

Publication number
CN103412883A
CN103412883A CN2013103021876A CN201310302187A CN103412883A CN 103412883 A CN103412883 A CN 103412883A CN 2013103021876 A CN2013103021876 A CN 2013103021876A CN 201310302187 A CN201310302187 A CN 201310302187A CN 103412883 A CN103412883 A CN 103412883A
Authority
CN
China
Prior art keywords
node
information
data
semantic
dimension
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013103021876A
Other languages
Chinese (zh)
Other versions
CN103412883B (en
Inventor
王小峰
吴纯青
任沛阁
胡晓峰
黄杰
虞万荣
彭伟
陶静
孙浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN201310302187.6A priority Critical patent/CN103412883B/en
Publication of CN103412883A publication Critical patent/CN103412883A/en
Application granted granted Critical
Publication of CN103412883B publication Critical patent/CN103412883B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A semantic intelligent information publishing and subscribing method based on a P2P technology includes the steps that (1) a system topological structure is built, and a super node is set up; (2) standardization processing is carried out on high-dimensional attribute space; (3) high-dimensional data is partitioned, and a global index tree is built and maintained; (4) dimension reduction of the high-dimensional data: the node sorts through a data accumulation area assigned to self to be a multi-dimensional hypercube in the high-dimensional attribute space, uses a pyramid dimension reducing method for mapping high-dimensional data objects to one-dimensional data space and carries out indexing through the B+ tree; (5) an i-Chord method is used for managing the data objects; (6) intelligent information publishing and subscribing on the basis of semantics are achieved. The semantic intelligent information publishing and subscribing method has the advantages of being simple in principle and easy to achieve and popularize, and improving fault tolerance, dynamic nature and information distribution efficiency of a system.

Description

Semantic intelligent information distribution subscription method based on the P2P technology
Technical field
The present invention is mainly concerned with the intelligent information exchange method field of semantic-based in extensive information network, refers in particular to a kind of semantic intelligent information distribution subscription method based on the P2P technology.
Background technology
Develop rapidly and widespread use along with computer networking technology, digital resource in network presents the situation of exponential increase, the form of expression becomes more diverse, how the user also day by day improving, becomes from the information of effective acquisition user " interested " the numerous and diverse Internet resources of magnanimity the problem that people more and more pay close attention to the demand of acquisition of information.
The characteristics such as that network environment presents is in large scale, decentralised control, loose couplings, autonomy, dynamic, the researcher has proposed publish/subscribe (Publish/Subscribe, hereinafter to be referred as P/S) technology for this reason.The P/S system is comprised of publisher, subscriber and event agent's three parts.The publisher refers to the object of generation event, i.e. information producer; The subscriber refers to the object of consumption event, i.e. information consumer; And the event agent is the middleware of publish/subscribe, the publisher is with " event " form event agent that releases news, and the subscriber subscribes to interested event to the event agent, the event agent the event of issue in time reliably route give interested subscription.The P/S technology is a kind of asynchronous communication model that one-to-many and multi-to-multi information interaction can be provided simultaneously, it can make each participant of information interaction full decoupled on time, space and control stream, also have simultaneously the features such as anonymous communication, can meet well the demand of the loose communication of large-scale distributed network system.
Existing publish/subscribe technical research still is in developing stage, all exists some problems at aspects such as reliability and distribution of information efficiency, still has many gordian techniquies urgently to be resolved hurrily.For example, aspect topological structure, existing P/S system middleware is typically designed to centralized or non-structural P 2 P form.Centralized topology relies on individual server and comes the publisher of intermediary and subscriber (as the Gryphon in the SIENA of Colorado university and IBM research centre and JEDI etc.), but its shortcoming is easily to cause performance bottleneck, if server lost efficacy, whole system can't be worked; Non-structural P 2 P form topology (as the Hermes of Cambridge university proposition) often adopts inundation, Gossiping or random walk (random walk) algorithm to carry out route information, but the dynamic due to its non-structure and node, the event route is difficult to maintenance, and the system expandability is poor.
On the other hand, under open network environment, the form of expression of information resources is different, ubiquity the message structure isomery, and (different users means same event by different structures, the event for example had is the Map form, some events are the XML form) with the problem of Semantic Heterogeneous (different user is used different vocabulary (term) to mean same event, or means different concepts with same vocabulary).And existing P/S system (as systems such as CORBA, Scribe, Bayeux, JEDI) is remaining aspect ability to express in very large deficiency, according to the structural information of event, be described, shortage is to the semantic understanding of event itself, matching algorithm between event and subscription belongs to exact matching, in matching process, easily be subject to the interference of synonym or near synonym, the error result that departs from a large number user semantic may be returned to, the Intelligent Matching based on information semantic can't be realized.
In order to strengthen the semantic meaning representation ability of system, realization is based on the Intelligent Matching of information semantic, digital resource that can form in information network is different is abstract is point or the proper vector in the higher-dimension attribute space, and the distance between putting by high dimensional data or the angle between proper vector are weighed the Semantic Similarity between data object.And " dimension disaster " problem has also been drawn in the proposition of higher-dimension attribute space thereupon, show in the higher-dimension attribute space that data distribute sparse and level off to the higher dimensional space surface distributed, thereby caused that semantic similarity cost is too large, search efficiency is not high.The high dimensional data dimensionality reduction technology can be mapped to lower dimensional space by the never manageable higher dimensional space of data object, has effectively reduced search volume, has improved data retrieval efficiency, is one of effective means solved " dimension calamity " problem.For example: the technical scheme that the Chinese patent application name is called " a kind of embedded dimension reduction method based on image data structure protection " record by by each vector in original cube according to vector in twos between distance relation carry out the division of similar subset and non-similar subset; for different subsets, do the purpose that different embedding operations reaches distance conversion, newer distance matrix is done to projection reach the dimensionality reduction purpose.But the data object complicated component in real information network is of a great variety, its form of expression and semantic attribute constantly change, and are difficult to they are all unified to be abstracted into the fixedly vector of dimension, fixed type; Digital resource can define a variety of attributes in the higher-dimension attribute space simultaneously, but a lot of attributes and search irrelevant (can not occur as the concept in medical science) are arranged in the actual search process in computer science, therefore be necessary various data object consolidations are mapped to the attribute space of fixed sturcture and suitably reduce the attribute irrelevant with search, thereby reduce the calculated amount in semantic similarity, further improve search efficiency.
In sum, there is certain deficiency in existing distribution subscription system at aspects such as dynamic, fault-tolerance, self-organizations, simultaneity factor remains in defect aspect ability to express, lack the semantic understanding of event itself, can't realize that between the user, the intelligent information of semantic-based is mutual.
Summary of the invention
The technical problem to be solved in the present invention just is: for the technical matters that prior art exists, the invention provides a kind of principle simply, easily realize and promote, improve the semantic intelligent information distribution subscription method based on the P2P technology of fault-tolerance, dynamic and the distribution of information efficiency of system.
For solving the problems of the technologies described above, the present invention by the following technical solutions:
A kind of semantic intelligent information distribution subscription method based on the P2P technology, the steps include:
(1) constructing system topological structure set up super node: utilize the structural P 2 P technology that the topology constructing of a plurality of event agents in the P/S system is become to the Chord ring structure, and on ring, set a super node, for attribute the structure attribute space of information extraction network data resource, in information network data resource abstract be point or vector in the higher-dimension attribute space;
(2) normalization of higher-dimension attribute space is processed: on super node, utilize vector space model that data message in network is expressed as to the higher-dimension point vector in the higher-dimension attribute space, all data messages in network are mapped in a higher-dimension attribute space, on mathematical form, are expressed as a high level matrix; Utilize potential semantic indexing to remove the information attribute very little with Relevance in Information Retrieval, by the original higher-dimension attribute space of the approximate replacement of an attribute subspace;
(3) the global index tree is safeguarded in high dimensional data subregion foundation: super node SN is divided into different data gathering districts by the high dimensional data in attribute space, and each data gathering is distinguished and is fitted on different nodes, super node is also safeguarded an index tree be configured to by all nodal informations, be called the global index tree, be used for as node dispense event information in the Chord ring and determine the agent node that subscribe request need to be accessed;
(4) high dimensional data dimensionality reduction: the data gathering district consolidation that node will be distributed to oneself is the multidimensional hypercube in the higher-dimension attribute space, finally utilizes the pyramid dimension reduction method that the high dimensional data object map is arrived to the one-dimensional data space, and carries out index with the B+ tree;
(5) utilize i-Chord management by methods data object: after High Dimensional Data Set is mapped to the one-dimensional data space, utilize Chord agreement organizations maintaining network data message; Arrange an isotonic function by data-mapping in the one-dimensional data space to Chord resource identifier space; The corresponding data accumulation area of each node of Chord, the high dimensional data in storage administration corresponding data accumulation area, and routing table fast and easy Query Information of each node maintenance;
(6) intelligent information that realizes semantic-based is subscribed to and issue: subscription table of each node maintenance in system, record and the semantic relevant subscription information of this node; When the subscriber sends subscribe request to system, at first by search global index tree, determine and the semantic relevant agent node of this subscribe request, request is sent to agent node, in the subscription table of agent node, increase by one and subscribe to record, register the incidence relation of this subscribe request and this node, then agent node is determined the precise search scope in the higher-dimension attribute space according to subscribe request and subscription condition, in the precise search scope, accurately searches with the semantic identical event information of subscribe request and returns to the subscriber; When the publisher sends event information to system, at first by search global index tree, determine the host node relevant to this event semantics, event information is sent to host's node, and the subscription table of consulting host's node, if certain subscription information semantic matches success in event information and subscription table, by the event active push to the user.
As a further improvement on the present invention: the concrete steps of described step (1) are:
(1.1) in the P/S system, may there be a plurality of event agents, a plurality of event agents are organized into to event agent's network according to the Chord ring structure, the event agent is to each node in should network, each event agent is according to the data resource in certain rale store information network, and the information of preserving other nodes of part;
(1.2) in the Chord ring, select the node that ability is the strongest as super node SN, super node SN makes regular check on the ability of other nodes in ring, therefrom selects candidate's super node, the upper important information of candidate's super node backup SN;
(1.3) super node SN is responsible for extracting the attribute of multi-form data resource, structure multidimensional property space, in information network data resource abstract be point or vector in the higher-dimension attribute space.
As a further improvement on the present invention: the concrete steps of described step (2) are:
(2.1) thought according to vector space model is described as an attribute vector by high dimensional data information in network, then system is a matrix by all higher-dimension Information Organizations in network, if in information network, be described as the matrix A of a t*d by d high dimensional data information of t attribute description, each row of matrix represent a higher-dimension information data, matrix element a IjFor the property value of attribute i in data object j, represented the importance of attribute i in object j, if higher-dimension information j does not exist attribute i, a Ij Be 0;
(2.2) notice that in matrix, most element is 0, in the descriptive information retrieval, the most attribute of a higher-dimension information is garbage, replaces initial matrix with a low-rank approximate matrix; Suppose that in network, the higher-dimension information aggregate is expressed as matrix A, the order of A is r, utilizes the svd of matrix A to be decomposed into to the product of three matrixes:
A=UΣV T
U=(u wherein l, u 2, u r) be a t*r matrix, Σ=diag (σ 1..., σ r) be a r*r diagonal matrix, V=(v l..., v r) be a d*r matrix, σ iThe singular value of A, σ 1>=σ 2>=...>=σ r;
(2.3) only retain the singular values of a matrix of 1 maximum, dispense other singular value, the approximate abbreviation of matrix A that is r by order is that order is 1 matrix A 1:
A 1=U 1Σ 1V 1 T
U wherein 1=(u l, u 2..., U 1), Σ 1=diag (σ 1..., σ 1), V 1=(V l..., V 1), V 1Σ 1Row be the semantic vector of higher-dimension information.
As a further improvement on the present invention: the concrete steps of described step (3) are:
(3.1) in information network data resource abstract be point or vector in the higher-dimension attribute space, according to the close data resource of distance in attribute space, has the principle of similar semanteme, super node SN carries out the cluster subregion to the high dimensional data distributed in the multidimensional property space, high dimensional data is divided into to a plurality of mutually disjoint data gatherings district, and each data gathering is distinguished and is fitted on different nodes;
(3.2) super node SN, except the routing table of self, also safeguards the index tree that a structure of the resource identifier range information by the upper all nodes of ring forms, and is called the GI of global index;
(3.2.1) be fast node dispense event information; In ring, each node is responsible for a data accumulation area, in this accumulation area, all data messages are all distributed to this node, when an event information request adds fashionable, GI determines which data gathering district this event information belongs to by inquiry, namely determined host's node of event information, then the identifier of host's node of take is search key, utilizes the Chord Routing Protocol, and this event information is distributed to host's node;
(3.2.2) determine the node that subscribe request need to be accessed; When certain node input subscribe request, at first request is sent to SN search GI, determine that the semantic space of data gathering district which node is responsible for and this subscribe request intersects and return the identifier of these nodes, the node returned is called agent node, then the identifier of agent node of take is search key, subscribe request is routed to these nodes, further realizes semantic matches.
As a further improvement on the present invention: the concrete steps of described step (4) are:
(4.1) by each data gathering district consolidation, be that in high-dimensional data space, dimension is the multidimensional hypercube of d, and the multidimensional hypercube is carried out to normalized, the length of side that is every one dimension is 1, the central point of hypercube be expressed as (0.5,0.5 ..., 0.5), then take the hypercube center is summit, and (d-1) dimension lineoid in data gathering district, as base, is divided into 2d pyramid by each d dimension data accumulation area;
(4.2) each pyramid being divided into to several height parallel with base divides, each height is divided corresponding with a B+ tree data page, then the distance of putting the pyramid base according to high dimensional data is mapped as one dimensional numerical by the high dimensional data in the data gathering district, and by data after B+ tree organization and administration dimensionality reduction; The dimensionality reduction formula is:
y v=i*C+(j+h v)=i*c+(0+|0.5-h v|)
Wherein the cube number at certain high dimensional data object v place is i, and the pyramid number is j, and v is h to place pyramid base plane distance v', the distance that arrives plane, place, pyramid summit is h v=∣ 0.5-h v’ ∣; After dimensionality reduction, in each High Dimension Cubes, the one dimension value of high dimensional data object is limited at [i*c, (i+1) * c] in interval, c is enough large constant, guarantees that the data object in each data partition has the index key that is different from other subregion.
As a further improvement on the present invention: the concrete steps of described step (5) are:
(5.1) suppose that between Chord system resource identifier field, size is 2 m, utilize an isotonic function h that one dimensional numerical after dimensionality reduction is mapped to interval [0,2 in order m) in; Namely for certain the high dimensional data point v in attribute space, the i-Chord resource identifier of some v is:
key v=ichord(v)=h(y v)=h(i*c+(0+h v))∈[0,2 m)
(5.2) each node N on the Chord ring iBe responsible for the high dimensional data information of a data accumulation area of storage administration, establish N iNode identifier Nkey i, in this accumulation area, the resource identifier of high dimensional data information is distributed in interval (Nkey I-1, Nkey i] in;
(5.3) each node N iSafeguard a routing table, namely pointer gauge, point to other nodes on ring; M list item arranged in routing table, wherein (it is that on the Chord ring, identifier is equal to or greater than (Nkey that 1≤k≤m) goes list item to k i+ 2 K-1) mod2 mFirst node, i.e. successor ((Nkey i+ 2 K-1) mod2 m); Any one node receives when key word is the request of key, checks that at first whether self node equals key, if it is directly returns; Otherwise node is searched its routing table, find in table maximum but be no more than first node of key, and inquiry request is transmitted to this node, repeat this process, until request arrives a node N k, meet key and be positioned at N kAnd N kSubsequent node N K+1Between the time, node N kReport its descendant node N K+1As replying of request;
(5.4) when high dimensional data in certain data gathering district is too intensive while causing node effectively to it, not manage, this accumulation area is split into to two or more new accumulation area, correspondingly on the Chord ring, selects idle node to share the load of origin node; When Sparse in the data gathering district of closing on causes node resource when waste, merge accumulation area, select the XM in the new data gathering district of a conduct in two XM, and another is set to idle node.
As a further improvement on the present invention: the concrete steps of described step (6) are:
(6.1) subscription table of each node maintenance, record all subscription information relevant to this node, when node has event information to arrive, the subscription table of inquiring about this node, if certain subscription information semantic matches success in event information and subscription table, by the event active push to the user;
(6.2) as the subscriber to upper certain the node N of ring iWhile submitting subscribe request information to, this node searching global search tree is determined and the semantic relevant agent node of subscribe request, this subscribe request is routed to agent node, and carry out based on subscribing to semantic similarity algorithm on agent node, if search and the semantic identical event information of subscribe request, by the event information back transfer to N iSimultaneously, agent node is stored in subscribe request, subscription condition, routed path and hunting zone information in its subscription table;
(6.3) as the publisher to upper certain the node N of ring jDuring the issue event information, this node searching global search tree is determined and the semantic relevant host's node of event information, according to the Chord Routing Protocol, this event information is routed to and is stored on host's node; Simultaneously, host's node carries out semantic matches by the subscribe request information in event attribute information and its subscription table, if the match is successful directly event information is pushed to the subscriber;
(6.4) when the subscriber cancels subscriptions information to upper certain the node issue of ring, this node searching global search tree is determined and the semantic relevant agent node of subscription information, and solicited message is routed to corresponding agent node, in the agent node subscription table, find and delete corresponding subscription information.
Compared with prior art, the invention has the advantages that:
The principle of the invention simply, easily realize and promote, on the basis of structural P 2 P technology, can construct the publish/subscribe system of semantic-based, improved fault-tolerance, dynamic and the distribution of information efficiency of system; The present invention is by event information and subscription information are mapped to the multidimensional property space, and the system support, to the semantic understanding of information itself, has strengthened the ability to express of system, supports the Intelligent Matching based on information semantic; By consolidation abbreviation multidimensional property space and high dimensional data dimensionality reduction technology, eliminate the dimension calamity problem that bring in the multidimensional property space simultaneously, can efficiently realize Organization of Data and Intelligent Matching based on information semantic.
The accompanying drawing explanation
Fig. 1 is the schematic flow sheet of the inventive method.
Fig. 2 is that in the present invention, high dimensional data is mapped to one-dimensional space schematic diagram from higher dimensional space.
Fig. 3 is according to searching the schematic diagram of data procedures and subscription table in the present invention.
Fig. 4 is the schematic diagram of subscribe request processing procedure in the present invention.
Fig. 5 is the schematic diagram that the present invention is based on the inquiry of information semantic process range.
Embodiment
Below with reference to Figure of description and specific embodiment, the present invention is described in further details.
As shown in Figure 1, the semantic intelligent information distribution subscription method based on the P2P technology of the present invention, concrete steps are:
(1) constructing system topological structure set up super node: utilize the structural P 2 P technology that the topology constructing of a plurality of event agents in the P/S system is become to the Chord ring structure, and on ring, set a super node, for attribute the structure attribute space of information extraction network data resource, in information network data resource abstract be point or vector in the higher-dimension attribute space.
(2) normalization of higher-dimension attribute space is processed: various for information category in present information network, different each inconsistent problems of information attribute, on super node, utilize vector space model (VSM) that data message in network is expressed as to the higher-dimension point vector in the higher-dimension attribute space, all data messages in network are mapped in a higher-dimension attribute space, on mathematical form, are expressed as a high level matrix; For noise and the synonym reduced in the information index process disturbs, utilize potential semantic indexing (LSI) to remove the information attribute very little with Relevance in Information Retrieval, by the original higher-dimension attribute space of the approximate replacement of an attribute subspace.
(3) the global index tree is safeguarded in high dimensional data subregion foundation: super node SN is divided into different data gathering districts by the high dimensional data in attribute space, and each data gathering is distinguished and is fitted on different nodes, super node is also safeguarded an index tree be configured to by all nodal informations, be called the global index tree, be used for as node dispense event information in the Chord ring and determine the agent node that subscribe request need to be accessed.
(4) high dimensional data dimensionality reduction: the data gathering district consolidation that node will be distributed to oneself is the multidimensional hypercube in the higher-dimension attribute space, finally utilizes the pyramid dimension reduction method that the high dimensional data object map is arrived to the one-dimensional data space, and carries out index with the B+ tree.
(5) utilize i-Chord management by methods data object: after High Dimensional Data Set is mapped to the one-dimensional data space, utilize Chord agreement organizations maintaining network data message.For one-dimensional data space and Chord resource identifier Space mismatching problem, arrange an isotonic function by data-mapping in the one-dimensional data space to Chord resource identifier space; The corresponding data accumulation area of each node of Chord, the high dimensional data in storage administration corresponding data accumulation area, and routing table fast and easy Query Information of each node maintenance; When the data volume of a node administration is excessive, can insert by node the load balancing of Operation & Maintenance System such as leaving.
(6) intelligent information that realizes semantic-based is subscribed to and issue: subscription table of each node maintenance in system, record and the semantic relevant subscription information of this node.When the subscriber sends subscribe request to system, at first by search global index tree, determine and the semantic relevant agent node of this subscribe request, request is sent to agent node, in the subscription table of agent node, increase by one and subscribe to record, register the incidence relation of this subscribe request and this node, then agent node is determined the precise search scope in the higher-dimension attribute space according to subscribe request and subscription condition, in the precise search scope, accurately searches with the semantic identical event information of subscribe request and returns to the subscriber.When the publisher sends event information to system, at first by search global index tree, determine the host node relevant to this event semantics, event information is sent to host's node, and the subscription table of consulting host's node, if certain subscription information semantic matches success in event information and subscription table, by the event active push to the user.
In the present embodiment, the concrete steps of above-mentioned steps (1) are:
(1.1) in the P/S system, may there be a plurality of event agents, a plurality of event agents are organized into to event agent's network according to the Chord ring structure, the event agent is to each node in should network, each event agent is according to the data resource in certain rule (semantic information of event) grid information storage network, client service for some, and the information of preserving other nodes of part, make to subscribe to low-cost with event information, arrive semantic relevant node efficiently, reliably.
(1.2) node that ability of selection (computing power of node, storage space, line duration, bandwidth etc. are many-sided to be considered) is the strongest in the Chord ring is as super node SN, super node SN makes regular check on the ability of other nodes in ring, therefrom select candidate's super node, the upper important information of candidate's super node backup SN, so that SN replaces SN to become new super node while breaking down.
(1.3) in order to obtain the semantic information of data resource in network, super node SN is responsible for extracting the attribute of multi-form data resource, structure multidimensional property space, in information network data resource abstract be point or vector in the higher-dimension attribute space.
In the present embodiment, the concrete steps of above-mentioned steps (2) are:
(2.1) thought according to vector space model is described as an attribute vector by high dimensional data information in network, attribute can be the elements such as the concept relevant with given higher-dimension information, keyword, term, then system is a matrix by all higher-dimension Information Organizations in network, if in information network, can be described as the matrix A of a t*d by d high dimensional data information of t attribute description, each row of matrix represent a higher-dimension information data, matrix element a IjFor the property value of attribute i in data object j, represented the importance (if calculated by a term occurs in a document frequency) of attribute i in object j, if higher-dimension information j does not exist attribute i, a Ij Be 0.
(2.2) notice that in matrix, most element is 0, in the descriptive information retrieval, the most attribute of a higher-dimension information is garbage, in order to reduce calculated amount unnecessary in the information semantic retrieving, replaces initial matrix with a low-rank approximate matrix.Suppose that in network, the higher-dimension information aggregate can be expressed as matrix A, the order of A is r, utilizes the svd of matrix A to be decomposed into to the product of three matrixes:
A=U∑V T
U=(u wherein 1, u 2..., u r) be a t*r matrix, ∑=diag (σ 1..., σ r) be a r*r diagonal matrix, V=(v 1..., v r) be a d*r matrix, σ iThe singular value of A, σ 1>=σ 2>=...>=σ r.
(2.3) in order to reduce calculated amount, accelerate information retrieval speed, avoid simultaneously the synonym in the semantic matches process to disturb, only retain l maximum singular values of a matrix, dispense other singular value, the approximate abbreviation of matrix A that is r by order is that order is the matrix A of l l:
A 1=U 1Σ 1V 1 T
U wherein 1=(u l, u 2..., U 1), Σ 1=diag (σ 1..., σ 1), V 1=(V l..., V 1), V 1Σ 1Row be the semantic vector of higher-dimension information.Like this, about the processing of higher-dimension information retrieval, can in this low-rank semantic matrix, carry out.
In the present embodiment, the concrete steps of above-mentioned steps (3) are:
(3.1) in information network data resource abstract be point or vector in the higher-dimension attribute space, according to the close data resource of distance in attribute space, has the principle of similar semanteme, super node SN carries out the cluster subregion to the high dimensional data distributed in the multidimensional property space, high dimensional data is divided into to a plurality of mutually disjoint data gatherings district, and each data gathering is distinguished and is fitted on different nodes.
(3.2) super node SN, except the routing table of self, also safeguards the index tree that a structure of the resource identifier range information by the upper all nodes of ring forms, and is called the GI of global index.GI has following two purposes:
(3.2.1) be fast node dispense event information.In ring, each node is responsible for a data accumulation area, in this accumulation area, all data messages are all distributed to this node, when an event information request adds fashionable, GI determines which data gathering district this event information belongs to by inquiry, namely determined host's node of event information, then the identifier of host's node of take is search key, utilizes the Chord Routing Protocol, and this event information is distributed to host's node.
(3.2.2) determine the node that subscribe request need to be accessed.When certain node input subscribe request, at first request is sent to SN search GI, determine that the semantic space of data gathering district which node is responsible for and this subscribe request intersects and return the identifier of these nodes, the node returned is called agent node, may comprise the subscription result, then the identifier of agent node of take is search key, and subscribe request is routed to these nodes, further realizes semantic matches.
In the present embodiment, the concrete steps of above-mentioned steps (4) are:
(4.1) by each data gathering district consolidation, be that in high-dimensional data space, dimension is the multidimensional hypercube of d, and the multidimensional hypercube is carried out to normalized, the length of side that is every one dimension is 1, the central point of hypercube be expressed as (0.5,0.5 ..., 0.5), then take the hypercube center is summit, and (d-1) dimension lineoid in data gathering district, as base, is divided into 2d pyramid by each d dimension data accumulation area (hypercube).
(4.2) each pyramid being divided into to several height parallel with base divides, each height is divided corresponding with a B+ tree data page, then the distance of putting the pyramid base according to high dimensional data is mapped as one dimensional numerical by the high dimensional data in the data gathering district, and by data after B+ tree organization and administration dimensionality reduction.The dimensionality reduction formula is:
y v=i*C+(j+h v)=i*c+(j+|0.5-h v|)
Wherein the cube number at certain high dimensional data object v place is that i(distinguishes different data gathering districts), the pyramid number is j, v is h to place pyramid base plane distance v', the distance that arrives plane, place, pyramid summit is h v=∣ 0.5-h v’ ∣.After dimensionality reduction, in each High Dimension Cubes, the one dimension value of high dimensional data object is limited at [i*c, (i+1) * c] in interval, c is enough large constant, thereby guarantees that the data object in each data partition has the index key that is different from other subregion.
In the present embodiment, the concrete steps of above-mentioned steps (5) are:
(5.1) suppose that between Chord system resource identifier field, size is 2 m, for one-dimensional data after the resource identifier management dimensionality reduction that utilizes Chord, utilize an isotonic function h that one dimensional numerical after dimensionality reduction is mapped to interval [0,2 in order m) in.Namely for certain the high dimensional data point v(v in attribute space, belong to the hypercube of cube numerical digit i), the i-Chord resource identifier of some v is:
key v=ichord(v)=h(y v)=h(i*c+(j+h v))∈[0,2 m)
(5.2) each node N on the Chord ring iBe responsible for the high dimensional data information of a data accumulation area of storage administration, establish N iNode identifier Nkey i, in this accumulation area, the resource identifier of high dimensional data information is distributed in interval (Nkey I-1, Nkey i] in.
(5.3) each node N iSafeguard a routing table, namely pointer gauge (finger table), point to other nodes on ring.The m(identifier is arranged in routing table figure place) individual list item, wherein (it is that on the Chord ring, identifier is equal to or greater than (Nkey that 1≤k≤m) goes list item to k i+ 2 K-1) mod2 mFirst node, i.e. successor ((Nkey i+ 2 K-1) mod2 m).Any one node receives when key word is the request of key, checks that at first whether self node equals key, if it is directly returns; Otherwise node is searched its routing table, find in table maximum but be no more than first node of key, and inquiry request is transmitted to this node, repeat this process, until request arrives a node N k, meet key and be positioned at N kAnd N kSubsequent node N K+1Between the time, node N kReport its descendant node N K+1As replying of request.
(5.4) resource identifier be responsible for of node is limited, so node can't be safeguarded unlimited many high dimensional data information.When high dimensional data in certain data gathering district is too intensive while causing node effectively to it, not manage, this accumulation area is split into to two or more new accumulation area, correspondingly on the Chord ring, select idle node to share the load of origin node; When Sparse in the data gathering district of closing on causes node resource when waste, merge accumulation area, select the XM in the new data gathering district of a conduct in two XM, and another is set to idle node.By carrying out aforesaid operations, effectively realized system load balancing.
In the present embodiment, the concrete steps of above-mentioned steps (6) are:
(6.1) in order to realize the information distribution subscription function of system semantic-based, subscription table of each node maintenance, record all subscription information relevant to this node, when node has event information to arrive, inquire about the subscription table of this node, if certain subscription information semantic matches success in event information and subscription table, by the event active push to the user.
(6.2) as the subscriber to upper certain the node N of ring iWhile submitting subscribe request information to, this node searching global search tree is determined and the semantic relevant agent node of subscribe request, this subscribe request is routed to agent node, and carry out based on subscribing to semantic similarity algorithm (as an inquiry, range query, k NN Query, arest neighbors inquiry) on agent node, if search and the semantic identical event information of subscribe request, by the event information back transfer to N iSimultaneously, agent node is stored in the information such as subscribe request, subscription condition, routed path and hunting zone in its subscription table.
(6.3) as the publisher to upper certain the node N of ring jDuring the issue event information, this node searching global search tree is determined and the semantic relevant host's node of event information, according to the Chord Routing Protocol, this event information is routed to and is stored on host's node; Simultaneously, host's node carries out semantic matches by the subscribe request information in event attribute information and its subscription table, if the match is successful directly event information is pushed to the subscriber.
(6.4) when the subscriber cancels subscriptions information to upper certain the node issue of ring, this node searching global search tree is determined and the semantic relevant agent node of subscription information, and solicited message is routed to corresponding agent node, in the agent node subscription table, find and delete corresponding subscription information.
Below will to the present invention, be described in detail with a concrete application example, its detailed implementation step is:
1) network data acquisition: have the different digital resource of magnanimity form (as text, image, music, video etc.) in network now, system can not the above-mentioned resource of intuitivism apprehension semantic information, at first digital resource in network gathered and carry out the formalization processing, be convenient to machine recognition and further process.
2) constructing system topology set up super node: a plurality of event agents are organized into to event agent's network according to the Chord ring structure, the event agent is to each node in should network, each event agent is according to the data resource in certain rule (semantic information of event) grid information storage network, client service for some, and the information of preserving other nodes of part, make to subscribe to low-cost with event information, arrive semantic relevant node efficiently, reliably.The node that in setting system, ability is the strongest (computing power, line duration, storage space, bandwidth etc. many-sided consider) super node SN, and the backup node of definite super node, the normal operation of system when guaranteeing that super node breaks down.In order to obtain the semantic information of data resource in network, super node SN is responsible for extracting the attribute of multi-form data resource, obtains the multidimensional property space, in information network data resource abstract be point or vector in the higher-dimension attribute space.
3) normalization of higher-dimension attribute space is processed: according to vector space model (VSM), extract the attribute information of data object, data object in network is described as to a higher-dimension attribute vector, network digital resources is mapped in the higher-dimension attribute space, the network data resource collection can be expressed as a matrix, element value in vector means the importance of an attribute in this data object, can be by t the attribute description matrix A that is a t*d as d in network data object set.
The network data resource meaned by the higher-dimension attribute space in information retrieval process has a lot of attributes that have nothing to do with inquiry (showing as a lot of elements in matrix is 0), in order to accelerate inquiry velocity, avoid simultaneously synonym and noise, the abbreviation matrix, replace initial matrix with a low-rank approximate matrix.Suppose that in network, the data resource set expression is matrix A, the order of A is r, utilizes the svd of matrix A to be decomposed into to the product of three matrixes: A=U ∑ V T, U=(u wherein 1, u 2..., u r) be a t*r matrix, ∑=diag (σ 1..., σ r) be a r*r diagonal matrix, V=(v 1..., v r) be a d*r matrix, σ iThe singular value of A, σ 1>=σ 2>=...>=σ rThen only retain l maximum singular values of a matrix, dispense other singular value, the approximate abbreviation of matrix A that is r by order is that order is the matrix A of l l: A l=U llV l T, U wherein l=(u 1, u 2..., u l), ∑ l=diag (σ 1..., σ l), V l=(v 1..., v l), V llRow be the semantic vector of higher-dimension information, Internet resources can be expressed as the high dimension vector in l dimension attribute space.
4) data gathering subregion set up global index tree: the data object approached according to the attribute space middle distance more may have the principle of similar semanteme, data object in attribute space is assembled to subregion, high dimensional data in space is divided into to a plurality of mutually disjoint data gatherings district, guarantees that as far as possible the data in each accumulation area are tending towards being uniformly distributed.
In addition, super node is safeguarded the tree GI of global index, the data gathering district range information that each node is corresponding is arranged in GI, when having new data object to add system, which node GI should belong to according to the quick specified data object of data object coordinate, so that the data object that will newly add is rapidly distributed to this node; When new subscribe request is arranged, determine the semantic space scope of subscribe request, inquiry GI confirms the data gathering district of intersecting with subscribe request fast, and request is sent to further accurately inquiry of interdependent node, has accelerated inquiry velocity.Answer in addition the impact of the GI of periodic refreshing global index to avoid the system node change to bring.
5) high dimensional data dimension-reduction treatment: for eliminating " dimension disaster " impact that in the higher-dimension attribute space, information retrieval is subject to, high dimensional data information is carried out to dimension-reduction treatment, as shown in Figure 2.
The data gathering district possibility out-of-shape that data partition obtains, by each data gathering district consolidation, it is the higher-dimension hypercube, each hypercube determines that a cube counts i, and carry out normalized, making the cube length of side is 1, then take the hypercube center to be summit, and (d-1) dimension lineoid in data gathering district is as base, each d dimension data accumulation area (hypercube) is divided into to 2d pyramid, and gives a pyramid value i for each pyramid.
Each pyramid is divided into to several height parallel with base to be divided, each height is divided corresponding with a B+ tree data page, then according to high dimensional data put the pyramid base the distance high dimensional data in the data gathering district is mapped as to one dimensional numerical, as shown in Figure 2, and by data after B+ tree organization and administration dimensionality reduction.The dimensionality reduction formula is: y v=i*c+ (j+h v)=i*c+ (j+ ∣ 0.5-h v’ ∣), after dimensionality reduction, in each High Dimension Cubes, the one dimension value of high dimensional data object is limited in [i*c, (i+1) * c] interval, c is enough large constant, thereby guarantees that the data object in each data partition has the index key that is different from other subregion.
6) with i-Chord management by methods data object: utilize Chord method storage administration network data information, in information retrieval, realized the information retrieval distributed treatment, improved the efficiency of system.At first according to the 5th) resource identifier of the one-dimensional data construction data object that generates of section, suppose that between Chord system resource identifier field, size is 2 m, utilize an isotonic function h that the one-dimensional data value is mapped to interval [0,2 in order m) in, the mapping formula is: key v=ichord (v)=h (y v)=h (i*c+ (j+h v)) ∈ [0,2 m).
For the ease of data query, the data object on the Chord ring in data accumulation area of each node storage administration, data information memory corresponding to ring the preceding paragraph resource identifier is on its descendant node, as node N iNode identifier Nkey i, resource identifier is at interval (Nkey I-1, Nkey i] in the high dimensional data object be stored in node N iOn.Routing table of each node maintenance, system is according to data object corresponding to the given resource identifier of routing table lookup (Fig. 3 has demonstrated the process of searching key=28 from node N4).For the maintenance system load balancing, when data object in the data gathering district is too intensive or too sparse, divides or the pooled data collector, and carry out corresponding nodal operation simultaneously, to guarantee the stable resource optimization that reaches of system performance.
7) semantic-based is realized the distribution subscription of information: process the subscribe request process as shown in Figure 4, each node maintenance has subscription table (as shown in Figure 3), when the user sends subscribe request to certain node, the systems inspection subscription table, if there is no identical subscribe request, systematic search global index tree is determined to the semantic relevant agent node of subscribe request and subscribe request is routed to these agent nodes, and on these semantic interdependent nodes, carry out the similarity algorithm, find with the semantic identical event information of subscribe request and return to the user, simultaneously by user's subscribe request, the relevant informations such as Search Results add the subscription table in agent node, if identical user's request is arranged in subscription table, according to the subscription table fast finding, return and upgrade subscription table to result, in system, there is new event information to add fashionable, according to its resource identifier, be stored on corresponding host's node, and check that host's node subscription table determines whether the semantic subscription be complementary with it, if existed by the data object active push to corresponding user, when user's transmission cancelled subscriptions message, the system searching subscription table was also deleted corresponding subscription information.
The similarity method is divided into four classes: some inquiry (Point Query) finds the destination object p identical with given query point q in data space S; Range query (Range Query), for given query point q and threshold values r, find all destination object p that meet d (p, q)≤r in data space S; Arest neighbors inquiry (Nearest Neighbor Query), find the destination object p nearest with given query point q in data space S; K-NN Query (KNN Query), find k the destination object p nearest with given query point q in data space S.
The range query of take is example, (two-dimensional space of take is example) as shown in Figure 5, at first according to query point coordinate and inquiry radius thereof, draw query context (border circular areas in figure), and judge whether with data space in the data gathering district intersect, if exist data gathering district and query context to intersect, determine two border index key y that interior each pyramid of accumulation area and query context intersect lowAnd y High, according to these two index keys, carry out scope the B+ seeds and search, obtain a point set (point set in figure in shadow region), then accurately whether the concentrated data point of judging point belongs to query context, last Output rusults.
Be only below the preferred embodiment of the present invention, protection scope of the present invention also not only is confined to above-described embodiment, and all technical schemes belonged under thinking of the present invention all belong to protection scope of the present invention.It should be pointed out that for those skilled in the art, some improvements and modifications without departing from the principles of the present invention, should be considered as protection scope of the present invention.

Claims (7)

1. the semantic intelligent information distribution subscription method based on the P2P technology, is characterized in that, step is:
(1) constructing system topological structure set up super node: utilize the structural P 2 P technology that the topology constructing of a plurality of event agents in the P/S system is become to the Chord ring structure, and on ring, set a super node, for attribute the structure attribute space of information extraction network data resource, in information network data resource abstract be point or vector in the higher-dimension attribute space;
(2) normalization of higher-dimension attribute space is processed: on super node, utilize vector space model that data message in network is expressed as to the higher-dimension point vector in the higher-dimension attribute space, all data messages in network are mapped in a higher-dimension attribute space, on mathematical form, are expressed as a high level matrix; Utilize potential semantic indexing to remove the information attribute very little with Relevance in Information Retrieval, by the original higher-dimension attribute space of the approximate replacement of an attribute subspace;
(3) the global index tree is safeguarded in high dimensional data subregion foundation: super node SN is divided into different data gathering districts by the high dimensional data in attribute space, and each data gathering is distinguished and is fitted on different nodes, super node is also safeguarded an index tree be configured to by all nodal informations, be called the global index tree, be used for as node dispense event information in the Chord ring and determine the agent node that subscribe request need to be accessed;
(4) high dimensional data dimensionality reduction: the data gathering district consolidation that node will be distributed to oneself is the multidimensional hypercube in the higher-dimension attribute space, finally utilizes the pyramid dimension reduction method that the high dimensional data object map is arrived to the one-dimensional data space, and carries out index with the B+ tree;
(5) utilize i-Chord management by methods data object: after High Dimensional Data Set is mapped to the one-dimensional data space, utilize Chord agreement organizations maintaining network data message; Arrange an isotonic function by data-mapping in the one-dimensional data space to Chord resource identifier space; The corresponding data accumulation area of each node of Chord, the high dimensional data in storage administration corresponding data accumulation area, and routing table fast and easy Query Information of each node maintenance;
(6) intelligent information that realizes semantic-based is subscribed to and issue: subscription table of each node maintenance in system, record and the semantic relevant subscription information of this node; When the subscriber sends subscribe request to system, at first by search global index tree, determine and the semantic relevant agent node of this subscribe request, request is sent to agent node, in the subscription table of agent node, increase by one and subscribe to record, register the incidence relation of this subscribe request and this node, then agent node is determined the precise search scope in the higher-dimension attribute space according to subscribe request and subscription condition, in the precise search scope, accurately searches with the semantic identical event information of subscribe request and returns to the subscriber; When the publisher sends event information to system, at first by search global index tree, determine the host node relevant to this event semantics, event information is sent to host's node, and the subscription table of consulting host's node, if certain subscription information semantic matches success in event information and subscription table, by the event active push to the user.
2. the semantic intelligent information distribution subscription method based on the P2P technology according to claim 1, is characterized in that, the concrete steps of described step (1) are:
(1.1) in the P/S system, may there be a plurality of event agents, a plurality of event agents are organized into to event agent's network according to the Chord ring structure, the event agent is to each node in should network, each event agent is according to the data resource in certain rale store information network, and the information of preserving other nodes of part;
(1.2) in the Chord ring, select the node that ability is the strongest as super node SN, super node SN makes regular check on the ability of other nodes in ring, therefrom selects candidate's super node, the upper important information of candidate's super node backup SN;
(1.3) super node SN is responsible for extracting the attribute of multi-form data resource, structure multidimensional property space, in information network data resource abstract be point or vector in the higher-dimension attribute space.
3. the semantic intelligent information distribution subscription method based on the P2P technology according to claim 1, is characterized in that, the concrete steps of described step (2) are:
(2.1) thought according to vector space model is described as an attribute vector by high dimensional data information in network, then system is a matrix by all higher-dimension Information Organizations in network, if in information network, be described as the matrix A of a t*d by d high dimensional data information of t attribute description, each row of matrix represent a higher-dimension information data, matrix element a IjFor the property value of attribute i in data object j, represented the importance of attribute i in object j, if higher-dimension information j does not exist attribute i, a IjBe 0;
(2.2) notice that in matrix, most element is 0, in the descriptive information retrieval, the most attribute of a higher-dimension information is garbage, replaces initial matrix with a low-rank approximate matrix; Suppose that in network, the higher-dimension information aggregate is expressed as matrix A, the order of A is r, utilizes the svd of matrix A to be decomposed into to the product of three matrixes:
A=U∑V T
U=(u wherein 1, u 2..., u r) be a t*r matrix, ∑=diag (σ 1..., σ r) be a r*r diagonal matrix, V=(v 1..., v r) be a d*r matrix, σ iThe singular value of A, σ 1>=σ 2>=...>=σ r
(2.3) only retain l maximum singular values of a matrix, dispense other singular value, the approximate abbreviation of matrix A that is r by order is that order is the matrix A of l l:
A l=U llV l T
U wherein l=(u 1, u 2..., u l), ∑ l=diag (σ 1..., σ l), V l=(v 1..., v l), V llRow be the semantic vector of higher-dimension information.
4. the semantic intelligent information distribution subscription method based on the P2P technology according to claim 1, is characterized in that, the concrete steps of described step (3) are:
(3.1) in information network data resource abstract be point or vector in the higher-dimension attribute space, according to the close data resource of distance in attribute space, has the principle of similar semanteme, super node SN carries out the cluster subregion to the high dimensional data distributed in the multidimensional property space, high dimensional data is divided into to a plurality of mutually disjoint data gatherings district, and each data gathering is distinguished and is fitted on different nodes;
(3.2) super node SN, except the routing table of self, also safeguards the index tree that a structure of the resource identifier range information by the upper all nodes of ring forms, and is called the GI of global index;
(3.2.1) be fast node dispense event information; In ring, each node is responsible for a data accumulation area, in this accumulation area, all data messages are all distributed to this node, when an event information request adds fashionable, GI determines which data gathering district this event information belongs to by inquiry, namely determined host's node of event information, then the identifier of host's node of take is search key, utilizes the Chord Routing Protocol, and this event information is distributed to host's node;
(3.2.2) determine the node that subscribe request need to be accessed; When certain node input subscribe request, at first request is sent to SN search GI, determine that the semantic space of data gathering district which node is responsible for and this subscribe request intersects and return the identifier of these nodes, the node returned is called agent node, then the identifier of agent node of take is search key, subscribe request is routed to these nodes, further realizes semantic matches.
5. the semantic intelligent information distribution subscription method based on the P2P technology according to claim 1, is characterized in that, the concrete steps of described step (4) are:
(4.1) by each data gathering district consolidation, be that in high-dimensional data space, dimension is the multidimensional hypercube of d, and the multidimensional hypercube is carried out to normalized, the length of side that is every one dimension is 1, the central point of hypercube be expressed as (0.5,0.5 ..., 0.5), then take the hypercube center is summit, and (d-1) dimension lineoid in data gathering district, as base, is divided into 2d pyramid by each d dimension data accumulation area;
(4.2) each pyramid being divided into to several height parallel with base divides, each height is divided corresponding with a B+ tree data page, then the distance of putting the pyramid base according to high dimensional data is mapped as one dimensional numerical by the high dimensional data in the data gathering district, and by data after B+ tree organization and administration dimensionality reduction; The dimensionality reduction formula is:
y v=i*c+(j+h v)=i*c+(j+∣0.5-h v’∣)
Wherein the cube number at certain high dimensional data object v place is i, and the pyramid number is j, and v is h to place pyramid base plane distance v', the distance that arrives plane, place, pyramid summit is h v=∣ 0.5-h v’ ∣; After dimensionality reduction, in each High Dimension Cubes, the one dimension value of high dimensional data object is limited at [i*c, (i+1) * c] in interval, c is enough large constant, guarantees that the data object in each data partition has the index key that is different from other subregion.
6. the semantic intelligent information distribution subscription method based on the P2P technology according to claim 1, is characterized in that, the concrete steps of described step (5) are:
(5.1) suppose that between Chord system resource identifier field, size is 2 m, utilize an isotonic function h that one dimensional numerical after dimensionality reduction is mapped to interval [0,2 in order m) in; Namely for certain the high dimensional data point v in attribute space, the i-Chord resource identifier of some v is:
key v=ichord(v)=h(y v)=h(i*c+(j+h v))∈[0,2 m)
(5.2) each node N on the Chord ring iBe responsible for the high dimensional data information of a data accumulation area of storage administration, establish N iNode identifier Nkey i, in this accumulation area, the resource identifier of high dimensional data information is distributed in interval (Nkey I-1, Nkey i] in;
(5.3) each node N iSafeguard a routing table, namely pointer gauge, point to other nodes on ring; M list item arranged in routing table, wherein (it is that on the Chord ring, identifier is equal to or greater than (Nkeyi+2 that 1≤k≤m) goes list item to k K-1) mod2 mFirst node, i.e. successor ((Nkey i+ 2 K-1) mod2 m); Any one node receives when key word is the request of key, checks that at first whether self node equals key, if it is directly returns; Otherwise node is searched its routing table, find in table maximum but be no more than first node of key, and inquiry request is transmitted to this node, repeat this process, until request arrives a node N k, meet key and be positioned at N kAnd N kSubsequent node N K+1Between the time, node N kReport its descendant node N K+1As replying of request;
(5.4) when high dimensional data in certain data gathering district is too intensive while causing node effectively to it, not manage, this accumulation area is split into to two or more new accumulation area, correspondingly on the Chord ring, selects idle node to share the load of origin node; When Sparse in the data gathering district of closing on causes node resource when waste, merge accumulation area, select the XM in the new data gathering district of a conduct in two XM, and another is set to idle node.
7. the semantic intelligent information distribution subscription method based on the P2P technology according to claim 1, is characterized in that, the concrete steps of described step (6) are:
(6.1) subscription table of each node maintenance, record all subscription information relevant to this node, when node has event information to arrive, the subscription table of inquiring about this node, if certain subscription information semantic matches success in event information and subscription table, by the event active push to the user;
(6.2) as the subscriber to upper certain the node N of ring iWhile submitting subscribe request information to, this node searching global search tree is determined and the semantic relevant agent node of subscribe request, this subscribe request is routed to agent node, and carry out based on subscribing to semantic similarity algorithm on agent node, if search and the semantic identical event information of subscribe request, by the event information back transfer to N iSimultaneously, agent node is stored in subscribe request, subscription condition, routed path and hunting zone information in its subscription table;
(6.3) as the publisher to upper certain the node N of ring jDuring the issue event information, this node searching global search tree is determined and the semantic relevant host's node of event information, according to the Chord Routing Protocol, this event information is routed to and is stored on host's node; Simultaneously, host's node carries out semantic matches by the subscribe request information in event attribute information and its subscription table, if the match is successful directly event information is pushed to the subscriber;
(6.4) when the subscriber cancels subscriptions information to upper certain the node issue of ring, this node searching global search tree is determined and the semantic relevant agent node of subscription information, and solicited message is routed to corresponding agent node, in the agent node subscription table, find and delete corresponding subscription information.
CN201310302187.6A 2013-07-17 2013-07-17 Semantic intelligent information distribution subscription method based on P2P technology Active CN103412883B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310302187.6A CN103412883B (en) 2013-07-17 2013-07-17 Semantic intelligent information distribution subscription method based on P2P technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310302187.6A CN103412883B (en) 2013-07-17 2013-07-17 Semantic intelligent information distribution subscription method based on P2P technology

Publications (2)

Publication Number Publication Date
CN103412883A true CN103412883A (en) 2013-11-27
CN103412883B CN103412883B (en) 2016-09-28

Family

ID=49605895

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310302187.6A Active CN103412883B (en) 2013-07-17 2013-07-17 Semantic intelligent information distribution subscription method based on P2P technology

Country Status (1)

Country Link
CN (1) CN103412883B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106060154A (en) * 2016-06-30 2016-10-26 江苏省现代企业信息化应用支撑软件工程技术研发中心 Subscribing-publishing matching method and device based on topic model
CN106326295A (en) * 2015-07-01 2017-01-11 中兴通讯股份有限公司 Method and device for storing semantic data
CN107404512A (en) * 2016-05-19 2017-11-28 华为技术有限公司 Resource subscription method, resource subscription device and resource subscription Xi System
CN109558410A (en) * 2018-12-14 2019-04-02 北京邮电大学 Event matches algorithm based on multi-dimensional content in a kind of information distribution system
CN112765207A (en) * 2021-04-07 2021-05-07 中国人民解放军国防科技大学 Resource big data representation, storage and query method
CN114844948A (en) * 2021-12-14 2022-08-02 合肥哈工轩辕智能科技有限公司 Client cache optimization method and device of real-time distribution system
CN115037624A (en) * 2021-03-06 2022-09-09 瞻博网络公司 Global network state management

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1625119A (en) * 2004-12-09 2005-06-08 中国科学院软件研究所 Routing method of pub/sub system on structural P2P network
US20120166556A1 (en) * 2010-12-23 2012-06-28 Electronics And Telecommunications Research Institute Method, device and system for real-time publish subscribe discovery based on distributed hash table
CN102547471A (en) * 2010-12-08 2012-07-04 中国科学院声学研究所 Method and system for obtaining candidate cooperation node in P2P streaming media system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1625119A (en) * 2004-12-09 2005-06-08 中国科学院软件研究所 Routing method of pub/sub system on structural P2P network
CN102547471A (en) * 2010-12-08 2012-07-04 中国科学院声学研究所 Method and system for obtaining candidate cooperation node in P2P streaming media system
US20120166556A1 (en) * 2010-12-23 2012-06-28 Electronics And Telecommunications Research Institute Method, device and system for real-time publish subscribe discovery based on distributed hash table

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
沈燕玉等: "基于结构化P2P的发布订阅系统", 《计算机系统应用》, vol. 21, no. 2, 15 February 2012 (2012-02-15), pages 130 - 134 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106326295B (en) * 2015-07-01 2021-12-14 中兴通讯股份有限公司 Semantic data storage method and device
CN106326295A (en) * 2015-07-01 2017-01-11 中兴通讯股份有限公司 Method and device for storing semantic data
CN107404512B (en) * 2016-05-19 2021-03-05 华为技术有限公司 Resource subscription method, resource subscription device and resource subscription system
US10637794B2 (en) 2016-05-19 2020-04-28 Huawei Technologies Co., Ltd. Resource subscription method, resource subscription apparatus, and resource subscription system
CN107404512A (en) * 2016-05-19 2017-11-28 华为技术有限公司 Resource subscription method, resource subscription device and resource subscription Xi System
CN106060154B (en) * 2016-06-30 2019-04-19 江苏省现代企业信息化应用支撑软件工程技术研发中心 Subscription publication matching process and device based on topic model
CN106060154A (en) * 2016-06-30 2016-10-26 江苏省现代企业信息化应用支撑软件工程技术研发中心 Subscribing-publishing matching method and device based on topic model
CN109558410A (en) * 2018-12-14 2019-04-02 北京邮电大学 Event matches algorithm based on multi-dimensional content in a kind of information distribution system
CN115037624A (en) * 2021-03-06 2022-09-09 瞻博网络公司 Global network state management
CN112765207A (en) * 2021-04-07 2021-05-07 中国人民解放军国防科技大学 Resource big data representation, storage and query method
CN112765207B (en) * 2021-04-07 2021-06-18 中国人民解放军国防科技大学 Resource big data processing, storing and inquiring method
CN114844948A (en) * 2021-12-14 2022-08-02 合肥哈工轩辕智能科技有限公司 Client cache optimization method and device of real-time distribution system
CN114844948B (en) * 2021-12-14 2024-05-31 合肥哈工轩辕智能科技有限公司 Client cache optimization method and device of real-time distribution system

Also Published As

Publication number Publication date
CN103412883B (en) 2016-09-28

Similar Documents

Publication Publication Date Title
US11176114B2 (en) RAM daemons
CN103412883B (en) Semantic intelligent information distribution subscription method based on P2P technology
Ding et al. Efficient and progressive algorithms for distributed skyline queries over uncertain data
US8688708B2 (en) Storing and retrieving objects on a computer network in a distributed database
CN106815338A (en) A kind of real-time storage of big data, treatment and inquiry system
US9477772B2 (en) Storing and retrieving objects on a computer network in a distributed database
Wang et al. Research and implementation on spatial data storage and operation based on Hadoop platform
Hajeer et al. Handling big data using a data-aware HDFS and evolutionary clustering technique
US20230024345A1 (en) Data processing method and apparatus, device, and readable storage medium
Mohammed et al. A review of big data environment and its related technologies
CN102981913B (en) Inference control method and inference control system with support on large-scale distributed incremental computation
CN113806446A (en) Rapid retrieval method for mass data of big data
Liu et al. Parallelizing uncertain skyline computation against n‐of‐N data streaming model
Bai et al. Adaptive query relaxation and top‐k result sorting of fuzzy spatiotemporal data based on XML
Deng et al. Spatial-keyword skyline publish/subscribe query processing over distributed sliding window streaming data
Huang Geopubsubhub: A geospatial publish/subscribe architecture for the world-wide sensor web
CN111562990B (en) Lightweight serverless computing method based on message
Boroujeni et al. A Novel Replication Strategy for Efficient XML Data Broadcast in Wireless Mobile Networks.
Li et al. A PR-quadtree based multi-dimensional indexing for complex query in a cloud system
Li et al. A novel approach for mining probabilistic frequent itemsets over uncertain data streams
Metre et al. Efficient processing of continuous spatial-textual queries over geo-textual data stream
Ren et al. haps: Supporting effective and efficient full-text p2p search with peer dynamics
Cortés et al. GeoTrie: A scalable architecture for location-temporal range queries over massive geotagged data sets
Chu et al. A cloud-based trajectory index scheme
Tavuseh et al. Optimization of resources discovery in grid computing using bloom filter

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant