CN101521655A - Method for searching and releasing information, system and synonymy node clustering method and device therefor - Google Patents

Method for searching and releasing information, system and synonymy node clustering method and device therefor Download PDF

Info

Publication number
CN101521655A
CN101521655A CN200810007704A CN200810007704A CN101521655A CN 101521655 A CN101521655 A CN 101521655A CN 200810007704 A CN200810007704 A CN 200810007704A CN 200810007704 A CN200810007704 A CN 200810007704A CN 101521655 A CN101521655 A CN 101521655A
Authority
CN
China
Prior art keywords
node
keyword
synonym
search
hash
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN200810007704A
Other languages
Chinese (zh)
Other versions
CN101521655B (en
Inventor
倪润特
康特
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN2008100077046A priority Critical patent/CN101521655B/en
Publication of CN101521655A publication Critical patent/CN101521655A/en
Application granted granted Critical
Publication of CN101521655B publication Critical patent/CN101521655B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a synonymy node clustering method and device therefor, a method for searching and releasing information and an information processing system, aiming at solving the problem of the prior art that the searching efficiency is low. The searching method comprises the following steps: the keywords are obtained and a corresponding body is obtained according to the keywords; hash calculation is carried out on the at least one keyword to obtain the hash value of the keyword and the target node is reached according to the hash value; the target node is adopted as the source node, and the target node and the similar synonymy nodes are searched according to the keywords, the corresponding body and the link with the synonymy node and stored in a routing table of the target node. According to the embodiment of the invention, the information similar to the body can be searched out to obtain the synonymy result with less jumping through the link with the synonymy node which is stored in the routing table of the target node so as to enhance the searching efficiency.

Description

Information search, dissemination method and system and synonym node cluster method and apparatus
Technical field
The present invention relates to a kind of communication technology, relate in particular to a kind of method, device of synonym node cluster, information search method and system, and information issuing method and system, and information processing system.
Background technology
Current available point-to-point search technique (as Chord, CAN, Kademlia etc.) mostly all is based on DHT (Distributed Hash Table, distributed hashtable) agreement, utilizes filename or keyword to search for.Big multiple search engine (as Google, Yahoo, AltaVista etc.) can provide full-text search.Above-mentioned two kinds of search techniques all are based on text matches.These only do not consider to search for context (search context, search context) based on the search technique of text matches, therefore usually provide and the incoherent Search Results of user search.
Semantic search (semantic search, semantic search) method is not only mated word, also notion (concept, notion) is mated, and therefore, can improve traditional search result.Because Search Results is based on notion, these Search Results can be limited in the user search context well, thereby better search quality can be provided.This just requires the issue for various territories, can not only issue the filename or the keyword of resource, also will adopt the semanteme (semantics, semanteme) of resource to issue the metadata relevant with this resource (metadata).Under the situation, the search terms that uses during the user search resource is based on the understanding of user to this resource information territory relevant knowledge, so semantic search can provide based on context and significant expected results to the user, can increase the accuracy of Search Results mostly.Use semantic concept (semantic concept, semantic concept) to carry out information labeling and help in an organized way to release news, and therefore improve the information search that utilizes P-2-P technology.
In the prior art, multiple solution of carrying out semantic search is arranged in point to point network.These solutions have:
One, hybrid classification and based on the point-to-point global service search system of body (A HybridHierarchical and Peer-to-Peer Ontology-based Global Service Discovery System)
This system is that a hybrid class-of-service that is applicable to local area network (LAN) and wide area network is found framework.It adopts description logic and network ontology language (OWL DL, Web Ontology Language DescriptionLogic) to carry out service describing and search.Logical capability among the OWL can be distributed in service content in the node.Use OWL DL design global service to find that framework mainly is in order to provide upgradability and logical services to describe.Service in this service search system has detailed content description, and these services can searchedly arrive in any position, and search can be carried out at user's search content.
Two, point-to-point semantic search network
This semantic search network is shared for the semantic knowledge in the network provides a distributed environment, for the establishment of knowledge, exchange provide a platform that obtains information with shared.This scheme mainly comprises two parts: one provide can be to relation between the n-tuple relation in the semantic network framework, body and the semantic ability that defines of Knowledge Discovery, the 2nd, can carry out knowledge sharing by point to point network, to realize to the information search in the network.
In above-mentioned prior art, most point-to-point semantic solution is applicable to specific territory or service, and the point-to-point nerve of a covering of structuring for supporting semantic search still can not provide generic structure for it.In search procedure, by each synonymity is carried out the correlated results that search procedure is obtained synonym respectively.Therefore need a lot of jumping figures, search efficiency is lower.
Summary of the invention
Embodiments of the invention provide a kind of method and apparatus and a kind of method of information search and method and a kind of information processing system of a kind of information issue of synonym node cluster, can solve the low problem of prior art search efficiency.
Embodiments of the invention provide a kind of method of synonym node cluster, and described method comprises:
Obtain the keyword of node correspondence, obtain the synonym item of described keyword by the dictionary service;
Described synonym item is carried out Hash calculation, determine the synonym node of described node according to the cryptographic Hash of Hash calculation acquisition;
By the cover layer agreement of bottom, in the routing table of described node self, safeguard link with described synonym node.
Embodiments of the invention provide a kind of device of synonym node cluster, and described device comprises:
Vocabulary Hash service unit is used for keyword is carried out the Hash operation, to obtain the cryptographic Hash of this keyword;
The dictionary service unit is used for returning the synonym item identical with this speech meaning of a word for a given speech;
The synonym service unit of trooping is used for the keyword according to the node correspondence, obtains the synonym item of described keyword from described dictionary service unit; Obtain the cryptographic Hash of described synonym item from described vocabulary Hash service unit, determine the synonym node of described node according to this cryptographic Hash; By the cover layer agreement of bottom, the link in the routing table of described node self between Maintenance Point and its synonym node.
Embodiments of the invention provide a kind of information processing system, and this system comprises: classification for search device, vocabulary Hash service unit and the synonym service unit of trooping;
Described classification for search device is used to obtain keyword, obtains the body of this keyword correspondence; Request vocabulary Hash service unit obtains the cryptographic Hash of at least one keyword, arrives destination node according to this cryptographic Hash; With described destination node is source node, to described destination node and close synonym node thereof, searches for according to the body of described keyword and described correspondence;
Described vocabulary Hash service unit is used for keyword is carried out the Hash operation, to obtain the cryptographic Hash of this keyword;
The described synonym service unit of trooping is used for safeguarding link between this node and its synonym node in the routing table of node self.
Embodiments of the invention also provide a kind of information search method, comprising:
Obtain keyword, obtain corresponding body according to described keyword;
At least one keyword is carried out Hash calculation obtain its cryptographic Hash, arrive destination node according to described cryptographic Hash;
With described destination node is source node, preserve in the routing table according to described destination node and its synonym node between link, to described destination node and close synonym node thereof, search for according to the body of described keyword and described correspondence.
Embodiments of the invention also provide a kind of information issuing method, comprising:
Obtain body,, use at least one mark keyword that data to be released are marked according to the body that is obtained;
At least one mark keyword is carried out Hash calculation obtain its cryptographic Hash,, the body of described mark keyword and described acquisition is issued according to described cryptographic Hash;
According to the synonym node of the definite destination node of issuing of described mark keyword, in the routing table of the destination node of issuing, safeguard the destination node of issue and the link of its synonym node.
According to the embodiment of the invention, by when information is issued, according to the synonym node of the definite destination node of issuing of mark keyword, in the routing table of the destination node of issuing, by the cover layer agreement of bottom, safeguard the destination node of issue and the link of its synonym node; When information search, arriving earlier destination node, is source node with this destination node again, preserve in the routing table according to destination node and its synonym node between link, to destination node and close synonym node thereof, search for according to keyword and corresponding body.Therefore, can just can search for the information close, obtain the synonymic search result, thereby improved search efficiency by the jumping of lesser amt with body.
Description of drawings
Fig. 1 shows the information search system of the embodiment of the invention one;
Fig. 2 shows the information issuing system of the embodiment of the invention two;
Fig. 3 shows the information processing system of the embodiment of the invention three;
Fig. 4 shows a kind of synonym node cluster device of the embodiment of the invention four;
Fig. 5 shows the another kind of synonym node cluster device of the embodiment of the invention four;
Fig. 6 shows the information search flow process of the specified domain of the embodiment of the invention six;
Fig. 7 shows the information search flow process of the use body and the dictionary service unit of the embodiment of the invention six;
Fig. 8 shows the information issuing process of the embodiment of the invention seven.
Embodiment
Understand and realization the present invention the existing embodiments of the invention of describing in conjunction with the accompanying drawings for the ease of persons skilled in the art.
Embodiment one
As shown in Figure 1, embodiment one has described a kind of information search system, and this system comprises: classification for search device, body manager, vocabulary Hash service unit, dictionary service unit, the synonym service unit of trooping.According to present embodiment, preferably, the body manager is set in the server zone in the P2P network, and this server zone is for having high performance node.Can be distributed in each node for the other parts in the system.
Described classification for search device is used to obtain keyword, and obtains corresponding body according to described keyword, carries out information search according to keyword and body.For example, keyword is the descriptor (abbreviating the territory as describes) in territory, sends the body download request to described body manager, downloads from the body manager with this territory and describes corresponding one or more bodies; Body according to keyword and download is searched for.All nodes in the point to point network all have the classification for search device.When having preserved some bodies in the node of classification for search device place, the classification for search device is before sending the body download request to the body manager, can judge the local body corresponding that whether exist according to keyword earlier with this keyword, if the local not body of this keyword correspondence, then further download corresponding body from the body manager, the body according to keyword and download carries out information search then; If the local body that this keyword correspondence has been arranged then can directly carry out information search according to keyword and corresponding body.
Described information search is meant: for given keyword (being generally the body parameter value), at first asking vocabulary Hash service unit that this keyword is carried out hash function and handle, obtain the cryptographic Hash of these keywords; By these cryptographic Hash, use the cover layer agreement (the underlying overlay protocol) of bottom that point to point network is searched for, the search parameter that is adopted when wherein searching for can comprise, keyword cryptographic Hash, relevant keyword (being generally the body parameter value) and body itself.In search procedure, can use the cosine comparative approach that the body parameter value is compared, and return Search Results according to these comparison values.Wherein, the detailed process of information search can for: at first, request vocabulary Hash service unit obtains the cryptographic Hash of at least one keyword; Then, arrive destination node according to this cryptographic Hash; Then, be source node with described destination node, to described destination node and close synonym node thereof, search for according to the body of described keyword and described correspondence; At last, can also further return Search Results.After having obtained corresponding body, use the mode of the body of described correspondence to be: the form with the body of correspondence is represented keyword, be that the parameter of body is carried out assignment with keyword in other words, body after the assignment just can carry and be used for the assignment keyword like this, further the body after the described assignment is used for search then.
Described body manager, the server zone that is arranged in the point to point network is concentrated, and is used for body is carried out storage and maintenance, and the interface of uploading or download body for arbitrary node in the network is provided; Receive the body download request that described classification for search device sends,, search this body surface of self according to the keyword that carries in the download request.Described body surface is that the mapping table to body is described in the territory, for given keyword, then returns the body corresponding with this keyword by the descriptor of searching the territory in this body surface.In other embodiments, in described body manager, there is not corresponding body, perhaps when corresponding body is arranged in the described body manager, can also be further mutual with other body managers, to obtain body corresponding on other body managers with given keyword.Return the single or multiple bodies corresponding at last, or further also return body corresponding on other body managers with given keyword with keyword.
Described dictionary (thesaurus, dictionary) service unit is used to support synonymic search.Particularly, for a given speech, then the dictionary service unit will return the item identical with this speech meaning of a word, and promptly synonym item (synonym term, synonym item) for a given speech, can have one or more synonym items.The synonym service unit of trooping obtains the synonym item by the dictionary service unit, and according to the synonym item node is carried out synonym troop (synonymously cluster).The body manager also can use this dictionary service unit, and with the synonym item of the semanteme that obtains body, the semanteme of wherein said body can be the notion (ontology concept, Ontological concept) of body.
Described vocabulary Hash service unit is used for keyword is carried out the Hash operation, to obtain the Hash value of this keyword.
The described synonym service unit of trooping is used to safeguard the semanteme of body and the relation between the node, and promptly the application cover layer agreement by bottom is near more with the meaning of a word of the semanteme of keeping body, between this body place node link near more, promptly jumping figure is few more between its place node; For example, when the semanteme of the body of two body correspondences on different nodes respectively each other during the synonym item, the link of then keeping between these two body place nodes is nearer, and under opposite extreme situations, the link that can keep two nodes is one to jump (hop jumps).When the keyword of the DHT of two nodes key correspondence was the synonym item, then these two nodes were considered to synonym, and synonym node each other.By the synonym service unit of trooping, use the application cover layer agreement of bottom, shorten the link between the node that is considered to synonym as much as possible, reduce between the node that is considered to synonym via jumping figure, thereby the synonym of finishing node is trooped, and jumping figure obtains the synonym result so that only pass through still less.This also means, can return more useful consequences for user search.Wherein, the specific implementation that synonym is trooped can for: obtain the keyword of node correspondence, the keyword here can be the keyword that the DHT key-value pair of node self is answered, and/or the keyword of the body of preserving in the node; Obtain the synonym item of above-mentioned keyword by the dictionary service; The synonym item is carried out Hash calculation, determine the synonym node of this node according to the cryptographic Hash of Hash calculation acquisition; By the cover layer agreement of bottom, in the routing table of this node self, safeguard link with described synonym node.Because node has been safeguarded in its routing table and the link of synonym node, can directly arrive the synonym node from destination node, thereby reduce the jumping figure of destination node to the link of its synonym node by this link, realized that synonym troops.
For example, a certain node ID is 9000 (to call this ID in the following text is that 9000 node is " node 9000 ", and other nodes are similar), supposes that the value of its DHT key is " 8990 ", Hash (movie)=8990 wherein, then the DHT key " 8990 " of node 9000 has been represented keyword " movie ".The synonym synonym of service unit by dictionary service unit inquiry " Movie " of trooping, the dictionary service unit returns the synonym item, as " Cinema ", " Film ", " Picture " etc.Synonym is trooped service unit again by vocabulary Hash service unit, utilizes hash function that these synonym items are handled, and supposes hash (cinema)=7000, hash (film)=6000, hash (picture)=5000.And then search according to each cryptographic Hash, suppose node in its DHT with these values (7000,6000,5000) be maintained 7010,6015 and 5020, promptly the corresponding ID of value " 7000 " is 7010 node, and the corresponding ID of value " 6000 " is 6015 node, and the corresponding ID of value " 5000 " is 5020 node.At last, synonym is trooped service unit in bottom cover layer routing table (underlying overlayrouting table), the node at these synonym item places is kept nearer link, and being about to node 9000 is shorter link with link maintenance between other three nodes (node 7010, node 6015 and node 5020) that are considered to synonym.Therefore, when search " Movie " (8990), can arrive node 9000 earlier, then, be source point with node 9000 again, arrive synonym node (node 7010, node 6015 respectively, node 5020), like this, can be by less jumping figure to reach synonym node (7010,6015,5020), and provide the synonymic search result of usefulness.The method of described arrival node is: a node sends to destination node according to routing table with search command or other information, for example search node sends to node 9000 with search command, perhaps, node 9000 sends to synonym node (node 7010 with search command, node 6015, node 5020).The destination node that node 9000 is determined according to the cryptographic Hash of keyword for the classification for search device, described destination node can specifically comprise: the local search unit, be used for keyword and body according to the transmission of classification for search device, the information of self preserving is carried out information search; Synonym node searching unit is used for according to keyword and body, and the information that the synonym node is preserved is carried out information search; Feedback unit is used for the Search Results of local search unit and synonym node searching unit is fed back to described classification for search device as a result.
Wherein, the meaning of a word of semanteme of keeping body is near more, link is near more between this body place node, be specifically as follows: preserve an independently synonym node route list at node, preserve the link between the synonym node in the described routing table, when from the synonym node of this node of querying node, directly arrive synonym node and carry out search inquiry according to this synonym node route list; Perhaps, the link of keeping the synonym node by the routing table of underlying protocol is nearer, for example, at first obtain the ID of nearly adopted node by dictionary service unit and vocabulary Hash service unit, be assumed to be 5000, then by the routing table inquiry of underlying protocol and the link of nearly adopted node, suppose not have in the routing table of underlying protocol record directly to arrive the link that ID is 5000 node, and be 5000 nearest nodes with ID in the routing table of underlying protocol, its ID is 4800,5000 node needs be earlier 4800 node through ID so will arrive ID and be, could arrive ID then and be 5000 node, at this moment, can seek nearer link by underlying protocol, for example directly arrive ID and be 5000 node, then by underlying protocol in its routing table, keeping direct arrival ID is the link of 5000 nodes, thus will arrive ID be 5000 chain route at least double bounce (be 4800 node one jumping to ID, be that 4800 node is at least one jumping of node of 5000 to ID from ID) reduce to and only need one to jump (being 5000 node directly) to ID.In other embodiments of the invention, it is near more also can similarly to adopt this method to keep the meaning of a word of semanteme of body, and link is near more between this body place node.
When search, can adopt following search technique:
1, the word Hash subregion salted hash Salted of serving-arranging in alphabetical order: this technology uses " the subregion salted hash Salted that arranges in alphabetical order " (Alphabetical Partitioned Hashing Technique, the subregion salted hash Salted that arranges in alphabetical order) to search for.
2, wildcard search technology (Wildcard Search Technique, wildcard search technology): this technology can be used asterisk wildcard search.
3, range searching technology: by the maximum of generation hunting zone and the cryptographic Hash of minimum value, and provide all interior Search Results of this scope, carry out range searching.Wherein, the ID of the cryptographic Hash dactylus point of the maximum of the hunting zone here and minimum value.This searching algorithm is as follows:
(1), obtains the hunting zone.
(2), to make the value of MIN_VALUE be stated range minimum, the value of MAX_VALUE is the scope maximum.
(3), MIN_VALUE_HASH=lexicographic order subregion Hash (MIN_VALUE), MAX_VALUE_HASH=lexicographic order subregion Hash (MAX_VALUE).
(4), arrive the node of MIN_VALUE_HASH appointment, the beginning ferret out travels through to the node of MAX_VALUE_HASH appointment by follow-up jumping and finishes search.
(5), obtain Search Results.
For example, range searching [1000,2000], then travels through to 5000 from 4000 follow-up jumpings to lexicographic order subregion Hash (2000)=5000 from lexicographic order subregion Hash (1000)=4000, return the result that all obtain again, i.e. all reference informations in the scope 4000 to 5000.
With respect to above-mentioned search technique, following several search-type can be arranged:
1, point search
To this search-type, accurate filename or keyword in the resource are searched for.Carry out hash function by filename and handle, can arrive the accurate target node resource.
2, similar search
For this search-type, user and unclear resource details accurately.Proximity search is issued by asterisk wildcard.
3, range searching
In the range searching type, the user provides a search value scope, again the node in this value range specified scope is carried out information search, wherein the concrete mode of information search, can be that node is carried out point search, similar search, perhaps further the synonym node of node be searched for.
Embodiment two
As shown in Figure 2, embodiment two has described a kind of information issuing system, and this system comprises: information classification device, body manager, vocabulary Hash service unit, dictionary service unit, the synonym service unit of trooping.According to present embodiment, preferably, the body manager is arranged in the server zone in the P2P network, and this server zone is for having high performance node.Can be distributed in each node for the other parts in the system.
Described information classification device, each point in the network all has this unit.After providing keyword to the body manager, the information classification device of a certain node can be downloaded the body of this keyword correspondence from the body manager.If the body of this keyword correspondence not, then the information classification device can be created own body at this keyword, and uses this body to carry out resource and issue.If the body downloaded and the semanteme of body do not have abundant relation, then can make amendment and upgrade this body.This establishment, modification or the body that upgrades can be uploaded to the body manager.The information classification device marks according to the data of the body that is obtained to the need issue, promptly available data in the node is marked; From vocabulary Hash service unit, obtain the cryptographic Hash of the keyword that is used for labeled data then; At last,, each keyword that is used for labeled data is issued the body that also has relevant keyword that is used for labeled data and correspondence of issue simultaneously according to the cryptographic Hash of keyword.Wherein, the described relevant keyword that is used for labeled data refers to, when being marked, data generally can use a plurality of keywords to mark, and generally only use the cryptographic Hash of a keyword when specifically issuing at the issue of some bodies, can with other keywords or all keyword as the associative key of this keyword.
Described body manager, the server zone that is set in the point to point network is concentrated, and is used for body is carried out storage and maintenance, and the interface of uploading or download body for arbitrfary point in the network is provided; Receive the body download request that described information classification device sends,, search this body surface of self according to the keyword that carries in the download request.Described body surface is that the mapping table to body is described in the territory, for given keyword, then returns the body corresponding with this keyword by the descriptor of searching the territory in this body surface.In other embodiments, in described body manager, there is not corresponding body, perhaps when corresponding body is arranged in the described body manager, can also be further mutual with other body managers, to obtain body corresponding on other body managers with given keyword.Return the single or multiple bodies corresponding at last, or further also return body corresponding on other body managers with given keyword with keyword.In addition, described body manager can also receive the new body that described information classification device is uploaded, and preserves; Perhaps, receive the body of the modification of described information classification device renewal, and preserve.
Described vocabulary Hash service unit is used for keyword is carried out the Hash operation, to obtain the value of this keyword.
The described synonym service unit of trooping is used to safeguard the semanteme of body and the relation between the node, and promptly the application cover layer agreement by bottom is near more with the meaning of a word of the semanteme of keeping body, between this body place node link near more, promptly jumping figure is few more between its place node; For example, when the semanteme of the body of two body correspondences on different nodes respectively each other during the synonym item, the link of then keeping between these two body place nodes is nearer, and under opposite extreme situations, the link that can keep two nodes is one to jump (hop jumps).When the keyword of the DHT of two nodes key representative was the synonym item, then these two nodes were considered to synonym, and synonym node each other.By the synonym service unit of trooping, use the application cover layer agreement of bottom, shorten the link between the node that is considered to synonym as much as possible, reduce between the node that is considered to synonym via jumping figure, thereby the synonym of finishing node is trooped, and jumping figure obtains the synonym result so that only pass through still less.Wherein, the specific implementation that synonym is trooped can for: obtain the keyword of node correspondence, the keyword here can be the keyword that the DHT key-value pair of node self is answered, and/or the keyword of the body of preserving in the node; Obtain the synonym item of above-mentioned keyword by the dictionary service; The synonym item is carried out Hash calculation, determine the synonym node of this node according to the cryptographic Hash of Hash calculation acquisition; By the cover layer agreement of bottom, in the routing table of this node self, safeguard link with described synonym node.Because node has been safeguarded in its routing table and the link of synonym node, can directly arrive the synonym node from destination node, thereby reduce the jumping figure of destination node to the link of its synonym node by this link, realized that synonym troops.
In the delivery system of present embodiment, obtained after the corresponding body, use the mode of the body of described correspondence to be: the form with the body of correspondence is represented keyword, be that the parameter of body is carried out assignment with keyword in other words, body after the assignment just can carry and be used for the assignment keyword like this, further the body after the described assignment is used for issue then.
In other embodiments, described information classification device can specifically comprise: body delivery unit, body processing unit; Described body delivery unit is used for downloading body according to keyword from the body manager; The body that described body processing unit is revised, created is uploaded to the body manager; Described body processing unit, be used for if body of being downloaded and the abundant relation of semantic nothing, then this body is made amendment/upgraded, or create new body, and the body that will create, revise or upgrade by described body delivery unit is uploaded to the body manager at keyword.
Embodiment three
Present embodiment combines the delivery system among search system among the embodiment one and the embodiment two, as shown in Figure 3, embodiment three has described a kind of information processing system, and this system comprises: classification for search device, body manager, vocabulary Hash service unit, dictionary service unit, synonym troop service unit, information classification device.
Wherein, in the information processing system of present embodiment, the correspondence among the function of each part such as embodiment one and the embodiment two is described, and repeats no more here.
Embodiment four
As shown in Figure 4, embodiment four provides a kind of synonym node cluster device, and described device comprises:
Vocabulary Hash service unit is used for keyword is carried out the Hash operation, to obtain the cryptographic Hash of this keyword;
The dictionary service unit is used for returning the synonym item identical with this speech meaning of a word for a given speech;
The synonym service unit of trooping is used for the keyword according to the node correspondence, obtains the synonym item of described keyword from described dictionary service unit; Obtain the cryptographic Hash of described synonym item from described vocabulary Hash service unit, determine the synonym node of described node according to this cryptographic Hash; By the cover layer agreement of bottom, the link in the routing table of described node self between Maintenance Point and its synonym node.
In other embodiments, as shown in Figure 4, the described synonym service unit of trooping can specifically comprise: synonym node determining unit, be used for keyword according to the node correspondence, synonym item to described this keyword of dictionary service unit request, to the cryptographic Hash of the described synonym item of described vocabulary Hash service unit request, to determine the synonym node of described node; The underlying protocol performance element is used for the result definite according to described synonym node determining unit, by underlying protocol, obtains the multilink of described node and described synonym node; The link selection unit is used for selecting the shortest link as the link the routing table of the cover layer agreement of bottom from the multilink that described underlying protocol performance element obtains.
In other embodiments, as shown in Figure 5, the described synonym service unit of trooping comprises: synonym node determining unit, be used for keyword according to the node correspondence, synonym item to described this keyword of dictionary service unit request, to the cryptographic Hash of the described synonym item of described vocabulary Hash service unit request, to determine the synonym node of described node; The underlying protocol performance element is used for the result definite according to described synonym node determining unit, by underlying protocol, obtains the link of described node and described synonym node; Synonym routing table unit is used to safeguard a synonym node route list, preserves the link between node and the described synonym node.
In the search system of embodiment one, in the delivery system of embodiment two, in the information processing system of embodiment three, the troop specific implementation of service unit of synonym can be to the troop specific implementation of service unit of synonym among the embodiment four.
Embodiment five
Embodiment five provides a kind of method of synonym node cluster, and described method comprises: obtain the keyword of node correspondence, obtain the synonym item of described keyword by the dictionary service; Described synonym item is carried out Hash calculation, determine the synonym node of described node according to the cryptographic Hash of Hash calculation acquisition; By the cover layer agreement of bottom, in the routing table of described node self, safeguard link with described synonym node.
Wherein, described cover layer agreement by bottom is safeguarded in the routing table of described node self and the link of described synonym node, can be specially: a synonym node route list is set, the link between preservation and the described synonym node in described node; Or, obtain the multilink of described node and described synonym node by the cover layer agreement of bottom, from described multilink, select the shortest link as the link in the routing table of the cover layer agreement of bottom.
In the present embodiment, the keyword of described node self correspondence is: the keyword that the DHT key-value pair of node self is answered, and/or the keyword of the body of preserving in the node.
Embodiment six
Present embodiment descriptor searching method, wherein, the body that the meaning of a word is close is stored in close node.According to the descriptor of specified domain whether in search procedure, search can be divided into following two search procedures: specified domain search and use the intelligent search of body and dictionary service unit.Introduce this two classes searching method below respectively.
1, specified domain search.As shown in Figure 6, this search procedure is described below:
Step 21, obtain territory descriptor keyword.
Step 22, according to described descriptor keyword, judge locally whether to exist and describe corresponding body to localization, as if not existing, then execution in step 23, execution in step 25 then; Otherwise, execution in step 24, execution in step 25 then.
Step 23, download the body corresponding with this descriptor keyword from the body manager.Wherein, step 22 is not necessary step, can be directly in step 21 back execution in step 23, and execution in step 25 then.
Step 24, obtain corresponding body from this locality according to described descriptor keyword.
Step 25, obtain search-type.Described search-type comprises point search, similar search or range searching.
Step 26, search for according to search-type, searching key word and body.
If search-type is a point search, then step 26 comprises: A, obtain the searching key word of representing with the body form of correspondence, wherein said searching key word is the keyword that the parameter according to the body of described correspondence provides, and can be the parameter value that the parameter at the body of correspondence provides; B, from vocabulary Hash service unit, obtain the cryptographic Hash of these keywords; C, with each or arbitrarily the cryptographic Hash and the corresponding body of a keyword search as search parameter, promptly be the body assignment of described correspondence with described searching key word, search for body thereby form; D, arrive the ferret out node according to cryptographic Hash after, will search for body and compare with the issue body, export Search Results; E, by the operation similar to step D, from the synonym node, obtain Search Results.If search-type is a similar search, then step 26 is: (Wildcard Search Technique) searches for according to the wildcard search technology.If search-type is a range searching, then step 26 is: search for according to the range searching technology.
The search procedure of the present invention that illustrates below by the process of the user terminal search and the Titanic that downloads movies.User terminal obtains the film territory by the interface request, and by the body download step descriptor in film territory is informed the classification for search device, and the descriptor in for example described film territory can be " Movie ".The classification for search device is downloaded the body corresponding with the film territory from the body manager.The body corresponding with the film territory can comprise following parameter: " Movie Name ", " Director ", " Type " etc.Body is returned to user terminal, and the value of described parameter is provided with field form request user.After user terminal obtains the value of these parameter fields, can search for the Tinatic film by these field values.
User terminal sends search instruction, wherein, supposes that each field value in the described search instruction is: Movie Name=Titanic, Director=Steven Spielberg, Type=Mixed.Then, utilize the information search step that these field values and body are committed to the classification for search device.Ask vocabulary Hash service unit again, handle these field values (" Tinatic ", " Steven Spielberg ", " Mixed " etc.) are carried out Hash.Afterwards, the structure search command, for example, construct a search command and comprise the cryptographic Hash (cryptographic Hash of supposing " Titanic " is 8990) of " Titanic ", and search parameter value (" Titanic ", " StevenSpielberg ", " Mixed "), wherein, can be in the process of structure search command according to each parameter value and the different search command of body configuration corresponding with the film territory.Carry out route according to search command, the destination node that default " 8990 " is corresponding is the node of ID8995, then route to node 8995 according to search command, utilize search parameter value (Movie Name=Titanic then, Director=Steven Spielberg, Type=Mixed) and the body corresponding, with the film territory with the node of ID8995 on the body issued compare and obtain the result.If this keyword " Titanic " has the synonym node to exist, then this search also is routed to these synonym nodes, and utilize search parameter value (Movie Name=Titanic, Director=StevenSpielberg, Type=Mixed) and the body corresponding, compare and obtain the result with the issue body with the film territory.Wherein, the different search command of body configuration that can be corresponding in the process of structure search command according to each parameter value and film territory, for example can also construct a search command and comprise the cryptographic Hash of " StevenSpielberg " and search parameter value (" Titanic ", " Steven Spielberg ", " Mixed ").Wherein, the search parameter value that comprises in the search command (" Titanic ", " Steven Spielberg ", " Mixed ") and the relation of the body corresponding with the film territory can be: with search parameter value (" Titanic ", " StevenSpielberg ", " Mixed ") to the parameter (" Movie Name " in the body corresponding with the film territory, " Director ", " Type ") assignment, for example, Movie Name=Titanic, Director=StevenSpielberg, Type=Mixed, the body that will carry search parameter value then is included in the search command.
2, use the intelligent search of body and dictionary service unit, as shown in Figure 7, this search procedure be described below:
Step 31, obtain keyword.
Step 32, obtain all bodies corresponding with keyword.All bodies corresponding with keyword described here refer to, do not consider the pairing territory of body, as long as comprised described keyword in the descriptor in the territory of body correspondence, think that then this body is the body corresponding with described keyword, concrete implementation can be: obtain one or more keyword, in this body surface, carry out the descriptor of inquiry field at each keyword, return all bodies corresponding with any keyword.
Step 33, provide search suggestion.At first, obtain all keywords, described all keywords comprise: keyword that all bodies corresponding with the keyword that obtains comprise and the described keyword that obtains; Then described all keywords are made up, provide different compound modes, select for the user.
Step 34, according to user's selection, carry out search procedure.Wherein, the search procedure here can be the concrete search procedure that provides in the step 26.
If the not body corresponding with keyword, or the non-selected any body of user then carries out Hash to searching key word and handle, and provide Search Results based on these cryptographic Hash.
Embodiment seven
As shown in Figure 8, present embodiment is described issuing process.
Step 41, user terminal utilize keyword, download the body of this keyword correspondence from the body manager.If there is not corresponding ontological existence, the body of this keyword correspondence of user creatable, and this body is uploaded to the body manager.User terminal also can be made amendment to the body of downloading, and the body of being revised is uploaded to the body manager.
The concrete steps of issue body are as follows:
Step 42, data to be released are marked according to the body that step 41 obtained.Wherein, the keyword that uses in mark can be used as the searching key word in the search procedure.
Describedly data to be released are labeled as: according to the relevant information of data to be released, for the parameter in the body of described acquisition is carried out assignment according to the body that is obtained.
Step 43, from the service of vocabulary Hash, obtain the cryptographic Hash of each keyword that is used to mark.
Step 44, each keyword that is used to mark is issued, issue simultaneously also have relevant keyword that is used to mark and corresponding body.Wherein, the described relevant keyword that is used to mark is that the part or all of keyword of all keywords that are used for marking perhaps further comprises some descriptive information, wherein, described descriptive information is for not issuing but the word that can be described data.
For example, the user wants " Titanic Movie " but issues as file in download.At first, user inquiring body manager obtains the body corresponding with " Movie ".The body manager returns " Movie " corresponding body, and this body can comprise following parameter: " Movie Name ", " Director ", " Type ", " Format " etc.The information classification device marks data according to this body, for example, the relevant information of supposing data to be released has, movie name " Titanic ", movie director " Steven Spielberg ", film types " Mixed ", film form " avi ", film size " 1G " etc. are labeled as " Movie Name=Titanic ", " Director=Steven Spielberg ", " Type=Mixed ", " Format=avi " with " Titanic Movie ".The user is according to the relevant information of the data to be released known to self, the body parameter is marked, can all parameters of this body be marked (" Movie Name ", " Director ", " Type ", " Format "), also can only mark a part wherein, for example the user does not know the value of " Type ", only mark " Movie Name=Titanic ", " Director=Steven Spielberg ", " Format=avi ".Described mark can directly be marked by the user, or uses any " annotation tool " to mark.These keywords that are used to mark are sent to vocabulary Hash service unit, carry out Hash and handle the corresponding cryptographic Hash of acquisition." Movie Name " releases news, be releasing news of " Titanic " herein, comprise: cryptographic Hash (cryptographic Hash of " Titanic "), associative key (" Titanic ", " StevenSpielberg ", " Mixed ", " avi ") and corresponding body.Adopt similar methods, releasing news of each keyword (" Steven Spielberg ", " Mixed ", " avi ") issued, and for example releasing news of " Steven Spielberg " comprises: cryptographic Hash (cryptographic Hash of " Steven Spielberg "), associative key (" Titanic ", " Steven Spielberg ", " Mixed ", " avi ") and relevant body.Wherein, above-mentioned issue can be the cryptographic Hash in the releasing news of: utilization, determine the destination node of issue, to release news then and send to the destination node of issue, for example the cryptographic Hash of " Titanic " is 8990, and the destination node ID of the issue of 8990 correspondences is 9000, determines that according to corresponding relation the destination node of issue is that ID is 9000 node, then with Titanic " release news that to send to ID be 9000 node.
Wherein, in the present embodiment, after keyword and body issued, further comprise: the destination node of issue is determined the synonym node of the destination node of issue according to described mark keyword, in the routing table of the destination node of issuing, safeguard the destination node of issue and the link of its synonym node.Thereby the destination node of realization issue and the synonym between its synonym node are trooped.
According to the embodiment of the invention, by when information is issued, according to the synonym node of the definite destination node of issuing of mark keyword, in the routing table of the destination node of issuing, by the cover layer agreement of bottom, safeguard the destination node of issue and the link of its synonym node; When information search, arriving earlier destination node, is source node with this destination node again, preserve in the routing table according to destination node and its synonym node between link, to destination node and close synonym node thereof, search for according to keyword and corresponding body.Therefore, can just can search for the information close, obtain the synonymic search result, thereby improved search efficiency by the jumping of lesser amt with body.
Though described the present invention by embodiment, those of ordinary skills know, without departing from the spirit and substance in the present invention, just can make the present invention that many distortion and variation are arranged, and scope of the present invention is limited to the appended claims.

Claims (19)

1, a kind of method of synonym node cluster is characterized in that, described method comprises:
Obtain the keyword of node correspondence, obtain the synonym item of described keyword by the dictionary service;
Described synonym item is carried out Hash calculation, determine the synonym node of described node according to the cryptographic Hash of Hash calculation acquisition;
By the cover layer agreement of bottom, in the routing table of described node self, safeguard link with described synonym node.
2, the method for claim 1, it is characterized in that, described cover layer agreement by bottom, in the routing table of described node self, safeguard link with described synonym node, be specially: a synonym node route list is set, the link between preservation and the described synonym node in described node; Or, obtain the multilink of described node and described synonym node by the cover layer agreement of bottom, from described multilink, select the shortest link as the link in the routing table of the cover layer agreement of bottom.
3, method as claimed in claim 1 or 2 is characterized in that, the keyword of described node self correspondence is: the keyword that the DHT key-value pair of node self is answered, and/or the keyword of the body of preserving in the node.
4, a kind of device of synonym node cluster is characterized in that, described device comprises:
Vocabulary Hash service unit is used for keyword is carried out the Hash operation, to obtain the cryptographic Hash of this keyword;
The dictionary service unit is used for returning the synonym item identical with this speech meaning of a word for a given speech;
The synonym service unit of trooping is used for the keyword according to the node correspondence, obtains the synonym item of described keyword from described dictionary service unit; Obtain the cryptographic Hash of described synonym item from described vocabulary Hash service unit, determine the synonym node of described node according to this cryptographic Hash; By the cover layer agreement of bottom, the link in the routing table of described node self between Maintenance Point and its synonym node.
5, device as claimed in claim 4 is characterized in that, the described synonym service unit of trooping comprises:
Synonym node determining unit is used for the keyword according to the node correspondence, to the synonym item of described this keyword of dictionary service unit request, to the cryptographic Hash of the described synonym item of described vocabulary Hash service unit request, to determine the synonym node of described node;
The underlying protocol performance element is used for the result definite according to described synonym node determining unit, by underlying protocol, obtains the multilink of described node and described synonym node;
The link selection unit is used for selecting the shortest link as the link the routing table of the cover layer agreement of bottom from the multilink that described underlying protocol performance element obtains.
6, device as claimed in claim 4 is characterized in that, the described synonym service unit of trooping comprises:
Synonym node determining unit is used for the keyword according to the node correspondence, to the synonym item of described this keyword of dictionary service unit request, to the cryptographic Hash of the described synonym item of described vocabulary Hash service unit request, to determine the synonym node of described node;
The underlying protocol performance element is used for the result definite according to described synonym node determining unit, by underlying protocol, obtains the link of described node and described synonym node;
Synonym routing table unit is used to safeguard a synonym node route list, preserves the link between node and the described synonym node.
7, a kind of information processing system is characterized in that, this system comprises: classification for search device, vocabulary Hash service unit and the synonym service unit of trooping;
Described classification for search device is used to obtain keyword, obtains the body of this keyword correspondence; Request vocabulary Hash service unit obtains the cryptographic Hash of at least one keyword, arrives destination node according to this cryptographic Hash; With described destination node is source node, to described destination node and close synonym node thereof, searches for according to the body of described keyword and described correspondence;
Described vocabulary Hash service unit is used for keyword is carried out the Hash operation, to obtain the cryptographic Hash of this keyword;
The described synonym service unit of trooping is used for safeguarding link between this node and its synonym node in the routing table of node self.
8, system as claimed in claim 7 is characterized in that, described system also comprises the body manager, is used for body is stored/safeguarded, and safeguards the corresponding relation of keyword and body; The interface of downloading body for arbitrary node in the network is provided;
Described classification for search device is used to also judge whether this locality exists and the corresponding body of given keyword, if do not exist, then downloads body by the body manager, carries out information search according to keyword and body then; Otherwise, obtain corresponding body from this locality according to described keyword, directly carry out information search according to keyword and body.
9, system as claimed in claim 8, it is characterized in that, described body manager is set in the server zone that has the high-performance node in the point to point network, and classification for search device, vocabulary Hash service unit, the synonym service unit of trooping is distributed in each node.
10, as claim 7,8 or 9 described systems, it is characterized in that described destination node comprises:
The local search unit is used for according to keyword and body, and the information of self preserving is carried out information search;
Synonym node searching unit is used for according to keyword and body, and the information that the synonym node is preserved is carried out information search;
Feedback unit is used for the Search Results of local search unit and synonym node searching unit is fed back to described classification for search device as a result.
11, system as claimed in claim 7 is characterized in that, described system also comprises the information classification device, is used to obtain body, uses at least one mark keyword that the data of need issue are marked according to the body that is obtained; From vocabulary Hash service unit, obtain the cryptographic Hash of mark keyword, the body of one or more mark keywords and acquisition is issued.
12, system as claimed in claim 11 is characterized in that, described system also comprises the body manager, is used for body is stored/safeguarded, safeguards the corresponding relation of keyword and body; Interface for arbitrfary point upload/download body in the network is provided;
Described information classification device comprises: body delivery unit, body processing unit;
Described body delivery unit is used for downloading body according to keyword from the body manager; The body that described body processing unit is revised, created is uploaded to the body manager;
Described body processing unit is used for if body of being downloaded and the abundant relation of semantic nothing are then made amendment/upgraded this body, or create new body at keyword; The body that to create, revise or upgrade by described body delivery unit is uploaded to the body manager.
13, a kind of information search method is characterized in that, described method comprises:
Obtain keyword, obtain corresponding body according to described keyword;
At least one keyword is carried out Hash calculation obtain its cryptographic Hash, arrive destination node according to described cryptographic Hash;
With described destination node is source node, preserve in the routing table according to described destination node and its synonym node between link, to described destination node and close synonym node thereof, search for according to the body of described keyword and described correspondence.
14, method as claimed in claim 13 is characterized in that, after obtaining keyword, described method also comprises: judge the local body corresponding with described keyword that whether exist, if do not exist, then download the body corresponding with this keyword from the body manager; Otherwise, obtain corresponding body from this locality according to described keyword.
15, method as claimed in claim 13 is characterized in that, after obtaining body, described method also comprises: obtain search-type; Described search-type comprises point search, similar search or range searching; Described body according to described keyword and described correspondence is searched for specifically and is comprised: according to the body of described keyword and described correspondence, search for the way of search of search-type appointment.
16, method as claimed in claim 13 is characterized in that, described body search according to described keyword and described correspondence is: be the parameter assignment of the body of described correspondence with described keyword, search for according to the body of the correspondence after the described assignment.
17, a kind of information issuing method is characterized in that, described method comprises:
Obtain body,, use at least one mark keyword that data to be released are marked according to the body that is obtained;
At least one mark keyword is carried out Hash calculation obtain its cryptographic Hash,, the body of described mark keyword and described acquisition is issued according to described cryptographic Hash;
According to the synonym node of the definite destination node of issuing of described mark keyword, in the routing table of the destination node of issuing, safeguard the destination node of issue and the link of its synonym node.
18, method according to claim 17 is characterized in that, described acquisition body specifically comprises: download body, create body or obtain body by editing existing body by keyword.
19, method according to claim 17, it is characterized in that, described according to described cryptographic Hash, body to described mark keyword and described acquisition is issued as: with the parameter assignment of the described mark keyword body that is described acquisition, issue according to the body of described cryptographic Hash after, send to the destination node of issue described assignment.
CN2008100077046A 2008-02-29 2008-02-29 Method for searching and releasing information, system and synonymy node clustering method and device therefor Expired - Fee Related CN101521655B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008100077046A CN101521655B (en) 2008-02-29 2008-02-29 Method for searching and releasing information, system and synonymy node clustering method and device therefor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008100077046A CN101521655B (en) 2008-02-29 2008-02-29 Method for searching and releasing information, system and synonymy node clustering method and device therefor

Publications (2)

Publication Number Publication Date
CN101521655A true CN101521655A (en) 2009-09-02
CN101521655B CN101521655B (en) 2011-11-16

Family

ID=41082033

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008100077046A Expired - Fee Related CN101521655B (en) 2008-02-29 2008-02-29 Method for searching and releasing information, system and synonymy node clustering method and device therefor

Country Status (1)

Country Link
CN (1) CN101521655B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103455935A (en) * 2012-05-31 2013-12-18 索尼公司 Information processing apparatus, program, and information processing method
CN103838770A (en) * 2012-11-26 2014-06-04 中国移动通信集团北京有限公司 Logic data partition method and system
CN106375365A (en) * 2015-07-23 2017-02-01 陈奕章 Intelligent cloud searching system and engineering purchasing service system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1315073C (en) * 2003-01-23 2007-05-09 英业达股份有限公司 Computer network information system possessing intelligent type online information searching function as well as improving linking efficiency between network nodes
CN100485688C (en) * 2007-06-11 2009-05-06 周广宇 Method for structural information issue and search at network environment
CN100583804C (en) * 2007-06-22 2010-01-20 清华大学 Method and system for processing social network expert information based on expert value propagation algorithm

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103455935A (en) * 2012-05-31 2013-12-18 索尼公司 Information processing apparatus, program, and information processing method
US9854220B2 (en) 2012-05-31 2017-12-26 Saturn Licensing Llc Information processing apparatus, program, and information processing method
CN103838770A (en) * 2012-11-26 2014-06-04 中国移动通信集团北京有限公司 Logic data partition method and system
CN106375365A (en) * 2015-07-23 2017-02-01 陈奕章 Intelligent cloud searching system and engineering purchasing service system

Also Published As

Publication number Publication date
CN101521655B (en) 2011-11-16

Similar Documents

Publication Publication Date Title
US9589006B2 (en) Method and apparatus for multidimensional data storage and file system with a dynamic ordered tree structure
US7664742B2 (en) Index data structure for a peer-to-peer network
US8938459B2 (en) System and method for distributed index searching of electronic content
CN104820717B (en) A kind of storage of mass small documents and management method and system
CN101399688A (en) Publishing method and device for distributed region lookup zone
EP2629212A1 (en) Method for storing and searching tagged content items in a distributed system
Castano et al. Helios: a general framework for ontology-based knowledge sharing and evolution in P2P systems
CN101739398A (en) Distributed database multi-join query optimization algorithm
US20130166654A1 (en) Method and Arrangement in a Peer-to-Peer Network
CN108282525B (en) Video resource management system and method based on peer-to-peer network
CN102891872B (en) The method and system of data storage and query in a kind of peer-to-peer network
CN101635741B (en) Method and system thereof for inquiring recourses in distributed network
CN101521655B (en) Method for searching and releasing information, system and synonymy node clustering method and device therefor
US20120317275A1 (en) Methods and devices for node distribution
KR20090094313A (en) Method and system for publishing the content, method and system for querying the content
Bender et al. Bookmark-driven Query Routing in Peer-to-Peer Web Search.
US20090049179A1 (en) Establishing of a semantic multilayer network
CN102760137A (en) Distributed full-text search method and distributed full-text search system
Löser et al. Semantic methods for p2p query routing
Löser et al. On Ranking Peers in Semantic Overlay Networks.
Amer-Yahia et al. Interactive exploration of composite items
Sangpachatanaruk et al. Semantic driven hashing (sdh): an ontology-based search scheme for the semantic aware network (sa net)
Obidallah et al. A Taxonomy to Characterize Web Service Discovery Approaches, Looking at Five Perspectives
KR100913441B1 (en) Method for Searching Semantic Resource Using Semantic Space Mapping of Resource
Zhang et al. FR-Index: A multi-dimensional indexing framework for switch-centric data centers

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20111116

Termination date: 20190228