CN103605708B - The method and system of keyword are speculated in KAD networks by keyword cryptographic Hash - Google Patents

The method and system of keyword are speculated in KAD networks by keyword cryptographic Hash Download PDF

Info

Publication number
CN103605708B
CN103605708B CN201310556473.5A CN201310556473A CN103605708B CN 103605708 B CN103605708 B CN 103605708B CN 201310556473 A CN201310556473 A CN 201310556473A CN 103605708 B CN103605708 B CN 103605708B
Authority
CN
China
Prior art keywords
keyword
resource file
candidate keywords
related resource
filename
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310556473.5A
Other languages
Chinese (zh)
Other versions
CN103605708A (en
Inventor
程学旗
冯凯
孙庆
刘备
席鹏弼
王元卓
刘悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN201310556473.5A priority Critical patent/CN103605708B/en
Publication of CN103605708A publication Critical patent/CN103605708A/en
Application granted granted Critical
Publication of CN103605708B publication Critical patent/CN103605708B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/1834Distributed file systems implemented based on peer-to-peer networks, e.g. gnutella

Abstract

The present invention provides the method and system for speculating keyword in a kind of KAD networks by keyword cryptographic Hash.Methods described searches for related resource file using keyword cryptographic Hash in KAD networks, obtains the filename of the related resource file;And word segmentation processing is carried out to the filename of the related resource file, obtain candidate keywords and its number occurred in the filename of the related resource file.The number that methods described also includes in the filename of the related resource file being occurred according to candidate keywords speculates keyword.The present invention can accurately obtain key word information corresponding to keyword cryptographic Hash in KAD networks, be easy to supervise KAD networks, improve network security.

Description

The method and system of keyword are speculated in KAD networks by keyword cryptographic Hash
Technical field
The present invention relates to speculate keyword by keyword cryptographic Hash in peer-to-peer network technology, more particularly to a kind of KAD networks Method and system.
Background technology
Recently as the rapid development of P2P technologies, P2P flow has occupied the 48%~80% of internet traffic.Its Architecture also constantly develops, from initial Unstructured Peer-to-Peer Network, to current structured p2p network.It is distributed Hash table(DHT)It is the major technique for realizing structured p2p network.Kademlia(KAD)Agreement is a kind of DHT realization, is The achievement in research that New York Univ USA P etar Maymounkov and David Mezieres delivered in 2002.Pass through Unique XOR algorithm(XOR)For distance metric basis, a kind of brand-new DHT topological structures are established, compared with other algorithms Substantially increase routing inquiry speed.
With eMule popularization and its support to Kademlia agreements so that KAD turn at present be widely deployed and The DHT networks of application, its shared and transmission resource(Including keyword resource, file resource)Quantity is even more hundreds of millions of. In KAD networks, node ID is represented with the string of binary characters of one 128.Generally lead to when node starts for the first time The ID that MD4 hash functions generate the node is crossed, the randomness of Hash ensure that the uniformity and collisionless of node.KAD networks provide In source, keyword resource is used for the information of index file, and its data mode in KAD networks is<Key, value>, wherein Key is the ID values for using hash function to generate the keyword obtained by word segmentation regulation segmentation filename(That is keyword Hash Value), value then includes the range of information of the file resource containing the keyword, such as filename, file size, file ID Deng.When carrying out keyword resource searching, KAD can first calculate the cryptographic Hash of keyword using MD4 hash functions, then be changed In generation, searches.
The MD4 algorithms that the use of ID values is obtained from keyword are a kind of non-reversible algorithms, and how to be come from keyword cryptographic Hash Speculate that key word information is significant.For example, buzzword is speculated according to the keyword cryptographic Hash detected in KAD networks Converge, further action is taken further according to popular vocabulary, to reach the purpose of security control.However, ground there is presently no any This target can be realized by studying carefully achievement.
The content of the invention
According to one embodiment of present invention, there is provided speculate the side of keyword in a kind of KAD networks by keyword cryptographic Hash Method, methods described include:
Step 1), using keyword cryptographic Hash related resource file is searched in KAD networks, obtain the related resource text The filename of part;
Step 2), word segmentation processing is carried out to the filename of the related resource file, obtain candidate keywords and its in institute State the number occurred in the filename of related resource file;
Step 3), the number that is occurred in the filename of the related resource file according to candidate keywords speculate it is crucial Word.
In one embodiment, step 3)Including:
Selection occurrence number in the filename of the related resource file is more than or equal to the candidate keywords of predetermined threshold;
When the number of selected candidate keywords is more than 1, Hash calculation is carried out to selected candidate keywords, will The result of Hash calculation is with the keyword cryptographic Hash identical candidate keywords as estimation result;
When the number of selected candidate keywords is equal to 1, using the candidate keywords as estimation result;
Otherwise, it determines estimation result is sky.
In a further embodiment, the predetermined threshold is the number of the related resource file.
In one embodiment, step 2)Including:
The participle symbol in the filename of the related resource file is found, using the word separated by the participle symbol as candidate Keyword;And
Calculate the number that the candidate keywords occur in the filename of the related resource file.
In a further embodiment, the participle symbol includes punctuation mark.
According to one embodiment of present invention, also provide in a kind of KAD networks and keyword is speculated by keyword cryptographic Hash System, the system include:
Search module, for searching for related resource file in KAD networks using keyword cryptographic Hash, obtain the correlation The filename of resource file;
Speculate module, for carrying out word segmentation processing to the filename of the related resource file, obtain candidate keywords and Its number occurred in the filename of the related resource file;And according to candidate keywords in the related resource file Filename in the number that occurs speculate keyword.
In one embodiment, the supposition module is used to select to go out occurrence in the filename of the related resource file Candidate keywords of the number more than or equal to predetermined threshold;
When the number of selected candidate keywords is more than 1, Hash calculation is carried out to selected candidate keywords, will The result of Hash calculation is with the keyword cryptographic Hash identical candidate keywords as estimation result;
When the number of selected candidate keywords is equal to 1, using the candidate keywords as estimation result;
Otherwise, it determines estimation result is sky.
In one embodiment, thus it is speculated that the participle symbol that module is used to find in the filename of the related resource file, will The word separated by the participle symbol is as candidate keywords;And the candidate keywords are calculated in the related resource file The number occurred in filename.
Key word information corresponding to keyword cryptographic Hash in KAD networks can be accurately obtained using the present invention, be easy to supervise Managed network, improve network security.
Brief description of the drawings
Fig. 1 is the method flow diagram for speculating keyword by keyword Hash according to one embodiment of the invention;
Fig. 2 is that related resource file is searched in KAD networks using keyword cryptographic Hash according to one embodiment of the invention Method flow diagram;
Fig. 3 is the method flow diagram segmented according to one embodiment of the invention to resource file name;
Fig. 4 A and 4B are keyword cryptographic Hash schematic diagrames according to an embodiment of the invention with speculating obtained keyword Schematic diagram.
Embodiment
The present invention is illustrated with reference to the accompanying drawings and detailed description.
According to one embodiment of present invention, there is provided speculate the side of keyword in a kind of KAD networks by keyword cryptographic Hash Method, as shown in Figure 1.In short, this method searches for the resource text of correlation using keyword cryptographic Hash in KAD networks Part, obtain resource file list of file names;And target keyword is speculated according to resource file list of file names.
Fig. 2 shows the resource file list of file names that related resource file in KAD networks is worth to using keyword Hash Embodiment, step are as follows:
Step S101, keyword cryptographic Hash is obtained, such as acquisition can be monitored from KAD networks, or by artificial defeated Enter.
Step S102, the resource file of correlation is searched in KAD networks using keyword cryptographic Hash.
Wherein, resource file refers to various types of files in KAD networks, including text, video file, audio text Part, compressed package etc..Related resource file then refers to search for obtained resource file in KAD networks, and its filename includes should Keyword corresponding to keyword cryptographic Hash.
In one embodiment, this step can search for the resource of correlation text using the resource searching function in KAD agreements Part, the resource related to target keyword can be so obtained to greatest extent.Because in KAD networks, keyword resource is to deposit Storage is on the node nearest apart from the keyword(Wherein, distance refers to the difference between keyword cryptographic Hash and node i d), Then internal retrieving includes:The iterative search node nearest apart from keyword cryptographic Hash, finds these first in KAD networks After node, keyword resource search request is sent to;These nodes after receiving the request, can will oneself storage the keyword Resource replies to search requesting party.
It can also be multiple that related resource file quantity, which can be one, or is also possible that and does not search correlation The situation of resource file.But do not consider no related resource file such case herein, or can be regarded as target critical Word is not found.
Step S103, the resource file list of file names of the related resource file searched is obtained.
After the file resource of correlation is searched, such as the information on keyword resource is obtained, can be provided from the keyword Resource file name is obtained in the value in source(As described above, value includes filename, file size and file ID etc.), so as to Build resource file list of file names.In one embodiment, each list item of resource file list of file names may include file ID, filename Deng element.
Due to one or more related resource files may be searched in previous step, correspondingly, in resource file list of file names A list item can only be included or may include multiple list items.
Fig. 3 is shown according to resource file list of file names to speculate one embodiment of target keyword, is comprised the following steps:
Step S201, resource file list of file names is obtained.
Step S202, filename all in resource file list is segmented, obtains candidate keywords.
In one embodiment, can be accorded with according to the participle in filename to be segmented:Find the participle in filename Symbol, obtain the word separated by participle symbol.It can be punctuation mark to segment symbol, i.e., in filename, " () [] { }< >,._-!:;/ " these characters be participle symbol.Such as:One file is entitled:The star in China-future(Talent competition)First Phase .mp4, wherein '-', '(’、‘)' it is exactly participle symbol, file name may be logically divided into " China ", " star in future ", " select-elite section Mesh ", " first phase " four candidate keywords.Here, resource file name does not consider extension name.
Step S203, candidate key word list is built using obtained candidate keywords.
In one embodiment, candidate key word list is represented by<Candidate keywords, number>, wherein candidate keywords It is to segment obtained word by previous step, its number refers to what the candidate keywords occurred in whole resource file list of file names Number.
Step S204, the number according to corresponding to candidate keywords in predetermined threshold and candidate key word list deduces key Target keyword corresponding to word cryptographic Hash.
In one embodiment, predetermined threshold can be set based on experience value.In one embodiment, predetermined threshold is For the file number in resource file list of file names(That is related resource file number), then speculate that target keyword is divided into following three kinds Situation:
When only number corresponding to a candidate keywords is more than or equal in resource file list of file names in lists of keywords During file number, then the candidate keywords are target keyword;
When there is the file number that number corresponding to multiple candidate keywords is more than or equal in resource file list of file names, to this Several candidate keywords carry out Hash calculation respectively, obtain its cryptographic Hash.By the cryptographic Hash and S101 of these candidate keywords The keyword cryptographic Hash obtained in step is compared, such as identical to can obtain target keyword.If be different from, show Target keyword corresponding to this keyword cryptographic Hash is not found.
When number corresponding to candidate keywords is both less than the file number in resource file list of file names, show that keyword is breathed out Target keyword corresponding to uncommon value is not found.
According to one embodiment of present invention, also provide in a kind of KAD networks and keyword is speculated by keyword cryptographic Hash System, including search module and supposition module.
Wherein, search module is used to search for related resource file in KAD networks using keyword cryptographic Hash, obtains correlation The filename of resource file.Speculate that module is used to carry out word segmentation processing to the filename of related resource file, obtain candidate key The number that word and candidate keywords occur in the filename of related resource file.Speculate that module is additionally operable to according to candidate key The number that word occurs in the filename of related resource file speculates keyword.
Speculate that module when speculating target keyword, selects the occurrence number in the filename of related resource file big first In the candidate keywords equal to predetermined threshold.Wherein, when the number of selected candidate keywords is more than 1, to selected Candidate keywords carry out Hash calculation, using the result of Hash calculation and keyword cryptographic Hash identical candidate keywords as supposition As a result.When the number of selected candidate keywords is equal to 1, using the candidate keywords as estimation result.Otherwise, this is shown Target keyword corresponding to keyword cryptographic Hash is not found, you can determines estimation result for sky.
To prove the validity of the method provided by the invention that keyword is speculated by keyword cryptographic Hash, inventor is in KAD Many experiments have been carried out in network.Hash calculation is carried out to some default keywords first, obtains Hash as shown in Figure 4 A Value, corresponding target keyword is deduced then according to method provided by the invention by these cryptographic Hash, as a result as shown in Figure 4 B, And result shown in Fig. 4 B is consistent with predetermined keyword, show that method provided by the invention can accurately and effectively identify keyword Keyword corresponding to cryptographic Hash.
It should be noted that and understand, the feelings of the spirit and scope of the present invention required by appended claims are not departed from Under condition, various modifications and improvements can be made to the present invention of foregoing detailed description.It is therefore desirable to the model of the technical scheme of protection Enclose and do not limited by given any specific exemplary teachings.

Claims (10)

1. speculating the method for the keyword in a kind of KAD networks by keyword cryptographic Hash, methods described includes:
Step 1), using keyword cryptographic Hash related resource file is searched in KAD networks, obtain the related resource file Filename;
Step 2), the All Files name to the related resource file carry out word segmentation processing, obtain candidate keywords and the time The number for selecting keyword to occur in the All Files name of the related resource file;And
Step 3), the number occurred according to candidate keywords in the All Files name of the related resource file speculate the pass Keyword.
2. according to the method for claim 1, wherein, step 3) includes:
Selection occurrence number in the filename of the related resource file is more than or equal to the candidate keywords of predetermined threshold;
When the number of selected candidate keywords is more than 1, Hash calculation is carried out to selected candidate keywords, by Hash The result of calculating is with the keyword cryptographic Hash identical candidate keywords as estimation result;
When the number of selected candidate keywords is equal to 1, using the candidate keywords as estimation result;
Otherwise, it determines estimation result is sky.
3. according to the method for claim 2, wherein, the predetermined threshold is the number of the related resource file.
4. according to the method described in any one in claim 1-3, wherein, step 2) includes:
The participle symbol in the filename of the related resource file is found, using the word separated by the participle symbol as candidate key Word;And
Calculate the number that the candidate keywords occur in the filename of the related resource file.
5. according to the method for claim 4, wherein the participle symbol includes punctuation mark.
6. speculating the system of the keyword in a kind of KAD networks by keyword cryptographic Hash, the system includes:
Search module, for searching for related resource file in KAD networks using keyword cryptographic Hash, obtain the related resource The filename of file;
Speculate module, for carrying out word segmentation processing to the All Files name of the related resource file, obtain candidate keywords and The number that the candidate keywords occur in the All Files name of the related resource file;And existed according to candidate keywords The number occurred in the All Files name of the related resource file speculates the keyword.
7. system according to claim 6, wherein, it is described to speculate that module is used to select the text in the related resource file Occurrence number is more than or equal to the candidate keywords of predetermined threshold in part name;
When the number of selected candidate keywords is more than 1, Hash calculation is carried out to selected candidate keywords, by Hash The result of calculating is with the keyword cryptographic Hash identical candidate keywords as estimation result;
When the number of selected candidate keywords is equal to 1, using the candidate keywords as estimation result;
Otherwise, it determines estimation result is sky.
8. system according to claim 7, wherein, the predetermined threshold is the number of the related resource file.
9. according to the system described in any one in claim 6-8, wherein speculating that module is used to find the related resource text Participle symbol in the filename of part, using the word separated by the participle symbol as candidate keywords;And calculate the candidate and close The number that keyword occurs in the filename of the related resource file.
10. system according to claim 9, wherein the participle symbol includes punctuation mark.
CN201310556473.5A 2013-11-11 2013-11-11 The method and system of keyword are speculated in KAD networks by keyword cryptographic Hash Active CN103605708B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310556473.5A CN103605708B (en) 2013-11-11 2013-11-11 The method and system of keyword are speculated in KAD networks by keyword cryptographic Hash

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310556473.5A CN103605708B (en) 2013-11-11 2013-11-11 The method and system of keyword are speculated in KAD networks by keyword cryptographic Hash

Publications (2)

Publication Number Publication Date
CN103605708A CN103605708A (en) 2014-02-26
CN103605708B true CN103605708B (en) 2017-12-08

Family

ID=50123931

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310556473.5A Active CN103605708B (en) 2013-11-11 2013-11-11 The method and system of keyword are speculated in KAD networks by keyword cryptographic Hash

Country Status (1)

Country Link
CN (1) CN103605708B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104102748B (en) * 2014-08-08 2017-12-22 中国联合网络通信集团有限公司 File Mapping method and device and file recommendation method and device
CN110442773B (en) * 2019-08-13 2023-07-18 深圳市网心科技有限公司 Node caching method, system and device in distributed system and computer medium
CN114465933B (en) * 2022-04-13 2022-06-14 中国科学院合肥物质科学研究院 Block chain network transmission method and transmission medium based on KAD (Kad-based binary) model

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008154823A1 (en) * 2007-06-21 2008-12-24 Tencent Technology (Shenzhen) Company Limited Searching method, system and device
CN102082820A (en) * 2010-12-14 2011-06-01 西北工业大学 EMule file sharing system oriented comprehensive pollution method
CN102087666A (en) * 2011-01-30 2011-06-08 华东师范大学 Indexes based on covering relationship between nodes and key words, constructing method and query method thereof
CN103167029A (en) * 2013-03-06 2013-06-19 中国科学院计算技术研究所 Method and device for discovering specific resources on eMule network
CN103226591A (en) * 2013-04-15 2013-07-31 厦门亿联网络技术股份有限公司 Method and device for supporting quick access of multiple keywords
CN103258052A (en) * 2013-05-28 2013-08-21 中国科学院计算技术研究所 Method for discovering related resources on eMule network

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7895338B2 (en) * 2003-03-18 2011-02-22 Siemens Corporation Meta-search web service-based architecture for peer-to-peer collaboration and voice-over-IP
FR2917259B1 (en) * 2007-06-08 2009-08-21 Alcatel Lucent Sas USE OF A PREFIXED HASH TREE (PHT) FOR LOCATION OF SERVICES WITHIN A POST-TO-POST COMMUNICATION NETWORK

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008154823A1 (en) * 2007-06-21 2008-12-24 Tencent Technology (Shenzhen) Company Limited Searching method, system and device
CN102082820A (en) * 2010-12-14 2011-06-01 西北工业大学 EMule file sharing system oriented comprehensive pollution method
CN102087666A (en) * 2011-01-30 2011-06-08 华东师范大学 Indexes based on covering relationship between nodes and key words, constructing method and query method thereof
CN103167029A (en) * 2013-03-06 2013-06-19 中国科学院计算技术研究所 Method and device for discovering specific resources on eMule network
CN103226591A (en) * 2013-04-15 2013-07-31 厦门亿联网络技术股份有限公司 Method and device for supporting quick access of multiple keywords
CN103258052A (en) * 2013-05-28 2013-08-21 中国科学院计算技术研究所 Method for discovering related resources on eMule network

Also Published As

Publication number Publication date
CN103605708A (en) 2014-02-26

Similar Documents

Publication Publication Date Title
Awad et al. Chaotic searchable encryption for mobile cloud storage
US9659214B1 (en) Locally optimized feature space encoding of digital data and retrieval using such encoding
CN103605708B (en) The method and system of keyword are speculated in KAD networks by keyword cryptographic Hash
CN113706326B (en) Mobile social network diagram modification method based on matrix operation
CN109145053B (en) Data processing method and device, client and server
CN105786953B (en) Ordering encoded manifests in a content-centric network
US20120317275A1 (en) Methods and devices for node distribution
CN113254797B (en) Searching method, device and processing equipment for social network community
US8738801B2 (en) Methods and apparatus for updating index information while adding and updating documents in a distributed network
Fraigniaud Small worlds as navigable augmented networks: Model, analysis, and validation
CN106709045B (en) Node selection method and device in distributed file system
CN115277540A (en) Method and device for optimizing structured P2P network
CN109032804B (en) Data processing method and device and server
Miao et al. A novel efficient index model and modified chord protocol for decentralized service repositories
CN111290713A (en) Data storage method and device, electronic equipment and storage medium
JP5929902B2 (en) Information processing device
CN104079615A (en) File downloading method and information processing device
US20110270841A1 (en) Distributed Tag-Based Correlation Engine
Chen et al. Heuristic resource discovery in p2p network
Zuo et al. Critical link-aware P2P search for internet videos in semantic overlay network
Singh et al. Comparative Analysis of Energy Efficient Protocols for Prolonged Life of Wireless Sensor Network
CN103605789A (en) Method and system for obtaining hot retrieval resources on KAD network
Liu et al. Double‐layer P2P networks supporting semantic search and keeping scalability
JP2018500650A (en) Method, apparatus and system for determining the presence of a data file
Furness et al. Improving Wide Area P2P Service Discovery Mechanisms using Complex Queries

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant