CN102136975B - Large-scale network environment-oriented similarity network construction method - Google Patents
Large-scale network environment-oriented similarity network construction method Download PDFInfo
- Publication number
- CN102136975B CN102136975B CN201110044235.7A CN201110044235A CN102136975B CN 102136975 B CN102136975 B CN 102136975B CN 201110044235 A CN201110044235 A CN 201110044235A CN 102136975 B CN102136975 B CN 102136975B
- Authority
- CN
- China
- Prior art keywords
- network
- community
- layer
- topic
- resource
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a large-scale network resource environment-oriented similarity network construction method, which aims to enhance the semantic of links among resources, form a semantic virtual layer with similarity relation and improve the low accuracy and inaccuracy of conventional network resource management and network services. The method comprises the following steps of: firstly, roughly dividing the resources into a plurality of relatively smaller resource blocks by utilizing a divide and conquer strategy, and further performing fine processing on each resource block to reduce the complexity of directly processing the large-scale resources; secondly, constructing a three-layer network model comprising a resource layer, a topic layer and a community layer to ensure the connectivity of a constructed network; thirdly, constructing feedback mechanisms at the topic layer and the community layer to fuse more resources into the constructed network respectively; and finally, adding a human-computer interaction mechanism to enable a user to correct the links of the similarity network constructed by machines and make the network links more consistent with human thinking.
Description
Technical field
The present invention relates to a kind of similar network establishing method towards large-scale network resource environment, more specifically, relate to a kind ofly in the text resource of magnanimity, set up similar link to form the method for a similar network.
Background technology
Similar network is a semantic layer based on similarity relation organization network resource, aims to provide to have to enrich semantic network service, for example: network intelligence is browsed and intelligent search.But it is an impossible thing that the method for traditional similar network of structure is applied under the environment of large-scale network resource.Reason has two: one is high complexity computing time.Suppose to utilize to calculate between two and directly calculate similarity,, when resource extent is very large, Riming time of algorithm is the growth of resource extent quadratic term.Another reason is that the similar network link building is poor.While supposing to calculate, similarity threshold is got low, can guarantee to build the connectedness of network, but accuracy is just lower.And similarity threshold is got height, guaranteed accuracy and reduced connectedness.Present patent application is utilized a series of strategies and technology, has perfectly solved the problem on the organization and management that magnanimity resource brings.
Summary of the invention
The object of the invention is to the managerial difficulty that the magnanimity for the information on network causes, and the network service inefficiency and the inaccuracy that run into while providing, a kind of similar network establishing method towards large-scale network environment is provided, by the text resource link with similarity relation together, form a semantic virtual level with similarity relation.
For achieving the above object, design of the present invention is:
First, consider to utilize the strategy of dividing and rule, resource first coarse be divided into several less Resource Block, and then further careful processing in each Resource Block, reduces the complexity of directly processing in extensive resource;
Secondly, consider to build the network model of three layers, comprise resource layer, topic Ceng He community layer, guarantee to build the connectedness of network;
Again, consider to build feedback mechanism, lay respectively at topic Ceng He community layer, make more resource incorporate the network of structure;
Finally, consider to add man-machine interaction mechanism, allow user revise the link of the similar network of machine structure, make network linking more meet human thinking.
According to appeal inventive concept, the present invention adopts following technical proposals:
(1) thought based on acquaintance's immune algorithm, self adaptation is excavated network potential similarity community center;
(2) community center obtaining according to self adaptation, based on Boolean calculation, excavates the coarse similar community in large-scale network resource;
(3), based on matrix reasoning, excavate the topic center in similar community;
(4) according to obtained topic center, use the k-means algorithm of revising, form topic;
(5) build the similar network with Three Tiered Network Architecture, include community's layer, topic layer and resource layer;
(6) utilizing man-machine interaction mechanism to adjust in the similar network of having set up links.
The present invention compares with existing semantic interlink network establishing method, has following significant advantage: the present invention, towards large-scale network resource, adopts the strategy of dividing and rule generally, reduces the time complexity of direct construction.When excavating similar community, the selection of community center is subject to the inspiration of immune algorithm thought, and the similar community center of excavation is than the more high accuracy that has of choosing at random; The algorithm that has designed in addition Boolean calculation forms similar community, reduces the huge time loss that numerical operation brings.While producing different topic in coarse similar community, the center of topic is not some text resources, but that the method for utilizing matrix reasoning produces is more accurate, has that the frequent mode of the keyword of strong representation ability more forms, and be adaptive generation; In addition, utilize the thought of k-means algorithm to form topic, wherein added similarity threshold to avoid the text resource that similarity is lower to be grouped into the accuracy that has reduced plan structure network in some topics.Finally, structure be the similar network with Three Tiered Network Architecture, different layers is managed different level knowledge, meets knowledge hierarchy structure, and has increased network link.
Details are as follows for a preferred embodiment of the present invention:
This concrete implementation step towards the similar network establishing method of large-scale network environment is as follows:
(1) thought based on acquaintance's immune algorithm, self adaptation is excavated network potential similarity Web Community center.From resource collection, get at random a text, find some texts similarly, and extract the main contents of these texts as the center of potential similarity community.
(2) community center obtaining according to self adaptation, based on Boolean calculation, excavates the coarse similar community in large-scale network resource.The similar community center forming according to previous step, does logical “and” operation to similar community center one by one the Internet resources of magnanimity, and satisfactory resource division, in the similar community of correspondence, is formed to several coarse similar communities.
(3), based on matrix reasoning, excavate the topic center in similar community.Each excavation to coarse similar community by the form of matrix, represent, utilize matrix reasoning to excavate multinomial frequent mode, self adaptation forms different topic centers.Wherein defined matrix is: text resource of line display of matrix, and the corresponding keyword that forms text of row, if a text resource comprises certain keyword, in correspondence position set, on the contrary reset.And newly define a matrix operation operation
: every a line of previous matrix and each row of a rear matrix are done logic "and" operation, using the outcome record meeting the demands in matrix of consequence as the foundation of next step keyword Mining Frequent Patterns.If previous matrix is frontier, a rear matrix is later, and matrix of consequence is product,
the process of computing is:
(m, n represents respectively line number and the columns of matrix frontier, n, p represents respectively line number and the columns of matrix later)
Matrix
operation is to form potential topic center in order to excavate multinomial frequent item set.The formula of k item frequent mode is excavated in definition:
Wherein, matrix D is the matrix notation of a coarse similar community, D
tfor the transposed matrix of D,
record the matrix of consequence of k item Frequent Set.The last Candidate Set using the frequent item set excavating as the center of potential topic.
(4) according to obtained topic center, revise k-means algorithm, form topic.Similarity threshold is set, utilizes k-means method to form in each community and produce topic, in the topic that guarantees to form, do not include the text resource lower than similarity threshold.
(5) build the similar network with Three Tiered Network Architecture, include community's layer, topic layer and resource layer.Community's layer links all community centers; The core of all topics in topic layer link Yi Ge community; Internet resources in a topic of resource layer management.
(6) utilizing man-machine interaction mechanism to adjust in the similar network of having set up links.
Claims (3)
1. towards a similar network establishing method for large-scale network environment, it is characterized in that operating procedure is as follows:
(1) thought based on acquaintance's immune algorithm, self adaptation is excavated network potential similarity community center;
(2) community center obtaining according to self adaptation, based on Boolean calculation, excavates the coarse similar community in large-scale network resource;
(3), based on matrix reasoning, excavate the topic center in similar community;
(4) according to obtained topic center, use and revise k-means algorithm, form topic; Similarity threshold is set, utilizes k-means method to form topic in each community, in the topic that guarantees to form, do not include the text resource lower than similarity threshold;
(5) build the similar network with Three Tiered Network Architecture, include community's layer, topic layer and resource layer;
(6) utilize man-machine interaction mechanism to adjust the link in the similar network of having set up;
The formation at the topic center in described step (3), that reasoning obtains based on matrix, and newly define a matrix operation operation, for every a line of previous matrix and each row of a rear matrix are done logical operation, using the outcome record meeting the demands in matrix of consequence as the foundation of next step keyword Mining Frequent Patterns, the last Candidate Set using the frequent mode that utilizes matrix operation to obtain as potential topic center.
2. the similar network establishing method towards large-scale network environment according to claim 1, it is characterized in that the formation of the similar community center in described step (1), be obtained by the inspiration of immune algorithm thought, the probability that the text resource of a random selection becomes Yi Ge potential similarity community center becomes the probability of Yi Ge potential similarity community center much smaller than the main contents of several texts similar to this random text of selecting; Therefore, from resource collection, get at random a text, find some texts similarly, and extract the main contents of these texts as the center of potential similarity community.
3. the similar network establishing method towards large-scale network environment according to claim 1, it is characterized in that the structure in described step (5) has the similar network of Three Tiered Network Architecture, be designed to Three Tiered Network Architecture, include community's layer, topic layer and resource layer, community's layer links all community centers, the core of all topics in topic layer link Yi Ge community, the Internet resources in a topic of resource layer management.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110044235.7A CN102136975B (en) | 2011-02-24 | 2011-02-24 | Large-scale network environment-oriented similarity network construction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110044235.7A CN102136975B (en) | 2011-02-24 | 2011-02-24 | Large-scale network environment-oriented similarity network construction method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102136975A CN102136975A (en) | 2011-07-27 |
CN102136975B true CN102136975B (en) | 2014-04-02 |
Family
ID=44296635
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201110044235.7A Expired - Fee Related CN102136975B (en) | 2011-02-24 | 2011-02-24 | Large-scale network environment-oriented similarity network construction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102136975B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2618274A1 (en) * | 2012-01-18 | 2013-07-24 | Alcatel Lucent | Method for providing a set of services of a first subset of a social network to a user of a second subset of said social network |
EP2741249A1 (en) * | 2012-12-04 | 2014-06-11 | Alcatel Lucent | Method and device for optimizing information diffusion between communities linked by interaction similarities |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101369921A (en) * | 2008-09-12 | 2009-02-18 | 中国科学技术大学 | Self-similar network service generation method |
CN101571853A (en) * | 2009-05-22 | 2009-11-04 | 哈尔滨工程大学 | Evolution analysis device and method for contents of network topics |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7035838B2 (en) * | 2002-12-06 | 2006-04-25 | General Electric Company | Methods and systems for organizing information stored within a computer network-based system |
-
2011
- 2011-02-24 CN CN201110044235.7A patent/CN102136975B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101369921A (en) * | 2008-09-12 | 2009-02-18 | 中国科学技术大学 | Self-similar network service generation method |
CN101571853A (en) * | 2009-05-22 | 2009-11-04 | 哈尔滨工程大学 | Evolution analysis device and method for contents of network topics |
Also Published As
Publication number | Publication date |
---|---|
CN102136975A (en) | 2011-07-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104281617A (en) | Domain knowledge-based multilayer association rules mining method and system | |
CN102768670B (en) | Webpage clustering method based on node property label propagation | |
CN102609528B (en) | Frequent mode association sorting method based on probabilistic graphical model | |
CN102073700A (en) | Discovery method of complex network community | |
CN107819756B (en) | Method for improving mining income | |
CN104361036A (en) | Association rule mining method for alarm event | |
CN106484754A (en) | Based on hierarchical data and the knowledge forest layout method of diagram data visualization technique | |
CN103150163A (en) | Map/Reduce mode-based parallel relating method | |
CN105404637A (en) | Data mining method and device | |
Fadaei et al. | Enhanced K-means re-clustering over dynamic networks | |
Zhai et al. | A two-layer algorithm based on PSO for solving unit commitment problem | |
CN102136975B (en) | Large-scale network environment-oriented similarity network construction method | |
CN106844934B (en) | Smart city planning and designing expert system and smart city planning and designing method | |
Le et al. | A novel algorithm for mining high utility itemsets | |
CN103577899B (en) | A kind of service combining method combined with QoS based on creditability forceast | |
CN104834709A (en) | Parallel cosine mode mining method based on load balancing | |
Seol et al. | Reduction of association rules for big data sets in socially-aware computing | |
CN105678382B (en) | A kind of concept lattice merging method and system based on sub- Formal Context attributes similarity | |
Hong et al. | The study of improved FP-growth algorithm in MapReduce | |
Hou et al. | Simulating the dynamics of urban land quantity in China from 2020 to 2070 under the Shared Socioeconomic Pathways | |
Zhou et al. | Identifying technology evolution pathways by integrating citation network and text mining | |
CN109829056A (en) | Predicate explains the fact that template-driven Abductive reasoning method | |
CN104268270A (en) | Map Reduce based method for mining triangles in massive social network data | |
CN106156259A (en) | A kind of user behavior information displaying method and system | |
CN104036024A (en) | Spatial clustering method based on GACUC (greedy agglomerate category utility clustering) and Delaunay triangulation network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20140402 Termination date: 20170224 |
|
CF01 | Termination of patent right due to non-payment of annual fee |