CN102663108A - Medicine corporation finding method based on parallelization label propagation algorithm for complex network model - Google Patents

Medicine corporation finding method based on parallelization label propagation algorithm for complex network model Download PDF

Info

Publication number
CN102663108A
CN102663108A CN2012101111712A CN201210111171A CN102663108A CN 102663108 A CN102663108 A CN 102663108A CN 2012101111712 A CN2012101111712 A CN 2012101111712A CN 201210111171 A CN201210111171 A CN 201210111171A CN 102663108 A CN102663108 A CN 102663108A
Authority
CN
China
Prior art keywords
medicine
label
parallelization
propagation algorithm
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012101111712A
Other languages
Chinese (zh)
Other versions
CN102663108B (en
Inventor
王崇骏
刘正
杨鸿超
孙道平
谢俊元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN2012101111712A priority Critical patent/CN102663108B/en
Publication of CN102663108A publication Critical patent/CN102663108A/en
Application granted granted Critical
Publication of CN102663108B publication Critical patent/CN102663108B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The invention provides a medicine corporation finding method based on a parallelization label propagation algorithm for a complex network model. The medicine corporation finding method includes a networking stage and a mining stage, the networking stage includes a) preprocessing and generating a traditional Chinese medicine data set, formatting the traditional Chinese medicine dataset into text data; b) deploying the initial text data to a Hadoop platform; c) parallelly establishing a traditional Chinese medicine (TCM for short) network; and d) completing the networking stage, and the mining stage includes a) acquiring a TCM network text file processed and generated from the step c) in the networking stage; b) deploying the TCM network text file to the Hadoop platform; c) implementing the parallelization label propagation algorithm to find out medicine corporations; and d) completing the mining stage. By the aid of the medicine corporation finding method based on the parallelization label propagation algorithm for the complex network model, a TCM network model is built, extensibility and running speed of networking and the label propagation algorithm are increased by the aid of parallelization technology, the medicine corporations with similarity in terms of complex Chinese medicines can be effectively mined, and research on compatibility regulation of traditional Chinese medicine is assisted.

Description

Medicine corporations discover method based on complex network model parallelization label propagation algorithm
Technical field
The present invention relates to a kind of Chinese medicine complex network modeling method, and on this Chinese medicine medicine complex network TCM, adopt parallel label propagation algorithm to excavate the technology of Chinese medicine medicine corporations.
Background technology
Utilize the data mining technology can intellectual analysis Chinese medicine compound prescription data, find potential drug matching rule.In the Chinese medicine data mining commonly used one type of application being arranged is the clustering algorithm of medicine, and it carries out polymerization to find the drug group of frequent prescription based on transaction item model (regard compound the affairs of being made up of multiple medicine as and be stored in the transaction database) to similar medicine.Tradition is difficult to excavate the medicine of indirect prescription compatibility based on the Chinese medicine medicine clustering algorithm of transaction item model, and often ignores the processing to uncommon medicine, is unfavorable for furtheing investigate the compatibility rule knowledge of every kind of medicine.The present invention attempts with complex network model modeling Chinese medicine medicine network, in the medicine network, uses corporations and finds that algorithm excavates the similar drug group of the property of medicine.
Research to network corporations structure in Complex Networks Analysis has had very long history, and it relates to every field such as computer science, sociology, life science.Corporations' structure in analysis and the announcement network all is very important for awareness network structure and phase-split network characteristic.In the Chinese medicine complex network, carrying out corporations finds very close based on the purpose of the medicine cluster analysis of transaction item model with tradition; All be that the frequent medicine of prescription together is aggregated in the same classification, and excavate the similar medicine of the property of medicine so that research drug matching rule.
Make up Chinese medicine medicine complex network this thinking based on complex network model and broken the traditional Chinese medicine data mining all based on the convention of the modeler model of transaction item; And the label propagation algorithm in the employing Complex Networks Analysis can deeply excavate Chinese medicine medicine corporations; Find that the property of medicine is similar, the drug group of the inner frequent relatively prescription of corporations, overcome tradition and can not find indirect compatibility and the defective of ignoring uncommon medicine based on the transaction item clustering algorithm.
In the recent period, along with the surge of Chinese medicine compound prescription data, non-parallel algorithm is applicable to that no longer the corporations of fairly large Chinese medicine data find.
Summary of the invention
Technical matters to be solved by this invention is to realize the Chinese medicine complex network modeling, and on this model, adopts parallel label propagation algorithm, fast and effeciently to find medicine corporations.
For addressing the above problem, the medicine corporations discover method based on complex network model parallelization label propagation algorithm of the present invention comprises the steps:
1) the networking stage:
The a pre-service is formatted as initial text data to generate the Chinese medicine data set;
B disposes the platform to Hadoop with initial text data;
Chinese medicine medicine network is set up in the c parallelization, i.e. TCM network, and this network is a node with the medicine, with SC ABNode greater than given threshold value connects the limit;
D finishes.
2) excavation phase:
A obtaining step 1)-Chinese medicine medicine network text file that c handle to generate;
B disposes the platform to Hadoop with above-mentioned TCM network text file;
C implements parallelization label propagation algorithm, promptly adopts the label propagation algorithm of MapReduce framework parallelization, utilizes node neighbor information iteration to upgrade self label (promptly corporations number), with discovery medicine corporations;
D) finish.
Said pre-service is formed for the medicine that extracts all compounds in the Chinese medicine compound prescription data among step 1)-a.
The said initial text data that is deployed as step 1)-a generation is uploaded to the distributed file system (HDFS) of Hadoop platform among step 1)-b.
Further, the detailed process of step 1)-c is following:
1) be each Chinese medicine compound prescription, promptly delegation's text data is set a unique ID
2) set up the inverted index between the sign ID from the medicine to the compound;
3) set unique drug identifier id for each medicine, wherein comprise the frequency that this medicine occurs in compound;
4) inverted index is reduced, promptly carry out Inversed File Retrieval Algorithm once more, every capable compound reads in certain Map function of this subtask, reduction Chinese medicine compound prescription text data;
5) each Map function reads delegation's text, parses the medicine nodal information;
6) judging that can the compound contained drug in this Map function also set up associating key assignments < Key, Value>in twos, is then to carry out 7), otherwise carry out 8);
7) set up associating key assignments < Key, Value >;
8) < Key, Value>sends among the Reduce through shuffle&&sort, and Reduce receives [Value] array that identical Key forms down, measures between medicine in twos according to computes, will be greater than the medicine of setting threshold to writing file and being saved among the HDFS
SC AB = | F A &cap; F B | min { | F A | , | F B | }
Wherein | F A∩ F B| expression medicine A, B be the number of times of prescription together, min{|F A|, | F B| the occurrence number of the less medicine of prescription number of times among the expression medicine A, B, and SC ABExpression medicine A, B co-occurrence number of times and the minimum ratio that the medicine number of times occurs;
9) reading 6) the middle medicine that generates is to file, and promptly the limit collection of medicine complex network is formatted as the adjacency list form and preserves the Chinese medicine network topology structure;
10) finish.
Further, step 2)-utilize tag update self label (be generally the maximum label of the frequency, take certain selection at random) under the neighbor node among the c if maximum frequency label has a plurality of.The total process of parallelization label propagation algorithm is based on iterative, and stopping criterion for iteration is that each node label is basicly stable, does not for example change etc. greater than 90% node label.Provide certain iterative algorithm flow process in the iterative step at this, promptly wherein parallelization label propagation algorithm iterative process is specific as follows::
1) for each medicine node unique initial labels id is set;
2) each Map function reads delegation's text from HDFS, deposits in the Value variable;
3) data in the parsing Value variable are preserved node i d with interim array Tmp [0], and Tmp [1] preserves adjacency list AdjList and Label;
4) sending node data structure;
5) judge whether only contain a label among the Label, promptly iteration first carries out 6), otherwise carry out 7);
6) make variable V=label 1;
7) make variable V=label 1&& label 2, wherein the label of the label of t-1 iteration of label 1 expression and t-2 iteration of label 2 expressions;
8) make variable i=0;
9) whether judging i less than AdjList.length, is execution in step 10 then, otherwise execution in step 12
10) send < AdjList.get (i), V >
11) i carries out 8 from increasing 1);
12) the Map process finishes, and Hadoop carries out shuffle&&sort;
13) Reduce resolves [Value] array, preserves node structure with data structure AdjLabelPA respectively, interim chained list ls 1, ls 2Preserve the l that each passes over respectively 1, l 2Value (if two labels are arranged, otherwise ls 2Be sky)
14) find out new node label according to following formula;
C x ( t ) = f ( C x 1 ( t - 1 ) , . . . , C x k ( t - 1 ) , w * C x 1 ( t - 2 ) , . . . , w * C x k ( t - 2 ) )
15) wherein Represent iteration x t-1 time kThe label of node, what the f function returned is that neighbor node passes over the maximum mark of the frequency;
16) the t-1 label and the t label that upgrade among the AdjLabel are respectively C x(t-1) and C x(t);
17) result who preserves this iteration is to distributed file system HDFS;
18) finish.
Medicine corporations discover method based on complex network model parallelization label propagation algorithm of the present invention has been set up Chinese medicine medicine complex network model; Utilize the parallelization technology to improve the extensibility and the travelling speed of networking and label propagation algorithm; And can effectively excavate the similar medicine corporations of herbal mixture property, help research drug matching rule.
Description of drawings
Fig. 1 finds operational flowchart for medicine corporations.
Fig. 2 is the process flow diagram of the medicine corporations discover method based on complex network model parallelization label propagation algorithm of the present invention.
Fig. 3 is for generating the process flow diagram of Chinese medicine medicine (TCM) network.
Fig. 4 excavates the process flow diagram of medicine corporations on Chinese medicine medicine (TCM) network, utilizing label propagation algorithm (certain iteration).
Embodiment
In order more to understand technology contents of the present invention, special act specific embodiment also cooperates appended graphic explanation following.
As shown in Figure 1; Core drug excavates and obtains the Chinese medicine compound prescription data through prescription data base querying, irregular text data extraction etc.; Generate text data through pre-service such as data normalization, formats; Chinese medicine medicine complex network is set up in parallelization on the Hadoop platform then, on this network, moves parallelization label propagation algorithm at last to excavate medicine corporations.
It is key steps of this invention that the networking of Chinese medicine compound prescription data is excavated medicine corporations with rowization label propagation algorithm; Thinking of the present invention is effectively excavated medicine corporations through complex network modeling and parallelization label propagation algorithm exactly, improves algorithm extensibility and travelling speed simultaneously.
The process flow diagram of the medicine corporations discover method based on complex network model parallelization label propagation algorithm of the present invention is as shown in Figure 2.
Step 0 is the initial state of medicine of the present invention corporations discover method;
At networking stage (step 1-3), step 1 is from database or other irregular text datas, to obtain initial Chinese medicine compound prescription networking data, and is formatted as text data so that be uploaded to the distributed file system (HDFS) of Hadoop platform;
Step 2 is to concentrate parallel Chinese medicine medicine (TCM) network of setting up in primary data, comprises twice inverted index and sets up medicine in twos to the associating key-value pair;
Step 3 is the HDFS that are saved to the TCM network that generates on the Hadoop platform.
At excavation phase (step 4-5), step 4, operation parallelization label propagation algorithm in the TCM network that step 3 generated;
Step 5 is that the result who excavates is saved to HDFS;
Step 6 is end step of the medicine corporations discover method based on complex network model parallelization label propagation algorithm of the present invention.
Fig. 3 is the detailed description to step 2 among Fig. 2.
Step 20 is an initial step;
Step 21 is to set a unique ID value for each Chinese medicine compound prescription, from label 1 beginning;
Step 22 is to set up the inverted index of medicine to compound ID;
Step 23 is to set id for each medicine, and from label 1%N, wherein N representes the frequency that this medicine occurs in compound, i.e. the length of inverted index;
Step 24 pair inverted index reduces, and promptly carries out Inversed File Retrieval Algorithm once more, and every capable compound reads in certain Map function of this subtask;
Can the compound contained drug in this Map function of step 25 judgement also set up the associating key assignments in twos, can promptly carry out 26, otherwise carry out 27 (noticing that should be that the Map process of saying this subtask finishes this moments);
Step 26 is for setting up associating key assignments < Key, Value>(wherein Key is less than Value);
Step 27 is calculated SC for utilizing formula 1 in the Reduce function ABValue
SC AB = | F A &cap; F B | min { | F A | , | F B | } - - - ( 1 )
Wherein | F A∩ F B| expression medicine A, B be the number of times of prescription together, min{|F A|, | F B| the occurrence number of the less medicine of prescription number of times among the expression medicine A, B, and SC ABExpression medicine A, B co-occurrence number of times and the minimum ratio that the medicine number of times occurs;
Step 28 is for to be saved to HDFS with the result;
Step 29 is the end of Fig. 3.
The total process of parallelization label propagation algorithm is based on iterative, and stopping criterion for iteration is that each node label is basicly stable, and Fig. 4 is the detailed description to an iteration of label propagation algorithm in the step 4 among Fig. 2, and is specific as follows:
Step 40 is an initial step;
Step 41 is that each medicine node is set a unique label;
Each Map function of step 42 expression reads delegation's text of HDFS Chinese traditional medicine network file, and deposits in the Value variable;
The Value variable is resolved in step 43 expression, preserves node i d with interim array Tmp [0], and Tmp [1] preserves adjacency list AdjList and Label;
Step 44 expression sending node data structure;
Step 45 judges whether Label only contains a label (for the first time iteration), is execution in step 46 then, otherwise execution in step 47;
Step 46 makes variable V=label 1;
Step 47 makes variable V=label 1&& label 2, wherein the label of the label of t-1 iteration of label 1 expression and t-2 iteration of label 2 expressions;
Step 48 makes variable i=0;
Whether step 49 judges i less than AdjList.length, is execution in step 50 then, otherwise execution in step 52;
< AdjList.get (i), V>sent in step 50 expression;
I is from increasing 1 in step 51 expression;
Step 52 is the Shuffle and the Sort process of Hadoop platform;
Step 53 receives < Key, [Value]>for Reduce;
Step 54 is resolved [Value] array for Reduce, preserves node structure with data structure AdjLabelPA respectively, interim chained list ls 1The label of self t-1 iteration of preserving that each neighbor node transmits, ls 2The label of self t-2 iteration of preserving that each neighbor node transmits;
Step 55 expression is returned common consideration ls according to the f function 1, ls 2The highest label L of the frequency that produces;
The label of t-1, t-2 iteration among the AdjLabelPA is upgraded in step 56 expression;
Step 57 is for to be kept at the result on the HDFS;
Step 58 is the end step of Fig. 4;
Annotate: the label propagation algorithm has repeatedly iteration, and the terminal point of iteration is that the tags stabilize of the node more than 90% in the network is constant.
In sum; The present invention utilizes the parallelization technology to improve the extensibility and the travelling speed of networking and label propagation algorithm; Find algorithm so that can under a large amount of compound data, rapidly and efficiently move corporations; And can effectively excavate the similar medicine corporations of herbal mixture property, help research drug matching rule.
Have common knowledge the knowledgeable in the technical field under the present invention, do not breaking away from the spirit and scope of the present invention, when doing various changes and retouching.Therefore, protection scope of the present invention is as the criterion when looking claims person of defining.

Claims (5)

1. the medicine corporations discover method based on complex network model parallelization label propagation algorithm is characterized in that, comprises the steps:
1) the networking stage:
The a pre-service is formatted as initial text data to generate the Chinese medicine data set;
B disposes the platform to Hadoop with initial text data;
Chinese medicine medicine network is set up in the c parallelization, and this network is a node with the medicine, with SC ABNode greater than given threshold value connects the limit;
D finishes.
2) excavation phase:
A obtaining step 1)-Chinese medicine medicine network text file that c handle to generate;
B disposes the platform to Hadoop with above-mentioned Chinese medicine medicine network text file;
C implements parallelization label propagation algorithm, promptly adopts the label propagation algorithm of MapReduce framework parallelization, utilizes node neighbor information iteration to upgrade self label, to find medicine corporations;
D) finish.
2. the medicine corporations discover method based on complex network model parallelization label propagation algorithm according to claim 1 is characterized in that, wherein said pre-service is formed for the medicine that extracts all compounds in the Chinese medicine compound prescription data among step 1)-a.
3. the medicine corporations discover method based on complex network model parallelization label propagation algorithm according to claim 1; It is characterized in that wherein the said initial text data that is deployed as step 1)-a generation is uploaded to the distributed file system of Hadoop platform among step 1)-b.
4. the medicine corporations discover method based on complex network model parallelization label propagation algorithm according to claim 1 is characterized in that wherein the detailed process of step 1)-c is following:
1) be each Chinese medicine compound prescription, promptly delegation's text data is set a unique ID
2) set up the inverted index between the sign ID from the medicine to the compound;
3) set unique drug identifier id for each medicine, wherein comprise the frequency that this medicine occurs in compound;
4) inverted index is reduced, promptly carry out Inversed File Retrieval Algorithm once more, every capable compound reads in certain Map function of this subtask, reduction Chinese medicine compound prescription text data;
5) each Map function reads delegation's text, parses the medicine nodal information;
6) judging that can the compound contained drug in this Map function also set up associating key assignments < Key, Value>in twos, is then to carry out 7), otherwise carry out 8);
7) set up associating key assignments < Key, Value >;
8) < Key; Value>send among the Reduce through shuffle&&sort; Reduce receives [Value] array that identical Key forms down, measures between medicine in twos according to computes, will be greater than the medicine of setting threshold to writing file and being saved in the distributed file system
Figure 000000
Wherein | F A∩ F B| expression medicine A, B be the number of times of prescription together, min{|F A|, | F B| the occurrence number of the less medicine of prescription number of times among the expression medicine A, B, and SC ABExpression medicine A, B co-occurrence number of times and the minimum ratio that the medicine number of times occurs;
9) reading 6) the middle medicine that generates is to file, and promptly the limit collection of medicine complex network is formatted as the adjacency list form and preserves the Chinese medicine network topology structure;
10) finish.
5. the medicine corporations discover method based on complex network model parallelization label propagation algorithm according to claim 1; It is characterized in that; Step 2)-c in the total process of parallelization label propagation algorithm be based on iterative; Stopping criterion for iteration is that each node label is basicly stable, and wherein parallelization label propagation algorithm iterative process is specific as follows:
1) for each medicine node unique initial labels id is set;
2) each Map function reads delegation's text from HDFS, deposits in the Value variable;
3) data in the parsing Value variable are preserved node i d with interim array Tmp [0], and Tmp [1] preserves adjacency list AdjList and Label;
4) sending node data structure;
5) judge whether only contain a label among the Label, promptly iteration first carries out 6), otherwise carry out 7);
6) make variable V=label 1;
7) make variable V=label 1&& label 2, wherein the label of the label of t-1 iteration of label 1 expression and t-2 iteration of label 2 expressions;
8) make variable i=0;
9) whether judging i less than AdjList.1ength, is execution in step 10 then, otherwise execution in step 12
10) send < AdjList.get (i), V >
11) i carries out 8 from increasing 1);
12) the Map process finishes, and Hadoop carries out shuffle&&sort;
13) Reduce resolves [Value] array, preserves node structure with data structure AdjLabelPA respectively, interim chained list ls 1, ls 2Preserve the l that each passes over respectively 1, l 2Value (if two labels are arranged, otherwise ls 2Be sky)
14) find out new node label according to following formula;
C x ( t ) = f ( C x 1 ( t - 1 ) , . . . , C x k ( t - 1 ) , w * C x 1 ( t - 2 ) , . . . , w * C x k ( t - 2 ) )
15) wherein
Figure FDA0000153484920000032
Represent iteration x t-1 time kThe label of node, what the f function returned is that neighbor node passes over the maximum mark of the frequency;
16) the t-1 label and the t label that upgrade among the AdjLabel are respectively C x(t-1) and C x(t);
17) result who preserves this iteration is to distributed file system;
18) finish.
CN2012101111712A 2012-04-16 2012-04-16 Medicine corporation finding method based on parallelization label propagation algorithm for complex network model Expired - Fee Related CN102663108B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012101111712A CN102663108B (en) 2012-04-16 2012-04-16 Medicine corporation finding method based on parallelization label propagation algorithm for complex network model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012101111712A CN102663108B (en) 2012-04-16 2012-04-16 Medicine corporation finding method based on parallelization label propagation algorithm for complex network model

Publications (2)

Publication Number Publication Date
CN102663108A true CN102663108A (en) 2012-09-12
CN102663108B CN102663108B (en) 2013-11-13

Family

ID=46772599

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012101111712A Expired - Fee Related CN102663108B (en) 2012-04-16 2012-04-16 Medicine corporation finding method based on parallelization label propagation algorithm for complex network model

Country Status (1)

Country Link
CN (1) CN102663108B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708285A (en) * 2012-04-24 2012-10-03 河海大学 Coremedicine excavation method based on complex network model parallelizing PageRank algorithm
CN105096297A (en) * 2014-05-05 2015-11-25 中兴通讯股份有限公司 Graph data partitioning method and device
CN105159922A (en) * 2015-08-03 2015-12-16 同济大学 Label propagation algorithm-based posting data-oriented parallelized community discovery method
CN105677648A (en) * 2014-11-18 2016-06-15 四三九九网络股份有限公司 Community detection method and system based on label propagation algorithm
CN106126649A (en) * 2016-06-24 2016-11-16 北京千安哲信息技术有限公司 A kind of similar Chinese crude drug method for digging and device
CN106933985A (en) * 2017-02-20 2017-07-07 广东省中医院 A kind of analysis of core side finds method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
M.E.J NEWMAN: "《Modularity and community structure in network》", 《PROC.NATL.ACAD.SCI.USA103》 *
刘洋: "《基于MapReduce的中医药并行数据挖掘服务》", 《中国优秀硕士学位论文全文数据库(电子期刊)》 *
刘熙等: "《基于最大频繁项集的层次聚类方法》", 《广西师范大学学报:自然科学版》 *
陈波: "《中药复方配伍的数据挖掘系统的构建》", 《中国优秀硕士学位论文全文数据库(电子期刊)》 *
马延妮: "《在线社会网络团结构分析》", 《中国优秀硕士学位论文全文数据库(电子期刊)》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708285A (en) * 2012-04-24 2012-10-03 河海大学 Coremedicine excavation method based on complex network model parallelizing PageRank algorithm
CN102708285B (en) * 2012-04-24 2015-05-13 河海大学 Coremedicine excavation method based on complex network model parallelizing PageRank algorithm
CN105096297A (en) * 2014-05-05 2015-11-25 中兴通讯股份有限公司 Graph data partitioning method and device
CN105677648A (en) * 2014-11-18 2016-06-15 四三九九网络股份有限公司 Community detection method and system based on label propagation algorithm
CN105677648B (en) * 2014-11-18 2018-08-28 四三九九网络股份有限公司 A kind of Combo discovering method and system based on label propagation algorithm
CN105159922A (en) * 2015-08-03 2015-12-16 同济大学 Label propagation algorithm-based posting data-oriented parallelized community discovery method
CN105159922B (en) * 2015-08-03 2018-08-24 同济大学 The parallelization Combo discovering method towards consignment data based on label propagation algorithm
CN106126649A (en) * 2016-06-24 2016-11-16 北京千安哲信息技术有限公司 A kind of similar Chinese crude drug method for digging and device
CN106126649B (en) * 2016-06-24 2019-07-23 北京千安哲信息技术有限公司 A kind of similar Chinese medicine method for digging and device
CN106933985A (en) * 2017-02-20 2017-07-07 广东省中医院 A kind of analysis of core side finds method
CN106933985B (en) * 2017-02-20 2020-06-26 广东省中医院 Analysis and discovery method of core party

Also Published As

Publication number Publication date
CN102663108B (en) 2013-11-13

Similar Documents

Publication Publication Date Title
CN109739994B (en) API knowledge graph construction method based on reference document
CN102663108B (en) Medicine corporation finding method based on parallelization label propagation algorithm for complex network model
CN102207946B (en) Knowledge network semi-automatic generation method
CN104035975B (en) It is a kind of to realize the method that remote supervisory character relation is extracted using Chinese online resource
CN103116574B (en) From the method for natural language text excavation applications process body
CN101593200A (en) Chinese Web page classification method based on the keyword frequency analysis
US8793251B2 (en) Input partitioning and minimization for automaton implementations of capturing group regular expressions
CN105893382A (en) Priori knowledge based microblog user group division method
US20140372105A1 (en) Submatch Extraction
US20220237220A1 (en) Template generation using directed acyclic word graphs
CN103488637B (en) A kind of method carrying out expert Finding based on dynamics community&#39;s excavation
CN102708285B (en) Coremedicine excavation method based on complex network model parallelizing PageRank algorithm
CN113901214B (en) Method and device for extracting form information, electronic equipment and storage medium
CN103927176A (en) Method for generating program feature tree on basis of hierarchical topic model
CN101377816B (en) Method and system for matching paralleling multiple-mode of matching regulation including displacement indication symbol
CN113704420A (en) Method and device for identifying role in text, electronic equipment and storage medium
WO2018205459A1 (en) Target user acquisition method and apparatus, electronic device and medium
CN110765276A (en) Entity alignment method and device in knowledge graph
CN103699568A (en) Method for extracting hyponymy relation of field terms from wikipedia
CN104281695A (en) Combination theory based quasi natural language semantic information extraction method and system
Jiang et al. A semantic-based approach to service clustering from service documents
CN106156259A (en) A kind of user behavior information displaying method and system
Burget Hierarchies in html documents: Linking text to concepts
CN105868363A (en) Webpage page text extraction method and system based on fuzzy logic
Carme et al. The lixto project: Exploring new frontiers of web data extraction

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C56 Change in the name or address of the patentee
CP02 Change in the address of a patent holder

Address after: Hongqiao Industrial Park in Taixing city of Jiangsu province Taizhou city 225400 six Wei Hong Kong Avenue

Patentee after: Nanjing University

Address before: 210093 Nanjing, Gulou District, Jiangsu, No. 22 Hankou Road

Patentee before: Nanjing University

CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20131113

Termination date: 20180416

CF01 Termination of patent right due to non-payment of annual fee