CN102663108A - Medicine corporation finding method based on parallelization label propagation algorithm for complex network model - Google Patents
Medicine corporation finding method based on parallelization label propagation algorithm for complex network model Download PDFInfo
- Publication number
- CN102663108A CN102663108A CN2012101111712A CN201210111171A CN102663108A CN 102663108 A CN102663108 A CN 102663108A CN 2012101111712 A CN2012101111712 A CN 2012101111712A CN 201210111171 A CN201210111171 A CN 201210111171A CN 102663108 A CN102663108 A CN 102663108A
- Authority
- CN
- China
- Prior art keywords
- medicine
- label
- parallelization
- propagation algorithm
- node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000003814 drug Substances 0.000 title claims abstract description 156
- 238000000034 method Methods 0.000 title claims abstract description 31
- 229940079593 drug Drugs 0.000 title claims description 16
- 230000006855 networking Effects 0.000 claims abstract description 12
- 150000001875 compounds Chemical class 0.000 claims description 28
- 230000014509 gene expression Effects 0.000 claims description 19
- 238000009412 basement excavation Methods 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 2
- 238000012804 iterative process Methods 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 abstract description 7
- 238000005065 mining Methods 0.000 abstract 3
- 229940126678 chinese medicines Drugs 0.000 abstract 1
- 238000007781 pre-processing Methods 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 4
- 238000007418 data mining Methods 0.000 description 3
- 230000001788 irregular Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 238000007621 cluster analysis Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000013075 data extraction Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
Images
Landscapes
- Machine Translation (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
The invention provides a medicine corporation finding method based on a parallelization label propagation algorithm for a complex network model. The medicine corporation finding method includes a networking stage and a mining stage, the networking stage includes a) preprocessing and generating a traditional Chinese medicine data set, formatting the traditional Chinese medicine dataset into text data; b) deploying the initial text data to a Hadoop platform; c) parallelly establishing a traditional Chinese medicine (TCM for short) network; and d) completing the networking stage, and the mining stage includes a) acquiring a TCM network text file processed and generated from the step c) in the networking stage; b) deploying the TCM network text file to the Hadoop platform; c) implementing the parallelization label propagation algorithm to find out medicine corporations; and d) completing the mining stage. By the aid of the medicine corporation finding method based on the parallelization label propagation algorithm for the complex network model, a TCM network model is built, extensibility and running speed of networking and the label propagation algorithm are increased by the aid of parallelization technology, the medicine corporations with similarity in terms of complex Chinese medicines can be effectively mined, and research on compatibility regulation of traditional Chinese medicine is assisted.
Description
Technical field
The present invention relates to a kind of Chinese medicine complex network modeling method, and on this Chinese medicine medicine complex network TCM, adopt parallel label propagation algorithm to excavate the technology of Chinese medicine medicine corporations.
Background technology
Utilize the data mining technology can intellectual analysis Chinese medicine compound prescription data, find potential drug matching rule.In the Chinese medicine data mining commonly used one type of application being arranged is the clustering algorithm of medicine, and it carries out polymerization to find the drug group of frequent prescription based on transaction item model (regard compound the affairs of being made up of multiple medicine as and be stored in the transaction database) to similar medicine.Tradition is difficult to excavate the medicine of indirect prescription compatibility based on the Chinese medicine medicine clustering algorithm of transaction item model, and often ignores the processing to uncommon medicine, is unfavorable for furtheing investigate the compatibility rule knowledge of every kind of medicine.The present invention attempts with complex network model modeling Chinese medicine medicine network, in the medicine network, uses corporations and finds that algorithm excavates the similar drug group of the property of medicine.
Research to network corporations structure in Complex Networks Analysis has had very long history, and it relates to every field such as computer science, sociology, life science.Corporations' structure in analysis and the announcement network all is very important for awareness network structure and phase-split network characteristic.In the Chinese medicine complex network, carrying out corporations finds very close based on the purpose of the medicine cluster analysis of transaction item model with tradition; All be that the frequent medicine of prescription together is aggregated in the same classification, and excavate the similar medicine of the property of medicine so that research drug matching rule.
Make up Chinese medicine medicine complex network this thinking based on complex network model and broken the traditional Chinese medicine data mining all based on the convention of the modeler model of transaction item; And the label propagation algorithm in the employing Complex Networks Analysis can deeply excavate Chinese medicine medicine corporations; Find that the property of medicine is similar, the drug group of the inner frequent relatively prescription of corporations, overcome tradition and can not find indirect compatibility and the defective of ignoring uncommon medicine based on the transaction item clustering algorithm.
In the recent period, along with the surge of Chinese medicine compound prescription data, non-parallel algorithm is applicable to that no longer the corporations of fairly large Chinese medicine data find.
Summary of the invention
Technical matters to be solved by this invention is to realize the Chinese medicine complex network modeling, and on this model, adopts parallel label propagation algorithm, fast and effeciently to find medicine corporations.
For addressing the above problem, the medicine corporations discover method based on complex network model parallelization label propagation algorithm of the present invention comprises the steps:
1) the networking stage:
The a pre-service is formatted as initial text data to generate the Chinese medicine data set;
B disposes the platform to Hadoop with initial text data;
Chinese medicine medicine network is set up in the c parallelization, i.e. TCM network, and this network is a node with the medicine, with SC
ABNode greater than given threshold value connects the limit;
D finishes.
2) excavation phase:
A obtaining step 1)-Chinese medicine medicine network text file that c handle to generate;
B disposes the platform to Hadoop with above-mentioned TCM network text file;
C implements parallelization label propagation algorithm, promptly adopts the label propagation algorithm of MapReduce framework parallelization, utilizes node neighbor information iteration to upgrade self label (promptly corporations number), with discovery medicine corporations;
D) finish.
Said pre-service is formed for the medicine that extracts all compounds in the Chinese medicine compound prescription data among step 1)-a.
The said initial text data that is deployed as step 1)-a generation is uploaded to the distributed file system (HDFS) of Hadoop platform among step 1)-b.
Further, the detailed process of step 1)-c is following:
1) be each Chinese medicine compound prescription, promptly delegation's text data is set a unique ID
2) set up the inverted index between the sign ID from the medicine to the compound;
3) set unique drug identifier id for each medicine, wherein comprise the frequency that this medicine occurs in compound;
4) inverted index is reduced, promptly carry out Inversed File Retrieval Algorithm once more, every capable compound reads in certain Map function of this subtask, reduction Chinese medicine compound prescription text data;
5) each Map function reads delegation's text, parses the medicine nodal information;
6) judging that can the compound contained drug in this Map function also set up associating key assignments < Key, Value>in twos, is then to carry out 7), otherwise carry out 8);
7) set up associating key assignments < Key, Value >;
8) < Key, Value>sends among the Reduce through shuffle&&sort, and Reduce receives [Value] array that identical Key forms down, measures between medicine in twos according to computes, will be greater than the medicine of setting threshold to writing file and being saved among the HDFS
Wherein | F
A∩ F
B| expression medicine A, B be the number of times of prescription together, min{|F
A|, | F
B| the occurrence number of the less medicine of prescription number of times among the expression medicine A, B, and SC
ABExpression medicine A, B co-occurrence number of times and the minimum ratio that the medicine number of times occurs;
9) reading 6) the middle medicine that generates is to file, and promptly the limit collection of medicine complex network is formatted as the adjacency list form and preserves the Chinese medicine network topology structure;
10) finish.
Further, step 2)-utilize tag update self label (be generally the maximum label of the frequency, take certain selection at random) under the neighbor node among the c if maximum frequency label has a plurality of.The total process of parallelization label propagation algorithm is based on iterative, and stopping criterion for iteration is that each node label is basicly stable, does not for example change etc. greater than 90% node label.Provide certain iterative algorithm flow process in the iterative step at this, promptly wherein parallelization label propagation algorithm iterative process is specific as follows::
1) for each medicine node unique initial labels id is set;
2) each Map function reads delegation's text from HDFS, deposits in the Value variable;
3) data in the parsing Value variable are preserved node i d with interim array Tmp [0], and Tmp [1] preserves adjacency list AdjList and Label;
4) sending node data structure;
5) judge whether only contain a label among the Label, promptly iteration first carries out 6), otherwise carry out 7);
6) make variable V=label 1;
7) make variable V=label 1&& label 2, wherein the label of the label of t-1 iteration of label 1 expression and t-2 iteration of label 2 expressions;
8) make variable i=0;
9) whether judging i less than AdjList.length, is execution in step 10 then, otherwise execution in step 12
10) send < AdjList.get (i), V >
11) i carries out 8 from increasing 1);
12) the Map process finishes, and Hadoop carries out shuffle&&sort;
13) Reduce resolves [Value] array, preserves node structure with data structure AdjLabelPA respectively, interim chained list ls
1, ls
2Preserve the l that each passes over respectively
1, l
2Value (if two labels are arranged, otherwise ls
2Be sky)
14) find out new node label according to following formula;
15) wherein
Represent iteration x t-1 time
kThe label of node, what the f function returned is that neighbor node passes over the maximum mark of the frequency;
16) the t-1 label and the t label that upgrade among the AdjLabel are respectively C
x(t-1) and C
x(t);
17) result who preserves this iteration is to distributed file system HDFS;
18) finish.
Medicine corporations discover method based on complex network model parallelization label propagation algorithm of the present invention has been set up Chinese medicine medicine complex network model; Utilize the parallelization technology to improve the extensibility and the travelling speed of networking and label propagation algorithm; And can effectively excavate the similar medicine corporations of herbal mixture property, help research drug matching rule.
Description of drawings
Fig. 1 finds operational flowchart for medicine corporations.
Fig. 2 is the process flow diagram of the medicine corporations discover method based on complex network model parallelization label propagation algorithm of the present invention.
Fig. 3 is for generating the process flow diagram of Chinese medicine medicine (TCM) network.
Fig. 4 excavates the process flow diagram of medicine corporations on Chinese medicine medicine (TCM) network, utilizing label propagation algorithm (certain iteration).
Embodiment
In order more to understand technology contents of the present invention, special act specific embodiment also cooperates appended graphic explanation following.
As shown in Figure 1; Core drug excavates and obtains the Chinese medicine compound prescription data through prescription data base querying, irregular text data extraction etc.; Generate text data through pre-service such as data normalization, formats; Chinese medicine medicine complex network is set up in parallelization on the Hadoop platform then, on this network, moves parallelization label propagation algorithm at last to excavate medicine corporations.
It is key steps of this invention that the networking of Chinese medicine compound prescription data is excavated medicine corporations with rowization label propagation algorithm; Thinking of the present invention is effectively excavated medicine corporations through complex network modeling and parallelization label propagation algorithm exactly, improves algorithm extensibility and travelling speed simultaneously.
The process flow diagram of the medicine corporations discover method based on complex network model parallelization label propagation algorithm of the present invention is as shown in Figure 2.
At networking stage (step 1-3), step 1 is from database or other irregular text datas, to obtain initial Chinese medicine compound prescription networking data, and is formatted as text data so that be uploaded to the distributed file system (HDFS) of Hadoop platform;
At excavation phase (step 4-5), step 4, operation parallelization label propagation algorithm in the TCM network that step 3 generated;
Fig. 3 is the detailed description to step 2 among Fig. 2.
Step 21 is to set a unique ID value for each Chinese medicine compound prescription, from label 1 beginning;
Can the compound contained drug in this Map function of step 25 judgement also set up the associating key assignments in twos, can promptly carry out 26, otherwise carry out 27 (noticing that should be that the Map process of saying this subtask finishes this moments);
Wherein | F
A∩ F
B| expression medicine A, B be the number of times of prescription together, min{|F
A|, | F
B| the occurrence number of the less medicine of prescription number of times among the expression medicine A, B, and SC
ABExpression medicine A, B co-occurrence number of times and the minimum ratio that the medicine number of times occurs;
The total process of parallelization label propagation algorithm is based on iterative, and stopping criterion for iteration is that each node label is basicly stable, and Fig. 4 is the detailed description to an iteration of label propagation algorithm in the step 4 among Fig. 2, and is specific as follows:
Step 41 is that each medicine node is set a unique label;
Each Map function of step 42 expression reads delegation's text of HDFS Chinese traditional medicine network file, and deposits in the Value variable;
The Value variable is resolved in step 43 expression, preserves node i d with interim array Tmp [0], and Tmp [1] preserves adjacency list AdjList and Label;
Step 44 expression sending node data structure;
Whether step 49 judges i less than AdjList.length, is execution in step 50 then, otherwise execution in step 52;
< AdjList.get (i), V>sent in step 50 expression;
I is from increasing 1 in step 51 expression;
Step 53 receives < Key, [Value]>for Reduce;
The label of t-1, t-2 iteration among the AdjLabelPA is upgraded in step 56 expression;
Step 57 is for to be kept at the result on the HDFS;
Step 58 is the end step of Fig. 4;
Annotate: the label propagation algorithm has repeatedly iteration, and the terminal point of iteration is that the tags stabilize of the node more than 90% in the network is constant.
In sum; The present invention utilizes the parallelization technology to improve the extensibility and the travelling speed of networking and label propagation algorithm; Find algorithm so that can under a large amount of compound data, rapidly and efficiently move corporations; And can effectively excavate the similar medicine corporations of herbal mixture property, help research drug matching rule.
Have common knowledge the knowledgeable in the technical field under the present invention, do not breaking away from the spirit and scope of the present invention, when doing various changes and retouching.Therefore, protection scope of the present invention is as the criterion when looking claims person of defining.
Claims (5)
1. the medicine corporations discover method based on complex network model parallelization label propagation algorithm is characterized in that, comprises the steps:
1) the networking stage:
The a pre-service is formatted as initial text data to generate the Chinese medicine data set;
B disposes the platform to Hadoop with initial text data;
Chinese medicine medicine network is set up in the c parallelization, and this network is a node with the medicine, with SC
ABNode greater than given threshold value connects the limit;
D finishes.
2) excavation phase:
A obtaining step 1)-Chinese medicine medicine network text file that c handle to generate;
B disposes the platform to Hadoop with above-mentioned Chinese medicine medicine network text file;
C implements parallelization label propagation algorithm, promptly adopts the label propagation algorithm of MapReduce framework parallelization, utilizes node neighbor information iteration to upgrade self label, to find medicine corporations;
D) finish.
2. the medicine corporations discover method based on complex network model parallelization label propagation algorithm according to claim 1 is characterized in that, wherein said pre-service is formed for the medicine that extracts all compounds in the Chinese medicine compound prescription data among step 1)-a.
3. the medicine corporations discover method based on complex network model parallelization label propagation algorithm according to claim 1; It is characterized in that wherein the said initial text data that is deployed as step 1)-a generation is uploaded to the distributed file system of Hadoop platform among step 1)-b.
4. the medicine corporations discover method based on complex network model parallelization label propagation algorithm according to claim 1 is characterized in that wherein the detailed process of step 1)-c is following:
1) be each Chinese medicine compound prescription, promptly delegation's text data is set a unique ID
2) set up the inverted index between the sign ID from the medicine to the compound;
3) set unique drug identifier id for each medicine, wherein comprise the frequency that this medicine occurs in compound;
4) inverted index is reduced, promptly carry out Inversed File Retrieval Algorithm once more, every capable compound reads in certain Map function of this subtask, reduction Chinese medicine compound prescription text data;
5) each Map function reads delegation's text, parses the medicine nodal information;
6) judging that can the compound contained drug in this Map function also set up associating key assignments < Key, Value>in twos, is then to carry out 7), otherwise carry out 8);
7) set up associating key assignments < Key, Value >;
8) < Key; Value>send among the Reduce through shuffle&&sort; Reduce receives [Value] array that identical Key forms down, measures between medicine in twos according to computes, will be greater than the medicine of setting threshold to writing file and being saved in the distributed file system
Wherein | F
A∩ F
B| expression medicine A, B be the number of times of prescription together, min{|F
A|, | F
B| the occurrence number of the less medicine of prescription number of times among the expression medicine A, B, and SC
ABExpression medicine A, B co-occurrence number of times and the minimum ratio that the medicine number of times occurs;
9) reading 6) the middle medicine that generates is to file, and promptly the limit collection of medicine complex network is formatted as the adjacency list form and preserves the Chinese medicine network topology structure;
10) finish.
5. the medicine corporations discover method based on complex network model parallelization label propagation algorithm according to claim 1; It is characterized in that; Step 2)-c in the total process of parallelization label propagation algorithm be based on iterative; Stopping criterion for iteration is that each node label is basicly stable, and wherein parallelization label propagation algorithm iterative process is specific as follows:
1) for each medicine node unique initial labels id is set;
2) each Map function reads delegation's text from HDFS, deposits in the Value variable;
3) data in the parsing Value variable are preserved node i d with interim array Tmp [0], and Tmp [1] preserves adjacency list AdjList and Label;
4) sending node data structure;
5) judge whether only contain a label among the Label, promptly iteration first carries out 6), otherwise carry out 7);
6) make variable V=label 1;
7) make variable V=label 1&& label 2, wherein the label of the label of t-1 iteration of label 1 expression and t-2 iteration of label 2 expressions;
8) make variable i=0;
9) whether judging i less than AdjList.1ength, is execution in step 10 then, otherwise execution in step 12
10) send < AdjList.get (i), V >
11) i carries out 8 from increasing 1);
12) the Map process finishes, and Hadoop carries out shuffle&&sort;
13) Reduce resolves [Value] array, preserves node structure with data structure AdjLabelPA respectively, interim chained list ls
1, ls
2Preserve the l that each passes over respectively
1, l
2Value (if two labels are arranged, otherwise ls
2Be sky)
14) find out new node label according to following formula;
15) wherein
Represent iteration x t-1 time
kThe label of node, what the f function returned is that neighbor node passes over the maximum mark of the frequency;
16) the t-1 label and the t label that upgrade among the AdjLabel are respectively C
x(t-1) and C
x(t);
17) result who preserves this iteration is to distributed file system;
18) finish.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012101111712A CN102663108B (en) | 2012-04-16 | 2012-04-16 | Medicine corporation finding method based on parallelization label propagation algorithm for complex network model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012101111712A CN102663108B (en) | 2012-04-16 | 2012-04-16 | Medicine corporation finding method based on parallelization label propagation algorithm for complex network model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102663108A true CN102663108A (en) | 2012-09-12 |
CN102663108B CN102663108B (en) | 2013-11-13 |
Family
ID=46772599
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2012101111712A Expired - Fee Related CN102663108B (en) | 2012-04-16 | 2012-04-16 | Medicine corporation finding method based on parallelization label propagation algorithm for complex network model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102663108B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102708285A (en) * | 2012-04-24 | 2012-10-03 | 河海大学 | Coremedicine excavation method based on complex network model parallelizing PageRank algorithm |
CN105096297A (en) * | 2014-05-05 | 2015-11-25 | 中兴通讯股份有限公司 | Graph data partitioning method and device |
CN105159922A (en) * | 2015-08-03 | 2015-12-16 | 同济大学 | Label propagation algorithm-based posting data-oriented parallelized community discovery method |
CN105677648A (en) * | 2014-11-18 | 2016-06-15 | 四三九九网络股份有限公司 | Community detection method and system based on label propagation algorithm |
CN106126649A (en) * | 2016-06-24 | 2016-11-16 | 北京千安哲信息技术有限公司 | A kind of similar Chinese crude drug method for digging and device |
CN106933985A (en) * | 2017-02-20 | 2017-07-07 | 广东省中医院 | A kind of analysis of core side finds method |
-
2012
- 2012-04-16 CN CN2012101111712A patent/CN102663108B/en not_active Expired - Fee Related
Non-Patent Citations (5)
Title |
---|
M.E.J NEWMAN: "《Modularity and community structure in network》", 《PROC.NATL.ACAD.SCI.USA103》 * |
刘洋: "《基于MapReduce的中医药并行数据挖掘服务》", 《中国优秀硕士学位论文全文数据库(电子期刊)》 * |
刘熙等: "《基于最大频繁项集的层次聚类方法》", 《广西师范大学学报:自然科学版》 * |
陈波: "《中药复方配伍的数据挖掘系统的构建》", 《中国优秀硕士学位论文全文数据库(电子期刊)》 * |
马延妮: "《在线社会网络团结构分析》", 《中国优秀硕士学位论文全文数据库(电子期刊)》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102708285A (en) * | 2012-04-24 | 2012-10-03 | 河海大学 | Coremedicine excavation method based on complex network model parallelizing PageRank algorithm |
CN102708285B (en) * | 2012-04-24 | 2015-05-13 | 河海大学 | Coremedicine excavation method based on complex network model parallelizing PageRank algorithm |
CN105096297A (en) * | 2014-05-05 | 2015-11-25 | 中兴通讯股份有限公司 | Graph data partitioning method and device |
CN105677648A (en) * | 2014-11-18 | 2016-06-15 | 四三九九网络股份有限公司 | Community detection method and system based on label propagation algorithm |
CN105677648B (en) * | 2014-11-18 | 2018-08-28 | 四三九九网络股份有限公司 | A kind of Combo discovering method and system based on label propagation algorithm |
CN105159922A (en) * | 2015-08-03 | 2015-12-16 | 同济大学 | Label propagation algorithm-based posting data-oriented parallelized community discovery method |
CN105159922B (en) * | 2015-08-03 | 2018-08-24 | 同济大学 | The parallelization Combo discovering method towards consignment data based on label propagation algorithm |
CN106126649A (en) * | 2016-06-24 | 2016-11-16 | 北京千安哲信息技术有限公司 | A kind of similar Chinese crude drug method for digging and device |
CN106126649B (en) * | 2016-06-24 | 2019-07-23 | 北京千安哲信息技术有限公司 | A kind of similar Chinese medicine method for digging and device |
CN106933985A (en) * | 2017-02-20 | 2017-07-07 | 广东省中医院 | A kind of analysis of core side finds method |
CN106933985B (en) * | 2017-02-20 | 2020-06-26 | 广东省中医院 | Analysis and discovery method of core party |
Also Published As
Publication number | Publication date |
---|---|
CN102663108B (en) | 2013-11-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109739994B (en) | API knowledge graph construction method based on reference document | |
CN102663108B (en) | Medicine corporation finding method based on parallelization label propagation algorithm for complex network model | |
CN102207946B (en) | Knowledge network semi-automatic generation method | |
CN104035975B (en) | It is a kind of to realize the method that remote supervisory character relation is extracted using Chinese online resource | |
CN103116574B (en) | From the method for natural language text excavation applications process body | |
CN101593200A (en) | Chinese Web page classification method based on the keyword frequency analysis | |
US8793251B2 (en) | Input partitioning and minimization for automaton implementations of capturing group regular expressions | |
CN105893382A (en) | Priori knowledge based microblog user group division method | |
US20140372105A1 (en) | Submatch Extraction | |
US20220237220A1 (en) | Template generation using directed acyclic word graphs | |
CN103488637B (en) | A kind of method carrying out expert Finding based on dynamics community's excavation | |
CN102708285B (en) | Coremedicine excavation method based on complex network model parallelizing PageRank algorithm | |
CN113901214B (en) | Method and device for extracting form information, electronic equipment and storage medium | |
CN103927176A (en) | Method for generating program feature tree on basis of hierarchical topic model | |
CN101377816B (en) | Method and system for matching paralleling multiple-mode of matching regulation including displacement indication symbol | |
CN113704420A (en) | Method and device for identifying role in text, electronic equipment and storage medium | |
WO2018205459A1 (en) | Target user acquisition method and apparatus, electronic device and medium | |
CN110765276A (en) | Entity alignment method and device in knowledge graph | |
CN103699568A (en) | Method for extracting hyponymy relation of field terms from wikipedia | |
CN104281695A (en) | Combination theory based quasi natural language semantic information extraction method and system | |
Jiang et al. | A semantic-based approach to service clustering from service documents | |
CN106156259A (en) | A kind of user behavior information displaying method and system | |
Burget | Hierarchies in html documents: Linking text to concepts | |
CN105868363A (en) | Webpage page text extraction method and system based on fuzzy logic | |
Carme et al. | The lixto project: Exploring new frontiers of web data extraction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C56 | Change in the name or address of the patentee | ||
CP02 | Change in the address of a patent holder |
Address after: Hongqiao Industrial Park in Taixing city of Jiangsu province Taizhou city 225400 six Wei Hong Kong Avenue Patentee after: Nanjing University Address before: 210093 Nanjing, Gulou District, Jiangsu, No. 22 Hankou Road Patentee before: Nanjing University |
|
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20131113 Termination date: 20180416 |
|
CF01 | Termination of patent right due to non-payment of annual fee |