KR20120043977A - A method for retrieving a gene-disease-chemical relationship using multi-dimensional indexes from huge bio-literatures - Google Patents

A method for retrieving a gene-disease-chemical relationship using multi-dimensional indexes from huge bio-literatures Download PDF

Info

Publication number
KR20120043977A
KR20120043977A KR1020100105299A KR20100105299A KR20120043977A KR 20120043977 A KR20120043977 A KR 20120043977A KR 1020100105299 A KR1020100105299 A KR 1020100105299A KR 20100105299 A KR20100105299 A KR 20100105299A KR 20120043977 A KR20120043977 A KR 20120043977A
Authority
KR
South Korea
Prior art keywords
gene
disease
compound
index
sentence
Prior art date
Application number
KR1020100105299A
Other languages
Korean (ko)
Other versions
KR101448731B1 (en
Inventor
김태경
오정수
이상혁
허보경
Original Assignee
한국생명공학연구원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 한국생명공학연구원 filed Critical 한국생명공학연구원
Priority to KR20100105299A priority Critical patent/KR101448731B1/en
Publication of KR20120043977A publication Critical patent/KR20120043977A/en
Application granted granted Critical
Publication of KR101448731B1 publication Critical patent/KR101448731B1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2264Multidimensional index structures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Bioethics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)

Abstract

PURPOSE: A method for extracting gene-disease-chemical relations from bio-literatures using multi-dimensional indexes is provided to quickly extract gene-disease-chemical relations based on a reverse index and the multi-dimensional indexes, thereby supporting precise search per a sentence. CONSTITUTION: Multi-dimensional indexes about diseases, genes, and chemicals are made from bio-literatures. The multi-dimensional indexes are stored according to an index storage structure. The diseases, genes, and chemicals are multi-dimensionally analyzed by receiving a search word from a user.

Description

A method for retrieving a gene-disease-chemical relationship using multi-dimensional indexes from huge bio-literatures}

FIELD OF THE INVENTION The present invention relates to text mining techniques in the field of bioinformatics, and more particularly to multidimensional indexes from large biotechnology literature for multidimensional analysis of gene-disease-chemicals. The present invention relates to a method for effectively extracting a gene-disease-chemical relationship by applying a method to increase the efficiency and accuracy of a search and to enable a multi-dimensional analysis of a gene-disease-compound.

Conventionally, in the field of biology, the results of a large number of biological experiments are published in the literature every year, and accordingly, the strategic use of such information is becoming increasingly important.

In addition, in order to identify gene-disease-compound relations from biotext literature, the only way is to search through keyword search on PubMed. However, about 10,000 documents are currently being managed on PubMed. It is certain to do.

Therefore, there is an increasing demand for an infrastructure that enables the rapid verification of information of interest from such a large volume of documents, enabling the verification, verification and inference of life phenomena.

As an example of the prior art for identifying gene-disease-compound relationships from biotext literature as described above, see, for example, "PolySearch: aweb-based textmining system for extracting relationships between human diseases, published May 16, 2008." , genes, mutations, drugs and metabolites, "Nucleic Acids Research, 2008. Vol.

That is, the document discloses a system that enables to search for relevant mutation symptoms and drugs through the disease or gene using a query.

However, the gene-disease-compound relationship analysis technique disclosed in the above-mentioned document only has a disadvantage in that only X-> Y relationships are considered and X, Y-> Z cannot be analyzed.

In addition, as another example of the prior art as described above, for example, "Integration of text- and data-mining using ontologies successfully selects disease gene candidates" published February 22, 2005, Nucleic Acids Research, 2005. In Vol. 33, No. 5, descriptions on selecting gene candidates for causing diseases using ontology, text mining, and data mining techniques are described.

In addition, as another example of the prior art as described above, for example, "Text-mining and information-retrieval services for molecular biology" published on June 28, 2005, Genome Biology, 2005. 6: 224 (doi (10.1186 / gb-2005-6-7-224) discloses a technique for automatically extracting a functional relationship between a gene and a protein from text through text mining in molecular biology.

However, there are limitations in identifying gene-disease-compound relationships by keyword-based retrieval from a large-scale biotechnology literature using the methods described in the prior art as described above.

First, the above-described methods of the prior art have a so-called false positive, which is actually negative, but the result is positive because the object of the query is green, so that the amount of documents searched is vastly larger than necessary, Accordingly, there was a problem that the user takes a long time to check the information.

Second, the prior art methods as described above do not have a highlight function for genes, diseases, and compounds, and thus, it is difficult for a user to easily identify the sentence at a glance.

Third, the prior art methods, as described above, are often unable to provide summary information on the relationship between gene-disease-compound, and in the case of presenting summary information, most of them do not accept new information in real time as a result of manual work. There was no limit.

Therefore, by solving the problems of the prior art as described above, it is possible to quickly and flexibly extract the relationship between the gene-disease-compound, to search and confirm the relationship between the gene-disease-compound at the sentence level, and also to index It is desirable to provide a method for extracting a gene-disease-compound relationship from a new large-scale biotechnology literature that can implement an intuitive user interface, but there is no system or method that satisfies all such requirements.

The present invention has been made to solve the problems of the prior art as described above, and therefore an object of the present invention is to quickly and flexibly extract the relationship between gene-disease-compounds through a multidimensional index structure, and at the same time, gene-disease -To provide a method of extracting gene-disease-compound relationships from large-scale biotechnology literature using multidimensional indexes to search and verify compound relationships at the sentence level and to implement an intuitive user interface using indexes. It is.

In order to achieve the above object, according to the present invention, in the method for extracting the gene-disease-compound relationship from the large-scale biotechnology literature, to build a multi-dimensional index for diseases, genes, compounds from the large-scale biotechnology literature Storing the constructed multi-dimensional index according to a predetermined index storage structure; and using the stored index, a user inputs a search word and multi-dimensional analysis of diseases, genes, and compounds from the large-scale biotechnology literature. There is provided a method for extracting a gene-disease-compound relationship from a large volume of biotechnology literature, comprising the step of performing a.

The multi-dimensional index construction may include extracting only a document whose abstract field is not null from a PubMed database, and dividing the contents of each abstract into sentence units and curating them. Creating a sentence table, constructing an inverse index for the sentence table, and comparing each synonym dictionary for genes, diseases, and compounds with the inverse index for each of the genes, diseases, and compounds. It characterized in that it comprises a step of building a dimensional index of.

The sentence table may be stored in the order of [pubmed id, sentence id (sentence id), sentence].

In addition, the index storage structure is characterized in that the star schema (Star Schema) structure.

In addition, in the index storage structure, the disease index, information about the pubmed ID, sentence number, disease ID and disease name, start position, end position is stored, the standard disease name and synonym information about the disease is stored in association It is characterized by.

Further, in the index storage structure, the gene index, the information on the pubmed ID, sentence number, gene ID and gene name, start position, end position is stored, synonymous information about the standard gene and gene is stored in this It is characterized by.

In addition, in the index storage structure, the compound index, information about the pubmed ID, sentence number, compound ID and compound ID, compound name, start position, end position is stored, synonymous information about the compound name and the compound And are stored in association.

In addition, the index storage structure, characterized in that configured to be able to establish a multi-dimensional analysis model by adding index information for other analysis dimensions in addition to the disease index, the gene index and the compound index.

In addition, the method includes the above sentence in one dimension (gene, disease, compound), two dimensions (gene-disease, disease-gene, gene-compound, compound-gene, disease-compound, compound-disease relationship) and three-dimensional (Gene-disease-compound relationship).

In addition, in the above method, when a user inputs a search word, colors or highlights of respective genes, diseases, and compounds are applied to the screens showing the search results, thereby providing visual effects and allowing the user to intuitively understand the contents. It is characterized in that configured to.

In addition, the method is characterized in that, when the user inputs a search word, by grasping the content based on the sentence to view the entire abstract, it is configured to check the abstract content centered on the sentence.

Further, in the method, when a user inputs a search word, keywords related to the search word are displayed, and when the keyword is selected, the search word and a search result corresponding to the keyword and the abstract are displayed, so that the user may enter between the search word and the keyword. Characterized in that it can be configured to easily perform a relationship analysis.

In addition, the method is characterized in that it is configured to immediately perform the necessary analysis by accessing the index using SQL, without having to write a separate program for extracting the relationship between the gene-disease-compound.

As described above, according to the present invention, it is possible to quickly extract the relationship between gene-disease-compound by utilizing inverse index and multi-dimensional index, to support a sophisticated sentence-by-sentence search, and to analyze X-> Y. In addition, a method of extracting a gene-disease-compound relationship from a large-scale biotechnology literature using a multi-dimensional index that can extract the relationship of X, Y-> Z is provided.

That is, according to the present invention, the abstract file is imported from the PubMed database, each abstract is separated into sentence units, and after generating inverse indexes for the positions of genes, diseases and compounds in the separated sentences, genes, diseases, Create a dimensional index for each compound with a name for each compound, use synonym terminology dictionary to improve search accuracy, and use multidimensional indexes to link multi-dimensional analysis by linking indexes and sentences. A method of extracting a gene-disease-compound relationship is provided.

Therefore, according to the present invention, it is possible to derive the relationship between the biotechnological entities from a large volume of literature, and this can be applied to deriving new relational information from literatures of various fields such as chemistry and physics as well as biotechnology.

1 is a view for explaining a procedure for constructing a multi-dimensional index for a disease-gene-compound from a large-capacity literature in a method for extracting a gene-disease-compound relationship from a large-capacity biotechnology literature according to the present invention.
2 is a view for explaining a storage structure for extracting a disease-gene-compound in a method of extracting a gene-disease-compound relationship from a large-scale biotechnology literature according to the present invention.
3 is a view showing a screen showing a basic search result extracted by applying a multi-dimensional analysis structure in a method for extracting a gene-disease-compound relationship from a large-scale biotechnology literature according to the present invention.
4 is a view showing a screen for providing the entire abstract content of the extracted sentence in the method for extracting the gene-disease-compound relationship from the large-scale biotechnology literature according to the present invention.
FIG. 5 is a diagram illustrating an input screen and a result screen for multidimensional analysis in a method of extracting a gene-disease-compound relationship from a large-scale biotechnology literature according to the present invention.
6 is a view showing the structure of the SQL for extracting the relationship between the compound-disease in the method for extracting the gene-disease-compound relationship from the large-scale biotechnology literature according to the present invention.
7 is a view showing a gene-compound relationship analysis screen as an embodiment of a method for extracting a gene-disease-compound relationship from a large-scale biotechnology literature according to the present invention.
8 is a view showing a gene-gene relationship analysis screen as an embodiment of a method for extracting a gene-disease-compound relationship from a large-scale biotechnology literature according to the present invention.
9 is a view showing a disease-gene- abstract relationship analysis screen as an embodiment of a method for extracting a gene-disease-compound relationship from a large-scale biotechnology literature according to the present invention.
10 is a view showing a disease-gene relationship analysis screen as an embodiment of a method for extracting a gene-disease-compound relationship from a large-scale biotechnology literature according to the present invention.

Hereinafter, with reference to the accompanying drawings, the details of the method for extracting the gene-disease-compound relationship from the large-scale biotechnology literature according to the present invention will be described.

Hereinafter, it is to be noted that the following description is only an embodiment for carrying out the present invention, and the present invention is not limited to the contents of the embodiments described below.

That is, a method of extracting a gene-disease-compound relationship from a large-scale biotechnology literature according to the present invention, as described below, is a multidimensional index for analyzing a gene-disease-compound in a star schema form from a large-scale literature. The present invention relates to a gene-disease-compound relationship analysis technique using a multidimensional index having a structure and highlighting the gene-disease-compound included in a search result using such an index.

In addition, the present invention, for example, the support of various search services and biotechnology by adding genes recently discovered in relation to diseases of which biotechnologists are interested, or organization (Organism), body parts (Anatomy), etc. It can be applied to information retrieval system that can be used in all fields.

Subsequently, with reference to FIGS. 1-10, the specific structure of the method of extracting a gene-disease-compound relationship from the large-capacity biotechnology literature concerning this invention is demonstrated.

First, referring to FIG. 1, FIG. 1 illustrates a process of constructing a multidimensional index in a method of extracting a gene-disease-compound relationship from a large-scale biotechnology literature according to the present invention.

That is, as shown in Figure 1, the procedure for constructing a multi-dimensional index for disease-gene-compounds from a large volume of literature, first importing the abstract file from the PubMed database, and then separate each abstract into sentence units, Create an index for each location for each gene, disease, or compound present.

Here, in constructing each of the indexes, the synonym term dictionary is used to increase the search accuracy.

As described above, after the index is generated, the indexes and sentences are connected to each other so that the user can perform multidimensional analysis.

More specifically, the above-described procedure for constructing a multidimensional index first extracts the entire document from the PubMed database, where the extraction condition only results in that the Abstract field is not null (step 1). .

Subsequently, the content of each abstract is divided into sentence units to form a sentence table through curation, and stored in the order of [pubmed id, sentence id, sentence], for example (step 2). .

Next, an inverse index is constructed for the sentence table obtained in the above step (step 3).

Subsequently, the gene, disease, and compound synonym dictionaries are compared with the sentence inverse index to build each dimension index (step 4).

Next, with reference to FIG. 2, the method of storing the index constructed as mentioned above is demonstrated.

Referring to FIG. 2, FIG. 2 shows an index storage structure for extracting a disease-gene-compound relationship.

That is, a key feature of the present invention is that in the storage structure that allows the sentence (Sentence) to be viewed from the perspective (dimensional) of each disease, gene, compound index, as shown in FIG. The structure is called 'Star Schema' in jargon.

More specifically, as shown in FIG. 2, for example, for a disease index, information about a pubmed ID, a sentence number, a disease ID and a disease name, a start position, and an end position is stored, and the standard disease name and Synonym information about the disease is associated and stored.

In addition, for the gene index, similarly, information on the pubmed ID, sentence number, gene ID and gene name, start position, and end position is stored, and synonymous information about the standard gene and gene is stored in association with it.

In addition, with respect to the compound index, information on pubmed ID, sentence number, compound ID and compound ID, compound name, start position, and end position is stored, and the compound name and synonym information about the compound are stored in association with each other.

In addition, for other analysis dimensions, only the index information may be added as appropriate with reference to the above contents as needed, and thus, another analysis dimension can be easily added to establish a multidimensional analysis model.

In this case, as an example of a query type that can be processed, for example, a search for a search result including a desired keyword in a sentence or abstract, or for each type, one or more conditions You can search for sentences or abstracts.

In other words, the present invention is a method of dramatically increasing the performance and accuracy of a search by using an index without directly accessing about 100 million sentences as a whole sentence. The sentence is one-dimensional (gene, disease, compound) , Two-dimensional (gene-disease, disease-gene, gene-compound, compound-gene, disease-compound, compound-disease relationship) and three-dimensional (gene-disease-compound relationship).

In addition, this storage structure can provide a very flexible structure that can easily add another dimension of analysis.

Next, FIG. 3 is a screen showing a basic search result extracted by applying the multidimensional analysis structure as described above, and is a screen showing a search result based on a keyword based on a sentence.

That is, as shown in Figure 3, when the user enters a search word, by applying the color for each gene, disease, compound on the screen showing the search results, not only give a visual effect, the user can intuitively understand the content Configure it to be.

Here, gene, disease, and compound information in each sentence is to be taken from the index.

4 shows a screen that provides the entire abstract contents of the extracted sentences.

That is, as shown in FIG. 4, when the user inputs a search word, the user can grasp the content based on the sentence and then view the entire abstract, so that the abstract content can be confirmed based on the sentence.

5 shows an input screen and a result screen for multidimensional analysis.

That is, as shown in FIG. 5, when a user inputs a search word for a compound, a search word for a related disease is displayed through synonym processing, and when a user selects one of the search results for the compound and the disease, An abstract is displayed, allowing the user to easily perform compound-disease relationship analysis.

FIG. 6 shows the SQL structure for extracting the compound-disease relationship as shown in FIG. 5.

That is, without the need to create a separate program for extracting the relationship between the gene-disease-compound, it is configured to immediately perform the necessary analysis by accessing the index using SQL as shown in FIG.

In other words, the features of the present invention configured as described above, first, as shown in Figs. 3 and 4, supports a keyword-based logical search for the sentence, the screen based on the final result confirmation of the multi-dimensional analysis Second, through the multi-dimensional index structure as shown in FIG. 2, Ad-Hoc queries can be performed for each viewpoint of gene-disease-compound.

7 to 10 show practical applications of multidimensional analysis results using the method of the present invention as described above.

That is, as shown in Figures 7 to 10, according to the present invention, various multidimensional analysis such as gene-compound relationship analysis, gene-gene relationship analysis, disease-gene- abstract analysis, disease-gene analysis is possible.

As described above, according to the present invention, by supporting a high-performance sentence-by-state logical search, it is possible to solve the problem that currently does not support sentence-by-sentral search in biotechnology literature search.

In addition, according to the present invention, the user's intuitive understanding is improved through the highlight function for the gene-disease-compound keyword in the search results, and a flexible and high performance analysis service utilizing the multi-dimensional index of the gene-disease-compound. Can be provided.

That is, the present invention, for example, the disease list output associated with a specific gene, the gene list output associated with a specific disease, the abstract import containing a specific disease and gene, the list of genes present with a specific gene, the specific gene and Providing biotext mining services for various cases, such as the list of related compounds, the list of diseases related to specific compounds, the list of diseases related to specific body parts, the list of compounds related to specific body parts, and the list of compounds related to specific species. Can be.

As described above, the details of the method for extracting the gene-disease-compound relationship from the large-scale biotechnology literature according to the present invention have been described through the embodiments of the present invention as described above. However, the present invention is not limited thereto, and therefore, it is obvious that various modifications, changes, combinations, and substitutions may be made by those skilled in the art according to design needs and various other factors. I will call it work.

Claims (13)

In a method for extracting a gene-disease-compound relationship from a large-scale biotechnology literature,
Constructing a multi-dimensional index for diseases, genes, and compounds from the large biotechnology literature;
Storing the constructed multidimensional index according to a predetermined index storage structure;
Gene-disease-compounds from large-scale biotechnology literature, comprising using the stored index, a user entering a search term and performing multi-dimensional analysis of diseases, genes, and compounds from the large-scale biotechnology literature How to extract a relationship.
The method of claim 1,
Building the multidimensional index,
Extracting only documents in which the Abstract field is not null from the PubMed database,
Dividing the contents of each abstract into sentence units to form a sentence table through curation;
Building an inverse index on the sentence table;
Comparing each synonym dictionary for genes, diseases, and compounds with said inverse index to construct respective dimensional indices for genes, diseases, and compounds. How to extract compound relationships.
The method of claim 2,
In the step of making the sentence table,
And said sentence table is stored in the order of [pubmed id, sentence id (sentence id), sentence (sentence)].
The method of claim 1,
And said index storage structure is a star schema structure. A method of extracting a gene-disease-compound relationship from a large-scale biotechnology literature.
The method of claim 4, wherein
In the index storage structure, the disease index, the information on the pubmed ID, sentence number, disease ID and disease name, start position, end position is stored, and the standard disease name and synonym information about the disease is stored in association with A method for extracting gene-disease-compound relationships from a large volume of biotechnology literature.
The method of claim 4, wherein
In the index storage structure, the gene index, pubmed ID, sentence number, gene ID and gene name, information about the start position, the end position is stored, the standard gene and synonym information for the gene is stored in association with A method for extracting gene-disease-compound relationships from a large volume of biotechnology literature.
The method of claim 4, wherein
In the index storage structure, the compound index, information about the pubmed ID, sentence number, compound ID and compound ID, compound name, start position, end position is stored, synonymous information about the compound name and the compound is associated with A method for extracting a gene-disease-compound relationship from a large volume of biotechnology literature, which is stored.
The method of claim 4, wherein
The index storage structure is gene-disease from a large-scale biotechnology literature, characterized in that it is possible to establish a multi-dimensional analysis model by adding index information for other analysis dimensions in addition to the disease index, the gene index and the compound index. How to extract compound relationships.
The method of claim 1,
The method comprises:
The sentence is divided into one dimension (gene, disease, compound), two dimensions (gene-disease, disease-gene, gene-compound, compound-gene, disease-compound, compound-disease relationship) and three-dimensional (gene-disease-compound) Method for extracting a gene-disease-compound relationship from a large volume of biotechnology literature, which is configured to be analyzed.
The method of claim 1,
When the user enters a search word, the color or highlight for each gene, disease, or compound is applied to the screen displaying the search result, so that the user can intuitively understand the content as well as give a visual effect. A method for extracting gene-disease-compound relationships from large biotechnology literature.
The method of claim 1,
When a user enters a search term, the gene-disease-compound relationship is obtained from a large-scale biotechnology literature, which is configured to identify the contents based on sentences and then view the entire abstract. How to extract.
The method of claim 1,
When the user enters a search word, keywords related to the search word are displayed. When the keyword is selected, the search result and the abstract corresponding to the search word and the keyword are displayed, and the user can easily analyze the relationship between the search word and the keyword. A method for extracting a gene-disease-compound relationship from a large volume of biotechnologies characterized in that it is configured to be capable of doing so.
The method of claim 1,
Gene-disease-from large-scale biotechnology literature, characterized by the ability to access the index and perform the necessary analysis immediately using SQL without the need to write a separate program to extract the relationship between gene-disease-compounds. How to extract compound relationships.
KR20100105299A 2010-10-27 2010-10-27 A method for retrieving a gene-disease-chemical relationship using multi-dimensional indexes from huge bio-literatures KR101448731B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR20100105299A KR101448731B1 (en) 2010-10-27 2010-10-27 A method for retrieving a gene-disease-chemical relationship using multi-dimensional indexes from huge bio-literatures

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR20100105299A KR101448731B1 (en) 2010-10-27 2010-10-27 A method for retrieving a gene-disease-chemical relationship using multi-dimensional indexes from huge bio-literatures

Publications (2)

Publication Number Publication Date
KR20120043977A true KR20120043977A (en) 2012-05-07
KR101448731B1 KR101448731B1 (en) 2014-10-21

Family

ID=46263933

Family Applications (1)

Application Number Title Priority Date Filing Date
KR20100105299A KR101448731B1 (en) 2010-10-27 2010-10-27 A method for retrieving a gene-disease-chemical relationship using multi-dimensional indexes from huge bio-literatures

Country Status (1)

Country Link
KR (1) KR101448731B1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101839572B1 (en) * 2017-11-21 2018-03-16 연세대학교 산학협력단 Apparatus Analyzing Disease-related Genes and Method thereof
KR20190003231A (en) * 2017-06-30 2019-01-09 광주과학기술원 A method for normalizing biomedical names
CN110472037A (en) * 2019-08-21 2019-11-19 北京大学第三医院(北京大学第三临床医学院) A kind of index of medical literature and the extracting method and system of numerical value
KR20210123756A (en) * 2020-04-06 2021-10-14 광주과학기술원 A relevant document extraction system for gene chemical disease
KR102522954B1 (en) * 2022-10-13 2023-04-17 이동근 System and method for providing predictive information by artificial intelligence-based data analysis

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190003231A (en) * 2017-06-30 2019-01-09 광주과학기술원 A method for normalizing biomedical names
KR101839572B1 (en) * 2017-11-21 2018-03-16 연세대학교 산학협력단 Apparatus Analyzing Disease-related Genes and Method thereof
CN110472037A (en) * 2019-08-21 2019-11-19 北京大学第三医院(北京大学第三临床医学院) A kind of index of medical literature and the extracting method and system of numerical value
KR20210123756A (en) * 2020-04-06 2021-10-14 광주과학기술원 A relevant document extraction system for gene chemical disease
KR102522954B1 (en) * 2022-10-13 2023-04-17 이동근 System and method for providing predictive information by artificial intelligence-based data analysis

Also Published As

Publication number Publication date
KR101448731B1 (en) 2014-10-21

Similar Documents

Publication Publication Date Title
Song et al. Detecting the knowledge structure of bioinformatics by mining full-text collections
Tsuruoka et al. Discovering and visualizing indirect associations between biomedical concepts
Baran et al. pubmed2ensembl: a resource for mining the biological literature on genes
US9613125B2 (en) Data store organizing data using semantic classification
US20140108424A1 (en) Data store organizing data using semantic classification
US9081847B2 (en) Data store organizing data using semantic classification
Kim et al. Open Agile text mining for bioinformatics: the PubAnnotation ecosystem
KR101448731B1 (en) A method for retrieving a gene-disease-chemical relationship using multi-dimensional indexes from huge bio-literatures
Arnaboldi et al. Wormicloud: a new text summarization tool based on word clouds to explore the C. elegans literature
CN103034656A (en) Chapter content tiering method and device, and article content tiering method and device
JP2005122231A (en) Screen display system and screen display method
Wildgaard et al. Advancing PubMed? A comparison of third-party PubMed/Medline tools
Canakoglu et al. Integrative warehousing of biomolecular information to support complex multi-topic queries for biomedical knowledge discovery
KR20130057715A (en) Method for providing deep domain knowledge based on massive science information and apparatus thereof
Hassani-Pak et al. Enhancing data integration with text analysis to find proteins implicated in plant stress response
Lee et al. Using annotations from controlled vocabularies to find meaningful associations
Moftah et al. Methods to access structured and semi-structured data in bioinformatics databases: A perspective
Kusakunniran et al. Journal co-citation analysis for identifying trends of inter-disciplinary research: an exploratory case study in a university
Krishnappa et al. A Bibliometric Study on Bioinformatics: An Analytical Study
Newman et al. A scale-out RDF molecule store for improved co-identification, querying and inferencing
Devignes et al. BioRegistry: Automatic extraction of metadata for biological database retrieval and discovery
Kumar et al. MetaRNA‐Seq: An Interactive Tool to Browse and Annotate Metadata from RNA‐Seq Studies
Cheung Inferring novel relationships through over-representation analysis of medical subjects in biomedical bibliographies
Chen et al. Gene ontology evidence sentence retrieval using combinatorial applications of semantic class and rule patterns
Chun et al. Smart searching system for virtual science brain

Legal Events

Date Code Title Description
A201 Request for examination
E902 Notification of reason for refusal
E701 Decision to grant or registration of patent right
GRNT Written decision to grant
FPAY Annual fee payment

Payment date: 20170824

Year of fee payment: 4