WO2021080540A1 - A disease estimation system and method according to gene expression values - Google Patents

A disease estimation system and method according to gene expression values Download PDF

Info

Publication number
WO2021080540A1
WO2021080540A1 PCT/TR2020/050969 TR2020050969W WO2021080540A1 WO 2021080540 A1 WO2021080540 A1 WO 2021080540A1 TR 2020050969 W TR2020050969 W TR 2020050969W WO 2021080540 A1 WO2021080540 A1 WO 2021080540A1
Authority
WO
WIPO (PCT)
Prior art keywords
gene expression
value
gene
values
class
Prior art date
Application number
PCT/TR2020/050969
Other languages
French (fr)
Inventor
Alper Yilmaz
Original Assignee
Yildiz Teknik Universitesi
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yildiz Teknik Universitesi filed Critical Yildiz Teknik Universitesi
Publication of WO2021080540A1 publication Critical patent/WO2021080540A1/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/10Gene or protein expression profiling; Expression-ratio estimation or normalisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

Definitions

  • the present invention relates to a disease estimation system comprising a processor unit which receives the gene expression values of an individual associated with gene descriptors.
  • Gene expression is a data collection formed by numerical values. Gene expression expresses the amount of a gene characteristic or mutation thereof which belongs to a person. The interpretation of these numerical data by means of human power is difficult and is prone to faults.
  • One other of the present methods is cancer diagnosis from gene expression.
  • the average of the genes which exist in normal individuals and the average of the genes which exist in the individuals, who suffer from cancer are compared, and the genes which show meaningful change are detected and they are used as markers.
  • the genes, which have a great effect on the disease but which are ignored with the average may not be detected and the disease may not be diagnosed.
  • the present invention relates to a disease estimation system and method, for eliminating the above mentioned disadvantages and for bringing new advantages to the related technical field.
  • An object of the present invention is to provide a disease estimation system and method which eliminate operator faults.
  • Another object of the present invention is to provide a system and method which realize accelerated disease estimation.
  • the present invention is a disease estimation system comprising a processor unit which receives the gene expression values of an individual associated with gene descriptors. Accordingly, the improvement of the subject matter disease estimation system is that said processor unit is configured to realize the steps of:
  • each gene expression pixel for each gene expression value such that the first part is the value of the red component, the second part is the value of the green component and the third part is the value of the blue component,
  • a memory unit comprising a deep learning model formed by teaching the reference gene expression images of the individuals in at least one first class and in at least one second class,
  • disease estimation is realized without any operator fault, in other words, person fault. Since the gene expression value of each gene is taken into consideration, the fault probability is further reduced, and estimation with increased accuracy is provided.
  • the first class comprises the reference gene values of the individuals who suffer from cancer
  • the second class comprises the gene expression values of healthy individuals.
  • the present invention is moreover a disease estimation method realized by a processor unit which receives the gene expression values of an individual associated with gene descriptors, wherein said method comprises the following steps: - transforming the gene expression value of each gene descriptor into binary base and dividing each gene expression value transformed into binary base to a first part, a second part and a third part,
  • each gene expression pixel for each gene expression value such that the first part is the value of the red component, the second part is the value of the green component and the third part is the value of the blue component,
  • a memory unit comprising a deep learning model formed by teaching the reference gene expression images of the individuals in at least one first class and in at least one second class,
  • Figure 1 is a representative view of the system.
  • Figure 2 is a representative view of the gene expression image.
  • the present invention is a disease estimation system (100) which provides the gene expression values of an individual to be expressed by means of image and which estimates whether the individuals are ill or healthy or which disease the individuals suffer from in accordance with these images.
  • the disease estimation system (100) comprises a processor unit (110) which provides estimation of a disease by executing the software formed by command lines provided in a memory unit (120).
  • Said processor unit (110) can comprise one or more than one processor (GPU, CPU, etc.).
  • the memory unit (120) can comprise a memory or suitable type of memory combinations which provides storage of the data in a permanent and/or temporary manner.
  • the processor unit (110) is associated with the memory unit (120) in a manner reading data and writing data.
  • the disease estimation system (100) can comprise an input/output unit (130) associated with the processor unit (110) and which provides data input and data output.
  • the disease estimation system (100) can moreover comprise a display unit (140) which provides displaying of the data by the processor unit (110).
  • the disease estimation system (100) can be a general-purpose computer.
  • Gene expression values can take values between 0 and 16,300,000. Each gene has a gene expression value and these values change from person to person. In Table 1 , two genes have been given as examples. The gene expression values of these genes are also given. In this detailed description, codes, names, identity numbers, etc., assigned to the genes, are defined as gene descriptor.
  • the processor unit (110) essentially receives the gene expression values of a patient for whom the disease estimation will be made, and transforms these gene expression values into a gene expression image (200) which is in RGB form, and compares them with the reference gene expression values recorded in the memory unit (120), and classifies them with respect to their similarities, and makes a disease estimation.
  • the processor unit (110) receives the gene expression values associated with the gene descriptors of an individual.
  • the processor unit (110) transforms each gene expression value into binary base. It separates the gene expression value, which exists in the binary base, into three parts. In this possible embodiment, the gene expression value is divided to 8 each bits. In other words, each 8 bits has been separated to a part.
  • the form of the gene expression value which is 1880505 for the gene with gene descriptor of ENSG00000171428 is 000111001011000110111001 in the binary base.
  • Table 2 the form of this gene expression value, which has been written in the binary base and which has been decomposed into parts, is given as an example. The form where an exemplary gene expression value has been separated into three parts is given.
  • Table 2 A gene expression pixel (210) is formed such that the first part comprises the value of the red component, the second part comprises the value of the green component and the third part comprises the value of the blue component. In this example, this is in the form of red, green and blue components (28, 175, 185) of the pixel. This can also be seen in Table 1.
  • the memory unit (120) comprises a deep learning model formed by teaching the reference gene expression images (200) taken from ill and healthy individuals.
  • the reference gene expression images (200) are formed in the manner as the gene expression images (200) are formed.
  • the reference gene expression images have been classified according to the condition of the individual whose image is taken. In this possible embodiment, they are classified in two classes, namely, a first class and a second class.
  • the first class can describe the ill individuals or the individuals who suffer from cancer and the second class can represent the healthy individuals.
  • the first class and the second class can comprise the sub classes according to the details of the disease and health condition.
  • the deep learning model has been formed according to the reference gene expression images, and an image taken as input is classified.
  • the processor unit (110) realizes classification in a manner placing the gene expression image (200), obtained from gene expression values received as input, to the class of one of the reference gene expression images (200) by applying the deep learning model. Afterwards, the processor provides this classification result to be displayed in the display unit (140).

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biophysics (AREA)
  • Public Health (AREA)
  • Molecular Biology (AREA)
  • Bioethics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Evolutionary Computation (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)

Abstract

The present invention is a disease estimation system (100) comprising a processor unit (110) which receives the gene expression values of an individual associated with gene descriptors, wherein said processor unit (110) is configured to realize the steps of transforming the gene expression value of each gene descriptor into binary base and dividing each gene expression value transformed into binary base to a first part, a second part and a third part, forming one each gene expression pixel (210) for each gene expression value such that the first part is the value of the red component, the second part is the value of the green component and the third part is the value of the blue component, forming a gene expression image (200) comprising the gene expression pixels for said individual, accessing a memory unit comprising a deep learning model formed by teaching the reference gene expression images of the individuals in at least one first class and in at least one second class, receiving the gene expression image (200) as input to the deep learning model and classifying said gene expression image (200).

Description

A DISEASE ESTIMATION SYSTEM AND METHOD ACCORDING TO GENE
EXPRESSION VALUES
TECHNICAL FIELD
The present invention relates to a disease estimation system comprising a processor unit which receives the gene expression values of an individual associated with gene descriptors.
PRIOR ART
Early diagnosis is important in diseases like cancer. For the diagnosis of cancer, the methods known in the present art are essentially based on mutation analysis. These systems, set up with kit rationale, have been designed for determining specific mutation analyses. Thus, the method used for diagnosis is for detecting only specific mutations which are known.
Gene expression is a data collection formed by numerical values. Gene expression expresses the amount of a gene characteristic or mutation thereof which belongs to a person. The interpretation of these numerical data by means of human power is difficult and is prone to faults.
One other of the present methods is cancer diagnosis from gene expression. In this method, the average of the genes which exist in normal individuals and the average of the genes which exist in the individuals, who suffer from cancer, are compared, and the genes which show meaningful change are detected and they are used as markers. However, in this method, the genes, which have a great effect on the disease but which are ignored with the average, may not be detected and the disease may not be diagnosed.
As a result, because of all of the abovementioned problems, an improvement is required in the related technical field.
BRIEF DESCRIPTION OF THE INVENTION
The present invention relates to a disease estimation system and method, for eliminating the above mentioned disadvantages and for bringing new advantages to the related technical field. An object of the present invention is to provide a disease estimation system and method which eliminate operator faults.
Another object of the present invention is to provide a system and method which realize accelerated disease estimation.
In order to realize the abovementioned objects and the objects which are to be deducted from the detailed description below, the present invention is a disease estimation system comprising a processor unit which receives the gene expression values of an individual associated with gene descriptors. Accordingly, the improvement of the subject matter disease estimation system is that said processor unit is configured to realize the steps of:
- transforming the gene expression value of each gene descriptor into binary base and dividing each gene expression value transformed into binary base to a first part, a second part and a third part,
- forming one each gene expression pixel for each gene expression value such that the first part is the value of the red component, the second part is the value of the green component and the third part is the value of the blue component,
- forming a gene expression image comprising the gene expression pixels for said individual,
- accessing a memory unit comprising a deep learning model formed by teaching the reference gene expression images of the individuals in at least one first class and in at least one second class,
- receiving the gene expression image as input to the deep learning model and classifying said gene expression image.
Thus, disease estimation is realized without any operator fault, in other words, person fault. Since the gene expression value of each gene is taken into consideration, the fault probability is further reduced, and estimation with increased accuracy is provided.
In a preferred embodiment of the present invention, the first class comprises the reference gene values of the individuals who suffer from cancer, and the second class comprises the gene expression values of healthy individuals.
The present invention is moreover a disease estimation method realized by a processor unit which receives the gene expression values of an individual associated with gene descriptors, wherein said method comprises the following steps: - transforming the gene expression value of each gene descriptor into binary base and dividing each gene expression value transformed into binary base to a first part, a second part and a third part,
- forming one each gene expression pixel for each gene expression value such that the first part is the value of the red component, the second part is the value of the green component and the third part is the value of the blue component,
- forming a gene expression image comprising the gene expression pixels for said individual,
- accessing a memory unit comprising a deep learning model formed by teaching the reference gene expression images of the individuals in at least one first class and in at least one second class,
- receiving the gene expression image as input to the deep learning model and classifying said gene expression image.
BRIEF DESCRIPTION OF THE FIGURES
Figure 1 is a representative view of the system.
Figure 2 is a representative view of the gene expression image.
DETAILED DESCRIPTION OF THE INVENTION
In this detailed description, the subject matter is explained with references to examples without forming any restrictive effect only in order to make the subject more understandable.
With reference to Figure 1 , the present invention is a disease estimation system (100) which provides the gene expression values of an individual to be expressed by means of image and which estimates whether the individuals are ill or healthy or which disease the individuals suffer from in accordance with these images.
The disease estimation system (100) comprises a processor unit (110) which provides estimation of a disease by executing the software formed by command lines provided in a memory unit (120). Said processor unit (110) can comprise one or more than one processor (GPU, CPU, etc.). The memory unit (120) can comprise a memory or suitable type of memory combinations which provides storage of the data in a permanent and/or temporary manner. The processor unit (110) is associated with the memory unit (120) in a manner reading data and writing data. The disease estimation system (100) can comprise an input/output unit (130) associated with the processor unit (110) and which provides data input and data output. The disease estimation system (100) can moreover comprise a display unit (140) which provides displaying of the data by the processor unit (110).
The disease estimation system (100) can be a general-purpose computer.
Gene expression values can take values between 0 and 16,300,000. Each gene has a gene expression value and these values change from person to person. In Table 1 , two genes have been given as examples. The gene expression values of these genes are also given. In this detailed description, codes, names, identity numbers, etc., assigned to the genes, are defined as gene descriptor.
Figure imgf000006_0001
Table 1
The processor unit (110) essentially receives the gene expression values of a patient for whom the disease estimation will be made, and transforms these gene expression values into a gene expression image (200) which is in RGB form, and compares them with the reference gene expression values recorded in the memory unit (120), and classifies them with respect to their similarities, and makes a disease estimation.
In more details, the processor unit (110) receives the gene expression values associated with the gene descriptors of an individual. The processor unit (110) transforms each gene expression value into binary base. It separates the gene expression value, which exists in the binary base, into three parts. In this possible embodiment, the gene expression value is divided to 8 each bits. In other words, each 8 bits has been separated to a part. For instance, the form of the gene expression value which is 1880505 for the gene with gene descriptor of ENSG00000171428 is 000111001011000110111001 in the binary base. In Table 2, the form of this gene expression value, which has been written in the binary base and which has been decomposed into parts, is given as an example. The form where an exemplary gene expression value has been separated into three parts is given.
Figure imgf000006_0002
Table 2 A gene expression pixel (210) is formed such that the first part comprises the value of the red component, the second part comprises the value of the green component and the third part comprises the value of the blue component. In this example, this is in the form of red, green and blue components (28, 175, 185) of the pixel. This can also be seen in Table 1.
In a similar manner to the above mentioned explanation, these processes are realized for all genes which are received as input; and one each gene expression pixel (210) is formed. The gene expression pixels (210) are placed to the locations where they are mapped beforehand, and a gene expression image (200) is formed.
The memory unit (120) comprises a deep learning model formed by teaching the reference gene expression images (200) taken from ill and healthy individuals. The reference gene expression images (200) are formed in the manner as the gene expression images (200) are formed. The reference gene expression images have been classified according to the condition of the individual whose image is taken. In this possible embodiment, they are classified in two classes, namely, a first class and a second class. The first class can describe the ill individuals or the individuals who suffer from cancer and the second class can represent the healthy individuals. The first class and the second class can comprise the sub classes according to the details of the disease and health condition. The deep learning model has been formed according to the reference gene expression images, and an image taken as input is classified.
The processor unit (110) realizes classification in a manner placing the gene expression image (200), obtained from gene expression values received as input, to the class of one of the reference gene expression images (200) by applying the deep learning model. Afterwards, the processor provides this classification result to be displayed in the display unit (140).
The protection scope of the present invention is set forth in the annexed claims and cannot be restricted to the illustrative disclosures given above, under the detailed description. It is because a person skilled in the relevant art can obviously produce similar embodiments under the light of the foregoing disclosures, without departing from the main principles of the present invention. REFERENCE NUMBERS
100 Disease estimation system 110 Processor unit 120 Memory unit
130 Input/output unit 140 Display unit 200 Gene expression image 210 Gene expression pixel

Claims

1. A disease estimation system (100) comprising a processor unit (110) which receives the gene expression values of an individual associated with gene descriptors, wherein said processor unit (110) is configured to realize the steps of:
- transforming the gene expression value of each gene descriptor into binary base and dividing each gene expression value transformed into binary base to a first part, a second part and a third part,
- forming one each gene expression pixel (210) for each gene expression value such that the first part is the value of the red component, the second part is the value of the green component and the third part is the value of the blue component,
- forming a gene expression image (200) comprising the gene expression pixels for said individual.
2. The disease estimation system (100) according to claim 1 , wherein the first class comprises the reference gene values of the individuals who suffer from cancer and the second class comprises the gene expression values of healthy individuals.
3. A disease estimation method realized by a processor unit (110) which receives the gene expression values of an individual associated with gene descriptors, wherein the subject matter method comprises the steps of:
- transforming the gene expression value of each gene descriptor into binary base and dividing each gene expression value transformed into binary base to a first part, a second part and a third part,
- forming one each gene expression pixel (210) for each gene expression value such that the first part is the value of the red component, the second part is the value of the green component and the third part is the value of the blue component
- forming a gene expression image (200) comprising the gene expression pixels for said individual,
- accessing a memory unit comprising a deep learning model formed by teaching the reference gene expression images of the individuals in at least one first class and in at least one second class,
- receiving the gene expression image (200) as input to the deep learning model and classifying said gene expression image (200).
4. The disease estimation method (100) according to claim 2, wherein the first class comprises the reference gene values of the individuals who suffer from cancer and the second class comprises the gene expression values of healthy individuals.
PCT/TR2020/050969 2019-10-23 2020-10-21 A disease estimation system and method according to gene expression values WO2021080540A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
TR2019/16321 2019-10-23
TR201916321 2019-10-23

Publications (1)

Publication Number Publication Date
WO2021080540A1 true WO2021080540A1 (en) 2021-04-29

Family

ID=75620196

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/TR2020/050969 WO2021080540A1 (en) 2019-10-23 2020-10-21 A disease estimation system and method according to gene expression values

Country Status (1)

Country Link
WO (1) WO2021080540A1 (en)

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DARENDELI B. N. , AL-QABBANI H., YILMAZ A.: "Kanser Teşhisinde Derin Öğrenme ile Sınıflandırma", L.TEMEL ONKOLOJI SEMPOZYUMU, 11 May 2018 (2018-05-11), Izmir, Turkey, pages 48 *
SHARMA ALOK, VANS EDWIN, SHIGEMIZU DAICHI, BOROEVICH KEITH A., TSUNODA TATSUHIKO: "DeepInsight: A methodology to transform a non-image data to an image for convolution neural network architecture", SCIENTIFIC REPORTS, vol. 9, 6 August 2019 (2019-08-06), pages 11399, XP055819564 *

Similar Documents

Publication Publication Date Title
CN110428475B (en) Medical image classification method, model training method and server
US11935644B2 (en) Deep learning automated dermatopathology
Xue et al. Chest x-ray image view classification
Hu et al. Multi-scale features extraction from baseline structure MRI for MCI patient classification and AD early diagnosis
CN110059697B (en) Automatic lung nodule segmentation method based on deep learning
WO2020260936A1 (en) Medical image segmentation using an integrated edge guidance module and object segmentation network
CN110236543B (en) Alzheimer disease multi-classification diagnosis system based on deep learning
Chen et al. Local feature based mammographic tissue pattern modelling and breast density classification
CN110197715B (en) Medical image browsing system for film reading teaching
CN111325725A (en) Retina image recognition method and device, electronic equipment and storage medium
CN113065609B (en) Image classification method, device, electronic equipment and readable storage medium
Abdullah et al. Multi-sectional views textural based SVM for MS lesion segmentation in multi-channels MRIs
Ashour et al. Comparative study of multiclass classification methods on light microscopic images for hepatic schistosomiasis fibrosis diagnosis
CN113313680A (en) Colorectal cancer pathological image prognosis auxiliary prediction method and system
Phan et al. A Hounsfield value-based approach for automatic recognition of brain haemorrhage
CN113130050B (en) Medical information display method and display system
Pujari et al. Detection and classification of fungal disease with radon transform and support vector machine affected on cereals
WO2021080540A1 (en) A disease estimation system and method according to gene expression values
CN112818948B (en) Behavior identification method based on visual attention under embedded system
CN111459050B (en) Intelligent simulation type nursing teaching system and teaching method based on dual-network interconnection
CN111127414B (en) Perfusion image judgment system and method based on OPENCV and intelligent terminal
CN111932486A (en) Brain glioma segmentation method based on 3D convolutional neural network
Moradkhani et al. A new image mining approach for detecting micro-calcification in digital mammograms
CN112151175A (en) Computer-aided multi-person decision-making method, system and equipment based on iconography data
CN111179226A (en) Visual field map identification method and device and computer storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20879774

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20879774

Country of ref document: EP

Kind code of ref document: A1