CN112309498B - Gene detection method and device based on deep learning and fluorescence spectrum - Google Patents

Gene detection method and device based on deep learning and fluorescence spectrum Download PDF

Info

Publication number
CN112309498B
CN112309498B CN202011636282.6A CN202011636282A CN112309498B CN 112309498 B CN112309498 B CN 112309498B CN 202011636282 A CN202011636282 A CN 202011636282A CN 112309498 B CN112309498 B CN 112309498B
Authority
CN
China
Prior art keywords
fluorescence spectrum
spectrum image
fluorescence
gene
minimum value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011636282.6A
Other languages
Chinese (zh)
Other versions
CN112309498A (en
Inventor
李斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Niufusi Biological Technology Co ltd
Original Assignee
Wuhan Niufusi Biological Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Niufusi Biological Technology Co ltd filed Critical Wuhan Niufusi Biological Technology Co ltd
Priority to CN202011636282.6A priority Critical patent/CN112309498B/en
Publication of CN112309498A publication Critical patent/CN112309498A/en
Application granted granted Critical
Publication of CN112309498B publication Critical patent/CN112309498B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/30Detection of binding sites or motifs
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/62Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
    • G01N21/63Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
    • G01N21/64Fluorescence; Phosphorescence
    • G01N21/6428Measuring fluorescence of fluorescent products of reactions or of fluorochrome labelled reactive substances, e.g. measuring quenching effects, using measuring "optrodes"
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/62Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
    • G01N21/63Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
    • G01N21/64Fluorescence; Phosphorescence
    • G01N21/645Specially adapted constructive features of fluorimeters
    • G01N21/6456Spatial resolved fluorescence measurements; Imaging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/62Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
    • G01N21/63Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
    • G01N21/64Fluorescence; Phosphorescence
    • G01N2021/6417Spectrofluorimetric devices
    • G01N2021/6421Measuring at two or more wavelengths
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/62Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
    • G01N21/63Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
    • G01N21/64Fluorescence; Phosphorescence
    • G01N21/6428Measuring fluorescence of fluorescent products of reactions or of fluorochrome labelled reactive substances, e.g. measuring quenching effects, using measuring "optrodes"
    • G01N2021/6439Measuring fluorescence of fluorescent products of reactions or of fluorochrome labelled reactive substances, e.g. measuring quenching effects, using measuring "optrodes" with indicators, stains, dyes, tags, labels, marks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10056Microscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention relates to a gene detection method and a gene detection device based on deep learning and fluorescence spectrum, wherein the method comprises the following steps: acquiring fluorescence spectrum images of different markers of a plurality of genes, randomly selecting M fluorescence spectrum images with different wave bands for each gene, fusing the fluorescence spectrum images into a mixed fluorescence spectrum image, and enhancing the fluorescence spectrum images; filtering background noise of the mixed spectrum image according to a Gaussian peak hypothesis and a local self-adaptive polynomial fitting algorithm, and then performing feature extraction on the second mixed spectrum image according to a maximum value and minimum value self-adaptive algorithm to obtain peak signal features of the fluorescence spectrum image; and then training a convolutional neural network model according to the data, and obtaining a gene detection result by using the convolutional neural network model. The method extracts the characteristics of the fluorescence spectrum image by combining the traditional filtering and image processing; and the fluorescence spectra of different markers in the sample improve the robustness, generalization capability and accuracy of the convolutional neural network model.

Description

Gene detection method and device based on deep learning and fluorescence spectrum
Technical Field
The invention relates to the field of biological information and deep learning, in particular to a gene detection method and a gene detection device based on deep learning and fluorescence spectrum.
Background
Fluorescence-based detection methods are an extremely important class of analytical means in analytical chemistry, including fluorescence excitation/emission spectroscopy, phase-resolved fluorescence analysis, time-resolved fluorescence analysis, fluorescence immunolabeling analysis, three-dimensional fluorescence analysis, and the like. The fluorescence analysis method has high sensitivity, good selectivity and wide linear working range, and can be conveniently adapted to the analysis requirements through chemical means such as synthesis modification and the like, so the fluorescence analysis method is widely applied to the fields of environmental analysis, medical analysis, biological imaging, gene detection and the like.
The accuracy and speed of fluorescence-based gene detection depend on the specificity of combination of a gene marker (nucleotide fragment) and a targeted gene fragment, and under some unknown or newly discovered gene fragment scenes, the gene detection is difficult to perform due to the fact that the marker with good specificity cannot be obtained in a short time; on the other hand, the hidden failure or improper operation of the detected device causes the problems of sample pollution, weak characteristic wavelength signal or large background noise, and the like, and further causes the detection result to be difficult to determine or to have larger deviation.
Disclosure of Invention
The invention provides a gene detection method based on deep learning and fluorescence spectrum in order to accelerate the screening process of gene markers and improve the robustness and accuracy of gene detection, and the method comprises the following steps: acquiring fluorescence spectrum images of different markers of a plurality of genes, randomly selecting M fluorescence spectrum images with different wave bands for each gene, fusing the fluorescence spectrum images into one fluorescence spectrum image, and enhancing the fluorescence spectrum image to obtain a first mixed spectrum image; filtering the background noise of the first mixed spectrum image according to a Gaussian peak hypothesis and a local self-adaptive polynomial fitting algorithm to obtain a second mixed spectrum image; performing feature extraction on the second mixed spectrum image according to a maximum value and minimum value self-adaptive algorithm to obtain peak signal features of the fluorescence spectrum image; taking a gene sequence as a target label, taking a peak signal characteristic corresponding to each gene and a first mixed spectrum image as samples, training a convolutional neural network until the error of the convolutional neural network is lower than a threshold value, and stopping training to obtain a trained convolutional neural network; and inputting the fluorescence spectrum image of the gene to be detected into the trained convolutional neural network to obtain a gene detection result.
In some embodiments of the present invention, the obtaining of fluorescence spectrum images of different markers of a plurality of genes, randomly selecting M fluorescence spectrum images of different wavebands for each gene, fusing the M fluorescence spectrum images into one fluorescence spectrum image, and enhancing the fluorescence spectrum image to obtain a first mixed spectrum image includes the following steps: acquiring fluorescence spectrum images of different markers of multiple genes; clustering the fluorescence spectrum images according to a target gene, an excitation waveband, the chemical species of a marker and a fluorescence color in sequence to obtain a fluorescence spectrum image data set; randomly selecting M fluorescence spectrum images of different wave bands of a gene from the fluorescence spectrum image dataset and fusing the fluorescence spectrum images into one fluorescence spectrum image, wherein M is less than or equal to 4; and enhancing the fluorescence spectrum image to obtain a first mixed spectrum image.
In some embodiments of the present invention, the filtering the background noise of the first mixed spectrum image according to the gaussian peak hypothesis and the local adaptive polynomial fitting algorithm to obtain the second mixed spectrum image includes the following steps: filtering is carried out by adopting a Savitzky-Golay sliding window average filtering algorithm, wherein the Savitzky-Golay sliding window average filtering algorithm is expressed as follows:
Figure 100002_DEST_PATH_IMAGE002
wherein x [ i ]]Is the ith data, yi [ i ] in the raw spectral data]Is x [ i ]]2N +1 is a point x [ i ]]Window size as center, k is distance of each position in the sliding window from the center point, wkThe weight coefficient corresponding to the window; and determining all extreme points according to the derivative change of the spectral data of the first mixed spectral image, and further determining the position and the peak value of the fluorescence waveband corresponding to each marker.
In some embodiments of the present invention, the performing feature extraction on the second mixed spectrum image according to a maximum and minimum adaptive algorithm to obtain a peak signal feature of a fluorescence spectrum image includes the following steps: finding out minimum value points of all fluorescence intensities in the fluorescence spectrum image; searching a maximum value point between two adjacent minimum value points, dividing data between the two minimum value points into a left segment and a right segment according to the position of the maximum value point, and respectively setting the minimum value point positioned on the left side of the maximum value point and the minimum value point positioned on the right side of the maximum value point as a left minimum value and a right minimum value; subtracting the left minimum value from all data of the left segment, and subtracting the right minimum value from all data of the right segment; and traversing all two adjacent minimum value points and executing the steps.
In some embodiments of the present invention, the convolutional neural network is a Lenet-5 network, and the Lenet-5 network includes two convolutional layers, two pooling layers, a first fully-connected layer, and a second fully-connected layer. Further, 1 Dropout layer is included after the first fully-connected layer.
In a second aspect of the invention, there is provided a use of a gene detection method based on deep learning and fluorescence spectroscopy, wherein the method of the first aspect of the invention is applied to LHON disease detection.
In a third aspect of the present invention, a gene detection apparatus based on deep learning and fluorescence spectroscopy is provided, including an acquisition module, a filtering module, an extraction module, a training module, and an output module, where the acquisition module is configured to acquire fluorescence spectrum images of different markers of a plurality of genes, randomly select M fluorescence spectrum images of different bands for each gene, fuse the fluorescence spectrum images into one fluorescence spectrum image, and enhance the fluorescence spectrum image to obtain a first mixed spectrum image; the filtering module is used for filtering the background noise of the first mixed spectrum image according to a Gaussian peak hypothesis and a local self-adaptive polynomial fitting algorithm to obtain a second mixed spectrum image; the extraction module is used for extracting the characteristics of the second mixed spectrum image according to a maximum value and minimum value self-adaptive algorithm to obtain the peak signal characteristics of the fluorescence spectrum image; the training module is used for training the convolutional neural network until the error of the convolutional neural network is lower than a threshold value by taking the gene sequence as a target label and taking the peak signal characteristic corresponding to each gene and the first mixed spectrum image as samples, and stopping training to obtain the trained convolutional neural network; and the output module is used for inputting the fluorescence spectrum image of the gene to be detected into the trained convolutional neural network to obtain a gene detection result.
Further, the acquisition module comprises an acquisition module, a clustering module, a fusion module and an enhancement module, wherein the acquisition module is used for acquiring fluorescence spectrum images of different markers of multiple genes; the clustering module is used for clustering the fluorescence spectrum images according to a target gene, an excitation waveband, a chemical type of a marker and a fluorescence color in sequence to obtain a fluorescence spectrum image data set; the fusion module is used for randomly selecting M fluorescence spectrum images of different wave bands of a gene from the fluorescence spectrum image data set and fusing the fluorescence spectrum images into a fluorescence spectrum image, wherein M is less than or equal to 4; the enhancement module is used for enhancing the fluorescence spectrum image to obtain a first mixed spectrum image.
In a fourth aspect of the present invention, there is provided an electronic apparatus comprising: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to carry out the method of the first aspect of the invention.
In a fifth aspect of the invention, a computer readable medium, having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method of the first aspect of the invention.
Has the advantages that:
1. according to the method, background noise is filtered through Gaussian peak hypothesis and a local self-adaptive polynomial fitting algorithm, the signal to noise ratio of a sample is further improved by utilizing image enhancement, and therefore the occurrence of over-fitting and under-fitting conditions of a model or training time is reduced;
2. the different wave band fluorescence spectra of different markers are fused by a clustering method, so that the spectrum characterization gene information is improved, the diversity of a fluorescence spectrum sample data set is enriched, and the robustness, generalization capability and accuracy of a target gene sequence predicted by a convolutional neural network model are improved;
3. the gene detection method based on deep learning and fluorescence spectroscopy has the advantages of high flux and high sensitivity.
Drawings
FIG. 1 is a basic flow diagram of a method for gene detection based on deep learning and fluorescence spectroscopy in some embodiments of the invention;
FIG. 2 is a schematic representation of a first mixed spectral image after normalization in some embodiments of the invention;
FIG. 3 is a second mixed spectrum and characterization diagram in some embodiments of the invention;
FIG. 4 is a schematic diagram of the structure of a gene assaying device based on deep learning and fluorescence spectroscopy in some embodiments of the present invention;
fig. 5 is a basic configuration diagram of the electronic apparatus of the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1 to 3, in a first aspect of the present invention, there is provided a gene detection method based on deep learning and fluorescence spectroscopy, comprising the steps of: s101, acquiring fluorescence spectrum images of different markers of a plurality of genes, randomly selecting M fluorescence spectrum images of different wave bands for each gene, fusing the fluorescence spectrum images into one fluorescence spectrum image, and enhancing the fluorescence spectrum image to obtain a first mixed spectrum image; s102, filtering background noise of the first mixed spectrum image according to a Gaussian peak hypothesis and a local self-adaptive polynomial fitting algorithm to obtain a second mixed spectrum image; s103, extracting the characteristics of the second mixed spectrum image according to a maximum value and minimum value self-adaptive algorithm to obtain peak signal characteristics of the fluorescence spectrum image; s104, taking the gene sequence as a target label, taking the peak signal characteristic corresponding to each gene and the first mixed spectrum image as samples, training the convolutional neural network until the error of the convolutional neural network is lower than a threshold value, and stopping training to obtain the trained convolutional neural network; and S105, inputting the fluorescence spectrum image of the gene to be detected into the trained convolutional neural network to obtain a gene detection result.
Referring to fig. 2, in step S101 of some embodiments of the present invention, the acquiring fluorescence spectrum images of different markers of a plurality of genes, randomly selecting M fluorescence spectrum images of different wavelength bands for each gene, fusing the M fluorescence spectrum images into one fluorescence spectrum image, and enhancing the fluorescence spectrum image to obtain a first mixed spectrum image includes the following steps: acquiring fluorescence spectrum images of different markers of multiple genes; clustering the fluorescence spectrum images according to a target gene, an excitation waveband, the chemical species of a marker and a fluorescence color in sequence to obtain a fluorescence spectrum image data set; randomly selecting M fluorescence spectrum images of different wave bands of a gene from the fluorescence spectrum image dataset and fusing the fluorescence spectrum images into one fluorescence spectrum image, wherein M is less than or equal to 4; and enhancing the fluorescence spectrum image to obtain a first mixed spectrum image. In the image enhancement, the discrimination or contrast is improved by sharpening, thresholding, edge detection, and the like.
It can be understood that the fluorescence emission spectrum, referred to as fluorescence spectrum for short, can be obtained by fixing the excitation wavelength and intensity, measuring the emitted fluorescence intensity at different wavelengths, and plotting the relationship curve of the fluorescence intensity varying with the emission wavelength. Generally, the wavelength range of the first mixed spectrum image or the second mixed spectrum image is 450nm-750 nm. The common gene marker Green Fluorescent Protein (GFP) contains 238 amino acids, and the crystal structure with a molecular weight of 27kd.gf is a secondary structure consisting of two beta-sheet tubbiness structures enclosing 1 α helix and forming a regular hydrogen bond band. In addition, the red fluorescent protein, the yellow fluorescent protein, the blue fluorescent protein and modified substances and modified mutants thereof can emit fluorescence with different wavelengths or colors under the action of exciting light. Therefore, the value of M is less than or equal to 4 by comprehensively considering the diversity of the sample (covering the fluorescence of four different colors) and the redundancy of information.
In step S102 of some embodiments of the present invention, the filtering the background noise of the first mixed spectrum image according to the gaussian peak hypothesis and the local adaptive polynomial fitting algorithm to obtain the second mixed spectrum image includes the following steps: using a Savitzky-Golay sliding windowFiltering by an average filtering algorithm, wherein the Savitzky-Golay sliding window average filtering algorithm is expressed as:
Figure DEST_PATH_IMAGE003
wherein x [ i ]]Is the ith data, yi [ i ] in the raw spectral data]Is x [ i ]]2N +1 is a point x [ i ]]Window size as center, k is distance of each position in the sliding window from the center point, wkThe weight coefficient is corresponding to the window, and w is determined by the size of the window and the times of fitting the polynomial; and determining all extreme points according to the derivative change of the spectral data of the first mixed spectral image, and further determining the position and the peak value of the fluorescence waveband corresponding to each marker.
Referring to fig. 3, in step S103 of some embodiments of the present invention, the performing feature extraction on the second mixed spectral image according to a maximum and minimum adaptive algorithm to obtain a peak signal feature of the fluorescence spectral image includes the following steps: finding out minimum value points of all fluorescence intensities in the fluorescence spectrum image; searching a maximum value point between two adjacent minimum value points, dividing data between the two minimum value points into a left segment and a right segment according to the position of the maximum value point, and respectively setting the minimum value point positioned on the left side of the maximum value point and the minimum value point positioned on the right side of the maximum value point as a left minimum value and a right minimum value; subtracting the left minimum value from all data of the left segment, and subtracting the right minimum value from all data of the right segment; and traversing all two adjacent minimum value points and executing the steps. It should be noted that, because the fluorescence intensity is low, the laser interference is easy to occur, and the peak signal feature and the original image of the fluorescence spectrum are used as the sample together, so that the complexity of feature extraction is reduced, and the loss of the original map feature information is reduced.
In step S104 of some embodiments of the present invention, the convolutional neural network is Lenet-5, and the Lenet-5 network includes two convolutional layers, two pooling layers, a first fully-connected layer, and a second fully-connected layer. Further, 1 Dropout layer is included after the first fully-connected layer. The convolution kernel of the convolution layer has a size of 5 × 1, and a one-dimensional convolution is adopted in a convolution operation mode. To avoid over-fitting and improve generalization performance, a Batch Normalization layer (BN) was added after each neural network layer, and a random deactivation (Dropout) layer was applied after the first connection layer. It can be understood that to balance the training time and accuracy of the model, the number of convolutional layers, pooling layers, connection layers, Dropout layers is increased or decreased based on the above network model, and even the model structure is changed to AlexNet, VGGNet, GoogleNet, inclusion, ResNet, etc.
In some embodiments of the present invention, the above-described trained convolutional neural network is applied to gene prediction of Leber (Leber's heredity optical neuropathy, LHON) hereditary optic neuropathy. The disease is respectively related to site mutation of mitochondrial genes ND4: G11778A, ND6: T14484C and ND1: G3460A, and the detection process is as follows:
for the ND1:3460 site, the most preferred forward primer is the 3347-3365 sequence, i.e., SEQ ID NO: 1: 5'-taatcgcaatggcattcc-3' the reverse primer has the base sequence of 3542-3560, namely SEQ No. 2: 5'-gtagaagagcgatggtga-3' the length of the amplified fragment is 213 bp;
for ND4: 11778, the optimal forward primer is the 11660-11679 sequence, i.e. SEQ No. 3: 5'-attctcatccaaaccccct-3' the reverse primer is 11871-11889 base sequence, namely SEQ No. 4: 5'-cccagtaggttaatagtgg-3' the length of the amplified fragment is 230 bp;
for the ND6:14484 site, the most preferred forward primer is the 14359-14377 sequence, i.e., SEQ ID No. 5: 5'-acagcgatggctattgag-3' the reverse primer is 14586-14603 base sequence, namely SEQ No. 6: 5'-atcaacgcccataatcat-3', the length of the amplified fragment is 244 bp;
preparing the primer and the base fluorescent marker into mixed solution, mixing the mixed solution with 30 blood samples respectively, and carrying out fluorescent reaction to obtain 30 fluorescence maps; inputting the fluorescent map into the trained convolutional neural network to obtain a corresponding gene sequence; comparing the corresponding gene sequence with DNA11778, 14484 and 3460 sites to find out whether the blood sample contains LHON mutant gene. The results showed that 2 copies of 11778 mutation, 1 copy of 14484 mutation and 1 copy of 3460 mutation were detected together, and they were consistent with the results obtained by the analysis by all-round analyzer.
Referring to fig. 4, in a second aspect of the present invention, there is provided a gene detection apparatus 1 based on deep learning and fluorescence spectroscopy, including an obtaining module 11, a filtering module 12, an extracting module 13, a training module 14, and an output module 15, where the obtaining module 11 is configured to obtain fluorescence spectrum images of different markers of a plurality of genes, randomly select M fluorescence spectrum images of different bands for each gene, fuse the fluorescence spectrum images into one fluorescence spectrum image, and enhance the fluorescence spectrum image to obtain a first mixed spectrum image; the filtering module 12 is configured to filter the background noise of the first mixed spectrum image according to a gaussian peak hypothesis and a local adaptive polynomial fitting algorithm to obtain a second mixed spectrum image; the extraction module 13 is configured to perform feature extraction on the second mixed spectrum image according to a maximum and minimum adaptive algorithm to obtain a peak signal feature of the fluorescence spectrum image; the training module 14 is configured to use the gene sequence as a target label, use the peak signal characteristic corresponding to each gene and the first mixed spectrum image as samples, train the convolutional neural network until an error of the convolutional neural network is lower than a threshold, and stop training to obtain a trained convolutional neural network; and the output module 15 is used for inputting the fluorescence spectrum image of the gene to be detected into the trained convolutional neural network to obtain a gene detection result.
Further, the acquisition module 11 includes an acquisition module, a clustering module, a fusion module, and an enhancement module, and the acquisition module is configured to acquire fluorescence spectrum images of different markers of multiple genes; the clustering module is used for clustering the fluorescence spectrum images according to a target gene, an excitation waveband, a chemical type of a marker and a fluorescence color in sequence to obtain a fluorescence spectrum image data set; the fusion module is used for randomly selecting M fluorescence spectrum images of different wave bands of a gene from the fluorescence spectrum image data set and fusing the fluorescence spectrum images into a fluorescence spectrum image, wherein M is less than or equal to 4; the enhancement module is used for enhancing the fluorescence spectrum image to obtain a first mixed spectrum image.
Referring to fig. 5, an electronic device 500 may include a processing means (e.g., central processing unit, graphics processor, etc.) 501 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage means 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for the operation of the electronic apparatus 500 are also stored. The processing device 501, the ROM 502, and the RAM 503 are connected to each other through a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
The following devices may be connected to the I/O interface 505 in general: input devices 506 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 507 including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, and the like; a storage device 508 including, for example, a hard disk; and a communication device 509. The communication means 509 may allow the electronic device 500 to communicate with other devices wirelessly or by wire to exchange data. While fig. 5 illustrates an electronic device 500 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 5 may represent one device or may represent multiple devices as desired.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 509, or installed from the storage means 508, or installed from the ROM 502. The computer program, when executed by the processing device 501, performs the above-described functions defined in the methods of embodiments of the present disclosure. It should be noted that the computer readable medium described in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, however, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more computer programs which, when executed by the electronic device, cause the electronic device to:
computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, Python, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (9)

1. A gene detection method based on deep learning and fluorescence spectroscopy is characterized by comprising the following steps:
acquiring fluorescence spectrum images of different markers of a plurality of genes, randomly selecting M fluorescence spectrum images with different wave bands for each gene, fusing the fluorescence spectrum images into one fluorescence spectrum image, and enhancing the fluorescence spectrum image to obtain a first mixed spectrum image;
filtering the background noise of the first mixed spectrum image according to a Gaussian peak hypothesis and a local self-adaptive polynomial fitting algorithm to obtain a second mixed spectrum image;
performing feature extraction on the second mixed spectrum image according to a maximum value and minimum value self-adaptive algorithm to obtain peak signal features of the fluorescence spectrum image; the step of extracting the characteristics of the second mixed spectrum image according to the maximum value and minimum value self-adaptive algorithm to obtain the peak signal characteristics of the fluorescence spectrum image comprises the following steps: finding out minimum value points of all fluorescence intensities in the fluorescence spectrum image; searching a maximum value point between two adjacent minimum value points, dividing data between the two minimum value points into a left segment and a right segment according to the position of the maximum value point, and respectively setting the minimum value point positioned on the left side of the maximum value point and the minimum value point positioned on the right side of the maximum value point as a left minimum value and a right minimum value; subtracting the left minimum value from all data of the left segment, and subtracting the right minimum value from all data of the right segment; traversing all two adjacent minimum value points, and executing the steps;
taking a gene sequence as a target label, taking a peak signal characteristic corresponding to each gene and a first mixed spectrum image as samples, training a convolutional neural network until the error of the convolutional neural network is lower than a threshold value, and stopping training to obtain a trained convolutional neural network;
and inputting the fluorescence spectrum image of the gene to be detected into the trained convolutional neural network to obtain a gene detection result.
2. The gene detection method based on deep learning and fluorescence spectroscopy of claim 1, wherein the obtaining of fluorescence spectrum images of different markers of a plurality of genes, the randomly selecting and fusing M fluorescence spectrum images of different wave bands into one fluorescence spectrum image for each gene, and the enhancing of the fluorescence spectrum image to obtain the first mixed spectrum image comprises the following steps:
acquiring fluorescence spectrum images of different markers of multiple genes;
clustering the fluorescence spectrum images according to a target gene, an excitation waveband, the chemical species of a marker and a fluorescence color in sequence to obtain a fluorescence spectrum image data set;
randomly selecting M fluorescence spectrum images of different wave bands of a gene from the fluorescence spectrum image dataset and fusing the fluorescence spectrum images into one fluorescence spectrum image, wherein M is less than or equal to 4;
and enhancing the fluorescence spectrum image to obtain a first mixed spectrum image.
3. The gene detection method based on deep learning and fluorescence spectrum of claim 1, wherein the step of filtering the background noise of the first mixed spectrum image according to the Gaussian peak hypothesis and the local adaptive polynomial fitting algorithm to obtain the second mixed spectrum image comprises the following steps:
filtering is carried out by adopting a Savitzky-Golay sliding window average filtering algorithm, wherein the Savitzky-Golay sliding window average filtering algorithm is expressed as follows:
Figure DEST_PATH_IMAGE002
wherein x [ i ]]Is the ith data, yi [ i ] in the raw spectral data]Is x [ i ]]2N +1 is a point x [ i ]]Window size as center, k is distance of each position in the sliding window from the center point, wkThe weight coefficient corresponding to the window;
and determining all extreme points according to the derivative change of the spectral data of the first mixed spectral image, and further determining the position and the peak value of the fluorescence waveband corresponding to each marker.
4. The gene detection method based on deep learning and fluorescence spectroscopy of claim 1, wherein the convolutional neural network is Lenet-5, the Lenet-5 network comprises two convolutional layers, two pooling layers, a first fully-connected layer and a second fully-connected layer, and the first fully-connected layer comprises 1 Dropout layer.
5. The gene detection method based on deep learning and fluorescence spectroscopy as claimed in claim 4, wherein the method is used for gene detection of Leber hereditary optic neuropathy, and mitochondria ND1:3460 site, ND4: 11778, and ND6:14484, wherein: the forward primer for detecting the point mutation of the mitochondrial gene ND1G3460A is SEQ No. 1, and the reverse primer sequence is SEQ No. 2; the forward primer for detecting the mitochondrial gene ND4: G11778A point mutation is SEQ No. 3, and the reverse primer sequence is SEQ No. 4; the forward primer for detecting the mitochondrial gene ND6: T14484C point mutation is SEQ No. 5, and the reverse primer sequence is SEQ No. 6.
6. A gene detection device based on deep learning and fluorescence spectrum is characterized by comprising an acquisition module, a filtering module, an extraction module, a training module and an output module,
the acquisition module is used for acquiring fluorescence spectrum images of different markers of a plurality of genes, randomly selecting M fluorescence spectrum images with different wave bands for each gene, fusing the fluorescence spectrum images into one fluorescence spectrum image, and enhancing the fluorescence spectrum image to obtain a first mixed spectrum image;
the filtering module is used for filtering the background noise of the first mixed spectrum image according to a Gaussian peak hypothesis and a local self-adaptive polynomial fitting algorithm to obtain a second mixed spectrum image;
the extraction module is used for extracting the characteristics of the second mixed spectrum image according to a maximum value and minimum value self-adaptive algorithm to obtain the peak signal characteristics of the fluorescence spectrum image; the step of extracting the characteristics of the second mixed spectrum image according to the maximum value and minimum value self-adaptive algorithm to obtain the peak signal characteristics of the fluorescence spectrum image comprises the following steps: finding out minimum value points of all fluorescence intensities in the fluorescence spectrum image; searching a maximum value point between two adjacent minimum value points, dividing data between the two minimum value points into a left segment and a right segment according to the position of the maximum value point, and respectively setting the minimum value point positioned on the left side of the maximum value point and the minimum value point positioned on the right side of the maximum value point as a left minimum value and a right minimum value; subtracting the left minimum value from all data of the left segment, and subtracting the right minimum value from all data of the right segment; traversing all two adjacent minimum value points, and executing the steps;
the training module is used for training the convolutional neural network until the error of the convolutional neural network is lower than a threshold value by taking the gene sequence as a target label and taking the peak signal characteristic corresponding to each gene and the first mixed spectrum image as samples, and stopping training to obtain the trained convolutional neural network;
and the output module is used for inputting the fluorescence spectrum image of the gene to be detected into the trained convolutional neural network to obtain a gene detection result.
7. The gene detection device based on deep learning and fluorescence spectroscopy of claim 6, wherein the acquisition module comprises an acquisition module, a clustering module, a fusion module, and an enhancement module,
the acquisition module is used for acquiring fluorescence spectrum images of different markers of multiple genes;
the clustering module is used for clustering the fluorescence spectrum images according to a target gene, an excitation waveband, a chemical type of a marker and a fluorescence color in sequence to obtain a fluorescence spectrum image data set;
the fusion module is used for randomly selecting M fluorescence spectrum images of different wave bands of a gene from the fluorescence spectrum image data set and fusing the fluorescence spectrum images into a fluorescence spectrum image, wherein M is less than or equal to 4;
the enhancement module is used for enhancing the fluorescence spectrum image to obtain a first mixed spectrum image.
8. An electronic device, comprising: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to carry out the method according to any one of claims 1-5.
9. A computer-readable medium, on which a computer program is stored, wherein the computer program, when being executed by a processor, carries out the method according to any one of claims 1-5.
CN202011636282.6A 2020-12-31 2020-12-31 Gene detection method and device based on deep learning and fluorescence spectrum Active CN112309498B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011636282.6A CN112309498B (en) 2020-12-31 2020-12-31 Gene detection method and device based on deep learning and fluorescence spectrum

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011636282.6A CN112309498B (en) 2020-12-31 2020-12-31 Gene detection method and device based on deep learning and fluorescence spectrum

Publications (2)

Publication Number Publication Date
CN112309498A CN112309498A (en) 2021-02-02
CN112309498B true CN112309498B (en) 2021-04-09

Family

ID=74487585

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011636282.6A Active CN112309498B (en) 2020-12-31 2020-12-31 Gene detection method and device based on deep learning and fluorescence spectrum

Country Status (1)

Country Link
CN (1) CN112309498B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112861986B (en) * 2021-03-02 2022-04-22 广东工业大学 Method for detecting blood fat subcomponent content based on convolutional neural network
CN113962904B (en) * 2021-11-26 2023-02-10 江苏云脑数据科技有限公司 Method for filtering and denoising hyperspectral image
CN114720436B (en) * 2022-01-24 2023-05-12 四川农业大学 Agricultural product quality parameter detection method and equipment based on fluorescence hyperspectral imaging
CN114885094B (en) * 2022-03-25 2024-03-29 北京旷视科技有限公司 Image processing method, image processor, image processing module and device
CN115602245B (en) * 2022-09-09 2023-10-03 郑州思昆生物工程有限公司 Method, device, equipment and storage medium for screening fluorescent images
CN116596933B (en) * 2023-07-18 2023-09-29 深圳赛陆医疗科技有限公司 Base cluster detection method and device, gene sequencer and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104266955A (en) * 2014-09-02 2015-01-07 上海凯度机电科技有限公司 High content image flow biological microscopic analysis system
EP3324320A1 (en) * 2016-11-21 2018-05-23 Johnson & Johnson Vision Care Inc. Biomedical sensing methods and apparatus for the detection and prevention of lung cancer states
CN109580527A (en) * 2019-01-18 2019-04-05 重庆医科大学 A kind of infrared spectrum analysis identifying abo blood group based on histotomy
CN110689036A (en) * 2018-07-06 2020-01-14 塔塔咨询服务有限公司 Method and system for automatic chromosome classification
WO2020115755A1 (en) * 2018-12-07 2020-06-11 Rangasamy Naidu Educational Trust Nanodiamond with fluorescent and sparkling properties
CN111968705A (en) * 2020-07-23 2020-11-20 北斗生命科学(广州)有限公司 Gene sequencing order processing method, system and medium based on cloud architecture

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10580130B2 (en) * 2017-03-24 2020-03-03 Curadel, LLC Tissue identification by an imaging system using color information

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104266955A (en) * 2014-09-02 2015-01-07 上海凯度机电科技有限公司 High content image flow biological microscopic analysis system
EP3324320A1 (en) * 2016-11-21 2018-05-23 Johnson & Johnson Vision Care Inc. Biomedical sensing methods and apparatus for the detection and prevention of lung cancer states
CN110689036A (en) * 2018-07-06 2020-01-14 塔塔咨询服务有限公司 Method and system for automatic chromosome classification
WO2020115755A1 (en) * 2018-12-07 2020-06-11 Rangasamy Naidu Educational Trust Nanodiamond with fluorescent and sparkling properties
CN109580527A (en) * 2019-01-18 2019-04-05 重庆医科大学 A kind of infrared spectrum analysis identifying abo blood group based on histotomy
CN111968705A (en) * 2020-07-23 2020-11-20 北斗生命科学(广州)有限公司 Gene sequencing order processing method, system and medium based on cloud architecture

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Rice Blast Prediction Model Based on Analysis of Chlorophyll Fluorescence Spectrum;Zhou LN 等;《Spectral Analysis》;20141231;第34卷(第4期);第1003-1006页 *

Also Published As

Publication number Publication date
CN112309498A (en) 2021-02-02

Similar Documents

Publication Publication Date Title
CN112309498B (en) Gene detection method and device based on deep learning and fluorescence spectrum
US20200302603A1 (en) Method of computing tumor spatial and inter-marker heterogeneity
US10371639B2 (en) Detecting fluorescent material in a stained particle by comparison with an unstained particle over a plurality of frequency bands and by estimating a linear combination of base vectors
McRae et al. Robust blind spectral unmixing for fluorescence microscopy using unsupervised learning
EP3005293B1 (en) Image adaptive physiologically plausible color separation
US9128055B2 (en) Information processing apparatus, information processing method, program, and method of correcting intensity of fluorescence spectrum
Du et al. Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching
US6750964B2 (en) Spectral imaging methods and systems
US20120112098A1 (en) Enhancing visual assessment of samples
US20120015825A1 (en) Analytical systems and methods with software mask
CN103003683B (en) Apparatus, system, and method for increasing measurement accuracy in a particle imaging device using light distribution
CN111095358B (en) Slide glass image color deconvolution system and method for assisting tissue sample analysis
WO2019181845A1 (en) Biological tissue analyzing device, biological tissue analyzing program, and biological tissue analyzing method
US20170206655A1 (en) Systems and methods for color deconvolution
Bengtsson et al. Computer‐aided diagnostics in digital pathology
EP3961194B1 (en) Method and apparatus for multiplexed imaging of biomolecules through iterative unmixing of fluorophore signals
CN104350378B (en) Method and apparatus for the performance of measure spectrum system
CN113777053B (en) High-flux detection method and device based on quantum dot fluorescence and multispectral camera
US11377685B2 (en) Base sequence determination apparatus, capillary array electrophoresis apparatus, and method
WO2018042752A1 (en) Signal analysis device, signal analysis method, computer program, measurement device, and measurement method
US11361411B2 (en) Neighbor influence compensation
Zhao et al. Hyperspectral imaging analysis of a photonic crystal bead array for multiplex bioassays
Schmidt et al. Skin Whole‐Mount Immunofluorescent Staining Protocol, 3D Visualization, and Spatial Image Analysis
Huang et al. Applications of machine learning tools for ultra-sensitive detection of lipoarabinomannan with plasmonic grating biosensors in clinical samples of tuberculosis
CN117388249A (en) Method and device for identifying luminous pearl

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant