CN108251517A

CN108251517A - A kind of method of similar sequences relative populations in analysis system

Info

Publication number: CN108251517A
Application number: CN201711477562.5A
Authority: CN
Inventors: 李阳; 谢先荣; 王世阳
Original assignee: Wuhan Elders Biotechnology Co Ltd
Current assignee: Wuhan Elders Biotechnology Co Ltd
Priority date: 2017-12-29
Filing date: 2017-12-29
Publication date: 2018-07-06

Abstract

The method that the present invention discloses similar sequences relative populations in a kind of analysis system, includes the following steps：At least two homologous sequences to be detected are expanded with a primer pair, amplification carries out generation sequencing, obtained peak-data will be sequenced and submit to computer program together as the homologous sequence of reference diversity sequence and be compared, the mean intensity in each sequence all differences site is counted, the ratio of each sequence average signal strength is the quantity ratio for representing the homologous sequence.The present invention is not that can express or there is a situation where expression difference for similar sequences in a system, the relative populations for analyzing the similar sequences that can be simple and convenient can be sequenced by a generation, be of great significance for the research of its corresponding functional gene.

Description

A kind of method of similar sequences relative populations in analysis system

Technical field

The invention belongs to technical field of molecular biology, and in particular to similar sequences relative populations in a kind of analysis system Method.

Background technology

Functional genomics is a kind of information provided using Structural genomics and product, develops and applies new experiment Means, by the function of analyzing gene comprehensively on genome or system level so that biological study to term single gene or The multiple genes of research steering or protein of protein are carried out at the same time systematic research.The function of gene includes：Biological function, Phosphorylation modification such as is carried out to specific protein as protein kinase；Cytology function such as participates in iuntercellular and intracellular letter Number pipeline；Developmentally function, such as participate in morphogenesis.These functional genes usually have multiple copies in genome, Sequence height between these copies is similar, only the SNP differences of the difference of number of base sequence or even only several bases, However the gene of these differences is not that can express or the difference there are expression quantity under specific circumstances, and therefore, research The relative populations of the multiple copies of functional gene become extremely difficult in genome.

How these similar homologous gene relative abundance in the cell is understood, for determining that it corresponds to the work(of gene It can study significant.Round pcr is a kind of Protocols in Molecular Biology that specific DNA fragmentation is expanded for amplification, for It all expresses between homologous gene or is easily determined the situation not expressing is by regular-PCR, however for homologous gene The difference condition only measured, there is presently no a kind of simple and convenient analysis methods.

Invention content

In order to solve the problems in the prior art, the purpose of the present invention is to, it is proposed that similar sequences are opposite in a kind of analysis system The method of quantity is just capable of the relative abundance of simple and quick judgement homologous gene product by generation sequencing approach.

To achieve the above object, the technical solution adopted by the present invention is：

The method that the present invention provides similar sequences relative populations in a kind of analysis system, includes the following steps：It will be to be detected At least two homologous sequences expanded with a primer pair, amplification carries out generation sequencing, obtained peak value will be sequenced According to this and the homologous sequence as reference diversity sequence is submitted to computer program and is compared together, counts each sequence The mean intensity in all differences site, the ratio of each sequence average signal strength are the quantity ratio for representing the homologous sequence Value.

The generation sequencing is based on Capillary Electrophoresis, fluorescence signal and record on track is scanned after electrophoresis, specifically , generation sequencing result data are the data software of ABI companies exploitation, and the peak-data span is with a certain distance from electrophoresis starting point On A, T, C, G corresponds to fluorescence signal intensity.

Preferably, the amplification procedure has same amplification efficiency to all homologous sequences.

Preferably, the primer pair and the homologous sequence are 100% matching, the amplification target that amplification procedure is selected Ranging from 4~30, difference site in DNA.

Preferably, the reference diversity sequence is all homologous sequences in the amplification target DNA.

Preferably, the homologous sequence compared every time is two kinds.

Preferably, the method that the computer program is compared includes：First calculated according to the reference diversity sequence of submission All differences site, the specific base for recording these difference sites and position in the sequence；Then according to these location informations The signal strength of 4 kinds of bases on these positions is found out in sequencing result file data, and is recorded；It counts on all differences site The signal strength of different bases；The signal strength in each site is uniformed；Count each sequence all differences site Mean intensity, the ratio of each sequence average signal strength represent the quantity ratio of each homologous sequence in starting system.

Compared with prior art, the beneficial effects of the invention are as follows：Not being for similar sequences in a system can Express or there is a situation where expression difference, can by a generation be sequenced can be simple and convenient analyze the similar sequences Relative populations, be of great significance for the research of its corresponding functional gene.

Description of the drawings

Fig. 1 is the peak-data figure that the sequencing result data software in embodiment is opened.

Specific embodiment

Below in conjunction with the attached drawing in the present invention, technical scheme of the present invention is clearly and completely described, it is clear that Described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.Based on the implementation in the present invention Example, all other embodiment that those of ordinary skill in the art are obtained under the conditions of creative work is not made belong to The scope of protection of the invention.

Embodiment 1

(1) promoter Pg1090 and promoter Paction respectively drive gfp and gfp (m) pieces in Transgenic Rice Plants Section transcription；It extracts plant RNA and carries out reverse transcription, obtain gfp and gfp (m) homologous fragments, respectively such as SEQ ID in sequence table Shown in NO.1~2.

(2) above-mentioned gfp to be detected and gfp (m) homologous fragments are expanded with a primer pair, amplification carries out one Generation sequencing；The primer pair is as shown in sequence table SEQ ID NO.3~4：

GFP-F:ttcttcaagg acgacggcaa

GFP-R:aagttggcct ttatcccgtt

The primer pair and the homologous fragment to be detected are 100% matching, the amplification target DNA that amplification procedure is selected Ranging from 4~30, middle difference site.

(3) sequencing obtains 2017121889hmapz_GFP.ab1 sequencing result files, and sequencing result is opened with data software Peak-data figure it is as shown in Figure 1；Obtained peak-data and the homologous sequence as reference diversity sequence will be sequenced SEQ ID submit to NO.1~2 computer program and are compared together, and comparison process is as shown in the table：

Note：Base _ A：The corresponding base in difference site in sequence A (gfp)；Base _ B：Difference in sequence B (gfp (m)) The corresponding base of ectopic sites；Peak value _ A：Difference site base signal strength values on sequence A (gfp)；Peak value _ B：Sequence B (gfp (m)) difference site base signal strength values on；Homogenization value _ A：Difference site base signal strength is uniform on sequence A (gfp) Value after change；Homogenization value _ B：Value in sequence B (gfp (m)) after the homogenization of difference site base signal strength.

All differences sites is calculated according to the reference diversity sequence of submission, record these difference sites specific base and Position in sequence obtains totally 21 difference sites；Then it is found out in sequencing result file data according to these location informations The signal strength of 4 kinds of bases on these positions, and record；Count the signal strength of different bases on all differences site；To every The signal strength in a site is uniformed；Count the mean intensity in each sequence all differences site, each sequence average signal The ratio of intensity represents the quantity ratio of each homologous sequence in starting system.

Result of calculation shows A (gfp)：Relative populations ratio=1.712828 of B (gfp (m)).

According to the record of above-described embodiment and present disclosure, those skilled in the art can be in same system two Kind or the relative populations of a variety of homologous sequences carry out simple and convenient analysis, are of great significance for the research of functional gene.

It although an embodiment of the present invention has been shown and described, for the ordinary skill in the art, can be with Understanding without departing from the principles and spirit of the present invention can carry out these embodiments a variety of variations, modification, replace And modification, the scope of the present invention is defined by the appended.

SEQUENCE LISTING

<110>Wuhan Ai Deshi bio tech ltd

<120>A kind of method of similar sequences relative populations in analysis system

<130>

<160> 4

<170> PatentIn version 3.3

<210> 1

<211> 200

<212> DNA

<213>Rice gfp segments

<400> 1

ttcttcaagg acgacggcaa ctacaagacg cgagctgagg tgaagttcga gggcgacacg 60

ctcgtcaacc gtatcgagct caagggaatc gacttcaagg aggacggaaa cattctcgga 120

cacaagctgg agtacaacta caactcccac aacgtttaca taatggcgga caagcagaag 180

aacggcatta aggccaactt 200

<210> 2

<211> 200

<212> DNA

<213>Rice gfp (m) segments

<400> 2

ttcttcaagg acgacgggaa ctacaagacc cgtgcagagg tcaagttcga gggggacacc 60

ctcgtgaaca gaatcgagct caagggtatc gacttcaagg aggacggtaa catactcggt 120

cacaagctcg agtacaacta caactcgcac aacgtataca ttatggccga caagcagaag 180

aacgggataa aggccaactt 200

<210> 3

<211> 20

<212> DNA

<213>Artificial sequence GFP-F

<400> 3

ttcttcaagg acgacggcaa 20

<210> 4

<211> 20

<212> DNA

<213>Artificial sequence GFP-R

<400> 4

aagttggcct ttatcccgtt 20

Claims

1. a kind of method of similar sequences relative populations in analysis system, which is characterized in that include the following steps：It will be to be detected At least two homologous sequences are expanded with a primer pair, and amplification carries out generation sequencing, the peak-data that sequencing is obtained And the homologous sequence as reference diversity sequence is submitted to computer program and is compared together, counts each sequence institute The mean intensity in variant site, the ratio of each sequence average signal strength are the quantity ratio for representing the homologous sequence.

2. the method for similar sequences relative populations in a kind of analysis system according to claim 1, which is characterized in that described Amplification procedure has same amplification efficiency to all homologous sequences.

3. the method for similar sequences relative populations in a kind of analysis system according to claim 2, which is characterized in that described Primer pair and the homologous sequence are 100% matching, difference site ranging from 4 in the amplification target DNA that amplification procedure is selected ~30.

4. the method for similar sequences relative populations in a kind of analysis system according to claim 3, which is characterized in that described Reference diversity sequence is all homologous sequences in the amplification target DNA.

5. the method for similar sequences relative populations in a kind of analysis system according to claim 1, which is characterized in that every time The homologous sequence compared is two kinds.

6. according to the method for similar sequences relative populations in a kind of analysis system of Claims 1 to 5 any one of them, feature It is, the method that the computer program is compared includes：All differences position is first calculated according to the reference diversity sequence of submission Point, the specific base for recording these difference sites and position in the sequence；Then according to these location informations in sequencing result The signal strength of 4 kinds of bases on these positions is found out in file data, and is recorded；Count different bases on all differences site Signal strength；The signal strength in each site is uniformed；The mean intensity in each sequence all differences site is counted, respectively The ratio of sequence average signal strength represents the quantity ratio of each homologous sequence in starting system.