CN105260395B - The storage of STR data and paternity test sequence comparison method based on inverted index structure - Google Patents

The storage of STR data and paternity test sequence comparison method based on inverted index structure Download PDF

Info

Publication number
CN105260395B
CN105260395B CN201510590067.XA CN201510590067A CN105260395B CN 105260395 B CN105260395 B CN 105260395B CN 201510590067 A CN201510590067 A CN 201510590067A CN 105260395 B CN105260395 B CN 105260395B
Authority
CN
China
Prior art keywords
str
data
sample
index
inverted index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201510590067.XA
Other languages
Chinese (zh)
Other versions
CN105260395A (en
Inventor
刘健
李宝娟
高东怀
许卫中
孙茂
许浩
靳豪杰
张军超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fourth Military Medical University FMMU
Original Assignee
Fourth Military Medical University FMMU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fourth Military Medical University FMMU filed Critical Fourth Military Medical University FMMU
Priority to CN201510590067.XA priority Critical patent/CN105260395B/en
Publication of CN105260395A publication Critical patent/CN105260395A/en
Application granted granted Critical
Publication of CN105260395B publication Critical patent/CN105260395B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures

Abstract

The invention discloses a kind of STR data storage based on inverted index structure and paternity test sequence comparison method, belong to data storage and processing technology field.STR data storage and paternity test sequence comparison method of the present invention based on inverted index structure, it is main to include two aspects:First, the STR date storage methods based on inverted index structure, this method can be established different data fields, store STR data with inverted index structure in data field according to str locus seat selected by sample;The comparison method second, paternity test is sorted, inverted index structure of this method based on dividing domain, calculates the affiliation of sample in look for relative sample and database, realization is quick, stablizes, reliably looks for relative online.

Description

The storage of STR data and paternity test sequence comparison method based on inverted index structure
Technical field
The invention belongs to data storage and processing technology field, and in particular to a kind of STR numbers based on inverted index structure According to storage and paternity test sequence comparison method.
Background technology
According to incompletely statistics, look for relative personnel in the whole nation shared about 500,000 at present, wherein since history, natural calamity, society are asked Chaos caused by war orphan caused by reason (Japanese orphan), natural calamity orphan and the abducted populations such as topic etc. constitute the master of looking for relative personnel Body.In recent years, with the continuous development of biotechnology, carrying out looking for relative by gene technology becomes more and more feasible.
Looking for relative based on gene technology is that mainly human inheritance's mark is detected by using paternity test technology, And according to identification of the Inheritance Analysis on Genetic to doubtful parent and child genetic connection.DNA is the underlying carrier of human inheritance's information, What the chromosome of the mankind was mainly made of DNA, each human body cell has 22 pairs of autosome chromosomes and 1 pair of sex chromosome, altogether Meter 46, respectively from father and mother.Parent both sides are respectively filial generation and provide a hemichromosome, are mutually paired in after fertilization, Form the chromosome of filial generation.Whole chromosome system is formed since human body there are about 3,000,000,000 nucleotide, and in reproduction cell shape It is random into preceding exchange and combination, so in addition to identical twin, there is identical core without any two people Nucleotide sequence, here it is the genetic polymorphism of people.Despite the presence of the polymorphism of heredity, but each human chromosome is inevitable also only Its parent can be come from, here it is the theoretical foundation of DNA paternity tests.At present, applied during paternity test relatively broad Be the identification technology based on short tandem repeat (short tandem repeat, STR), due to its extremely sensitiveization, height Feature, the technologies such as degree personalization, fully digitalization have become the authentication technique of Global Access.One typical autosome Str locus seat data are as follows:
Site STR
D8S1179 13/14
D21S11 31/32
D7S820 11/12
CSF1PO 10/13
D3S1358 15/16
D5S818 11/13
D13S317 8/12
D16S539 9/12
D2S1338 17/23
D19S433 14/14
vWA 16/18
D12S391 18/21
D18S51 13/13
AMEL X/Y
D6S1043 12/19
FGA 22/23
At present, the solution of paternity test problem relies on relevant database to store and compare STR data more, to realize The judgement of sample donor parent child relationship.For a site, its STR data is mainly made of two numerals, one of them From father, another then derives from mother.In detection process, it is assumed that each 16 sites of sample detection are (including one A gender site).The same loci of each sample can have the numerical value of two allele.Two with biology parent child relationship In 15 STR bit points of a tested person, the data in each site require that at least one numerical value is identical.For this For problem, to judge whether there is parent child relationship between two individuals, at most need on each site to compare 4 times, 15 Site then at most needs to compare 60 times.When the sample size stored in system gradually increases, its contrast conting amount also will gradually increase Add.Therefore, although solving depositing for looking for relative database to a certain extent using the storage of relevant database and alignments Storage and search problem, but the characteristics of due to human body str locus seat data itself, it is not appropriate for the relationship type number for using " form " Stored according to storehouse, and largely have impact on the comparison efficiency of STR data.In addition, gene information can also in the presence of what is made a variation Energy property, once STR data are undergone mutation, will be further increased and carry out looking for relative and paternity test difficulty using STR data.
The content of the invention
In order to overcome the problems of the above-mentioned prior art, it is an object of the invention to provide one kind to be based on inverted index knot Structure STR data storage and paternity test sequence comparison method, this method can effectively improve looking for relative database robustness and Comparison efficiency, while can effectively ensure that the reliability of looking for relative result.
The present invention is to be achieved through the following technical solutions:
The storage of STR data and paternity test sequence comparison method disclosed by the invention based on inverted index structure, including Following steps:
1) the STR data storage based on inverted index structure
First, all STR data are pre-processed, it is reference format that the STR data sets of each sample, which are arranged,;So Afterwards, using each site as a data field, respective STR data will be stored in each data field;Finally, by STR data Stored in a manner of inverted index;
2) the paternity test sequence based on the STR data stored in a manner of inverted index compares
First, STR data to be looked for relative are pre-processed, it is reference format that the STR data sets of each sample, which are arranged,;So Afterwards, the STR data in each site are compared in respective data field, and form final parent child relationship index;Finally, Judge to whether there is parent child relationship between sample, if parent child relationship index is higher than specific value, then it is assumed that the confession of candidate samples The donor of body and sample to be looked for relative has a parent child relationship, on the contrary then think parent child relationship is not present between the two.
In step 1), STR data are pre-processed, it is reference format that the STR data sets of each sample, which are arranged, specifically It is as follows:
Sample data set is denoted as X={ x1,x2..., xn};
Wherein, xiRepresent the STR data of i-th of individual,Wherein,Represent j-th of STR The title of locus, vjkRepresent the characteristic value of STR on locus j on k-th of chromosome.
The foundation of data field is as follows in step 1):
The STR data of all samples are traveled through, establish the set STR of str locus seat titleN={ str1,str2,…, strm, for STRNIn each stri, different data fields is established, is denoted as di;I=1,2 ... m.
STR data are stored in a manner of inverted index in step 1), sample data set X are traveled through, to any xi, traversal:(vj1/vj2), ifCorresponding data field dmIn there are vj1Index, then by xiIt is added in the index;If no There are vj1Inverted index, then establish the index, and by xiIt is added in index;For vj2Adopt and located in a like fashion Reason.
STR data to be looked for relative are pre-processed in step 2), it is specific as follows:
It is following form by looking for relative sample arrangement:Y={ strj:(vj1/vj2), wherein strjRepresent j-th of str locus seat Title, vjkRepresent the characteristic value of STR on locus j on k-th of chromosome.
The calculating of parent child relationship index is as follows in step 2):
For sample y, str is traveled throughj:(vj1/vj2), if there is strjCorresponding domain dm, then v is obtainedj1And vj2Index institute Corresponding sample set, is denoted as X respectivelyj1And Xj2
Take Xj1And Xj2Union, be denoted as Xj=Xj1∪Xj2
Obtaining each strjCorresponding XjAfterwards, X is calculatedjUnionEach member in X Element is candidate samples;
To each element x in Xi, calculateWherein:
Then qiFor candidate samples xiParent child relationship index.
Parent child relationship is judged whether in step 2), it is specific as follows:
According to qiTo candidate samples xiDescending sort is carried out, if qi>=θ, then it is assumed that the donor of sample y to be looked for relative and candidate Sample xiDonor there is parent child relationship;It is on the contrary, then it is assumed that there is no parent child relationship between the two;Wherein, θ is set in advance for system Fixed threshold value, subtracts 1 for the quantity of locus.
Compared with prior art, the present invention has technique effect beneficial below:
1st, the search efficiency of higher
Traditional looking for relative database often uses relevant database, and by vertical segmentation, establish view, establish and count The means such as information optimize, improve the search efficiency of system.But these method search algorithms are relative complex, and it is not easy to be scarce Few relevant background knowledge operation maintenance personnel is understood.The present invention sets different data to store according to the difference of locus point position Domain, and stored paired str locus seat data using inverted index structure in different data fields, is drastically increased and is The search efficiency of system.
2nd, the scalability of higher
Traditional looking for relative database based on relevant database, often requires that looking for relative person must use specific gene position Point significantly limit the use scope for database of looking for relative to be detected, and great inconvenience is brought to vast looking for relative user.This Invention does not require specifically for looking for relative gene point position, if necessary to the point Bits Expanding to system, it is only necessary to which increase is corresponding Data field, without being modified to basic data structure, drastically increases the scope of application of system.
3rd, influence of the gene mutation to paternity test effect is avoided
Gene mutation is one of huge obstacle of accuracy of paternity test in limitation looking for relative database.Due to genetic mutation Presence so that parent-offspring two instead of between str locus seat data might not be completely superposed, therefore when using relevant database During storing str locus seat data, the complete matching between str locus seat data becomes difficult to operate, in SQL statement WHERE conditions are difficult to accurately match, and significantly limit the effect of paternity test.The present invention utilizes the row of falling in different pieces of information domain Index structure stores str locus seat data, in inquiry, it is only necessary to which the comparison score in different pieces of information domain calculates sample Parent child relationship index between this has the possibility of genetic connection between can obtaining sample donor, and is ranked up according to this, Influence of the gene mutation to paternity test effect is largely avoid, the present invention can effectively improve the robustness of looking for relative database And comparison efficiency, while can effectively ensure that the reliability of looking for relative result.
Brief description of the drawings
Fig. 1 is system overall framework schematic diagram;
Fig. 2 is the str locus seat data storage used in the present invention based on inverted index structure;
Fig. 3 is the algorithm flow chart of the STR data storage based on inverted index structure;
The algorithm flow chart that paternity test sequences of the Fig. 4 based on the STR data stored in a manner of inverted index compares;
Data structure when Fig. 5 and Fig. 6 does not store No. 00002 sample respectively and store 00002 in D8S1179 domains Data structure after number sample;
Fig. 7 and Fig. 8 is the looking for relative system prototype realized according to principle of the invention design, and Fig. 7 is based on inverted index structure The STR data of storage, Fig. 8 are the looking for relative result that looking for relative algorithm obtains.
Embodiment
With reference to specific embodiment, the present invention is described in further detail, it is described be explanation of the invention and It is not to limit.
Referring to Fig. 1, STR data storage and paternity test sequence comparison method of the present invention based on inverted index structure are main To include two aspects:First, the STR date storage methods based on inverted index structure, referring to Fig. 3, this method can be according to sample Selected str locus seat, establishes different data fields, stores STR data with inverted index structure in data field;Second, parent Son identification sequence comparison method, referring to Fig. 4, inverted index structure of this method based on dividing domain, calculates looking for relative sample and data The affiliation of sample in storehouse, realizes quick, stable, reliable online looking for relative.
1. the STR date storage methods based on inverted index structure
STR date storage methods based on inverted index structure, comprise the following steps:First, all STR data are carried out Pretreatment, it is reference format that the STR data sets of each sample, which are arranged,;Then, using each site as a data field, often Respective STR data will be stored in a data field.Finally, the mode of STR data inverted indexs is stored.Detailed process such as Fig. 3 It is shown, specifically:
Step 1:Data prediction.Data preparation to be stored is following form:Sample data set is denoted as X={ x1, x2..., xn, wherein xiRepresent the STR data of i-th of individual, be represented byWhereinRepresent The title of j-th of str locus seat, vjkRepresent the characteristic value of STR on locus j on k-th of chromosome.
Step 2:Establish data field.The STR data of all samples are traveled through, establish the set STR of str locus seat titleN ={ str1,str2,…,strm, for STRNIn each stri, different data fields is established, is denoted as di
Step 3:Data store.Sample data set X is traveled through, to any xi, traversal:(vj1/vj2), ifIt is corresponding Data field dmIn there are vj1Index, then by xiIt is added in the index;If there is no vj1Inverted index, then establish should Index, and by xiIt is added in index.For vj2Adopt and handled in a like fashion.
Str locus seat data after being stored using the above method are as shown in Figure 2.Wherein, the D8S1179 shown in top, D21S11 etc. is the data field corresponding to str locus seat;Lower left is the corresponding data key of data, digital representation therein STR numerical value;Inverted index list of the lower right corresponding to data key, the ID number of this donor of each numerical tabular sample.Such as List corresponding to STR numerical value 12 includes the numerals such as 1,5,7,13,22, represents that certain chromosome of sample 1,5,7,13,22 exists Numerical value on the D3S1358 of site is 12.
2. the paternity test sequence comparison method based on inverted index structure
On the basis of the STR date storage methods based on inverted index structure, str locus seat number as shown in Figure 2 is obtained According to storage organization., will be main using the paternity test sequence comparison method based on inverted index structure, this method when being looked for relative Comprise the following steps:First, STR data to be looked for relative are pre-processed, it is reticle that the STR data sets of each sample, which are arranged, Formula;Then, the STR data in each site are compared in respective data field, and form final parent child relationship index; Finally, judge to whether there is parent child relationship between sample, if parent child relationship index is higher than specific value, then it is assumed that candidate samples The donor of donor and sample to be looked for relative there is parent child relationship, it is on the contrary then think parent child relationship is not present between the two.
Step 1:Data prediction.It is following form by looking for relative sample arrangement:Y={ strj:(vj1/vj2), wherein strj Represent the title of j-th of str locus seat, vjkRepresent the characteristic value of STR on locus j on k-th of chromosome.
Step 2:Calculate parent child relationship index.For sample y, str is traveled throughj:(vj1/vj2), if there is strjIt is corresponding Domain dm, then v is obtainedj1And vj2The corresponding sample set of index, is denoted as X respectivelyj1And Xj2.Take Xj1And Xj2Union, be denoted asObtaining each strjCorresponding XjAfterwards, X is calculatedjUnionIn X Each element is candidate samples.To each element x in Xi, calculateWherein,
Then qiFor candidate samples xiParent child relationship index.
Step 3:Judge whether parent child relationship.According to qiTo candidate samples xiDescending sort is carried out, if qi>=θ is then Think the donor and candidate samples x of sample y to be looked for relativeiDonor there is parent child relationship;It is on the contrary then think to be not present between the two Parent child relationship.Wherein θ is the threshold value that system is previously set, general it is contemplated that the quantity for being arranged to locus subtracts 1.
Instantiation is as follows:
Need the looking for relative sample that stores as shown in table 1, sample to be looked for relative is as shown in table 2.
The sample instantiation to be stored in the looking for relative database of table 1
Sample ID 00001 00002 00003 00004 00005 ……
D8S1179 14/15 13/15 10/13 13/15 13/14 ……
D21S11 30.2/31 29/32.2 30/31.2 29/32.2 29/30 ……
D7S820 10/11 8/9 11/11 11/11 10/12 ……
CSF1PO 10/11 12/14 10/13 11/12 10/10 ……
D3S1358 15/16 16/16 16/17 15/15 15/16 ……
D5S818 10/11 11/12 11/13 10/11 10/13 ……
D13S317 12/12 11/11 11/12 10/11 11/11 ……
D16S539 10/13 9/11 11/12 11/12 10/12 ……
D2S1338 20/23 21/23 18/24 20/23 18/19 ……
D19S433 13/14 13/13 14/15 13/15.2 13/14 ……
vWA 17/20 14/16 16/17 13/14 17/19 ……
D12S391 18/21 19/20 17/17.3 18/19 18/18 ……
D18S51 13/14 13/15 14/15 13/16 14/17 ……
AMEL X/X X/Y X/Y X/Y X/Y ……
D6S1043 14/21.3 19/20 13/14 10/19 17/18 ……
FGA 19/22 19/24 23/24 24/26 23/23 ……
2 sample instantiation to be looked for relative of table
Site STR
D8S1179 13/15
D21S11 29/31
D7S820 11/11
CSF1PO 11/11
D3S1358 15/15
D5S818 10/12
D13S317 10/10
D16S539 9/11
D2S1338 18/23
D19S433 13/14
vWA 14/14
D12S391 18/19
D18S51 13/15
AMEL X/Y
D6S1043 10/18
FGA 23/24
1st, the STR data storage based on inverted index structure
Step 1:Data prediction.Will all samples to be stored to arrange be reference format, using No. 00001 sample as Example, the result after it is arranged are:
x1={ D8S1179:(14/15),D21S11:(30.2/31),D7S820:(10/11),CSF1PO:(10/11), D3S1358:(15/16),D5S818:(10/11),D13S317:(12/12),D16S539:(10/13),D2S1338:(20/ 23),D19S433:(13/14),vWA:(17/20),D12S391:(18/21),D18S51:(13/14),AMEL:(X/X), D6S1043:(14/21.3),FGA:(19/22)}
Step 2:Establish data field.In this example, the locus title of all sample datas is completely the same, therefore establishes Data field share 16:
Step 3:Data store.Data storage has stored in database at this time by taking ID is 00002 sample as an example ID is 00001 sample, as shown in Figure 5.First group of data D8S1179 is obtained first:(13/15), deposit in the database at this time In data field D8S1179, and there is index 15 and index 13 may be not present, it is therefore desirable to newly-built 13 index, and 00002 is added Into 13 and 15 index, as shown in fig. 6, traveling through each data field of No. 00002 sample successively in the manner described above afterwards.
2nd, the paternity test sequence comparison method based on inverted index structure
Step 1:Data prediction.
It is following form that looking for relative sample in table 2, which is arranged,:
Y={ D8S1179:(13/15),D21S11:(29/31),D7S820:(11/11),CSF1PO:(11/11), D3S1358:(15/15),D5S818:(10/12),D13S317:(10/10),D16S539:(9/11),D2S1338:(18/ 23),D19S433:(13/14),vWA:(14/14),D12S391:(18/19),D18S51:(13/15),AMEL:(X/Y), D6S1043:(10/18),FGA:(23/24)}
Step 2:Calculate parent child relationship index.
For sample y, firstly for first data field D8S1179, it takes its sample set there are 13 and 15 index Union Xj={ 00001,00002,00003,00004,00005 ... }, calculates the union X=in all domains on this basis {00001,00002,00003,00004,00005,...}.To each element x in Xi, its score is calculated, as shown in table 3:
3 sample score of table
Sample ID 00001 00002 00003 00004 00005 ……
D8S1179 1 1 1 1 1 ……
D21S11 0 1 0 1 1 ……
D7S820 1 0 1 1 0 ……
CSF1PO 1 0 0 1 0 ……
D3S1358 1 0 0 1 1 ……
D5S818 1 1 0 1 1 ……
D13S317 0 0 0 1 0 ……
D16S539 0 1 1 1 0 ……
D2S1338 1 1 1 1 1 ……
D19S433 1 1 1 1 1 ……
vWA 0 1 0 1 0 ……
D12S391 1 1 0 1 1 ……
D18S51 1 1 1 1 0 ……
AMEL 1 1 1 1 1 ……
D6S1043 0 0 0 1 1 ……
FGA 0 1 1 1 1 ……
qi 10 11 8 16 10
Step 3:Judge whether parent child relationship.
Descending sort is carried out to candidate samples according to score, makes θ=15, then can determine whether that sample 00004 and y is closed with parent-offspring System.
As shown in Figure 7 and Figure 8, wherein Fig. 7 illustrates this patent description to Database Systems prototype according to the system design The STR data based on inverted index structure storage, Fig. 8 illustrates the result of looking for relative.
In conclusion the storage of STR data and paternity test sequence proposed by the present invention based on inverted index structure compare Method.This method by establishing the modes such as data field, the index for establishing STR data values, by by paired str locus seat data with The form of inverted index is stored;On this basis, sorted comparison method by paternity test based on inverted index structure, The parent child relationship index between sample is calculated, the quick comparison of paternity test is realized, improves the efficiency of retrieval and inquisition, reduce Influence of the gene mutation to paternity test comparison efficiency;And due to the use of data field, drastically increase this method The scope of application.

Claims (4)

1. the storage of STR data and paternity test sequence comparison method based on inverted index structure, it is characterised in that including following Step:
1) the STR data storage based on inverted index structure
First, all STR data are pre-processed, it is reference format that the STR data sets of each sample, which are arranged,;Then, will Each site will store respective STR data as a data field in each data field;Finally, by STR data with the row of falling The mode of index stores;
STR data are pre-processed, it is reference format that the STR data sets of each sample, which are arranged, specific as follows:
Sample data set is denoted as X={ x1,x2,...,xn};
Wherein, xiRepresent the STR data of i-th of individual,Wherein,Represent j-th of str locus The title of seat, vjkRepresent the characteristic value of STR on locus j on k-th of chromosome;
2) the paternity test sequence based on the STR data stored in a manner of inverted index compares
First, STR data to be looked for relative are pre-processed, it is reference format that the STR data sets of each sample, which are arranged,;Then, The STR data in each site are compared in respective data field, and form final parent child relationship index;Finally, sentence It whether there is parent child relationship between random sample sheet, if parent child relationship index is higher than specific value, then it is assumed that the donor of candidate samples There is parent child relationship with the donor of sample to be looked for relative, it is on the contrary then think parent child relationship is not present between the two;
Wherein, STR data to be looked for relative are pre-processed, it is specific as follows:
It is following form by looking for relative sample arrangement:Y={ strj:(vj1/vj2), wherein strjRepresent the name of j-th of str locus seat Claim, vjkRepresent the characteristic value of STR on locus j on k-th of chromosome;
The calculating of the parent child relationship index is as follows:
For sample y, str is traveled throughj:(vj1/vj2), if there is strjCorresponding domain dm, then v is obtainedj1And vj2Corresponding to index Sample set, be denoted as X respectivelyj1And Xj2
Take Xj1And Xj2Union, be denoted as Xj=Xj1∪Xj2
Obtaining each strjCorresponding XjAfterwards, X is calculatedjUnion X=X1∪X2∪...∪XJ, each element in X is Candidate samples;
To each element x in Xi, calculateWherein:
Then qiFor candidate samples xiParent child relationship index.
2. the storage of STR data and paternity test sequence comparison method according to claim 1 based on inverted index structure, It is characterized in that, the foundation of data field is as follows in step 1):
The STR data of all samples are traveled through, establish the set STR of str locus seat titleN={ str1,str2,...,strm, pin To STRNIn each stri, different data fields is established, is denoted as di;I=1,2 ... m.
3. the storage of STR data and paternity test sequence comparison method according to claim 1 based on inverted index structure, It is characterized in that, storing STR data in a manner of inverted index in step 1), sample data set X is traveled through, to any xi, time Go throughIfCorresponding data field dmIn there are vj1Index, then by xiIt is added in the index;If There is no vj1Inverted index, then establish the index, and by xiIt is added in index;For vj2Adopt and located in a like fashion Reason.
4. the storage of STR data and paternity test sequence comparison method according to claim 1 based on inverted index structure, Parent child relationship is judged whether in step 2), it is specific as follows:
According to qiTo candidate samples xiDescending sort is carried out, if qi>=θ, then it is assumed that the donor and candidate samples of sample y to be looked for relative xiDonor there is parent child relationship;It is on the contrary, then it is assumed that there is no parent child relationship between the two;Wherein, θ is what system was previously set Threshold value, subtracts 1 for the quantity of locus.
CN201510590067.XA 2015-09-16 2015-09-16 The storage of STR data and paternity test sequence comparison method based on inverted index structure Expired - Fee Related CN105260395B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510590067.XA CN105260395B (en) 2015-09-16 2015-09-16 The storage of STR data and paternity test sequence comparison method based on inverted index structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510590067.XA CN105260395B (en) 2015-09-16 2015-09-16 The storage of STR data and paternity test sequence comparison method based on inverted index structure

Publications (2)

Publication Number Publication Date
CN105260395A CN105260395A (en) 2016-01-20
CN105260395B true CN105260395B (en) 2018-05-01

Family

ID=55100087

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510590067.XA Expired - Fee Related CN105260395B (en) 2015-09-16 2015-09-16 The storage of STR data and paternity test sequence comparison method based on inverted index structure

Country Status (1)

Country Link
CN (1) CN105260395B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110349634B (en) * 2019-07-11 2022-09-16 顾永才 System for searching discrete relatives by using gene technology

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101169628A (en) * 2007-11-14 2008-04-30 中控科技集团有限公司 Data storage method and device
US8775410B2 (en) * 2009-02-09 2014-07-08 The Hong Kong Polytechnic University Method for using dual indices to support query expansion, relevance/non-relevance models, blind/relevance feedback and an intelligent search interface

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101169628A (en) * 2007-11-14 2008-04-30 中控科技集团有限公司 Data storage method and device
US8775410B2 (en) * 2009-02-09 2014-07-08 The Hong Kong Polytechnic University Method for using dual indices to support query expansion, relevance/non-relevance models, blind/relevance feedback and an intelligent search interface

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
短串联重复序列的研究;王冰梅等;《北华大学学报(自然科学版)》;20060228;第7卷(第1期);论文第43-46页 *

Also Published As

Publication number Publication date
CN105260395A (en) 2016-01-20

Similar Documents

Publication Publication Date Title
US11335435B2 (en) Identifying ancestral relationships using a continuous stream of input
Si et al. Model-based clustering for RNA-seq data
Edwards et al. High-resolution genetic mapping with pooled sequencing
US20160232224A1 (en) Categorization and filtering of scientific data
CN107577924B (en) Long-chain non-coding RNA subcellular position prediction method based on deep learning
CN106485096B (en) The miRNA- Relationship To Environmental Factors prediction technique learnt based on random two-way migration and multi-tag
Yuan et al. Nonconvex penalty based low-rank representation and sparse regression for eQTL mapping
Bhadra et al. Identification of multiview gene modules using mutual information-based hypograph mining
Claerhout et al. Ysurnames? The patrilineal Y-chromosome and surname correlation for DNA kinship research
Gupta et al. Gene mutation classification through text evidence facilitating cancer tumour detection
CN115526246A (en) Self-supervision molecular classification method based on deep learning model
Binder et al. Cluster-localized sparse logistic regression for SNP data
CN109993305A (en) Ancestral source polymorphism prediction technique based on big data intelligent algorithm
CN105260395B (en) The storage of STR data and paternity test sequence comparison method based on inverted index structure
CN109033746B (en) Protein compound identification method based on node vector
Wang et al. Network clustering analysis using mixture exponential-family random graph models and its application in genetic interaction data
Chen et al. Multi-objective evolutionary triclustering with constraints of time-series gene expression data
Chowdhury et al. Cell type identification from single-cell transcriptomic data via gene embedding
Can et al. A literature review on the use of genetic algorithms in data mining
Liu et al. Similarity network fusion based on random walk and relative entropy for cancer subtype prediction of multigenomic data
CN113380326B (en) Gene expression data analysis method based on PAM clustering algorithm
Zhao et al. An improved graph representation learning method for drug-target interaction prediction over heterogeneous biological information graph
Zhao et al. A computational method for detecting the associations between multiple loci and phenotypes
Wang et al. Imputing missing values for genetic interaction data
Gong et al. BDLR: lncRNA identification using ensemble learning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180501

Termination date: 20190916

CF01 Termination of patent right due to non-payment of annual fee