CN109190089A

CN109190089A - Probabilistic Synthesis sort method

Info

Publication number: CN109190089A
Application number: CN201811035247.1A
Authority: CN
Inventors: 李园白; 杨阳; 刘方舟; 王静; 王琳; 张颖; 张一颖; 李萌; 杜昱
Original assignee: Institute Of Information On Traditional Chinese Medicine Cacms
Current assignee: Institute Of Information On Traditional Chinese Medicine Cacms
Priority date: 2018-09-06
Filing date: 2018-09-06
Publication date: 2019-01-11
Anticipated expiration: 2038-09-06
Also published as: CN109190089B

Abstract

The present invention relates to field of computer technology, provide a kind of Probabilistic Synthesis sort method.This method mainly comprise the steps that by the past experimental result resolve into only include two comparison elements sequence line；Count the repetition frequency of every sequence line；The position accuracy for selecting the starting point sequence each comparison element of line cycle calculations, until obtaining the optimal sequencing line that all comparison elements are formed with extreme higher position accuracy.The present invention is carried out multiple sequence lines in the way of correct probability for the first time and merged, the line that sorts can not include whole elements, and can be in the presence of that Partial Elements collating sequence is inconsistent between different sequence lines, the method that exhaustive all possible sequence lines carry out correct probability comparison again is abandoned, by the way of the screening of highest correct probability, gradually the sequence line of highest correct probability is shown, the review of coherent element is finally carried out to the sequence line of high probability again, the calculation amount not only greatly reduced, and technical result accuracy rate is higher.

Description

Probabilistic Synthesis sort method

Technical field

The present invention relates to field of computer technology more particularly to a kind of Probabilistic Synthesis sort methods.

Background technique

During scientific experiment, some scientific experiments can obtain some ranking results, if the sequence knot of multiple experiments Fruit can be brought together comprehensive utilization, form a comprehensive ranking results, be to have certain significance of scientific research.

According to previous sort method, be the element determination value in all experiments is put together and is ranked up, but due to Experiment condition is different, experiment survey meter device is different, experimental method is different, although being directed to same experimental subjects, different research Experiment, the experimental result of acquisition is different, and different experiments are quite big to the difference of the measurement of same experimental subjects magnitude obtained, It is difficult to accept or reject, since some differences can reach the degree of the order of magnitude, thus it is integrated ordered only according to measured value progress, it is extremely inaccurate. For example, have a kind of scientific experiment be about composition measurement class scientific experiment, experiment be measure the X Chinese medicine in several provinces A it is effective The comparison result of component content, it is contemplated that look at the A active constituent content highest of the X Chinese medicine in which province, quality of medicinal material is best.Have Measuring be the first and second the third provinces, measuring be the third naphthacene of second, measuring be the first and second naphthacene, obtained What is obtained is the concrete content value of the A effective component of the X Chinese medicine in different provinces.When researcher wants to obtain that the A effective component of X Chinese medicine contains When all province comparison results of amount, such as which province highest or secondary height, if the measured value directly provided according to experiment by It is ranked up (experiment condition is different) according to size, sequence is extremely inaccurate.Because the A for measuring X Chinese medicine in certain experiment is effective Component content, the measured value of the first and second the third provinces are respectively 0.6mg/ml, 0.5mg/ml, 0.4mg/ml, and in another experiment The measured value that Ding Yijia is saved is respectively 10mg/ml, 9mg/ml, 8mg/ml, if directly sorting according to measured value is that can not obtain Accurate integrated ordered result.

Since experiment condition is consistent in single experiment, so the ranking results comprising part province are quasi- in single experiment True, such as: certain experiment obtains first > second > the third, and in addition experiment obtains fourth > second > first.If it is desired to obtain all provinces about in X The sequence of the A effective component of medicine needs a kind of sort method and the ranking results of each experiment is integrated, this sequence side Method not only otherwise by the experiment condition result of each research is different is limited, but also can solve different between different experiments province The problem of ranking results.

Summary of the invention

The technical problem to be solved in the present invention is to provide a kind of Probabilistic Synthesis sort methods, can comprehensively utilize different sections Experiment ranking results are ground, the higher ranking results of accuracy rate are provided.

In order to solve the above-mentioned technical problems, the present invention provides a kind of Probabilistic Synthesis sort methods, comprising the following steps:

S1: the data set definition by the past experiment ranking results composition is data set P, will be each in the data set P Item sequence line resolves into sequence line only comprising two comparison elements, all sequence lines compositions only comprising two comparison elements Data set definition is data set Q；

S2: the repetition frequency of every sequence line in the data set Q is counted；

S3: using the highest sequence line of frequency of occurrence in the data set Q as starting point sequence line, in starting point sequence line Comparison element q₁With comparison element q₂Based on, emerging comparison element q in subsequent sequence line is added one by one_n, by the ratio Compared with element q_nWith comparison element q₁~q_n-1Combination, is listed comprising comparison element q₁~q_n-1N item sort line, the n item sorts line Composition data collection M, wherein the comparison element q_nPosition in the every sequence line of the data set M is different from, and n is Positive integer, maximum value are the quantity of comparison element in the data set Q；

S4: every sequence line in the data set M is resolved into only according to the method in the step S1 comprising described Comparison element q_nWith the comparison element q₁~q_n-1Any one of n-1 item sort line, the n group decomposited in the data set M N-1 item sequence line composition data collection R；

S5: searching for the sequence line in the data set R respectively in the data set Q, compares and sorts in the data set Q Sort the ordering relation of line in line and the data set R, and the correct of line of sorting in the data set R is marked according to comparing result The frequency or the wrong frequency；

S6: the comparison element q is calculated separately_nPosition accuracy in the every sequence line of the data set M, calculates Formula are as follows: in the data set R in the sum of correct frequency of every group of sequence line/data set R every group of sequence line correct frequency The summation * 100% of the secondary and wrong frequency；

S7: the comparison element q is chosen_nThe highest sequence line of position accuracy in the sequence line of the data set M is made To calculate comparison element q_n+1The starting point sequence line of position accuracy, return step S3 repeat step S3~S6, obtain the ratio Compared with element q_n+1The highest sequence line of position accuracy, circulation executes until acquisition all comparison element positions accuracy is highest Optimal sequencing line.

Further, the step S1 further includes, by the sequence line p of the data set P_nIn any two comparison element In conjunction with according to any two comparison element in the sequence line p_nIn ordering relation be ranked up, obtain only comprising institute State the sequence line of any two comparison element.

Further, the step S5 further includes searching for the sequence in the data set R respectively in the data set Q Line, if the first sequence line in the data set R is identical as the first sequence line in the data set Q, by the data The frequency of occurrence of first sequence line described in collection Q is labeled as the correct frequency of the first sequence line described in the data set R；If The second sequence line in the data set R and the second sequence line in the data set Q are on the contrary, then by institute in the data set Q The frequency of occurrence for stating the second sequence line is labeled as the wrong frequency of the second sequence line described in the data set R；If the number It is then marked in the data set Q both without identical sequence line or not opposite sequence line according to the third sequence line in collection R The correct frequency of the sequence line of third described in the data set R is 0.

Further, in the step S6, if the comparison element q_n2 or more in the data set M are sorted Position accuracy in line is identical and is extreme higher position accuracy, then all regard described 2 or more sequence lines as starting point sequence line, Continue to calculate next emerging comparison element q_n+1。

Further, further include step S8 after the step S7: calculating all comparison elements in the optimal sequencing line Mean place accuracy, sum to the position accuracy of each comparison element in the optimal sequencing line, then divided by more first The quantity of element, obtains the mean place accuracy of the optimal sequencing line.

It further, further include step S9 after the step S8: to the position of comparison element in the optimal sequencing line Accuracy is checked, and is searched in the data set Q all comprising the comparison element q_nSequence line, by every include institute State comparison element q_nSequence line be compared with the optimal sequencing line, marked according to the comparison result of ordering relation described Comparison element q_nThe correct frequency or the wrong frequency.

Further, the step S9 further include:, will be comprising described if the comparison result is that ordering relation is identical Comparison element q_nFrequency of occurrence of the sequence line in the data set Q be labeled as the comparison element q_nIn the optimal sequencing The correct frequency in position in line；If the comparison result be ordering relation on the contrary, if will include the comparison element q_nRow Frequency of occurrence of the sequence line in the data set Q is labeled as the comparison element q_nPositional fault in the optimal sequencing line The frequency；If in the data set Q including the comparison element q_nSequence line ordering relation do not appear in it is described optimal In sequence line, then by the comparison element q_nPosition accuracy in the optimal sequencing line is labeled as 0.

Further, further include step S10 after the step S9: it is multiple to calculate all comparison elements in the optimal sequencing line Mean place accuracy after core.

Above-mentioned technical proposal of the invention has the advantages that the present invention proposes solution aiming at the problem that multiple sequence line generalizations Certainly method is carried out multiple sequence lines in the way of correct probability for the first time and merged, and sequence line can not include whole elements, and Can be in the presence of that Partial Elements collating sequence is inconsistent between difference sequence line.The present invention abandons during realization The method that exhaustive all possible sequence lines carry out correct probability comparison again, the mode for having used highest correct probability to screen, gradually The sequence line of highest correct probability is shown, and finally carries out the review of coherent element to the sequence line of highest correct probability again, Calculation amount is not only greatly reduced in this way, and ranking results accuracy and the effect of exhaustive all sequence lines are consistent.

Detailed description of the invention

Fig. 1 is the flow diagram of Probabilistic Synthesis sort method of the present invention.

Specific embodiment

In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with attached drawing to the present invention Technical solution in embodiment is clearly and completely described, it is clear that described embodiment is that a part of the invention is implemented Example, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creativeness Every other embodiment obtained, shall fall within the protection scope of the present invention under the premise of labour.

Fig. 1 is the flow diagram of Probabilistic Synthesis sort method of the present invention.As shown in Figure 1, Probabilistic Synthesis of the invention sorts Method the following steps are included:

S1: the data set definition by the past experiment ranking results composition is data set P, and each in data set P is arranged Sequence line resolves into sequence line only comprising two comparison elements, the data of all sequence lines compositions only comprising two comparison elements Collection is defined as data set Q.

It in step sl, is data set P by the data set definition of the past experiment ranking results composition, it will be in data set P Each sequence line resolves into sequence line only comprising two comparison elements, and all only includes the sequence line group of two comparison elements At data set definition be data set Q.Sequence line in data set Q is to decompose the sequence line in data set P, by data Collect same sequence line p in P_nIn any two comparison element combine, according to any two comparison element sequence line p_nIn Ordering relation be ranked up, obtain only include any two comparison element a sequence line.

For example, having 6 sequences lines, respectively p in data set P₁:A>C>B>D、p₂:A>B>D>C>Y、p₃:A>B>Y、p₄:Y> D、p₅:A>D>C>Y、p₆: A > B > D > C, target are sequence of the expectation acquisition about 5 elements of ACBDY.It is any to choose sequence line p₁: two comparison elements A and C in A > C > B > D, according to elements A and C in sequence line p₁In ordering relation, then form sequence Line: A > C similarly chooses comparison element A and B, according to A and B in sequence line p₁In ordering relation formed sequence line: A > B, successively Analogize, by the line p that sorts₁: A > C > B > D forms 5 sequences line as shown in Table 1 after decomposing.

Table 1

Serial number	1	2	3	4	5
						Sort line	A>C	A>B	A>D	C>B	B>D

S2: the repetition frequency of every sequence line in statistical data collection Q.

In step s 2, summarize the sequence line repeated in data set Q, count the repetition frequency of every sequence line.Example Such as, after 6 sequence lines in above-mentioned data set P all being decomposed, summarize 13 sequence lines out, count the repetition of every sequence line The frequency, as shown in table 2.

Table 2

Serial number	1	2	3	4	5	6	7	8	9	10	11	12	13
														Sort line	A>B	A>D	A>C	D>C	B>D	A>Y	C>Y	B>C	B>Y	D>Y	C>B	C>D	Y>D
The frequency	4	4	4	3	3	3	2	2	2	2	1	1	1

S3: first with the comparison in starting point sequence line using the highest sequence line of frequency of occurrence in data set Q as starting point sequence line Plain q₁With comparison element q₂Based on, emerging comparison element q in subsequent sequence line is added one by one_n, by comparison element q_nWith than Compared with element q₁~q_n-1Combination, is listed comprising comparison element q₁~q_nN item sort line, n item sort line composition data collection M, wherein Comparison element q_nPosition in the every sequence line of data set M is different from, and n is positive integer, and maximum value is to compare in data set Q Compared with the quantity of element.

In step s3, using the highest sequence line of frequency of occurrence in data set Q as starting point sequence line, for example, in table 2, with A > B that frequency of occurrence is 4 times is that starting point sequence line is gradually added in subsequent sequence line newly to go out based on comparison element A and B Existing comparison element, as Article 2 sort the emerging comparison element D of line, by emerging comparison element D compared with before 3 sequence lines comprising comparison element A, B and D 1. A > B > D, 2. A > D > B and 3. D > A > B are listed in elements A and B combination, wherein Position in every sequence line where comparison element D is different from, that is, lists all positions being likely to occur comparison element D.On State 3 sequence lines 1. A > B > D, 2. A > D > B and 3. D > A > B composition data collection M.Wherein, n is positive integer, and maximum value is data set Q The quantity of middle comparison element.

S4: every sequence line in data set M is resolved into only according to the method in step S1 comprising comparison element q_nWith Comparison element q₁~q_n-1Any one of n-1 item sort line, the n group n-1 item decomposited in data set M sorts line composition data Collect R.

In step s 4, every sequence line in data set M is resolved into only according to the method in step S1 comprising q_nAnd q₁ ~q_n-1Any one of n-1 item sort line, share n group n-1 item sequence line.Such as 3 of comparison element A, B and D composition are arranged Sequence line 1. A > B > D, 2. A > D > B and 3. D > A > B according to the method in step S1 resolves into 3 groups of sequence lines, every group of sequence line includes 2 Item sorts line, and every sequence line only includes any one of comparison element D and A, B, as shown in table 3.3*2 item sequence in table 3 Line composition data collection R.

Table 3

S5: searching for the sequence line in data set R respectively in data set Q, and sort line and data set R in correlation data collection Q The ordering relation of middle sequence line, according to comparing result come the correct frequency for the line that sorts in labeled data collection R or the wrong frequency.

In step s 5, the sequence line in data set R is searched for respectively in data set Q, if the first row in data set R Sequence line is identical as the first sequence line in data set Q, then the frequency of occurrence of the first sequence line in data set Q is labeled as data set The correct frequency of first sequence line in R, for example, searching for sequence line A > D in data set R, that is, table 3, hair in data set Q, that is, table 2 Existing identical sequence line, frequency of occurrence are 4 times, then the correct frequency for the line A > D that sorts in labeled data collection R is 4.

If the second sequence line in data set R and the second sequence line in data set Q on the contrary, if by data set Q the The frequency of occurrence of two sequence lines is labeled as the wrong frequency of the second sequence line in data set R, for example, searching in data set Q, that is, table 2 Sequence line D > B in rope data set R, that is, table 3, discovery have opposite sequence line B > D, and frequency of occurrence is 3 times, then labeled data The wrong frequency for collecting the line D > B that sorts in R is 3.

If the third sequence line in data set R is in data set Q both without identical sequence line or not opposite row Sequence line, then the correct frequency of third sequence line is 0 in labeled data collection R.That is there is no in data set for the sequence line in data set R Occur in Q, shows do not occurred this as a result, so its correct frequency can be designated as 0 in experiment.

It after by the sequence line in data set R all search, can obtain as shown in table 4 as a result, listing 3*2 in table 4 Correct, the mistake frequency of item sequence line.

Table 4

S6: comparison element q is calculated separately_nPosition accuracy in the every sequence line of data set M, calculation formula are as follows: In data set R in the sum of correct frequency of every group of sequence line/data set R the correct frequency of every group of sequence line and the wrong frequency it is total With * 100%.

In step s 6, comparison element q is calculated separately_nPosition accuracy in the every sequence line of data set M, calculates Formula are as follows: in data set R in the sum of correct frequency of every group of sequence line/data set R every group of sequence line the correct and wrong frequency Summation * 100%.For example, in table 4,3 bar sequence lines 1. A > B > D, 2. A > D > B and 3. D > A of the comparison element D in data set M Position in > B is different from, and according to above-mentioned formula, position accuracy of the comparison element D in sequence line 1. A > B > D can be calculated For (correct 4 times+3 times correct)/(correct 4 times+3 times correct) * 100%=100%；Comparison element D is in sequence line 2. A > D > B Position accuracy can be calculated as correct 4 times/(correct 4 times+mistake 3 times) * 100%=57.14%；Comparison element D is sorting Position accuracy of the line 3. in D > A > B can be calculated as correct 0 time/(4 times+mistake of mistake 3 times) * 100%=0.Calculated result is such as Shown in table 5.

Table 5

S7: comparison element q is chosen_nThe highest sequence line of position accuracy in the sequence line of data set M, as calculating ratio Compared with element q_n+1The starting point sequence line of position accuracy, return step S3 repeat step S3~S6, obtain comparison element q_n+1Position The highest sequence line of accuracy, circulation execute until obtaining the highest optimal sequencing line of all comparison element positions accuracy.

In step S7, comparison element q is chosen_nThe highest sequence line of position accuracy, example in the sequence line of data set M Such as, in 3 sequence lines of data set M 1. A > B > D, 2. A > D > B and 3. in D > A > B, comparison element D is in sequence line 1. A > B > D Position accuracy is up to 100%, therefore chooses sequence line 1. starting point row of the A > B > D as the next new comparison element C of calculating Sequence line.Return step S3 calculates next emerging comparison element C, repeats step S3~S6, is obtaining comparison element location of C just The highest sequence line of true rate, until obtaining the highest optimal sequencing line of all comparison element positions accuracy.

For example, can continue to calculate and compare Elements C and the position accuracy of Y, wherein with comparison element D in the example above Setting the highest sequence line A > B > D of accuracy is starting point sequence line, and comparison element C and sequence line A > B > D can be combined to 4 sequence lines 1. it is correct to can be calculated comparison element location of C by the above method by A > B > D > C, 2. A > B > C > D, 3. A > C > B > D, 4. C > A > B > D The highest sequence line of rate, then the rest may be inferred calculate comparison element Y position accuracy, finally obtain the optimal of 5 comparison elements Sequence line is A > B > D > C > Y, and each comparison element is as shown in table 6 in position accuracy wherein.

Table 6

Comparison element	A	B	D	C	Y
						Position accuracy	100.00%	100.00%	100.00%	81.82%	90.00%

Present embodiment pass through using the highest frequency only include two comparison elements sequence line be starting point sequence line, pass through meter The accuracy height for calculating each element in the sequence line after new element is added is ranked up the extension of line, until all coherent elements The sequence being involved in forms the highest sequence line of final accuracy.During realization, exhaustive all possible sequence lines have been abandoned The comparative approach for carrying out correct probability again, by the way of the screening of highest correct probability, gradually the sequence of highest correct probability Line is shown.

In step s 6, if comparison element q_nPosition accuracy in 2 or more of data set M sequence lines it is identical and For extreme higher position accuracy, then all it regard this 2 or more sequence lines as starting point sequence line, continues to calculate next emerging comparison Element q_n+1.Comparison element q ought occur_nWhen position accuracy in 2 or more the sequence lines of data set M is identical, such as than 3 sequence lines 1. A > B > D, 2. A > D > B and 3. in D > A > B compared with element D in data set M, sequence line 1. A > B > D and 2. A > D > B Position accuracy it is identical and be extreme higher position accuracy, then will sequence line 1. A > B > D and 2. A > D > B is remained, respectively Using sort line 1. A > B > D and 2. A > D > B as starting point sort line, continue to calculate next emerging comparison element C.Since starting point is arranged There is the case where being greater than 1 in sequence line, therefore the optimal sequencing line finally obtained will also the case where being greater than 1 occur, if Repeatedly occurs the case where increasing starting point sequence line during calculating each comparison element, then the optimal sequencing line finally obtained A fairly large number of situation will be will appear.

Further include step S8 after step S7: calculating the mean place accuracy of all comparison elements in optimal sequencing line.

In step s 8, the mean place accuracy for calculating all comparison elements in optimal sequencing line, to optimal sequencing line In each comparison element the summation of position accuracy, then divided by the quantity of comparison element, obtain the mean place of optimal sequencing line Accuracy.When optimal sequencing line has a plurality of, the mean place accuracy of all optimal sequencing lines is calculated, is sorted by numerical value height Compare.For example, the mean place accuracy in above-mentioned table 6 is (100.00%+100.00%+100.00%+81.82%+ 90.00%)/5=94.36%.

When calculating each comparison element, since a plurality of starting point sequence line may be chosen, such as the starting point sequence most started When line options, the highest sequence line of frequency of occurrence has 3, respectively A > B, A > D, A > C in data set Q, this 3 sequence lines all may be used Subsequent calculating is carried out as starting point sequence line, meanwhile, the identical situation of the position accuracy being likely to occur in calculating process It will lead to and increase starting point sequence line when calculating next comparison element, therefore, the calculating carried out for a plurality of starting point sequence line can Obtain a plurality of optimal sequencing line.By calculating the mean place accuracy of all comparison elements in optimal sequencing line, can exist When a plurality of optimal sequencing line, to the further sequence respectively of a plurality of optimal sequencing line.

The position accuracy that preamble calculates each comparison element is calculated since the highest order item of the frequency, constantly handle Sequence line is sophisticated to multinomial from 2, only considered the preamble comparison element of each newly-increased comparison element, this side in calculating process Method, which does not need especially largely to calculate, can list relevant sequence line.If from the point of view of more comprehensively, all elements Should all participate in some element position accuracy calculate, not only should preamble order item participate in calculate, subsequent sequence Item should also participate in calculating.

As a result, further include step S9 after step S8: the position accuracy of comparison element in optimal sequencing line is answered Core is searched for all comprising comparison element q in data set Q_nSequence line, by every include comparison element q_nSequence line with most Excellent sequence line is compared.It include comparison element q by this if comparison result is that ordering relation is identical_nSequence line exist Frequency of occurrence in data set Q is labeled as comparison element q_nThe correct frequency in position in this bar optimal sequencing line, if compared It as a result is ordering relation on the contrary, including then comparison element q by this_nFrequency of occurrence of the sequence line in data set Q be labeled as Comparison element q_nThe positional fault frequency in this bar optimal sequencing line, if this includes comparison element q in data set Q_nRow The ordering relation of sequence line does not appear in this bar optimal sequencing line, then by comparison element q_nPosition in this bar optimal sequencing line It sets accuracy and is labeled as 0.

This step mainly checks the position accuracy of each element and the mean place accuracy of all comparison elements, Such as optimal sequencing line is A > B > D > C > Y, the sequence line frequency of occurrence of data set Q, it is first to search for all about comparison in reference table 2 The sequence line of plain A can obtain as shown in table 7 as a result, the position accuracy review result of so comparison element A is that A=is correct The total frequency 15*100%=100% of the frequency 15/.

Table 7

2 order items containing A	The frequency	Compared with optimal sequencing line
			A>B	4	Correctly
A>C	4	Correctly
			A>D	4	Correctly
A>Y	3	Correctly
			It is total	15	Correct total 15

Similarly, the position accuracy for comparing element B is checked, the results are shown in Table 8, then comparison element B It is the total frequency 12*100%=91.67% of the correct frequency 11/ of B=that position accuracy, which checks result,.

Table 8

2 order items containing B	The frequency	Compared with optimal sequencing line
			A>B	4	Correctly
B>D	3	Correctly
			B>C	2	Correctly
B>Y	2	Correctly
			C>B	1	Mistake
It is total	12	Correct total 11

The position accuracy that table 9 lists whole comparison elements is checked as a result, according to the position of the comparison element after review Accuracy, can also all comparison elements after calculation review mean place accuracy.Finally, each sequence line is according to averagely just True rate height sorts, for users to use the sequence line of highest average accuracy.

Table 9

It is first to choose the sequence of two elements as starting point sequence line, with the increasing of latter element because of this sequence work Add and increase the length of sequence line, sequence in this way is the increase with element and increases the first prime number for participating in sequence, in this way before The sequence element of participation can not be compared with the element to sort below there are no participation, set review step in the present invention Suddenly, it exactly allows the ordering scenario of each element and all elements to carry out the comparison of accuracy, makes calculating more comprehensive in this way, and And calculation amount is not too large.

Below to illustrate method of the invention for measuring the experiment of genunie medicinal materials effective component.

Genunie medicinal materials, also known as authentic medicinal herbs refer to and preferably come out by tcm clinical practice prolonged application, in specific region logical The produced medicinal material of specific production process is crossed, good effect good compared with other produced medical material quanlities of the same race in area has higher well-known Degree.In the effective constituent determination experiment of genunie medicinal materials,

One of genunie medicinal materials Radix Salviae Miltiorrhizae, many experiments measure this by studying the effective component in each province Radix Salviae Miltiorrhizae The quality of province Radix Salviae Miltiorrhizae, wherein the effective component Tanshinone I in Radix Salviae Miltiorrhizae is a common significant ingredient, in order to more each Which province highest of the content of province Tanshinone I, therefore be ranked up using method of the invention.

Each province that each experimental study is acquired about Tanshinone I is sorted, and forms two comparison elements sequence lines, portion Divide example as shown in table 10.Data are screened from the technical journal published in table 10, and Tanshinone I difference province is surveyed Fixed partial results, which are set out, to be come, and starts to be ranked up.

Table 10

Data are the sequence lines sequence line of each experiment resolved into only comprising two comparison elements in table 11.

Table 11

Test serial number	2 sequences
		3	Henan Province > Anhui Province
3	Henan Province > Shaanxi Province
		3	Henan Province > Gansu Province
3	Henan Province > Sichuan Province
		3	Henan Province > Shandong Province
3	Henan Province > Hebei province
		3	Anhui Province > Shaanxi Province
3	Anhui Province > Gansu Province
		3	Anhui Province > Sichuan Province
3	Anhui Province > Shandong Province
		3	Anhui Province > Hebei province
3	Shaanxi Province > Gansu Province
		……	……

Data are the frequencys of the sequence line for two comparison elements for counting all experiments in table 12.

Table 12

2 sequences	The frequency
		Shandong Province > Henan Province	10
Henan Province > Anhui Province	8
		Shandong Province > Hebei province	8
Shandong Province > Sichuan Province	8
		Shandong Province > Anhui Province	7
Henan Province > Hebei province	6
		Hebei province > Anhui Province	6
Henan Province > Sichuan Province	6
		Shandong Province > Jiangsu Province	5
Sichuan Province > Henan Province	4
		Jiangsu Province > Anhui Province	4
……	……

Data are the accuracy for the optimal sequencing line that method according to the present embodiment obtains in table 13, are by repeatedly just The cycle calculations of true probability, form a plurality of optimal sequencing line by accuracy height.

Table 13

Data are that the review of accuracy is carried out to every optimal sequencing line in table 14, recalculate average correct probability, with The height of average correct probability is ranked up line and enumerates, and uses for researcher.

Table 14

Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations；Although Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features； And these are modified or replaceed, technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution spirit and Range.

Claims

1. a kind of Probabilistic Synthesis sort method, which comprises the following steps:

S1: the data set definition by the past experiment ranking results composition is data set P, and each in the data set P is arranged Sequence line resolves into sequence line only comprising two comparison elements, the data of all sequence lines compositions only comprising two comparison elements Collection is defined as data set Q；

S3: using the highest sequence line of frequency of occurrence in the data set Q as starting point sequence line, with the ratio in starting point sequence line Compared with element q₁With comparison element q₂Based on, emerging comparison element q in subsequent sequence line is added one by one_n, the comparison is first Plain q_nWith comparison element q₁~q_n-1Combination, is listed comprising comparison element q₁~q_n-1N item sort line, n item sequence line composition Data set M, wherein the comparison element q_nPosition in the every sequence line of the data set M is different from, and n is positive whole Number, maximum value are the quantity of comparison element in the data set Q；

S4: every sequence line in the data set M is resolved into only according to the method in the step S1 comprising the comparison Element q_nWith the comparison element q₁~q_n-1Any one of n-1 item sort line, the n group n-1 item decomposited in the data set M Sort line composition data collection R；

S5: searching for the sequence line in the data set R respectively in the data set Q, compare in the data set Q sort line and The ordering relation of sequence line in the data set R, the correct frequency for the line that sorts in the data set R is marked according to comparing result Or the wrong frequency；

S6: the comparison element q is calculated separately_nPosition accuracy in the every sequence line of the data set M, calculation formula Are as follows: in the data set R in the sum of correct frequency of every group of sequence line/data set R the correct frequency of every group of sequence line and The summation * 100% of the mistake frequency；

S7: the comparison element q is chosen_nThe highest sequence line of position accuracy in the sequence line of the data set M, as meter Calculate comparison element q_n+1The starting point sequence line of position accuracy, return step S3 repeat step S3~S6, it is first to obtain the comparison Plain q_n+1The highest sequence line of position accuracy, circulation executes until acquisition all comparison element positions accuracy is highest optimal Sort line.

2. Probabilistic Synthesis sort method according to claim 1, which is characterized in that the step S1 further includes, will be described The sequence line p of data set P_nIn any two comparison element combine, according to any two comparison element in the sequence Line p_nIn ordering relation be ranked up, obtain only include any two comparison element a sequence line.

3. Probabilistic Synthesis sort method according to claim 1, which is characterized in that the step S5 further includes, described The sequence line in the data set R is searched in data set Q respectively, if the first sequence line and the number in the data set R It is identical according to the first sequence line in collection Q, then the frequency of occurrence of the first sequence line described in the data set Q is labeled as the number According to the correct frequency of the first sequence line described in collection R；If in the second sequence line and the data set Q in the data set R Second sequence line on the contrary, then by described in the data set Q second sequence line frequency of occurrence be labeled as in the data set R The wrong frequency of the second sequence line；If the third sequence line in the data set R is in the data set Q both without phase With sequence line also not opposite sequence line, then the correct frequency for marking the sequence line of third described in the data set R is 0.

4. Probabilistic Synthesis sort method according to claim 1, which is characterized in that in the step S6, if described Comparison element q_nPosition accuracy in 2 or more the sequence lines of the data set M is identical and is extreme higher position accuracy, then It all regard described 2 or more sequence lines as starting point sequence line, continues to calculate next emerging comparison element q_n+1。

5. Probabilistic Synthesis sort method according to claim 1, which is characterized in that after the step S7 further include step S8: calculating the mean place accuracy of all comparison elements in the optimal sequencing line, to each ratio in the optimal sequencing line Position accuracy compared with element is summed, then divided by the quantity of comparison element, the mean place for obtaining the optimal sequencing line is correct Rate.

6. according to claim 1 to any Probabilistic Synthesis sort method in 5, which is characterized in that after the step S8 Further include step S9: the position accuracy of comparison element in the optimal sequencing line being checked, is searched in the data set Q Suo Suoyou includes the comparison element q_nSequence line, by every include the comparison element q_nSequence line and the optimal row Sequence line is compared, and the comparison element q is marked according to the comparison result of ordering relation_nThe correct frequency or the wrong frequency.

7. Probabilistic Synthesis sort method according to claim 6, which is characterized in that the step S9 further include: if institute Stating comparison result is that ordering relation is identical, then will include the comparison element q_nAppearance of the sequence line in the data set Q The frequency is labeled as the comparison element q_nThe correct frequency in position in the optimal sequencing line；If the comparison result is row Order relation will be on the contrary, then will include the comparison element q_nFrequency of occurrence of the sequence line in the data set Q be labeled as it is described Comparison element q_nThe positional fault frequency in the optimal sequencing line；If in the data set Q including the comparison element q_n The ordering relation of sequence line do not appear in the optimal sequencing line, then by the comparison element q_nIn the optimal sequencing Position accuracy in line is labeled as 0.

8. Probabilistic Synthesis sort method according to claim 7, which is characterized in that further include step after the step S9 S10: the mean place accuracy in the optimal sequencing line after all comparison element reviews is calculated.